CN107481723A - A kind of channel matched method and its device for Application on Voiceprint Recognition - Google Patents

A kind of channel matched method and its device for Application on Voiceprint Recognition Download PDF

Info

Publication number
CN107481723A
CN107481723A CN201710751356.2A CN201710751356A CN107481723A CN 107481723 A CN107481723 A CN 107481723A CN 201710751356 A CN201710751356 A CN 201710751356A CN 107481723 A CN107481723 A CN 107481723A
Authority
CN
China
Prior art keywords
voice
channel
speech data
data
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710751356.2A
Other languages
Chinese (zh)
Inventor
梁永立
何亮
吴晋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201710751356.2A priority Critical patent/CN107481723A/en
Publication of CN107481723A publication Critical patent/CN107481723A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/20Pattern transformations or operations aimed at increasing system robustness, e.g. against channel noise or different working conditions

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The present invention proposes a kind of channel matched method and its device for Application on Voiceprint Recognition, belongs to speech recognition and field of voice communication.The inventive method gathers speech data first, carries out voice coding to speech data according to the communication pattern to be simulated and obtains compressing speech data;Error code operation is then carried out to compression speech data according to the bit error rate that respective communication pattern lower channel is simulated, obtains channel simulation speech data;Tone decoding operation finally is carried out to the speech data, obtains the voice under respective communication pattern.Apparatus of the present invention include:Voice collecting and read module, voice coding module, channel error code analog module, tone decoding module and data memory module.The present invention can simulate the voice communication courses such as fixed line, the VOIP networking telephones, wechat phone, QQ phones, 2G, 3G, 4G, so as to obtain training voice with tested speech channel condition identical, channel mismatch is efficiently solved the problems, such as, suitable for the application demand of Application on Voiceprint Recognition.

Description

A kind of channel matched method and its device for Application on Voiceprint Recognition
Technical field
The present invention relates to speech recognition and field of voice communication, is a kind of channel for Application on Voiceprint Recognition specifically Method of completing the square and its device.
Technical background
Application on Voiceprint Recognition, also referred to as Speaker Identification, it is to utilize computer, according to the life of voice automatic decision speaker's identity Thing feature identification technique.According to different application scenarios, sound groove recognition technology in e has a variety of sorting techniques:According to voice content whether , it is known that Application on Voiceprint Recognition can be divided into the unrelated with text of text correlation.According to the difference of identification mission, Application on Voiceprint Recognition can be divided into Talk about people's identification and speaker verification.Sound groove recognition technology in e is mainly used in the fields such as security monitoring, the criminal investigation administration of justice and ecommerce.
In recent years, the unrelated speaker of the text of main flow recognizes (hereinafter referred to as Speaker Identification) technology and is based on Douglas A.Reynolds were in gauss hybrid models-universal background model (the Gaussian mixture proposed in 2000 Model-universal background models, GMM-UBM) Speaker Recognition System.GMM-UBM systems are from speaker Identify angle, it is proposed that weigh the theoretical frame and implementation of two sections of voice similarity degrees, there is landmark meaning.
Voice communication refers to, by voice and by the communication way of transmission medium, there is base call, mobile phone communication, intercommunication Machine is conversed, and voice-enabled chat above network etc., is referred to as voice call.Voice communication mode common at present has landline telephone to lead to Letter, the VOIP networking telephones, wechat phone, QQ phones, 2G communication, 3G communications and 4G communications etc..
Public Switched Telephony Network (Public Switched Telephone Network, PSTN), i.e., in daily life Conventional telephone network.PSTN is a kind of circuit-switched network based on analogue technique, and its speech coding algorithm used is G.711a rate coding mode or u rate coding modes.The coded system that the VOIP networking telephones often use is International Telecommunication Union G.723 standard, specially algebraic code-excited linear predictive coding ACELP encode.The communication party that wechat phone, QQ phones use Formula is narrowband self-adaption multi code Rate of Chinese character AMR-NB coded systems.2G communicates, i.e. Generation Mobile Telecommunication System technology, bag Chinese juniper gsm communication system System and CDMA1x communication systems, the 2G of wherein China Mobile and CHINAUNICOM use GSM standard, and China Telecom 2G is used It is CDMA1x standards.GSM voice codings are Regular-Pulse Excitation long-term linearity predictive coding RPE-LTP.3G communications include China The TD-SCDMA and the WCDMA of the CHINAUNICOM and CDMA2000 of China Telecom that movable independent is formulated.TD-SCDMA and WCDMA All encoded using adaptive multi-beam forming AMR-NB or AMR-WB.Telecommunications 2G, 3G uses enhanced variable rate encoding and decoding Device EVRC or QCELP coded system.4G communicates, and China Mobile uses TD-LTE (Time Division Long Term Evolution) standard, CHINAUNICOM and China Telecom use FDD-LTE standards.What 4G communications used is high definition voice Converse VoLTE, and voice coding modes are adaptive multi-beam forming AMR.
Due to the extensive use of digital voice communication system, Speaker Recognition System can be obtained instructing in actual environment White silk voice and the coding of tested speech are often different, and Application on Voiceprint Recognition at this moment is just faced with because training and tested speech encode not Same and caused voice channel mismatch problem, this will have a huge impact to the performance of Speaker Recognition System.Solves letter Road mismatch problem is to improve Speaker Identification performance, strengthens one of the key of Speaker Recognition System degree of being practical.
To solve the problems, such as channel mismatch in Application on Voiceprint Recognition, what is be commonly used at present is that research models calculation across the vocal print of channel Method.Under the technology main vocal print modeling algorithm have disturbance component projection model (Nuisance Attribute Projection, NAP), simultaneous factor analysis model (Joint Factor Analysis, JFA), identity-based authentication vector (identity- Vector, i-vector) Speaker Identification modeling method and combine speech recognition DNN acoustic models and i-vector models Speaker Identification modeling method etc..
NAP and JFA is the subspace model put forward for channel mismatch problem.Wherein NAP direct estimations go out one Channel subspace, then the subspace is removed to reduce channel information to Speaker Identification from GMM average super vectors space Interference.JFA is thought in the higher dimensional space of GMM averages super vector, two sub-spaces is present and is contained speaker respectively Information and channel information, can be more effectively by speaker information in voice and letter by the way that the two subspaces are carried out with joint modeling Road information separates, so as to lift the Speaker Recognition System performance under Complex Channel.Because channel in JFA models is empty The interior speaker information contained compared with horn of plenty, the method that JFA separately models to speaker and channel can be to speaker information Larger damage is produced, 2010, Dehak et al. proposed i-vector models on JFA basis.In i-vector models only A sub-spaces, referred to as entire change subspace are defined, speaker information and channel information are contained simultaneously in the subspace. Further every section of voice has been expressed as a low dimension vector in the subspace, i.e. i-vector.Finally by i- The mode that vector aspects carry out channel compensation weakens influence of the channel to Speaker Recognition System performance.With JFA model phases Than, the complexity of i-vector models greatly reduces, while more flexible by way of carrying out channel compensation in subspace, And more preferable Speaker Identification performance is shown, and this also causes i-vector models to become most main flow and forefront Speaker Identification modeling method.2014, Lei and Kenny et al. proposed one kind and combine speech recognition DNN acoustic models and i- The Speaker Identification modeling method of vector models:During the valuation of i-vector model correlation sufficient statistics, use The DNN acoustic models classified in speech recognition to phoneme state replace traditional UBM model to calculate frame posterior probability.The party Method reduces Speaker Recognition System modeling complexity, and recognition effect lifting is obvious.
Illustrated by taking conventional JFA method for recognizing sound-groove as an example, this method assumes that can be used by giving one section of voice by one Super vector represents that then this super vector can be expressed as:
Mh(s)=mubm+vy(s)+uxh(s) (1)
Wherein Mh(s) given speaker s h section voices, m are representedubmUBM average super vector is represented, v represents to say People's space matrix is talked about, u represents channel space matrix, and y (s) is speaker's factor, xh(s) be given speaker s h section languages The channel factors that sound has.
In JFA, it is necessary first to estimate v, the two matrixes of u, i.e. speaker space matrix and channel space matrix.Work as instruction Practice the when marquis of speaker, then corresponding speaker factor y (s) and channel factors x are specifically estimated respectively to every section of training voiceh (s), so as to obtain this section training voice corresponding to speaker model be:
M (s)=mubm+vy(s) (2)
During test, it is only necessary to estimate the channel factors x of every section of tested speechh(s), you can be said with what is above obtained Words people's model is combined to realize Speaker Identification.
In Voiceprint Recognition System based on JFA, it is necessary to speech data, by function division have following three part:Training The speech data of common background gauss hybrid models;The speech data of training objective speaker;Speech data to be identified.
The existing method for recognizing sound-groove based on JFA include training the universal model stage, estimation speaker space matrix and Channel space stage matrix, training speaker model stage and test phase, wherein:
1) the universal model stage is trained, is comprised the following steps:
1-a) by voice pretreatment and feature extraction, the speech data for training common background gauss hybrid models is converted For spectrum signature;
It is initial to common background gauss hybrid models using K-means algorithms 1-b) based on the spectrum signature extracted Change;
1-c) using EM algorithm (Expectation maximum, EM) renewal step 1-b) initialization it is general Background gauss hybrid models.
2) estimate speaker space matrix and channel space stage matrix, comprise the following steps:
The characteristic vector of all voices of speaker 2-a) is calculated relative to the single order of Gaussian component in universal background model Baum-welch statistics, obtain corresponding average super vector;
2-b) combine step 2-a) gained average super vector, estimate speaker space matrix v using EM algorithm iterations;
The characteristic vector of all voices under same channel 2-c) is calculated relative to one of Gaussian component in universal background model Rank baum-welch statistics, obtain corresponding average super vector;
2-d) combine step 2-c) gained average super vector, estimate channel space matrix u using EM algorithm iterations.
3) the speaker model stage is trained, is comprised the following steps:
3-a) by voice pretreatment and feature extraction, the speech data of training objective speaker is converted into frequency spectrum spy Sign;
3-b) be based on step 3-a) spectrum signature, calculate baum-welch statistics, obtain corresponding average super vector;
3-c) combine step 3-b) gained average super vector, estimate corresponding speaker's factor y using the E-step of EM algorithms And channel factors x (s)h(s) combine vector, take y (s) part therein;
3-d) combine the speaker space matrix v and channel space matrix u that step 2) obtains, and 3-c) gained y (s), Target speaker model is calculated.
In step 3-d) in, train speaker model corresponding to voice to be calculated according to formula (2).
4) test phase:
4-a) by voice pretreatment and feature extraction, voice to be identified is converted into spectrum signature;
4-b) be based on 4-a) spectrum signature, calculate baum-welch statistics, obtain corresponding average super vector;
4-c) combine step 3-b) gained average super vector, estimate corresponding channel factors x using the E-step of EM algorithmsh (s);
The speaker model that step 3) obtains 4-d) is combined, calculation of group dividing affinity score is obtained using likelihood ratio;
4-e) using step 4-d) largest score of gained is calculated as the test statement recognition result.
Above-mentioned method for recognizing sound-groove may serve to solve the problems, such as channel mismatch, but corresponding problem be present.By taking JFA as an example, It is required that amount of training data is very big, operand is also very big when test, is difficult often to obtain to know very well in actual applications Other effect.
The content of the invention
It is open to provide a kind of channel for Application on Voiceprint Recognition the invention aims to solve the deficiency of prior art Matching process and its device.The present invention can effectively carry out channel soft simulation, simulation fixed line, 2G, 3G, 4G etc. to voice communication Voice communication course, so as to obtain training voice with tested speech channel condition identical, channel mismatch is efficiently solved the problems, such as, Suitable for the application demand of reality.
The technical solution adopted by the present invention is as follows:
A kind of channel matched method for Application on Voiceprint Recognition, it is characterised in that this method includes:Data acquisition and reading rank Section, voice coding stage, channel error code simulation stage, tone decoding stage;
1) data acquisition and reading stage comprise the following steps:
1-a) gather and read primary voice data, wherein, primary voice data is WAV forms;
The file header of primary voice data 1-b) is removed according to WAV format standards, obtains pure speech data block;
2) the voice coding stage comprises the following steps:
Voice communication mode to be simulated 2-a) is selected according to primary voice data, the voice communication mode be fixed line, Any one in the VOIP networking telephones, wechat phone, QQ phones, GSM, 3G, 4G;
2-b) according to speech coding standard corresponding with selected voice communication mode to step 1-b) obtained pure voice number Voice coding is carried out according to block, obtains compressing speech data;
3) the channel error code dummy run phase comprises the following steps:
3-a) according to step 2-a) selected by voice communication mode selection respective channel, obtain voice under different state of signal-to-noise The bit error rate of transmission;
Signal to noise ratio 3-b) is selected, to step 2-b) in after encoded compression speech data carry out error code operation, obtain Speech data after channel simulation, the speech data is as channel simulation speech data;Wherein, error code operation is to be communicated according to selected The bit error rate under pattern, that is, bit error rate corresponding under signal to noise ratio is selected, random error is carried out to compression speech data;
4) the tone decoding stage comprises the following steps:
4-a) according to step 2-a) selected by voice communication mode select corresponding tone decoding algorithm;
4-b) with corresponding tone decoding algorithm to step 3-b) obtained channel simulation speech data decodes;
Wav file head 4-c) is added to decoded speech data, obtains training language with tested speech channel condition identical Sound data.
A kind of channel matched device for Application on Voiceprint Recognition is also proposed based on the above method present invention, it is characterised in that should Device includes following 5 modules:
Voice collecting and read module:For gathering and reading the original sound data of speaker, speech data will be obtained Voice data file head is removed, obtains pure speech data block;
Voice coding module:For selecting voice communication mode to be simulated according to primary voice data, to according to voice Collection and read module obtain pure speech data block and carry out voice coding, so as to obtain the compression voice number under respective communication pattern According to;
Channel error code analog module:Respective channel is selected according to communication pattern, realizes and the channel for compressing speech data is missed Code simulation, obtains channel simulation speech data;
Tone decoding module:According to the corresponding decoding algorithm of voice communication mode, language is carried out to channel simulation speech data Sound is decoded, and increases file header to decoded voice, and voice is exported with tested speech channel condition identical so as to obtain;
Data memory module:For store primary voice data, compression speech data, channel simulation speech data and with survey Voice channel condition identical output speech data is tried, and corresponding data is passed into corresponding module.
The features of the present invention and beneficial effect:
(1) compared with traditional method for recognizing sound-groove, information channel simulation method is applied in Application on Voiceprint Recognition by the inventive method, only Need that channel simulator will be carried out for the primary voice data of training, with regard to that can obtain training with tested speech channel condition identical Voice, so as to solve the problems, such as channel mismatch existing for traditional method for recognizing sound-groove.
(2) compared with the Application on Voiceprint Recognition modeling algorithm across channel, the present invention simply carries out letter to original training speech samples Road emulates, without changing Application on Voiceprint Recognition modeling algorithm, so as to reduce the complexity of recognizer, while recognition effect It is more preferable than the Application on Voiceprint Recognition modeling algorithm across channel.Therefore, the inventive method meets actual answer more suitable for Application on Voiceprint Recognition task Demand.
Brief description of the drawings
Fig. 1 is the method flow block diagram of the present invention.
Fig. 2 is the apparatus structure block diagram of the present invention.
Embodiment
A kind of channel matched method and its device for Application on Voiceprint Recognition proposed by the present invention, be described with reference to the accompanying drawings as Under.
A kind of channel matched method for Application on Voiceprint Recognition proposed by the present invention, its flow is as shown in figure 1, this method bag Include:Data acquisition and reading stage, voice coding stage, channel error code dummy run phase, tone decoding stage;
1) data acquisition and reading stage, comprise the following steps:
1-a) gather and read primary voice data, wherein, primary voice data is WAV forms;
The file header of primary voice data 1-b) is removed according to WAV format standards, obtains pure speech data block;
2) the voice coding stage, comprise the following steps:
Voice communication mode to be simulated 2-a) is selected according to primary voice data, the voice communication mode be fixed line, Any one in the VOIP networking telephones, wechat phone, QQ phones, GSM, 3G, 4G;
2-b) according to the speech coding standard corresponding with selected voice communication mode to step 1-b) obtained pure voice Data block carries out voice coding, obtains compressing speech data;
3) the channel error code dummy run phase comprises the following steps:
3-a) according to step 2-a) selected by voice communication mode selection respective channel, obtain different state of signal-to-noise lower channels Transmit the bit error rate of voice;
Signal to noise ratio 3-b) is selected, to step 2-b) in after encoded compression speech data carry out error code operation, obtain Speech data after channel simulation, the speech data is as channel simulation speech data;Wherein, error code operation is to be communicated according to selected The bit error rate under pattern, that is, bit error rate corresponding under signal to noise ratio is selected, random error is carried out to compression speech data;
4) the tone decoding stage, comprise the following steps:
4-a) according to step 2-a) selected by voice communication mode select corresponding tone decoding algorithm;
4-b) with corresponding tone decoding algorithm to step 3-b) obtained channel simulation speech data decodes;
Wav file head 4-c) is added to decoded speech data, obtains training language with tested speech channel condition identical Sound data, the training speech data will be used for Application on Voiceprint Recognition.
Above-mentioned steps 1-a) in, primary voice data requires that form is WAV, 8Khz or 16Khz sampling, and 16bit quantifies, its Middle selection 8Khz or 16Khz sampling is determined by voice coding modes.
Obtained by the inventive method after training voice, conventional method for recognizing sound-groove can be used to carry out Application on Voiceprint Recognition.Example Such as Application on Voiceprint Recognition is carried out using i-vector models.The i-vector features of extraction training voice and tested speech, are calculated afterwards Therebetween the maximum as recognition result of cos distances, wherein distance value.
The present invention also proposes the channel matched device for Application on Voiceprint Recognition using the above method, it is characterised in that the dress Put including following 5 modules:
Voice collecting and read module:For gathering and reading the original sound data of speaker, speech data will be obtained Voice data file head is removed, obtains pure speech data block;
Voice coding module:For selecting voice communication mode to be simulated according to primary voice data, to obtaining pure language Sound data block carries out voice coding, so as to obtain the compression speech data under corresponding voice communication mode;
Channel error code analog module:Respective channel is selected according to voice communication mode, realizes the letter to compressing speech data Road error code simulation, obtains channel simulation speech data;
Tone decoding module:According to the corresponding decoding algorithm of voice communication mode, language is carried out to channel simulation speech data Sound is decoded, and increases file header to decoded voice, and speech data is exported with tested speech channel condition identical so as to obtain;
Data memory module:For store primary voice data, compression speech data, channel simulation speech data and with survey Voice channel condition identical output speech data is tried, and corresponding data is passed into corresponding module.
The available conventional simulation of above-mentioned each module, digital integrated electronic circuit are realized.

Claims (2)

  1. A kind of 1. channel matched method for Application on Voiceprint Recognition, it is characterised in that this method includes:Data acquisition and reading rank Section, voice coding stage, channel error code simulation stage, tone decoding stage;
    1) data acquisition and reading stage comprise the following steps:
    1-a) gather and read primary voice data, wherein, primary voice data is WAV forms;
    The file header of primary voice data 1-b) is removed according to WAV format standards, obtains pure speech data block;
    2) the voice coding stage comprises the following steps:
    Voice communication mode to be simulated 2-a) is selected according to primary voice data, the voice communication mode is fixed line, VOIP nets Any one in network phone, wechat phone, QQ phones, GSM, 3G, 4G;
    2-b) according to speech coding standard corresponding with selected voice communication mode to step 1-b) obtained pure speech data block Voice coding is carried out, obtains compressing speech data;
    3) the channel error code dummy run phase comprises the following steps:
    3-a) according to step 2-a) selected by voice communication mode selection respective channel, obtain voice channel under different state of signal-to-noise The bit error rate of transmission;
    Signal to noise ratio 3-b) is selected, to step 2-b) in after encoded compression speech data carry out error code operation, obtain channel Speech data after simulation, the speech data is as channel simulation speech data;Wherein, error code operation is the communication pattern selected by Under the bit error rate, i.e., corresponding bit error rate, random error is carried out to compression speech data under selected signal to noise ratio;
    4) the tone decoding stage comprises the following steps:
    4-a) according to step 2-a) selected by voice communication mode select corresponding tone decoding algorithm;
    4-b) with corresponding tone decoding algorithm to step 3-b) obtained channel simulation speech data decodes;
    Wav file head 4-c) is added to decoded speech data, obtains training voice number with tested speech channel condition identical According to.
  2. 2. a kind of channel matched device for Application on Voiceprint Recognition using method as claimed in claim 1, it is characterised in that should Device includes following 5 modules:
    Voice collecting and read module:For gathering and reading the original sound data of speaker, speech data removal will be obtained Voice data file head, obtain pure speech data block;
    Voice coding module:For selecting voice communication mode to be simulated according to primary voice data, to according to voice collecting And read module obtains pure speech data block and carries out voice coding, so as to obtain the compression speech data under respective communication pattern;
    Channel error code analog module:Respective channel is selected according to communication pattern, realizes the channel error code mould to compressing speech data Intend, obtain channel simulation speech data;
    Tone decoding module:According to the corresponding decoding algorithm of voice communication mode, voice solution is carried out to channel simulation speech data Code, increases file header to decoded voice, and voice is exported with tested speech channel condition identical so as to obtain;
    Data memory module:For storing primary voice data, compressing speech data, channel simulation speech data and with testing language Sound channel condition identical exports speech data, and corresponding data is passed into corresponding module.
CN201710751356.2A 2017-08-28 2017-08-28 A kind of channel matched method and its device for Application on Voiceprint Recognition Pending CN107481723A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710751356.2A CN107481723A (en) 2017-08-28 2017-08-28 A kind of channel matched method and its device for Application on Voiceprint Recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710751356.2A CN107481723A (en) 2017-08-28 2017-08-28 A kind of channel matched method and its device for Application on Voiceprint Recognition

Publications (1)

Publication Number Publication Date
CN107481723A true CN107481723A (en) 2017-12-15

Family

ID=60604022

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710751356.2A Pending CN107481723A (en) 2017-08-28 2017-08-28 A kind of channel matched method and its device for Application on Voiceprint Recognition

Country Status (1)

Country Link
CN (1) CN107481723A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109192216A (en) * 2018-08-08 2019-01-11 联智科技(天津)有限责任公司 A kind of Application on Voiceprint Recognition training dataset emulation acquisition methods and its acquisition device
CN111210809A (en) * 2018-11-22 2020-05-29 阿里巴巴集团控股有限公司 Voice training data adaptation method and device, voice data conversion method and electronic equipment
CN111312283A (en) * 2020-02-24 2020-06-19 中国工商银行股份有限公司 Cross-channel voiceprint processing method and device
CN111402899A (en) * 2020-03-25 2020-07-10 中国工商银行股份有限公司 Cross-channel voiceprint identification method and device
CN111817943A (en) * 2019-04-12 2020-10-23 腾讯科技(深圳)有限公司 Data processing method and device based on instant messaging application
CN112489678A (en) * 2020-11-13 2021-03-12 苏宁云计算有限公司 Scene recognition method and device based on channel characteristics
CN117765951A (en) * 2023-09-21 2024-03-26 南京龙垣信息科技有限公司 Information processing method and device for telephone voice recognition

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1306352A (en) * 2000-03-01 2001-08-01 深圳市中兴通讯股份有限公司 Test method of echo cancel function
CN1791035A (en) * 2005-12-22 2006-06-21 西安交通大学 Design method for distributed wireless communication transmission technique test platform
JP2010093815A (en) * 2008-10-13 2010-04-22 Ntt Docomo Inc Method for time-space encoding, and method and apparatus for transmitting, receiving and decoding radio signal
CN103730112A (en) * 2013-12-25 2014-04-16 安徽讯飞智元信息科技有限公司 Multi-channel voice simulation and acquisition method
CN106374975A (en) * 2016-08-22 2017-02-01 王毅 Multiport-based digitalized power line channel simulation device and simulation method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1306352A (en) * 2000-03-01 2001-08-01 深圳市中兴通讯股份有限公司 Test method of echo cancel function
CN1791035A (en) * 2005-12-22 2006-06-21 西安交通大学 Design method for distributed wireless communication transmission technique test platform
JP2010093815A (en) * 2008-10-13 2010-04-22 Ntt Docomo Inc Method for time-space encoding, and method and apparatus for transmitting, receiving and decoding radio signal
CN103730112A (en) * 2013-12-25 2014-04-16 安徽讯飞智元信息科技有限公司 Multi-channel voice simulation and acquisition method
CN106374975A (en) * 2016-08-22 2017-02-01 王毅 Multiport-based digitalized power line channel simulation device and simulation method

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109192216A (en) * 2018-08-08 2019-01-11 联智科技(天津)有限责任公司 A kind of Application on Voiceprint Recognition training dataset emulation acquisition methods and its acquisition device
CN111210809A (en) * 2018-11-22 2020-05-29 阿里巴巴集团控股有限公司 Voice training data adaptation method and device, voice data conversion method and electronic equipment
CN111210809B (en) * 2018-11-22 2024-03-19 阿里巴巴集团控股有限公司 Voice training data adaptation method and device, voice data conversion method and electronic equipment
CN111817943A (en) * 2019-04-12 2020-10-23 腾讯科技(深圳)有限公司 Data processing method and device based on instant messaging application
CN111817943B (en) * 2019-04-12 2022-06-14 腾讯科技(深圳)有限公司 Data processing method and device based on instant messaging application
US11683278B2 (en) 2019-04-12 2023-06-20 Tencent Technology (Shenzhen) Company Limited Spectrogram and message bar generation based on audio data in an instant messaging application
CN111312283A (en) * 2020-02-24 2020-06-19 中国工商银行股份有限公司 Cross-channel voiceprint processing method and device
CN111402899A (en) * 2020-03-25 2020-07-10 中国工商银行股份有限公司 Cross-channel voiceprint identification method and device
CN111402899B (en) * 2020-03-25 2023-10-13 中国工商银行股份有限公司 Cross-channel voiceprint recognition method and device
CN112489678A (en) * 2020-11-13 2021-03-12 苏宁云计算有限公司 Scene recognition method and device based on channel characteristics
CN112489678B (en) * 2020-11-13 2023-12-05 深圳市云网万店科技有限公司 Scene recognition method and device based on channel characteristics
CN117765951A (en) * 2023-09-21 2024-03-26 南京龙垣信息科技有限公司 Information processing method and device for telephone voice recognition

Similar Documents

Publication Publication Date Title
CN107481723A (en) A kind of channel matched method and its device for Application on Voiceprint Recognition
Weng et al. Semantic communication systems for speech transmission
CN107633842B (en) Audio recognition method, device, computer equipment and storage medium
CN109272988B (en) Voice recognition method based on multi-path convolution neural network
CN108806667A (en) The method for synchronously recognizing of voice and mood based on neural network
CN101447184B (en) Chinese-English bilingual speech recognition method based on phoneme confusion
CN108281137A (en) A kind of universal phonetic under whole tone element frame wakes up recognition methods and system
CN109977207A (en) Talk with generation method, dialogue generating means, electronic equipment and storage medium
CN109754790B (en) Speech recognition system and method based on hybrid acoustic model
CN107408383A (en) Encoder selects
CN109036391A (en) Audio recognition method, apparatus and system
CN108922513A (en) Speech differentiation method, apparatus, computer equipment and storage medium
CN106297826A (en) Speech emotional identification system and method
CN104036774A (en) Method and system for recognizing Tibetan dialects
CN109192216A (en) A kind of Application on Voiceprint Recognition training dataset emulation acquisition methods and its acquisition device
CN110415701A (en) The recognition methods of lip reading and its device
CN107767861A (en) voice awakening method, system and intelligent terminal
CN110310619A (en) Polyphone prediction technique, device, equipment and computer readable storage medium
CN101221766B (en) Method for switching audio encoder
CN108922521A (en) A kind of voice keyword retrieval method, apparatus, equipment and storage medium
CN110176237A (en) A kind of audio recognition method and device
CN111128211B (en) Voice separation method and device
CN110570869A (en) Voiceprint recognition method, device, equipment and storage medium
CN112131359A (en) Intention identification method based on graphical arrangement intelligent strategy and electronic equipment
CN113539232B (en) Voice synthesis method based on lesson-admiring voice data set

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20171215

WD01 Invention patent application deemed withdrawn after publication