SG11202010803VA - System and method for determining voice characteristics - Google Patents

System and method for determining voice characteristics

Info

Publication number
SG11202010803VA
SG11202010803VA SG11202010803VA SG11202010803VA SG11202010803VA SG 11202010803V A SG11202010803V A SG 11202010803VA SG 11202010803V A SG11202010803V A SG 11202010803VA SG 11202010803V A SG11202010803V A SG 11202010803VA SG 11202010803V A SG11202010803V A SG 11202010803VA
Authority
SG
Singapore
Prior art keywords
voice characteristics
determining voice
determining
voice
Prior art date
Application number
SG11202010803VA
Other languages
English (en)
Inventor
Zhiming Wang
Kaisheng Yao
Xiaolong Li
Original Assignee
Alipay Hangzhou Inf Tech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Inf Tech Co Ltd filed Critical Alipay Hangzhou Inf Tech Co Ltd
Publication of SG11202010803VA publication Critical patent/SG11202010803VA/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/18Artificial neural networks; Connectionist approaches
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/20Pattern transformations or operations aimed at increasing system robustness, e.g. against channel noise or different working conditions

Landscapes

  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Game Theory and Decision Science (AREA)
  • Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)
  • Image Analysis (AREA)
  • Telephonic Communication Services (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
SG11202010803VA 2019-10-31 2019-10-31 System and method for determining voice characteristics SG11202010803VA (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/114812 WO2020035085A2 (fr) 2019-10-31 2019-10-31 Système et procédé de détermination de caractéristiques vocales

Publications (1)

Publication Number Publication Date
SG11202010803VA true SG11202010803VA (en) 2020-11-27

Family

ID=69525955

Family Applications (2)

Application Number Title Priority Date Filing Date
SG11202010803VA SG11202010803VA (en) 2019-10-31 2019-10-31 System and method for determining voice characteristics
SG11202013135XA SG11202013135XA (en) 2019-10-31 2020-01-09 System and method for personalized speaker verification

Family Applications After (1)

Application Number Title Priority Date Filing Date
SG11202013135XA SG11202013135XA (en) 2019-10-31 2020-01-09 System and method for personalized speaker verification

Country Status (5)

Country Link
US (3) US10997980B2 (fr)
CN (2) CN111712874B (fr)
SG (2) SG11202010803VA (fr)
TW (1) TWI737462B (fr)
WO (2) WO2020035085A2 (fr)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108806696B (zh) * 2018-05-08 2020-06-05 平安科技(深圳)有限公司 建立声纹模型的方法、装置、计算机设备和存储介质
US11556848B2 (en) * 2019-10-21 2023-01-17 International Business Machines Corporation Resolving conflicts between experts' intuition and data-driven artificial intelligence models
CN111712874B (zh) * 2019-10-31 2023-07-14 支付宝(杭州)信息技术有限公司 用于确定声音特性的方法、系统、装置和存储介质
US11651767B2 (en) 2020-03-03 2023-05-16 International Business Machines Corporation Metric learning of speaker diarization
US11443748B2 (en) * 2020-03-03 2022-09-13 International Business Machines Corporation Metric learning of speaker diarization
CN111833855B (zh) * 2020-03-16 2024-02-23 南京邮电大学 基于DenseNet STARGAN的多对多说话人转换方法
CN111540367B (zh) * 2020-04-17 2023-03-31 合肥讯飞数码科技有限公司 语音特征提取方法、装置、电子设备和存储介质
CN111524525B (zh) * 2020-04-28 2023-06-16 平安科技(深圳)有限公司 原始语音的声纹识别方法、装置、设备及存储介质
US20220067279A1 (en) * 2020-08-31 2022-03-03 Recruit Co., Ltd., Systems and methods for multilingual sentence embeddings
CN113555032B (zh) * 2020-12-22 2024-03-12 腾讯科技(深圳)有限公司 多说话人场景识别及网络训练方法、装置
US11689868B2 (en) * 2021-04-26 2023-06-27 Mun Hoong Leong Machine learning based hearing assistance system
CN113345454B (zh) * 2021-06-01 2024-02-09 平安科技(深圳)有限公司 语音转换模型的训练、应用方法、装置、设备及存储介质
CN114023343B (zh) * 2021-10-30 2024-04-30 西北工业大学 基于半监督特征学习的语音转换方法
TWI795173B (zh) * 2022-01-17 2023-03-01 中華電信股份有限公司 多語言語音辨識系統、方法及電腦可讀媒介
CN114529191A (zh) * 2022-02-16 2022-05-24 支付宝(杭州)信息技术有限公司 用于风险识别的方法和装置
CN114694658A (zh) * 2022-03-15 2022-07-01 青岛海尔科技有限公司 说话人识别模型训练、说话人识别方法及装置
US20230352029A1 (en) * 2022-05-02 2023-11-02 Tencent America LLC Progressive contrastive learning framework for self-supervised speaker verification
CN115035890B (zh) * 2022-06-23 2023-12-05 北京百度网讯科技有限公司 语音识别模型的训练方法、装置、电子设备及存储介质
CN117495571B (zh) * 2023-12-28 2024-04-05 北京芯盾时代科技有限公司 一种数据处理方法、装置、电子设备及存储介质

Family Cites Families (55)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0559349B1 (fr) * 1992-03-02 1999-01-07 AT&T Corp. Procédé d'entraínement et appareil pour la reconnaissance du langage
US5640429A (en) * 1995-01-20 1997-06-17 The United States Of America As Represented By The Secretary Of The Air Force Multichannel non-gaussian receiver and method
CN1302427A (zh) * 1997-11-03 2001-07-04 T-内提克斯公司 用于说话者认证的模型自适应系统和方法
US6609093B1 (en) * 2000-06-01 2003-08-19 International Business Machines Corporation Methods and apparatus for performing heteroscedastic discriminant analysis in pattern recognition systems
US20030225719A1 (en) * 2002-05-31 2003-12-04 Lucent Technologies, Inc. Methods and apparatus for fast and robust model training for object classification
US9113001B2 (en) * 2005-04-21 2015-08-18 Verint Americas Inc. Systems, methods, and media for disambiguating call data to determine fraud
TWI297487B (en) * 2005-11-18 2008-06-01 Tze Fen Li A method for speech recognition
US9247056B2 (en) * 2007-02-28 2016-01-26 International Business Machines Corporation Identifying contact center agents based upon biometric characteristics of an agent's speech
US7958068B2 (en) * 2007-12-12 2011-06-07 International Business Machines Corporation Method and apparatus for model-shared subspace boosting for multi-label classification
EP2189976B1 (fr) * 2008-11-21 2012-10-24 Nuance Communications, Inc. Procédé d'adaptation d'un guide de codification pour reconnaissance vocale
FR2940498B1 (fr) * 2008-12-23 2011-04-15 Thales Sa Procede et systeme pour authentifier un utilisateur et/ou une donnee cryptographique
WO2012041492A1 (fr) * 2010-09-28 2012-04-05 MAX-PLANCK-Gesellschaft zur Förderung der Wissenschaften e.V. Procédé et dispositif destinés à récupérer une image numérique à partir d'une séquence d'images numériques observées
US8442823B2 (en) * 2010-10-19 2013-05-14 Motorola Solutions, Inc. Methods for creating and searching a database of speakers
US9679561B2 (en) * 2011-03-28 2017-06-13 Nuance Communications, Inc. System and method for rapid customization of speech recognition models
US9967218B2 (en) * 2011-10-26 2018-05-08 Oath Inc. Online active learning in user-generated content streams
US9042867B2 (en) * 2012-02-24 2015-05-26 Agnitio S.L. System and method for speaker recognition on mobile devices
US8527276B1 (en) * 2012-10-25 2013-09-03 Google Inc. Speech synthesis using deep neural networks
US20140222423A1 (en) * 2013-02-07 2014-08-07 Nuance Communications, Inc. Method and Apparatus for Efficient I-Vector Extraction
US9406298B2 (en) * 2013-02-07 2016-08-02 Nuance Communications, Inc. Method and apparatus for efficient i-vector extraction
CN103310788B (zh) * 2013-05-23 2016-03-16 北京云知声信息技术有限公司 一种语音信息识别方法及系统
US9514753B2 (en) * 2013-11-04 2016-12-06 Google Inc. Speaker identification using hash-based indexing
US9311932B2 (en) * 2014-01-23 2016-04-12 International Business Machines Corporation Adaptive pause detection in speech recognition
US9542948B2 (en) * 2014-04-09 2017-01-10 Google Inc. Text-dependent speaker identification
US10073985B2 (en) * 2015-02-27 2018-09-11 Samsung Electronics Co., Ltd. Apparatus and method for trusted execution environment file protection
US9687208B2 (en) * 2015-06-03 2017-06-27 iMEDI PLUS Inc. Method and system for recognizing physiological sound
US9978374B2 (en) * 2015-09-04 2018-05-22 Google Llc Neural networks for speaker verification
US10262654B2 (en) * 2015-09-24 2019-04-16 Microsoft Technology Licensing, Llc Detecting actionable items in a conversation among participants
CN107274904A (zh) * 2016-04-07 2017-10-20 富士通株式会社 说话人识别方法和说话人识别设备
CN105869630B (zh) * 2016-06-27 2019-08-02 上海交通大学 基于深度学习的说话人语音欺骗攻击检测方法及系统
US10535000B2 (en) 2016-08-08 2020-01-14 Interactive Intelligence Group, Inc. System and method for speaker change detection
US9824692B1 (en) * 2016-09-12 2017-11-21 Pindrop Security, Inc. End-to-end speaker recognition using deep neural network
US10553218B2 (en) 2016-09-19 2020-02-04 Pindrop Security, Inc. Dimensionality reduction of baum-welch statistics for speaker recognition
WO2018053518A1 (fr) * 2016-09-19 2018-03-22 Pindrop Security, Inc. Caractéristiques de bas niveau de compensation de canal pour la reconnaissance de locuteur
WO2018106971A1 (fr) * 2016-12-07 2018-06-14 Interactive Intelligence Group, Inc. Système et procédé de classification de locuteur à base de réseau neuronal
US10140980B2 (en) * 2016-12-21 2018-11-27 Google LCC Complex linear projection for acoustic modeling
CN108288470B (zh) * 2017-01-10 2021-12-21 富士通株式会社 基于声纹的身份验证方法和装置
CN106991312B (zh) * 2017-04-05 2020-01-10 百融云创科技股份有限公司 基于声纹识别的互联网反欺诈认证方法
US11556794B2 (en) * 2017-08-31 2023-01-17 International Business Machines Corporation Facilitating neural networks
US10679129B2 (en) * 2017-09-28 2020-06-09 D5Ai Llc Stochastic categorical autoencoder network
JP6879433B2 (ja) * 2017-09-29 2021-06-02 日本電気株式会社 回帰装置、回帰方法、及びプログラム
US20190213705A1 (en) * 2017-12-08 2019-07-11 Digimarc Corporation Artwork generated to convey digital messages, and methods/apparatuses for generating such artwork
CN108417217B (zh) * 2018-01-11 2021-07-13 思必驰科技股份有限公司 说话人识别网络模型训练方法、说话人识别方法及系统
WO2019161011A1 (fr) * 2018-02-16 2019-08-22 Dolby Laboratories Licensing Corporation Transfert de style de parole
US11468316B2 (en) * 2018-03-13 2022-10-11 Recogni Inc. Cluster compression for compressing weights in neural networks
US10347241B1 (en) * 2018-03-23 2019-07-09 Microsoft Technology Licensing, Llc Speaker-invariant training via adversarial learning
CN109065022B (zh) * 2018-06-06 2022-08-09 平安科技(深圳)有限公司 i-vector向量提取方法、说话人识别方法、装置、设备及介质
CN109256139A (zh) * 2018-07-26 2019-01-22 广东工业大学 一种基于Triplet-Loss的说话人识别方法
CN110164452B (zh) * 2018-10-10 2023-03-10 腾讯科技(深圳)有限公司 一种声纹识别的方法、模型训练的方法以及服务器
CN110428808B (zh) * 2018-10-25 2022-08-19 腾讯科技(深圳)有限公司 一种语音识别方法及装置
US10510002B1 (en) * 2019-02-14 2019-12-17 Capital One Services, Llc Stochastic gradient boosting for deep neural networks
CN110136729B (zh) * 2019-03-27 2021-08-20 北京奇艺世纪科技有限公司 模型生成方法、音频处理方法、装置及计算机可读存储介质
CN109903774A (zh) * 2019-04-12 2019-06-18 南京大学 一种基于角度间隔损失函数的声纹识别方法
US10878575B2 (en) * 2019-04-15 2020-12-29 Adobe Inc. Foreground-aware image inpainting
CN110223699B (zh) * 2019-05-15 2021-04-13 桂林电子科技大学 一种说话人身份确认方法、装置及存储介质
CN111712874B (zh) * 2019-10-31 2023-07-14 支付宝(杭州)信息技术有限公司 用于确定声音特性的方法、系统、装置和存储介质

Also Published As

Publication number Publication date
CN111418009B (zh) 2023-09-05
US20210110833A1 (en) 2021-04-15
WO2020035085A2 (fr) 2020-02-20
WO2020098828A2 (fr) 2020-05-22
US11031018B2 (en) 2021-06-08
US20210210101A1 (en) 2021-07-08
WO2020035085A3 (fr) 2020-08-20
US10997980B2 (en) 2021-05-04
TWI737462B (zh) 2021-08-21
US11244689B2 (en) 2022-02-08
SG11202013135XA (en) 2021-01-28
TW202119393A (zh) 2021-05-16
US20210043216A1 (en) 2021-02-11
CN111712874B (zh) 2023-07-14
WO2020098828A3 (fr) 2020-09-03
CN111712874A (zh) 2020-09-25
CN111418009A (zh) 2020-07-14

Similar Documents

Publication Publication Date Title
SG11202010803VA (en) System and method for determining voice characteristics
SG11202006772QA (en) System and method for decentralized-identifier creation
EP3736684A4 (fr) Procédé et système d'exécution d'instruction vocale
EP3873157A4 (fr) Procédé et appareil de détermination de liaison montante
GB2596770B (en) Carrier-resolved photo-hall system and method
EP3726373C0 (fr) Création d'un procédé d'application et système
EP3909220C0 (fr) Système et procédé pour une désegmentation sécurisée
EP3796712A4 (fr) Procédé et dispositif permettant d'établir un service vocal
EP3874580A4 (fr) Système et procédé de détermination de facteur q
GB2585087B (en) Positioning system and method
SG11202012810TA (en) System and method for storage
EP3976825A4 (fr) Systèmes et procédés pour déterminer une séquence
EP3962157A4 (fr) Procédé, appareil et système de détermination de mdbv
EP3951311A4 (fr) Système de mesure et procédé de mesure
GB2600580B (en) System and method for preparing MRNA
GB2588760B (en) Interface system and corresponding method
GB201901644D0 (en) Testing system and method
EP3815028A4 (fr) Procédé et système de détermination de risque
GB2590126B (en) Navigation system and method
EP4025875C0 (fr) Méthode et système pour déterminer déplacements de position
EP4037555A4 (fr) Procédé et système pour déterminer des paramètres cardiovasculaires
EP4062126C0 (fr) Système et procédé de navigation
SG11202110703TA (en) Method for determining reference value and terminal
EP4013701A4 (fr) Procédé et appareil permettant de déterminer l'emplacement d'un objet
EP3836410A4 (fr) Système et procédé de détermination