SG11202003722SA - Speaker separation model training method, two-speaker separation method and computing device - Google Patents
Speaker separation model training method, two-speaker separation method and computing deviceInfo
- Publication number
- SG11202003722SA SG11202003722SA SG11202003722SA SG11202003722SA SG11202003722SA SG 11202003722S A SG11202003722S A SG 11202003722SA SG 11202003722S A SG11202003722S A SG 11202003722SA SG 11202003722S A SG11202003722S A SG 11202003722SA SG 11202003722S A SG11202003722S A SG 11202003722SA
- Authority
- SG
- Singapore
- Prior art keywords
- speaker separation
- computing device
- model training
- speaker
- training method
- Prior art date
Links
- 238000000926 separation method Methods 0.000 title 2
- 238000000034 method Methods 0.000 title 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/18—Artificial neural networks; Connectionist approaches
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Business, Economics & Management (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Game Theory and Decision Science (AREA)
- Circuit For Audible Band Transducer (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810519521.6A CN108766440B (zh) | 2018-05-28 | 2018-05-28 | 说话人分离模型训练方法、两说话人分离方法及相关设备 |
PCT/CN2018/100174 WO2019227672A1 (zh) | 2018-05-28 | 2018-08-13 | 说话人分离模型训练方法、两说话人分离方法及相关设备 |
Publications (1)
Publication Number | Publication Date |
---|---|
SG11202003722SA true SG11202003722SA (en) | 2020-12-30 |
Family
ID=64006219
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
SG11202003722SA SG11202003722SA (en) | 2018-05-28 | 2018-08-13 | Speaker separation model training method, two-speaker separation method and computing device |
Country Status (5)
Country | Link |
---|---|
US (1) | US11158324B2 (ja) |
JP (1) | JP2020527248A (ja) |
CN (1) | CN108766440B (ja) |
SG (1) | SG11202003722SA (ja) |
WO (1) | WO2019227672A1 (ja) |
Families Citing this family (50)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109545186B (zh) * | 2018-12-16 | 2022-05-27 | 魔门塔(苏州)科技有限公司 | 一种语音识别训练系统及方法 |
CN109686382A (zh) * | 2018-12-29 | 2019-04-26 | 平安科技(深圳)有限公司 | 一种说话人聚类方法和装置 |
CN110197665B (zh) * | 2019-06-25 | 2021-07-09 | 广东工业大学 | 一种用于公安刑侦监听的语音分离与跟踪方法 |
CN110444223B (zh) * | 2019-06-26 | 2023-05-23 | 平安科技(深圳)有限公司 | 基于循环神经网络和声学特征的说话人分离方法及装置 |
CN110289002B (zh) * | 2019-06-28 | 2021-04-27 | 四川长虹电器股份有限公司 | 一种端到端的说话人聚类方法及系统 |
CN110390946A (zh) * | 2019-07-26 | 2019-10-29 | 龙马智芯(珠海横琴)科技有限公司 | 一种语音信号处理方法、装置、电子设备和存储介质 |
CN110718228B (zh) * | 2019-10-22 | 2022-04-12 | 中信银行股份有限公司 | 语音分离方法、装置、电子设备及计算机可读存储介质 |
CN111312256B (zh) * | 2019-10-31 | 2024-05-10 | 平安科技(深圳)有限公司 | 语音身份识别的方法、装置及计算机设备 |
CN110853618B (zh) * | 2019-11-19 | 2022-08-19 | 腾讯科技(深圳)有限公司 | 一种语种识别的方法、模型训练的方法、装置及设备 |
CN110992940B (zh) * | 2019-11-25 | 2021-06-15 | 百度在线网络技术(北京)有限公司 | 语音交互的方法、装置、设备和计算机可读存储介质 |
CN110992967A (zh) * | 2019-12-27 | 2020-04-10 | 苏州思必驰信息科技有限公司 | 一种语音信号处理方法、装置、助听器及存储介质 |
CN111145761B (zh) * | 2019-12-27 | 2022-05-24 | 携程计算机技术(上海)有限公司 | 模型训练的方法、声纹确认的方法、系统、设备及介质 |
CN111191787B (zh) * | 2019-12-30 | 2022-07-15 | 思必驰科技股份有限公司 | 提取说话人嵌入特征的神经网络的训练方法和装置 |
CN111370032B (zh) * | 2020-02-20 | 2023-02-14 | 厦门快商通科技股份有限公司 | 语音分离方法、系统、移动终端及存储介质 |
JP7359028B2 (ja) * | 2020-02-21 | 2023-10-11 | 日本電信電話株式会社 | 学習装置、学習方法、および、学習プログラム |
CN111370019B (zh) * | 2020-03-02 | 2023-08-29 | 字节跳动有限公司 | 声源分离方法及装置、神经网络的模型训练方法及装置 |
CN111009258A (zh) * | 2020-03-11 | 2020-04-14 | 浙江百应科技有限公司 | 一种单声道说话人分离模型、训练方法和分离方法 |
US11392639B2 (en) * | 2020-03-31 | 2022-07-19 | Uniphore Software Systems, Inc. | Method and apparatus for automatic speaker diarization |
CN111477240B (zh) * | 2020-04-07 | 2023-04-07 | 浙江同花顺智能科技有限公司 | 音频处理方法、装置、设备和存储介质 |
CN111524521B (zh) * | 2020-04-22 | 2023-08-08 | 北京小米松果电子有限公司 | 声纹提取模型训练方法和声纹识别方法、及其装置和介质 |
CN111524527B (zh) * | 2020-04-30 | 2023-08-22 | 合肥讯飞数码科技有限公司 | 话者分离方法、装置、电子设备和存储介质 |
CN111613249A (zh) * | 2020-05-22 | 2020-09-01 | 云知声智能科技股份有限公司 | 一种语音分析方法和设备 |
CN111640438B (zh) * | 2020-05-26 | 2023-09-05 | 同盾控股有限公司 | 音频数据处理方法、装置、存储介质及电子设备 |
CN111680631B (zh) * | 2020-06-09 | 2023-12-22 | 广州视源电子科技股份有限公司 | 模型训练方法及装置 |
CN111785291B (zh) * | 2020-07-02 | 2024-07-02 | 北京捷通华声科技股份有限公司 | 语音分离方法和语音分离装置 |
CN111933153B (zh) * | 2020-07-07 | 2024-03-08 | 北京捷通华声科技股份有限公司 | 一种语音分割点的确定方法和装置 |
CN111985934B (zh) * | 2020-07-30 | 2024-07-12 | 浙江百世技术有限公司 | 智能客服对话模型构建方法及应用 |
CN111899755A (zh) * | 2020-08-11 | 2020-11-06 | 华院数据技术(上海)有限公司 | 一种说话人语音分离方法及相关设备 |
CN112071329B (zh) * | 2020-09-16 | 2022-09-16 | 腾讯科技(深圳)有限公司 | 一种多人的语音分离方法、装置、电子设备和存储介质 |
CN112071330B (zh) * | 2020-09-16 | 2022-09-20 | 腾讯科技(深圳)有限公司 | 一种音频数据处理方法、设备以及计算机可读存储介质 |
CN112489682B (zh) * | 2020-11-25 | 2023-05-23 | 平安科技(深圳)有限公司 | 音频处理方法、装置、电子设备和存储介质 |
CN112700766B (zh) * | 2020-12-23 | 2024-03-19 | 北京猿力未来科技有限公司 | 语音识别模型的训练方法及装置、语音识别方法及装置 |
CN112820292B (zh) * | 2020-12-29 | 2023-07-18 | 平安银行股份有限公司 | 生成会议纪要的方法、装置、电子装置及存储介质 |
CN112289323B (zh) * | 2020-12-29 | 2021-05-28 | 深圳追一科技有限公司 | 语音数据处理方法、装置、计算机设备和存储介质 |
CN113544700A (zh) * | 2020-12-31 | 2021-10-22 | 商汤国际私人有限公司 | 神经网络的训练方法和装置、关联对象的检测方法和装置 |
KR20220115453A (ko) * | 2021-02-10 | 2022-08-17 | 삼성전자주식회사 | 음성 구간 인식의 향상을 지원하는 전자 장치 |
KR20220136750A (ko) * | 2021-04-01 | 2022-10-11 | 삼성전자주식회사 | 사용자 발화를 처리하는 전자 장치, 및 그 전자 장치의 제어 방법 |
CN113178205B (zh) * | 2021-04-30 | 2024-07-05 | 平安科技(深圳)有限公司 | 语音分离方法、装置、计算机设备及存储介质 |
KR20220169242A (ko) * | 2021-06-18 | 2022-12-27 | 삼성전자주식회사 | 전자 장치 및 전자 장치의 개인화된 음성 처리 방법 |
US20220406324A1 (en) * | 2021-06-18 | 2022-12-22 | Samsung Electronics Co., Ltd. | Electronic device and personalized audio processing method of the electronic device |
WO2023281717A1 (ja) * | 2021-07-08 | 2023-01-12 | 日本電信電話株式会社 | 話者ダイアライゼーション方法、話者ダイアライゼーション装置および話者ダイアライゼーションプログラム |
CN113362831A (zh) * | 2021-07-12 | 2021-09-07 | 科大讯飞股份有限公司 | 一种说话人分离方法及其相关设备 |
CN113571085B (zh) * | 2021-07-24 | 2023-09-22 | 平安科技(深圳)有限公司 | 语音分离方法、系统、装置和存储介质 |
CN113657289B (zh) * | 2021-08-19 | 2023-08-08 | 北京百度网讯科技有限公司 | 阈值估计模型的训练方法、装置和电子设备 |
JPWO2023047475A1 (ja) * | 2021-09-21 | 2023-03-30 | ||
KR20230042998A (ko) * | 2021-09-23 | 2023-03-30 | 한국전자통신연구원 | 음성 구간 분리 장치 및 그 방법 |
CN114363531B (zh) * | 2022-01-14 | 2023-08-01 | 中国平安人寿保险股份有限公司 | 基于h5的文案解说视频生成方法、装置、设备以及介质 |
CN115171716B (zh) * | 2022-06-14 | 2024-04-19 | 武汉大学 | 一种基于空间特征聚类的连续语音分离方法、系统及电子设备 |
CN115659162B (zh) * | 2022-09-15 | 2023-10-03 | 云南财经大学 | 雷达辐射源信号脉内特征提取方法、系统及设备 |
CN117037255B (zh) * | 2023-08-22 | 2024-06-21 | 北京中科深智科技有限公司 | 基于有向图的3d表情合成方法 |
Family Cites Families (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0272398A (ja) * | 1988-09-07 | 1990-03-12 | Hitachi Ltd | 音声信号用前処理装置 |
KR100612840B1 (ko) | 2004-02-18 | 2006-08-18 | 삼성전자주식회사 | 모델 변이 기반의 화자 클러스터링 방법, 화자 적응 방법및 이들을 이용한 음성 인식 장치 |
JP2008051907A (ja) * | 2006-08-22 | 2008-03-06 | Toshiba Corp | 発話区間識別装置及びその方法 |
WO2016095218A1 (en) * | 2014-12-19 | 2016-06-23 | Dolby Laboratories Licensing Corporation | Speaker identification using spatial information |
JP6430318B2 (ja) * | 2015-04-06 | 2018-11-28 | 日本電信電話株式会社 | 不正音声入力判定装置、方法及びプログラム |
CN106683661B (zh) * | 2015-11-05 | 2021-02-05 | 阿里巴巴集团控股有限公司 | 基于语音的角色分离方法及装置 |
JP2017120595A (ja) * | 2015-12-29 | 2017-07-06 | 花王株式会社 | 化粧料の塗布状態の評価方法 |
KR102450441B1 (ko) * | 2016-07-14 | 2022-09-30 | 매직 립, 인코포레이티드 | 홍채 식별을 위한 딥 뉴럴 네트워크 |
US9824692B1 (en) * | 2016-09-12 | 2017-11-21 | Pindrop Security, Inc. | End-to-end speaker recognition using deep neural network |
WO2018069974A1 (ja) * | 2016-10-11 | 2018-04-19 | エスゼット ディージェイアイ テクノロジー カンパニー リミテッド | 撮像装置、撮像システム、移動体、方法及びプログラム |
US10497382B2 (en) * | 2016-12-16 | 2019-12-03 | Google Llc | Associating faces with voices for speaker diarization within videos |
CN107221320A (zh) * | 2017-05-19 | 2017-09-29 | 百度在线网络技术(北京)有限公司 | 训练声学特征提取模型的方法、装置、设备和计算机存储介质 |
CN107180628A (zh) * | 2017-05-19 | 2017-09-19 | 百度在线网络技术(北京)有限公司 | 建立声学特征提取模型的方法、提取声学特征的方法、装置 |
CN107342077A (zh) * | 2017-05-27 | 2017-11-10 | 国家计算机网络与信息安全管理中心 | 一种基于因子分析的说话人分段聚类方法及系统 |
CN107680611B (zh) | 2017-09-13 | 2020-06-16 | 电子科技大学 | 基于卷积神经网络的单通道声音分离方法 |
US10529349B2 (en) * | 2018-04-16 | 2020-01-07 | Mitsubishi Electric Research Laboratories, Inc. | Methods and systems for end-to-end speech separation with unfolded iterative phase reconstruction |
US10963273B2 (en) * | 2018-04-20 | 2021-03-30 | Facebook, Inc. | Generating personalized content summaries for users |
-
2018
- 2018-05-28 CN CN201810519521.6A patent/CN108766440B/zh active Active
- 2018-08-13 JP JP2019572830A patent/JP2020527248A/ja active Pending
- 2018-08-13 SG SG11202003722SA patent/SG11202003722SA/en unknown
- 2018-08-13 WO PCT/CN2018/100174 patent/WO2019227672A1/zh active Application Filing
- 2018-08-13 US US16/652,452 patent/US11158324B2/en active Active
Also Published As
Publication number | Publication date |
---|---|
US11158324B2 (en) | 2021-10-26 |
CN108766440A (zh) | 2018-11-06 |
WO2019227672A1 (zh) | 2019-12-05 |
US20200234717A1 (en) | 2020-07-23 |
JP2020527248A (ja) | 2020-09-03 |
CN108766440B (zh) | 2020-01-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
SG11202003722SA (en) | Speaker separation model training method, two-speaker separation method and computing device | |
EP3742436A4 (en) | SPEECH SYNTHESIS METHOD, MODEL TRAINING METHOD, DEVICE AND COMPUTER DEVICE | |
EP3690763A4 (en) | MACHINE LEARNING MODEL TRAINING METHOD AND DEVICE AND ELECTRONIC DEVICE | |
EP3683725A4 (en) | METHOD OF GENERATION ABSTRACT DESCRIPTION, METHOD OF TRAINING ABSTRACT DESCRIPTION MODEL AND COMPUTER DEVICE | |
EP3611725A4 (en) | METHOD OF TRAINING A MODEL FOR VOICE SIGNAL PROCESSING, ELECTRONIC DEVICE AND STORAGE MEDIUM | |
EP3633610A4 (en) | LEARNING DEVICE, LEARNING METHOD, LEARNING MODEL, APPRAISAL DEVICE AND GRIPPING SYSTEM | |
EP3537349A4 (en) | MACHINE LEARNING MODEL FOR TRAINING METHOD AND DEVICE | |
EP3582118A4 (en) | METHOD AND DEVICE FOR TRAINING A CLASSIFICATION MODEL | |
EP3872705A4 (en) | DETECTION MODEL LEARNING PROCESS AND APPARATUS, AND TERMINAL DEVICE | |
EP3716156A4 (en) | NEURONAL NETWORK MODEL LEARNING PROCESS AND APPARATUS | |
SG11202000749RA (en) | Model training method and apparatus | |
EP3951646A4 (en) | IMAGE RECOGNITION NETWORK MODEL LEARNING METHOD, IMAGE RECOGNITION METHOD AND DEVICE | |
EP3579153A4 (en) | LEARNED MODEL PROVIDING METHOD AND LEARNED MODEL PROVIDING DEVICE | |
EP3633549A4 (en) | FACIAL DETECTION LEARNING PROCESS, APPARATUS AND ELECTRONIC DEVICE | |
EP3611657A4 (en) | MODEL TRAINING METHOD AND METHOD, DEVICE AND DEVICE FOR DETERMINING DATA SIMILARITY | |
EP3503980A4 (en) | EXERCISE SYSTEM AND METHOD | |
EP3399426A4 (en) | METHOD AND DEVICE FOR LEARNING A MODEL IN A DISTRIBUTED SYSTEM | |
EP3648044A4 (en) | METHOD, APPARATUS AND DEVICE FOR FORMING RISK MANAGEMENT MODEL AND RISK MANAGEMENT | |
EP3179473A4 (en) | Training method and apparatus for language model, and device | |
EP3540652A4 (en) | METHOD, DEVICE, CHIP AND SYSTEM FOR TRAINING A NEURAL NETWORK MODEL | |
EP3440659A4 (en) | CPR EXERCISE SYSTEM AND PROCEDURE | |
EP3579169A4 (en) | LEARNED MODEL PROVIDING METHOD AND LEARNED MODEL PROVIDING DEVICE | |
EP3346394A4 (en) | TRAINING DEVICE FOR QUESTION REPLY SYSTEM AND COMPUTER PROGRAM THEREFOR | |
EP3678072A4 (en) | MODEL INTEGRATION METHOD AND DEVICE | |
EP3627401A4 (en) | METHOD AND DEVICE FOR TRAINING A NEURONAL NETWORK |