CN104115221B - 基于文本到语音转换以及语义的音频人类交互证明 - Google Patents
基于文本到语音转换以及语义的音频人类交互证明 Download PDFInfo
- Publication number
- CN104115221B CN104115221B CN201380009453.4A CN201380009453A CN104115221B CN 104115221 B CN104115221 B CN 104115221B CN 201380009453 A CN201380009453 A CN 201380009453A CN 104115221 B CN104115221 B CN 104115221B
- Authority
- CN
- China
- Prior art keywords
- text
- speech
- audio
- answer
- hip
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/21—Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/2133—Verifying human interaction, e.g., Captcha
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US13/399,496 US10319363B2 (en) | 2012-02-17 | 2012-02-17 | Audio human interactive proof based on text-to-speech and semantics |
| US13/399,496 | 2012-02-17 | ||
| PCT/US2013/024245 WO2013122750A1 (en) | 2012-02-17 | 2013-02-01 | Audio human interactive proof based on text-to-speech and semantics |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN104115221A CN104115221A (zh) | 2014-10-22 |
| CN104115221B true CN104115221B (zh) | 2017-09-01 |
Family
ID=48982943
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201380009453.4A Expired - Fee Related CN104115221B (zh) | 2012-02-17 | 2013-02-01 | 基于文本到语音转换以及语义的音频人类交互证明 |
Country Status (7)
| Country | Link |
|---|---|
| US (1) | US10319363B2 (https=) |
| EP (1) | EP2815398B1 (https=) |
| JP (1) | JP6238312B2 (https=) |
| KR (1) | KR102101044B1 (https=) |
| CN (1) | CN104115221B (https=) |
| ES (1) | ES2628901T3 (https=) |
| WO (1) | WO2013122750A1 (https=) |
Families Citing this family (31)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20140067394A1 (en) * | 2012-08-28 | 2014-03-06 | King Abdulaziz City For Science And Technology | System and method for decoding speech |
| US10149077B1 (en) * | 2012-10-04 | 2018-12-04 | Amazon Technologies, Inc. | Audio themes |
| US9338162B2 (en) * | 2014-06-13 | 2016-05-10 | International Business Machines Corporation | CAPTCHA challenge incorporating obfuscated characters |
| CN105047192B (zh) * | 2015-05-25 | 2018-08-17 | 上海交通大学 | 基于隐马尔科夫模型的统计语音合成方法及装置 |
| CN105185379B (zh) * | 2015-06-17 | 2017-08-18 | 百度在线网络技术(北京)有限公司 | 声纹认证方法和装置 |
| CN105161098A (zh) * | 2015-07-31 | 2015-12-16 | 北京奇虎科技有限公司 | 一种交互系统的语音识别方法和装置 |
| CN105161105A (zh) * | 2015-07-31 | 2015-12-16 | 北京奇虎科技有限公司 | 一种交互系统的语音识别方法和装置 |
| US10277581B2 (en) * | 2015-09-08 | 2019-04-30 | Oath, Inc. | Audio verification |
| US9466299B1 (en) | 2015-11-18 | 2016-10-11 | International Business Machines Corporation | Speech source classification |
| US10347247B2 (en) * | 2016-12-30 | 2019-07-09 | Google Llc | Modulation of packetized audio signals |
| US10332520B2 (en) | 2017-02-13 | 2019-06-25 | Qualcomm Incorporated | Enhanced speech generation |
| CN108630193B (zh) * | 2017-03-21 | 2020-10-02 | 北京嘀嘀无限科技发展有限公司 | 语音识别方法及装置 |
| WO2018183290A1 (en) * | 2017-03-27 | 2018-10-04 | Orion Labs | Bot group messaging using general voice libraries |
| CN107609389B (zh) * | 2017-08-24 | 2020-10-30 | 南京理工大学 | 一种基于图像内容相关性的验证方法及系统 |
| JP6791825B2 (ja) * | 2017-09-26 | 2020-11-25 | 株式会社日立製作所 | 情報処理装置、対話処理方法及び対話システム |
| EP3698358B1 (en) | 2017-10-18 | 2025-03-05 | Soapbox Labs Ltd. | Methods and systems for processing audio signals containing speech data |
| KR20190057687A (ko) * | 2017-11-20 | 2019-05-29 | 삼성전자주식회사 | 챗봇 변경을 위한 위한 전자 장치 및 이의 제어 방법 |
| EP3794473B1 (en) | 2018-08-06 | 2024-10-16 | Google LLC | Captcha automated assistant |
| CN111048062B (zh) * | 2018-10-10 | 2022-10-04 | 华为技术有限公司 | 语音合成方法及设备 |
| US11423073B2 (en) | 2018-11-16 | 2022-08-23 | Microsoft Technology Licensing, Llc | System and management of semantic indicators during document presentations |
| US11126794B2 (en) * | 2019-04-11 | 2021-09-21 | Microsoft Technology Licensing, Llc | Targeted rewrites |
| CN110390104B (zh) * | 2019-07-23 | 2023-05-05 | 思必驰科技股份有限公司 | 用于语音对话平台的不规则文本转写方法及系统 |
| KR102663669B1 (ko) * | 2019-11-01 | 2024-05-08 | 엘지전자 주식회사 | 소음 환경에서의 음성 합성 |
| US20220035898A1 (en) * | 2020-07-31 | 2022-02-03 | Nuance Communications, Inc. | Audio CAPTCHA Using Echo |
| FR3122508A1 (fr) * | 2021-04-29 | 2022-11-04 | Orange | Caractérisation d’un utilisateur par association d’un son à un élément interactif |
| US20230142081A1 (en) * | 2021-11-10 | 2023-05-11 | Nuance Communications, Inc. | Voice captcha |
| CN114299919B (zh) * | 2021-12-27 | 2025-06-03 | 完美世界(北京)软件科技发展有限公司 | 文字转语音方法、装置、存储介质及计算机设备 |
| US12562150B2 (en) | 2023-04-21 | 2026-02-24 | Pindrop Security, Inc. | Deepfake detection |
| US12525224B2 (en) | 2023-04-21 | 2026-01-13 | Pindrop Security, Inc. | Deepfake detection |
| WO2024226757A2 (en) * | 2023-04-28 | 2024-10-31 | Pindrop Security, Inc. | Active voice liveness detection system |
| WO2024259486A1 (en) * | 2023-06-19 | 2024-12-26 | Macquarie University | Scam call system |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1584979A (zh) * | 2004-06-01 | 2005-02-23 | 安徽中科大讯飞信息科技有限公司 | 在语音合系统中将背景音与文本语音混合输出的方法 |
| CN1758330A (zh) * | 2004-10-01 | 2006-04-12 | 美国电报电话公司 | 用于通过交互式话音响应系统防止语音理解的方法和设备 |
| US20100312562A1 (en) * | 2009-06-04 | 2010-12-09 | Microsoft Corporation | Hidden markov model based text to speech systems employing rope-jumping algorithm |
| US20120004914A1 (en) * | 2006-06-21 | 2012-01-05 | Tell Me Networks c/o Microsoft Corporation | Audio human verification |
Family Cites Families (31)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPS63231496A (ja) * | 1987-03-20 | 1988-09-27 | 富士通株式会社 | 音声認識応答システム |
| US6195698B1 (en) | 1998-04-13 | 2001-02-27 | Compaq Computer Corporation | Method for selectively restricting access to computer systems |
| US7054811B2 (en) | 2002-11-06 | 2006-05-30 | Cellmax Systems Ltd. | Method and system for verifying and enabling user access based on voice parameters |
| US7039949B2 (en) | 2001-12-10 | 2006-05-02 | Brian Ross Cartmell | Method and system for blocking unwanted communications |
| JP2003302999A (ja) * | 2002-04-11 | 2003-10-24 | Advanced Media Inc | 音声による個人認証システム |
| US20040254793A1 (en) * | 2003-06-12 | 2004-12-16 | Cormac Herley | System and method for providing an audio challenge to distinguish a human from a computer |
| US7841940B2 (en) * | 2003-07-14 | 2010-11-30 | Astav, Inc | Human test based on human conceptual capabilities |
| US8255223B2 (en) | 2004-12-03 | 2012-08-28 | Microsoft Corporation | User authentication by combining speaker verification and reverse turing test |
| US7945952B1 (en) * | 2005-06-30 | 2011-05-17 | Google Inc. | Methods and apparatuses for presenting challenges to tell humans and computers apart |
| US8145914B2 (en) | 2005-12-15 | 2012-03-27 | Microsoft Corporation | Client-side CAPTCHA ceremony for user verification |
| US20070165811A1 (en) | 2006-01-19 | 2007-07-19 | John Reumann | System and method for spam detection |
| US7680891B1 (en) * | 2006-06-19 | 2010-03-16 | Google Inc. | CAPTCHA-based spam control for content creation systems |
| US20090055193A1 (en) * | 2007-02-22 | 2009-02-26 | Pudding Holdings Israel Ltd. | Method, apparatus and computer code for selectively providing access to a service in accordance with spoken content received from a user |
| WO2008114258A1 (en) * | 2007-03-21 | 2008-09-25 | Vivotext Ltd. | Speech samples library for text-to-speech and methods and apparatus for generating and using same |
| CN101059830A (zh) | 2007-06-01 | 2007-10-24 | 华南理工大学 | 一种可结合游戏特征的机器人外挂识别方法 |
| US8495727B2 (en) | 2007-08-07 | 2013-07-23 | Microsoft Corporation | Spam reduction in real time communications by human interaction proof |
| US20090249477A1 (en) * | 2008-03-28 | 2009-10-01 | Yahoo! Inc. | Method and system for determining whether a computer user is human |
| WO2010008722A1 (en) | 2008-06-23 | 2010-01-21 | John Nicholas Gross | Captcha system optimized for distinguishing between humans and machines |
| US8752141B2 (en) * | 2008-06-27 | 2014-06-10 | John Nicholas | Methods for presenting and determining the efficacy of progressive pictorial and motion-based CAPTCHAs |
| US8793135B2 (en) * | 2008-08-25 | 2014-07-29 | At&T Intellectual Property I, L.P. | System and method for auditory captchas |
| US8925057B1 (en) * | 2009-02-06 | 2014-12-30 | New Jersey Institute Of Technology | Automated tests to distinguish computers from humans |
| US9342508B2 (en) * | 2009-03-19 | 2016-05-17 | Microsoft Technology Licensing, Llc | Data localization templates and parsing |
| WO2012010743A1 (en) * | 2010-07-23 | 2012-01-26 | Nokia Corporation | Method and apparatus for authorizing a user or a user device based on location information |
| US8863233B2 (en) * | 2010-08-31 | 2014-10-14 | Rakuten, Inc. | Response determination apparatus, response determination method, response determination program, recording medium, and response determination system |
| US8719930B2 (en) * | 2010-10-12 | 2014-05-06 | Sonus Networks, Inc. | Real-time network attack detection and mitigation infrastructure |
| EP2647003A4 (en) * | 2010-11-30 | 2014-08-06 | Towson University | AUDIO-BASED PROOF OF HUMAN INTERACTIONS |
| JP2012163692A (ja) * | 2011-02-04 | 2012-08-30 | Nec Corp | 音声信号処理システム、音声信号処理方法および音声信号処理方法プログラム |
| US20120232907A1 (en) * | 2011-03-09 | 2012-09-13 | Christopher Liam Ivey | System and Method for Delivering a Human Interactive Proof to the Visually Impaired by Means of Semantic Association of Objects |
| US8810368B2 (en) * | 2011-03-29 | 2014-08-19 | Nokia Corporation | Method and apparatus for providing biometric authentication using distributed computations |
| US8904517B2 (en) * | 2011-06-28 | 2014-12-02 | International Business Machines Corporation | System and method for contexually interpreting image sequences |
| US9146917B2 (en) * | 2011-07-15 | 2015-09-29 | International Business Machines Corporation | Validating that a user is human |
-
2012
- 2012-02-17 US US13/399,496 patent/US10319363B2/en active Active
-
2013
- 2013-02-01 JP JP2014557674A patent/JP6238312B2/ja not_active Expired - Fee Related
- 2013-02-01 EP EP13749405.0A patent/EP2815398B1/en not_active Not-in-force
- 2013-02-01 CN CN201380009453.4A patent/CN104115221B/zh not_active Expired - Fee Related
- 2013-02-01 KR KR1020147022837A patent/KR102101044B1/ko not_active Expired - Fee Related
- 2013-02-01 ES ES13749405.0T patent/ES2628901T3/es active Active
- 2013-02-01 WO PCT/US2013/024245 patent/WO2013122750A1/en not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1584979A (zh) * | 2004-06-01 | 2005-02-23 | 安徽中科大讯飞信息科技有限公司 | 在语音合系统中将背景音与文本语音混合输出的方法 |
| CN1758330A (zh) * | 2004-10-01 | 2006-04-12 | 美国电报电话公司 | 用于通过交互式话音响应系统防止语音理解的方法和设备 |
| US20120004914A1 (en) * | 2006-06-21 | 2012-01-05 | Tell Me Networks c/o Microsoft Corporation | Audio human verification |
| US20100312562A1 (en) * | 2009-06-04 | 2010-12-09 | Microsoft Corporation | Hidden markov model based text to speech systems employing rope-jumping algorithm |
Non-Patent Citations (3)
| Title |
|---|
| 《A REVERSE TURING TEST USING SPEECH》;Greg Kochanski et al.;《ICSLP 2002:7th International Conference On Spoken Language Processing》;20020930;全文 * |
| 《一种智能变换语音技术的研究》;袁俏;《中国优秀硕士学位论文全文数据库》;20111215(第S1期);全文 * |
| 《汉语语音验证码技术及应用》;郭峰;《中国优秀硕士学位论文全文数据库 信息科技辑》;20110415(第04期);全文 * |
Also Published As
| Publication number | Publication date |
|---|---|
| EP2815398B1 (en) | 2017-03-29 |
| JP6238312B2 (ja) | 2017-11-29 |
| US20130218566A1 (en) | 2013-08-22 |
| EP2815398A4 (en) | 2015-05-06 |
| WO2013122750A1 (en) | 2013-08-22 |
| ES2628901T3 (es) | 2017-08-04 |
| CN104115221A (zh) | 2014-10-22 |
| US10319363B2 (en) | 2019-06-11 |
| JP2015510147A (ja) | 2015-04-02 |
| KR102101044B1 (ko) | 2020-04-14 |
| KR20140134653A (ko) | 2014-11-24 |
| EP2815398A1 (en) | 2014-12-24 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN104115221B (zh) | 基于文本到语音转换以及语义的音频人类交互证明 | |
| US12020687B2 (en) | Method and system for a parametric speech synthesis | |
| AU2019395322B2 (en) | Reconciliation between simulated data and speech recognition output using sequence-to-sequence mapping | |
| US10210861B1 (en) | Conversational agent pipeline trained on synthetic data | |
| US20230230576A1 (en) | Text-to-speech synthesis method and system, and a method of training a text-to-speech synthesis system | |
| KR20230003056A (ko) | 비음성 텍스트 및 스피치 합성을 사용한 스피치 인식 | |
| WO2017067206A1 (zh) | 个性化多声学模型的训练方法、语音合成方法及装置 | |
| US10685644B2 (en) | Method and system for text-to-speech synthesis | |
| CN101551947A (zh) | 辅助口语语言学习的计算机系统 | |
| US9437195B2 (en) | Biometric password security | |
| JPWO2016103652A1 (ja) | 音声処理装置、音声処理方法、およびプログラム | |
| US12118898B2 (en) | Voice visualization system for english learning, and method therefor | |
| Afzal et al. | Recitation of The Holy Quran Verses Recognition System Based on Speech Recognition Techniques | |
| Zahariev et al. | An approach to speech ambiguities eliminating using semantically-acoustical analysis | |
| Motyka et al. | Information technology of transcribing Ukrainian-language content based on deep learning | |
| KR102621954B1 (ko) | 관련 지식 유무에 따라 대화모델을 운용하는 대화 방법 및 시스템 | |
| CN116013246B (zh) | 说唱音乐自动生成方法及系统 | |
| Johnson | Towards Inclusive Low-Resource Speech Technologies: A Case Study of Educational Systems for African American English-Speaking Children | |
| Ajayi et al. | Indigenuous Vocabulary Reformulation For Continuousyorùbá Speech Recognition In M-Commerce Using Acoustic Nudging-Based Gaussian Mixture Model | |
| CN119181364A (zh) | 文本增强方法、装置、电子设备及存储介质 | |
| CN120472890A (zh) | 语音处理方法、装置及电子设备 | |
| Rouhe | Finite state models for recognition and validation of read prompts | |
| Tratnik et al. | Automatically Generating Text from Film Material–A Comparison of Three Models | |
| Ford Jr | Spoken Language Identification from Processing and Pattern Analysis of Spectrograms |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| ASS | Succession or assignment of patent right |
Owner name: MICROSOFT TECHNOLOGY LICENSING LLC Free format text: FORMER OWNER: MICROSOFT CORP. Effective date: 20150723 |
|
| C41 | Transfer of patent application or patent right or utility model | ||
| TA01 | Transfer of patent application right |
Effective date of registration: 20150723 Address after: Washington State Applicant after: MICROSOFT TECHNOLOGY LICENSING, LLC Address before: Washington State Applicant before: Microsoft Corp. |
|
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170901 |
|
| CF01 | Termination of patent right due to non-payment of annual fee |