CN104115221B - 基于文本到语音转换以及语义的音频人类交互证明 - Google Patents

基于文本到语音转换以及语义的音频人类交互证明 Download PDF

Info

Publication number
CN104115221B
CN104115221B CN201380009453.4A CN201380009453A CN104115221B CN 104115221 B CN104115221 B CN 104115221B CN 201380009453 A CN201380009453 A CN 201380009453A CN 104115221 B CN104115221 B CN 104115221B
Authority
CN
China
Prior art keywords
text
speech
audio
answer
hip
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201380009453.4A
Other languages
English (en)
Chinese (zh)
Other versions
CN104115221A (zh
Inventor
Y·钱
B·B·朱
F·K-P·宋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Technology Licensing LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Technology Licensing LLC filed Critical Microsoft Technology Licensing LLC
Publication of CN104115221A publication Critical patent/CN104115221A/zh
Application granted granted Critical
Publication of CN104115221B publication Critical patent/CN104115221B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/033Voice editing, e.g. manipulating the voice of the synthesiser
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2133Verifying human interaction, e.g., Captcha
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)
CN201380009453.4A 2012-02-17 2013-02-01 基于文本到语音转换以及语义的音频人类交互证明 Active CN104115221B (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US13/399,496 2012-02-17
US13/399,496 US10319363B2 (en) 2012-02-17 2012-02-17 Audio human interactive proof based on text-to-speech and semantics
PCT/US2013/024245 WO2013122750A1 (en) 2012-02-17 2013-02-01 Audio human interactive proof based on text-to-speech and semantics

Publications (2)

Publication Number Publication Date
CN104115221A CN104115221A (zh) 2014-10-22
CN104115221B true CN104115221B (zh) 2017-09-01

Family

ID=48982943

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201380009453.4A Active CN104115221B (zh) 2012-02-17 2013-02-01 基于文本到语音转换以及语义的音频人类交互证明

Country Status (7)

Country Link
US (1) US10319363B2 (enExample)
EP (1) EP2815398B1 (enExample)
JP (1) JP6238312B2 (enExample)
KR (1) KR102101044B1 (enExample)
CN (1) CN104115221B (enExample)
ES (1) ES2628901T3 (enExample)
WO (1) WO2013122750A1 (enExample)

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140067394A1 (en) * 2012-08-28 2014-03-06 King Abdulaziz City For Science And Technology System and method for decoding speech
US10149077B1 (en) * 2012-10-04 2018-12-04 Amazon Technologies, Inc. Audio themes
US9338162B2 (en) * 2014-06-13 2016-05-10 International Business Machines Corporation CAPTCHA challenge incorporating obfuscated characters
CN105047192B (zh) * 2015-05-25 2018-08-17 上海交通大学 基于隐马尔科夫模型的统计语音合成方法及装置
CN105185379B (zh) * 2015-06-17 2017-08-18 百度在线网络技术(北京)有限公司 声纹认证方法和装置
CN105161105A (zh) * 2015-07-31 2015-12-16 北京奇虎科技有限公司 一种交互系统的语音识别方法和装置
CN105161098A (zh) * 2015-07-31 2015-12-16 北京奇虎科技有限公司 一种交互系统的语音识别方法和装置
US10277581B2 (en) * 2015-09-08 2019-04-30 Oath, Inc. Audio verification
US9466299B1 (en) 2015-11-18 2016-10-11 International Business Machines Corporation Speech source classification
US10347247B2 (en) * 2016-12-30 2019-07-09 Google Llc Modulation of packetized audio signals
US10332520B2 (en) 2017-02-13 2019-06-25 Qualcomm Incorporated Enhanced speech generation
CN108630193B (zh) * 2017-03-21 2020-10-02 北京嘀嘀无限科技发展有限公司 语音识别方法及装置
WO2018183290A1 (en) * 2017-03-27 2018-10-04 Orion Labs Bot group messaging using general voice libraries
CN107609389B (zh) * 2017-08-24 2020-10-30 南京理工大学 一种基于图像内容相关性的验证方法及系统
JP6791825B2 (ja) * 2017-09-26 2020-11-25 株式会社日立製作所 情報処理装置、対話処理方法及び対話システム
WO2019077013A1 (en) 2017-10-18 2019-04-25 Soapbox Labs Ltd. METHODS AND SYSTEMS FOR PROCESSING AUDIO SIGNALS CONTAINING VOICE DATA
KR20190057687A (ko) * 2017-11-20 2019-05-29 삼성전자주식회사 챗봇 변경을 위한 위한 전자 장치 및 이의 제어 방법
US11355125B2 (en) 2018-08-06 2022-06-07 Google Llc Captcha automated assistant
CN111048062B (zh) * 2018-10-10 2022-10-04 华为技术有限公司 语音合成方法及设备
US11423073B2 (en) 2018-11-16 2022-08-23 Microsoft Technology Licensing, Llc System and management of semantic indicators during document presentations
US11126794B2 (en) * 2019-04-11 2021-09-21 Microsoft Technology Licensing, Llc Targeted rewrites
CN110390104B (zh) * 2019-07-23 2023-05-05 思必驰科技股份有限公司 用于语音对话平台的不规则文本转写方法及系统
KR102663669B1 (ko) * 2019-11-01 2024-05-08 엘지전자 주식회사 소음 환경에서의 음성 합성
US20220035898A1 (en) * 2020-07-31 2022-02-03 Nuance Communications, Inc. Audio CAPTCHA Using Echo
FR3122508A1 (fr) * 2021-04-29 2022-11-04 Orange Caractérisation d’un utilisateur par association d’un son à un élément interactif
US20230142081A1 (en) * 2021-11-10 2023-05-11 Nuance Communications, Inc. Voice captcha
CN114299919B (zh) * 2021-12-27 2025-06-03 完美世界(北京)软件科技发展有限公司 文字转语音方法、装置、存储介质及计算机设备
US20240363119A1 (en) * 2023-04-28 2024-10-31 Pindrop Security, Inc. Active voice liveness detection system
WO2024259486A1 (en) * 2023-06-19 2024-12-26 Macquarie University Scam call system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1584979A (zh) * 2004-06-01 2005-02-23 安徽中科大讯飞信息科技有限公司 在语音合系统中将背景音与文本语音混合输出的方法
CN1758330A (zh) * 2004-10-01 2006-04-12 美国电报电话公司 用于通过交互式话音响应系统防止语音理解的方法和设备
US20100312562A1 (en) * 2009-06-04 2010-12-09 Microsoft Corporation Hidden markov model based text to speech systems employing rope-jumping algorithm
US20120004914A1 (en) * 2006-06-21 2012-01-05 Tell Me Networks c/o Microsoft Corporation Audio human verification

Family Cites Families (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63231496A (ja) * 1987-03-20 1988-09-27 富士通株式会社 音声認識応答システム
US6195698B1 (en) 1998-04-13 2001-02-27 Compaq Computer Corporation Method for selectively restricting access to computer systems
US7054811B2 (en) 2002-11-06 2006-05-30 Cellmax Systems Ltd. Method and system for verifying and enabling user access based on voice parameters
US7039949B2 (en) 2001-12-10 2006-05-02 Brian Ross Cartmell Method and system for blocking unwanted communications
JP2003302999A (ja) * 2002-04-11 2003-10-24 Advanced Media Inc 音声による個人認証システム
US20040254793A1 (en) * 2003-06-12 2004-12-16 Cormac Herley System and method for providing an audio challenge to distinguish a human from a computer
US7841940B2 (en) * 2003-07-14 2010-11-30 Astav, Inc Human test based on human conceptual capabilities
US8255223B2 (en) 2004-12-03 2012-08-28 Microsoft Corporation User authentication by combining speaker verification and reverse turing test
US7945952B1 (en) * 2005-06-30 2011-05-17 Google Inc. Methods and apparatuses for presenting challenges to tell humans and computers apart
US8145914B2 (en) 2005-12-15 2012-03-27 Microsoft Corporation Client-side CAPTCHA ceremony for user verification
US20070165811A1 (en) 2006-01-19 2007-07-19 John Reumann System and method for spam detection
US7680891B1 (en) * 2006-06-19 2010-03-16 Google Inc. CAPTCHA-based spam control for content creation systems
US20090055193A1 (en) * 2007-02-22 2009-02-26 Pudding Holdings Israel Ltd. Method, apparatus and computer code for selectively providing access to a service in accordance with spoken content received from a user
BRPI0808289A2 (pt) * 2007-03-21 2015-06-16 Vivotext Ltd "biblioteca de amostras de fala para transformar texto em falta e métodos e instrumentos para gerar e utilizar o mesmo"
CN101059830A (zh) 2007-06-01 2007-10-24 华南理工大学 一种可结合游戏特征的机器人外挂识别方法
US8495727B2 (en) 2007-08-07 2013-07-23 Microsoft Corporation Spam reduction in real time communications by human interaction proof
US20090249477A1 (en) * 2008-03-28 2009-10-01 Yahoo! Inc. Method and system for determining whether a computer user is human
US8489399B2 (en) 2008-06-23 2013-07-16 John Nicholas and Kristin Gross Trust System and method for verifying origin of input through spoken language analysis
US8752141B2 (en) * 2008-06-27 2014-06-10 John Nicholas Methods for presenting and determining the efficacy of progressive pictorial and motion-based CAPTCHAs
US8793135B2 (en) * 2008-08-25 2014-07-29 At&T Intellectual Property I, L.P. System and method for auditory captchas
US8925057B1 (en) * 2009-02-06 2014-12-30 New Jersey Institute Of Technology Automated tests to distinguish computers from humans
US9342508B2 (en) * 2009-03-19 2016-05-17 Microsoft Technology Licensing, Llc Data localization templates and parsing
WO2012010743A1 (en) * 2010-07-23 2012-01-26 Nokia Corporation Method and apparatus for authorizing a user or a user device based on location information
WO2012029519A1 (ja) * 2010-08-31 2012-03-08 楽天株式会社 応答判定装置、応答判定方法、応答判定プログラム、記録媒体、および、応答判定システム
US8719930B2 (en) * 2010-10-12 2014-05-06 Sonus Networks, Inc. Real-time network attack detection and mitigation infrastructure
CA2819473A1 (en) * 2010-11-30 2012-06-07 Towson University Audio based human-interaction proof
JP2012163692A (ja) * 2011-02-04 2012-08-30 Nec Corp 音声信号処理システム、音声信号処理方法および音声信号処理方法プログラム
US20120232907A1 (en) * 2011-03-09 2012-09-13 Christopher Liam Ivey System and Method for Delivering a Human Interactive Proof to the Visually Impaired by Means of Semantic Association of Objects
US8810368B2 (en) * 2011-03-29 2014-08-19 Nokia Corporation Method and apparatus for providing biometric authentication using distributed computations
US8904517B2 (en) * 2011-06-28 2014-12-02 International Business Machines Corporation System and method for contexually interpreting image sequences
US9146917B2 (en) * 2011-07-15 2015-09-29 International Business Machines Corporation Validating that a user is human

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1584979A (zh) * 2004-06-01 2005-02-23 安徽中科大讯飞信息科技有限公司 在语音合系统中将背景音与文本语音混合输出的方法
CN1758330A (zh) * 2004-10-01 2006-04-12 美国电报电话公司 用于通过交互式话音响应系统防止语音理解的方法和设备
US20120004914A1 (en) * 2006-06-21 2012-01-05 Tell Me Networks c/o Microsoft Corporation Audio human verification
US20100312562A1 (en) * 2009-06-04 2010-12-09 Microsoft Corporation Hidden markov model based text to speech systems employing rope-jumping algorithm

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
《A REVERSE TURING TEST USING SPEECH》;Greg Kochanski et al.;《ICSLP 2002:7th International Conference On Spoken Language Processing》;20020930;全文 *
《一种智能变换语音技术的研究》;袁俏;《中国优秀硕士学位论文全文数据库》;20111215(第S1期);全文 *
《汉语语音验证码技术及应用》;郭峰;《中国优秀硕士学位论文全文数据库 信息科技辑》;20110415(第04期);全文 *

Also Published As

Publication number Publication date
EP2815398B1 (en) 2017-03-29
WO2013122750A1 (en) 2013-08-22
JP6238312B2 (ja) 2017-11-29
US10319363B2 (en) 2019-06-11
KR102101044B1 (ko) 2020-04-14
CN104115221A (zh) 2014-10-22
ES2628901T3 (es) 2017-08-04
EP2815398A1 (en) 2014-12-24
US20130218566A1 (en) 2013-08-22
KR20140134653A (ko) 2014-11-24
JP2015510147A (ja) 2015-04-02
EP2815398A4 (en) 2015-05-06

Similar Documents

Publication Publication Date Title
CN104115221B (zh) 基于文本到语音转换以及语义的音频人类交互证明
US12020687B2 (en) Method and system for a parametric speech synthesis
AU2019395322B2 (en) Reconciliation between simulated data and speech recognition output using sequence-to-sequence mapping
US10210861B1 (en) Conversational agent pipeline trained on synthetic data
KR20230003056A (ko) 비음성 텍스트 및 스피치 합성을 사용한 스피치 인식
US20230230576A1 (en) Text-to-speech synthesis method and system, and a method of training a text-to-speech synthesis system
WO2017067206A1 (zh) 个性化多声学模型的训练方法、语音合成方法及装置
US10685644B2 (en) Method and system for text-to-speech synthesis
CN101551947A (zh) 辅助口语语言学习的计算机系统
US9437195B2 (en) Biometric password security
JPWO2016103652A1 (ja) 音声処理装置、音声処理方法、およびプログラム
US12118898B2 (en) Voice visualization system for english learning, and method therefor
Zahariev et al. An approach to speech ambiguities eliminating using semantically-acoustical analysis
Motyka et al. Information technology of transcribing Ukrainian-language content based on deep learning
Louw et al. The Speect text-to-speech entry for the Blizzard Challenge 2016
Afzal et al. Recitation of The Holy Quran Verses Recognition System Based on Speech Recognition Techniques
KR102621954B1 (ko) 관련 지식 유무에 따라 대화모델을 운용하는 대화 방법 및 시스템
Johnson Towards Inclusive Low-Resource Speech Technologies: A Case Study of Educational Systems for African American English-Speaking Children
Sulír et al. Speaker adaptation for Slovak statistical parametric speech synthesis based on hidden Markov models
Ajayi et al. Indigenuous Vocabulary Reformulation For Continuousyorùbá Speech Recognition In M-Commerce Using Acoustic Nudging-Based Gaussian Mixture Model
CN119181364A (zh) 文本增强方法、装置、电子设备及存储介质
CN120472890A (zh) 语音处理方法、装置及电子设备
Rouhe Finite state models for recognition and validation of read prompts
Tratnik et al. Automatically Generating Text from Film Material–A Comparison of Three Models
Ford Jr Spoken Language Identification from Processing and Pattern Analysis of Spectrograms

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
ASS Succession or assignment of patent right

Owner name: MICROSOFT TECHNOLOGY LICENSING LLC

Free format text: FORMER OWNER: MICROSOFT CORP.

Effective date: 20150723

C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20150723

Address after: Washington State

Applicant after: Micro soft technique license Co., Ltd

Address before: Washington State

Applicant before: Microsoft Corp.

GR01 Patent grant
GR01 Patent grant