CN116830559A - 处理语音音频流中断的系统和方法 - Google Patents

处理语音音频流中断的系统和方法 Download PDF

Info

Publication number
CN116830559A
CN116830559A CN202180092238.XA CN202180092238A CN116830559A CN 116830559 A CN116830559 A CN 116830559A CN 202180092238 A CN202180092238 A CN 202180092238A CN 116830559 A CN116830559 A CN 116830559A
Authority
CN
China
Prior art keywords
stream
speech
text
audio stream
interrupt
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180092238.XA
Other languages
English (en)
Chinese (zh)
Inventor
F·奥利维耶里
R·韦斯特堡
S·塔加杜尔希瓦帕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of CN116830559A publication Critical patent/CN116830559A/zh
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/56Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
    • H04M3/568Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/027Concept to speech synthesisers; Generation of natural phrases from machine-based concepts
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/147Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/157Conference systems defining a virtual conference space and using avatars or agents
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/39Electronic components, circuits, software, systems or apparatus used in telephone systems using speech synthesis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/40Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/60Medium conversion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2203/00Aspects of automatic or semi-automatic exchanges
    • H04M2203/20Aspects of automatic or semi-automatic exchanges related to features of supplementary services
    • H04M2203/2088Call or conference reconnect, e.g. resulting from isdn terminal portability

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Telephonic Communication Services (AREA)
  • Telephone Function (AREA)
CN202180092238.XA 2021-02-03 2021-12-09 处理语音音频流中断的系统和方法 Pending CN116830559A (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US17/166,250 2021-02-03
US17/166,250 US11580954B2 (en) 2021-02-03 2021-02-03 Systems and methods of handling speech audio stream interruptions
PCT/US2021/072831 WO2022169534A1 (en) 2021-02-03 2021-12-09 Systems and methods of handling speech audio stream interruptions

Publications (1)

Publication Number Publication Date
CN116830559A true CN116830559A (zh) 2023-09-29

Family

ID=79283143

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180092238.XA Pending CN116830559A (zh) 2021-02-03 2021-12-09 处理语音音频流中断的系统和方法

Country Status (7)

Country Link
US (1) US11580954B2 (https=)
EP (1) EP4289129B1 (https=)
JP (1) JP7798901B2 (https=)
KR (1) KR20230133864A (https=)
CN (1) CN116830559A (https=)
BR (1) BR112023014966A2 (https=)
WO (1) WO2022169534A1 (https=)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2026007057A1 (en) * 2024-07-04 2026-01-08 Ringcentral, Inc. Systems and methods for recreating lost or inaudible speech in a conversation

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220303152A1 (en) * 2021-03-18 2022-09-22 Lenovo (Singapore) Pte. Ltd. Recordation of video conference based on bandwidth issue(s)
US11895263B2 (en) * 2021-05-25 2024-02-06 International Business Machines Corporation Interpreting conference call interruptions
US20240062750A1 (en) * 2022-08-18 2024-02-22 Avaya Management L.P. Speech transmission from a telecommunication endpoint using phonetic characters
CN118018137A (zh) 2022-11-08 2024-05-10 联发科技(新加坡)私人有限公司 音频播放方法及装置
US20240428774A1 (en) * 2023-06-21 2024-12-26 International Business Machines Corporation Cognitive assistant voice amelioration model
US20260073903A1 (en) * 2024-09-12 2026-03-12 Cisco Technology, Inc. Augmenting audio of communication sessions with transcribed visual content

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170237784A1 (en) * 2013-03-15 2017-08-17 Swyme Ip Bv Methods and systems for dynamic adjustment of session parameters for effective video collaboration among heterogenous devices
CN107393544A (zh) * 2017-06-19 2017-11-24 维沃移动通信有限公司 一种语音信号修复方法及移动终端
US9843673B1 (en) * 2016-11-14 2017-12-12 Motorola Mobility Llc Managing calls
US20180218727A1 (en) * 2017-02-02 2018-08-02 Microsoft Technology Licensing, Llc Artificially generated speech for a communication session
CN111108740A (zh) * 2017-09-29 2020-05-05 苹果公司 用于多用户通信会话的用户界面

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001230801A (ja) * 2000-02-14 2001-08-24 Sony Corp 通信システムとその方法、通信サービスサーバおよび通信端末装置
JP2008021058A (ja) * 2006-07-12 2008-01-31 Nec Corp 翻訳機能付き携帯電話装置、音声データ翻訳方法、音声データ翻訳プログラムおよびプログラム記録媒体
US9922641B1 (en) * 2012-10-01 2018-03-20 Google Llc Cross-lingual speaker adaptation for multi-lingual speech synthesis
KR101787594B1 (ko) 2013-08-29 2017-10-18 유니파이 게엠베하 운트 코. 카게 혼잡한 통신 채널에서 오디오 통신의 유지
DE102014018205A1 (de) * 2014-12-09 2016-06-09 Unify Gmbh & Co. Kg Konferenzsystem und Verfahren zum Steuern des Konferenzsystems
US9883144B2 (en) * 2016-05-12 2018-01-30 Fuji Xerox Co., Ltd. System and method for replacing user media streams with animated avatars in live videoconferences
US20180226073A1 (en) * 2017-02-06 2018-08-09 International Business Machines Corporation Context-based cognitive speech to text engine
US20180358003A1 (en) * 2017-06-09 2018-12-13 Qualcomm Incorporated Methods and apparatus for improving speech communication and speech interface quality using neural networks
US20200090648A1 (en) 2018-09-14 2020-03-19 International Business Machines Corporation Maintaining voice conversation continuity
US10971161B1 (en) * 2018-12-12 2021-04-06 Amazon Technologies, Inc. Techniques for loss mitigation of audio streams
KR102740698B1 (ko) * 2019-08-22 2024-12-11 엘지전자 주식회사 감정 정보 기반의 음성 합성 방법 및 장치
US11889128B2 (en) * 2021-01-05 2024-01-30 Qualcomm Incorporated Call audio playback speed adjustment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170237784A1 (en) * 2013-03-15 2017-08-17 Swyme Ip Bv Methods and systems for dynamic adjustment of session parameters for effective video collaboration among heterogenous devices
US9843673B1 (en) * 2016-11-14 2017-12-12 Motorola Mobility Llc Managing calls
US20180218727A1 (en) * 2017-02-02 2018-08-02 Microsoft Technology Licensing, Llc Artificially generated speech for a communication session
CN107393544A (zh) * 2017-06-19 2017-11-24 维沃移动通信有限公司 一种语音信号修复方法及移动终端
CN111108740A (zh) * 2017-09-29 2020-05-05 苹果公司 用于多用户通信会话的用户界面

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2026007057A1 (en) * 2024-07-04 2026-01-08 Ringcentral, Inc. Systems and methods for recreating lost or inaudible speech in a conversation

Also Published As

Publication number Publication date
BR112023014966A2 (pt) 2024-01-23
JP2024505944A (ja) 2024-02-08
WO2022169534A1 (en) 2022-08-11
TW202236084A (zh) 2022-09-16
US20220246133A1 (en) 2022-08-04
EP4289129A1 (en) 2023-12-13
US11580954B2 (en) 2023-02-14
EP4289129B1 (en) 2025-09-03
JP7798901B2 (ja) 2026-01-14
EP4289129C0 (en) 2025-09-03
KR20230133864A (ko) 2023-09-19

Similar Documents

Publication Publication Date Title
US11580954B2 (en) Systems and methods of handling speech audio stream interruptions
CN110634483B (zh) 人机交互方法、装置、电子设备及存储介质
US10680995B1 (en) Continuous multimodal communication and recording system with automatic transmutation of audio and textual content
CN113127609A (zh) 语音控制方法、装置、服务器、终端设备及存储介质
CN109147779A (zh) 语音数据处理方法和装置
US10228899B2 (en) Monitoring environmental noise and data packets to display a transcription of call audio
US20130304457A1 (en) Method and system for operating communication service
CN108920128B (zh) 演示文稿的操作方法及系统
CN107995105B (zh) 一种具有盲操作软件的智能终端
US20150163610A1 (en) Audio keyword based control of media output
JP6904357B2 (ja) 情報処理装置、情報処理方法、及びプログラム
US20240087597A1 (en) Source speech modification based on an input speech characteristic
CN108288469A (zh) 一种音箱及交互方法
JP7842767B2 (ja) 通話オーディオ再生速度調整
US20240402980A1 (en) Disabling audio coding of media content when a no volume condition of a device is detected
TWI914456B (zh) 處理語音音頻流中斷的系統和方法
US12603957B2 (en) Conference calls
CN118433437A (zh) 直播间语音直播方法、装置、直播系统、电子设备及介质
CN113271491B (zh) 电子装置以及播放控制方法
TWI917501B (zh) 通話音頻回放速度調整
JP2021131404A (ja) メディア再生装置およびその制御方法
CN121509379A (zh) 一种通信方法、通信系统及电子设备
CN121438808A (zh) 端云翻译系统的语音翻译方法、装置、设备以及介质
CN121415765A (zh) 流式语音同传方法、相关设备及计算机程序产品
CN121435991A (zh) 一种支持流式处理的跨应用程序同声传译方法及系统

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination