PL401372A1 - Hybrydowa kompresja danych głosowych w systemach zamiany tekstu na mowę - Google Patents

Hybrydowa kompresja danych głosowych w systemach zamiany tekstu na mowę

Info

Publication number
PL401372A1
PL401372A1 PL401372A PL40137212A PL401372A1 PL 401372 A1 PL401372 A1 PL 401372A1 PL 401372 A PL401372 A PL 401372A PL 40137212 A PL40137212 A PL 40137212A PL 401372 A1 PL401372 A1 PL 401372A1
Authority
PL
Poland
Prior art keywords
compression
speech
text
time domain
voice data
Prior art date
Application number
PL401372A
Other languages
English (en)
Inventor
Michał T. Kaszczuk
Łukasz M. Osowski
Original Assignee
Ivona Software Spółka Z Ograniczoną Odpowiedzialnością
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ivona Software Spółka Z Ograniczoną Odpowiedzialnością filed Critical Ivona Software Spółka Z Ograniczoną Odpowiedzialnością
Priority to PL401372A priority Critical patent/PL401372A1/pl
Priority to US13/720,900 priority patent/US9064489B2/en
Publication of PL401372A1 publication Critical patent/PL401372A1/pl

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/04Details of speech synthesis systems, e.g. synthesiser structure or memory management
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/04Time compression or expansion

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

Nagrane albo syntetyzowane segmenty mowy z systemów zamiany tekstu na mowę są kompresowane poprzez użycie zarówno technik kompresji domeny czasu, jak i kompresji percepcyjnej. Dwukrotnie skompresowane nagranie zostaje podzielone na segmenty mowy odpowiadające słowom oraz podsłowom do wykorzystania w systemie TTS. Stopień kompresji w ramach kompresji domeny czasu oraz współczynnik kompresji domeny czasu do kompresji percepcyjnej są modyfikowane na potrzeby dowolnego segmentu mowy. Wielkość lub współczynnik kompresji określa się na podstawie właściwości lingwistycznych lub akustycznych słowa lub podsłowa reprezentowanego przez dany segment mowy. Do różnych części danego segmentu mowy są stosowane różne wielkości i współczynniki kompresji.
PL401372A 2012-10-26 2012-10-26 Hybrydowa kompresja danych głosowych w systemach zamiany tekstu na mowę PL401372A1 (pl)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PL401372A PL401372A1 (pl) 2012-10-26 2012-10-26 Hybrydowa kompresja danych głosowych w systemach zamiany tekstu na mowę
US13/720,900 US9064489B2 (en) 2012-10-26 2012-12-19 Hybrid compression of text-to-speech voice data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PL401372A PL401372A1 (pl) 2012-10-26 2012-10-26 Hybrydowa kompresja danych głosowych w systemach zamiany tekstu na mowę

Publications (1)

Publication Number Publication Date
PL401372A1 true PL401372A1 (pl) 2014-04-28

Family

ID=50515002

Family Applications (1)

Application Number Title Priority Date Filing Date
PL401372A PL401372A1 (pl) 2012-10-26 2012-10-26 Hybrydowa kompresja danych głosowych w systemach zamiany tekstu na mowę

Country Status (2)

Country Link
US (1) US9064489B2 (pl)
PL (1) PL401372A1 (pl)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9734817B1 (en) * 2014-03-21 2017-08-15 Amazon Technologies, Inc. Text-to-speech task scheduling
US9554207B2 (en) 2015-04-30 2017-01-24 Shure Acquisition Holdings, Inc. Offset cartridge microphones
US9565493B2 (en) 2015-04-30 2017-02-07 Shure Acquisition Holdings, Inc. Array microphone system and method of assembling the same
US10367948B2 (en) 2017-01-13 2019-07-30 Shure Acquisition Holdings, Inc. Post-mixing acoustic echo cancellation systems and methods
EP3803867B1 (en) * 2018-05-31 2024-01-10 Shure Acquisition Holdings, Inc. Systems and methods for intelligent voice activation for auto-mixing
WO2019231632A1 (en) 2018-06-01 2019-12-05 Shure Acquisition Holdings, Inc. Pattern-forming microphone array
US11297423B2 (en) 2018-06-15 2022-04-05 Shure Acquisition Holdings, Inc. Endfire linear array microphone
WO2020061353A1 (en) 2018-09-20 2020-03-26 Shure Acquisition Holdings, Inc. Adjustable lobe shape for array microphones
CN113841419A (zh) 2019-03-21 2021-12-24 舒尔获得控股公司 天花板阵列麦克风的外壳及相关联设计特征
US11558693B2 (en) 2019-03-21 2023-01-17 Shure Acquisition Holdings, Inc. Auto focus, auto focus within regions, and auto placement of beamformed microphone lobes with inhibition and voice activity detection functionality
WO2020191380A1 (en) 2019-03-21 2020-09-24 Shure Acquisition Holdings,Inc. Auto focus, auto focus within regions, and auto placement of beamformed microphone lobes with inhibition functionality
TW202101422A (zh) 2019-05-23 2021-01-01 美商舒爾獲得控股公司 可操縱揚聲器陣列、系統及其方法
EP3977449A1 (en) 2019-05-31 2022-04-06 Shure Acquisition Holdings, Inc. Low latency automixer integrated with voice and noise activity detection
JP2022545113A (ja) 2019-08-23 2022-10-25 シュアー アクイジッション ホールディングス インコーポレイテッド 指向性が改善された一次元アレイマイクロホン
US11552611B2 (en) 2020-02-07 2023-01-10 Shure Acquisition Holdings, Inc. System and method for automatic adjustment of reference gain
US11706562B2 (en) 2020-05-29 2023-07-18 Shure Acquisition Holdings, Inc. Transducer steering and configuration systems and methods using a local positioning system
CN116918351A (zh) 2021-01-28 2023-10-20 舒尔获得控股公司 混合音频波束成形系统

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5920840A (en) * 1995-02-28 1999-07-06 Motorola, Inc. Communication system and method using a speaker dependent time-scaling technique
JP4132109B2 (ja) * 1995-10-26 2008-08-13 ソニー株式会社 音声信号の再生方法及び装置、並びに音声復号化方法及び装置、並びに音声合成方法及び装置
DE19610019C2 (de) * 1996-03-14 1999-10-28 Data Software Gmbh G Digitales Sprachsyntheseverfahren
EP1187100A1 (en) * 2000-09-06 2002-03-13 Koninklijke KPN N.V. A method and a device for objective speech quality assessment without reference signal
US6658383B2 (en) * 2001-06-26 2003-12-02 Microsoft Corporation Method for coding speech and music signals
US7454348B1 (en) * 2004-01-08 2008-11-18 At&T Intellectual Property Ii, L.P. System and method for blending synthetic voices
DE602005026778D1 (de) * 2004-01-16 2011-04-21 Scansoft Inc Corpus-gestützte sprachsynthese auf der basis von segmentrekombination
US9318108B2 (en) 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
US8321222B2 (en) * 2007-08-14 2012-11-27 Nuance Communications, Inc. Synthesis by generation and concatenation of multi-form segments

Also Published As

Publication number Publication date
US20140122060A1 (en) 2014-05-01
US9064489B2 (en) 2015-06-23

Similar Documents

Publication Publication Date Title
PL401372A1 (pl) Hybrydowa kompresja danych głosowych w systemach zamiany tekstu na mowę
Künzel Some general phonetic and forensic aspects of speaking tempo
WO2015009586A3 (en) Performing an operation relative to tabular data based upon voice input
EP3937165A4 (en) SPEECH SYNTHESIS METHOD AND APPARATUS, AND COMPUTER READABLE STORAGE MEDIUM
WO2014197334A3 (en) System and method for user-specified pronunciation of words for speech synthesis and recognition
MX2014010795A (es) Dispositivo para extraer informacion a partir de un dialogo.
PL401371A1 (pl) Opracowanie głosu dla zautomatyzowanej zamiany tekstu na mowę
PH12016502029B1 (en) In-call translation
EP3282448A3 (en) Compression of decomposed representations of a sound field
WO2010041131A8 (en) Associating source information with phonetic indices
GB201205790D0 (en) Transcription of speech
EP1922723A4 (en) SYSTEMS AND METHODS FOR RESPONDING TO A VOICE STATEMENT IN A NATURAL LANGUAGE
WO2015008162A3 (en) Systems and methods for textual content creation from sources of audio that contain speech
GB2529564A (en) Method, apparatus and system for regenerating voice intonation in automatically dubbed videos
DE602008003781D1 (de) System und verfahren für hybride sprachsynthese
ATE374991T1 (de) Verfahren und system für die umsetzung von text- zu-sprache
WO2012108680A3 (ko) 대역 확장 방법 및 장치
WO2013134641A3 (en) Recognizing speech in multiple languages
WO2007103520A3 (en) Codebook-less speech conversion method and system
GB2484615A (en) A text to speech method and system
NZ721890A (en) Harmonic bandwidth extension of audio signals
EP3935622A4 (en) SYSTEMS AND METHODS FOR TRANSFORMING SPOKE OR TEXTUAL INPUT INTO MUSIC
CA2694317A1 (en) Apparatus, systems and methods for language instruction
WO2014134568A3 (en) Apparatus and methods for bidirectional hyperelastic stent covers
WO2013127825A8 (en) Computer-implemented method and system for generating a report