PL401372A1 - Hybrydowa kompresja danych głosowych w systemach zamiany tekstu na mowę - Google Patents

Hybrydowa kompresja danych głosowych w systemach zamiany tekstu na mowę

Info

Publication number
PL401372A1
PL401372A1 PL401372A PL40137212A PL401372A1 PL 401372 A1 PL401372 A1 PL 401372A1 PL 401372 A PL401372 A PL 401372A PL 40137212 A PL40137212 A PL 40137212A PL 401372 A1 PL401372 A1 PL 401372A1
Authority
PL
Poland
Prior art keywords
compression
speech
text
time domain
voice data
Prior art date
Application number
PL401372A
Other languages
English (en)
Inventor
Michał T. Kaszczuk
Łukasz M. Osowski
Original Assignee
Ivona Software Spółka Z Ograniczoną Odpowiedzialnością
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ivona Software Spółka Z Ograniczoną Odpowiedzialnością filed Critical Ivona Software Spółka Z Ograniczoną Odpowiedzialnością
Priority to PL401372A priority Critical patent/PL401372A1/pl
Priority to US13/720,900 priority patent/US9064489B2/en
Publication of PL401372A1 publication Critical patent/PL401372A1/pl

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/04Details of speech synthesis systems, e.g. synthesiser structure or memory management
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/04Time compression or expansion

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

Nagrane albo syntetyzowane segmenty mowy z systemów zamiany tekstu na mowę są kompresowane poprzez użycie zarówno technik kompresji domeny czasu, jak i kompresji percepcyjnej. Dwukrotnie skompresowane nagranie zostaje podzielone na segmenty mowy odpowiadające słowom oraz podsłowom do wykorzystania w systemie TTS. Stopień kompresji w ramach kompresji domeny czasu oraz współczynnik kompresji domeny czasu do kompresji percepcyjnej są modyfikowane na potrzeby dowolnego segmentu mowy. Wielkość lub współczynnik kompresji określa się na podstawie właściwości lingwistycznych lub akustycznych słowa lub podsłowa reprezentowanego przez dany segment mowy. Do różnych części danego segmentu mowy są stosowane różne wielkości i współczynniki kompresji.
PL401372A 2012-10-26 2012-10-26 Hybrydowa kompresja danych głosowych w systemach zamiany tekstu na mowę PL401372A1 (pl)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PL401372A PL401372A1 (pl) 2012-10-26 2012-10-26 Hybrydowa kompresja danych głosowych w systemach zamiany tekstu na mowę
US13/720,900 US9064489B2 (en) 2012-10-26 2012-12-19 Hybrid compression of text-to-speech voice data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PL401372A PL401372A1 (pl) 2012-10-26 2012-10-26 Hybrydowa kompresja danych głosowych w systemach zamiany tekstu na mowę

Publications (1)

Publication Number Publication Date
PL401372A1 true PL401372A1 (pl) 2014-04-28

Family

ID=50515002

Family Applications (1)

Application Number Title Priority Date Filing Date
PL401372A PL401372A1 (pl) 2012-10-26 2012-10-26 Hybrydowa kompresja danych głosowych w systemach zamiany tekstu na mowę

Country Status (2)

Country Link
US (1) US9064489B2 (pl)
PL (1) PL401372A1 (pl)

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9734817B1 (en) * 2014-03-21 2017-08-15 Amazon Technologies, Inc. Text-to-speech task scheduling
US9565493B2 (en) 2015-04-30 2017-02-07 Shure Acquisition Holdings, Inc. Array microphone system and method of assembling the same
US9554207B2 (en) 2015-04-30 2017-01-24 Shure Acquisition Holdings, Inc. Offset cartridge microphones
US10367948B2 (en) 2017-01-13 2019-07-30 Shure Acquisition Holdings, Inc. Post-mixing acoustic echo cancellation systems and methods
EP3803867B1 (en) * 2018-05-31 2024-01-10 Shure Acquisition Holdings, Inc. Systems and methods for intelligent voice activation for auto-mixing
US11523212B2 (en) 2018-06-01 2022-12-06 Shure Acquisition Holdings, Inc. Pattern-forming microphone array
US11297423B2 (en) 2018-06-15 2022-04-05 Shure Acquisition Holdings, Inc. Endfire linear array microphone
US11310596B2 (en) 2018-09-20 2022-04-19 Shure Acquisition Holdings, Inc. Adjustable lobe shape for array microphones
US11558693B2 (en) 2019-03-21 2023-01-17 Shure Acquisition Holdings, Inc. Auto focus, auto focus within regions, and auto placement of beamformed microphone lobes with inhibition and voice activity detection functionality
CN118803494B (zh) 2019-03-21 2025-09-19 舒尔获得控股公司 具有抑制功能的波束形成麦克风瓣的自动对焦、区域内自动对焦、及自动配置
CN113841419B (zh) 2019-03-21 2024-11-12 舒尔获得控股公司 天花板阵列麦克风的外壳及相关联设计特征
CN114051738B (zh) 2019-05-23 2024-10-01 舒尔获得控股公司 可操纵扬声器阵列、系统及其方法
CN114051637B (zh) 2019-05-31 2025-10-28 舒尔获得控股公司 集成语音及噪声活动检测的低延时自动混波器
CN114467312A (zh) 2019-08-23 2022-05-10 舒尔获得控股公司 具有改进方向性的二维麦克风阵列
US12028678B2 (en) 2019-11-01 2024-07-02 Shure Acquisition Holdings, Inc. Proximity microphone
US11552611B2 (en) 2020-02-07 2023-01-10 Shure Acquisition Holdings, Inc. System and method for automatic adjustment of reference gain
WO2021243368A2 (en) 2020-05-29 2021-12-02 Shure Acquisition Holdings, Inc. Transducer steering and configuration systems and methods using a local positioning system
CN116918351A (zh) 2021-01-28 2023-10-20 舒尔获得控股公司 混合音频波束成形系统
US12452584B2 (en) 2021-01-29 2025-10-21 Shure Acquisition Holdings, Inc. Scalable conferencing systems and methods
US12542123B2 (en) 2021-08-31 2026-02-03 Shure Acquisition Holdings, Inc. Mask non-linear processor for acoustic echo cancellation
WO2023059655A1 (en) 2021-10-04 2023-04-13 Shure Acquisition Holdings, Inc. Networked automixer systems and methods
EP4427465A1 (en) 2021-11-05 2024-09-11 Shure Acquisition Holdings, Inc. Distributed algorithm for automixing speech over wireless networks
WO2023133513A1 (en) 2022-01-07 2023-07-13 Shure Acquisition Holdings, Inc. Audio beamforming with nulling control system and methods

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5920840A (en) * 1995-02-28 1999-07-06 Motorola, Inc. Communication system and method using a speaker dependent time-scaling technique
JP4132109B2 (ja) * 1995-10-26 2008-08-13 ソニー株式会社 音声信号の再生方法及び装置、並びに音声復号化方法及び装置、並びに音声合成方法及び装置
DE19610019C2 (de) * 1996-03-14 1999-10-28 Data Software Gmbh G Digitales Sprachsyntheseverfahren
EP1187100A1 (en) * 2000-09-06 2002-03-13 Koninklijke KPN N.V. A method and a device for objective speech quality assessment without reference signal
US6658383B2 (en) * 2001-06-26 2003-12-02 Microsoft Corporation Method for coding speech and music signals
US7454348B1 (en) * 2004-01-08 2008-11-18 At&T Intellectual Property Ii, L.P. System and method for blending synthetic voices
DE602005026778D1 (de) * 2004-01-16 2011-04-21 Scansoft Inc Corpus-gestützte sprachsynthese auf der basis von segmentrekombination
US9318108B2 (en) 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
US8321222B2 (en) * 2007-08-14 2012-11-27 Nuance Communications, Inc. Synthesis by generation and concatenation of multi-form segments

Also Published As

Publication number Publication date
US20140122060A1 (en) 2014-05-01
US9064489B2 (en) 2015-06-23

Similar Documents

Publication Publication Date Title
PL401372A1 (pl) Hybrydowa kompresja danych głosowych w systemach zamiany tekstu na mowę
Mohammadi et al. An overview of voice conversion systems
WO2015009586A3 (en) Performing an operation relative to tabular data based upon voice input
PH12014500482A1 (en) Systems and methods for language learning
EP4531037A3 (en) End-to-end speech conversion
SG11202100900QA (en) Text-based speech synthesis method and device, computer device, and non-transitory computer-readable storage medium
MX2014010795A (es) Dispositivo para extraer informacion a partir de un dialogo.
PL401371A1 (pl) Opracowanie głosu dla zautomatyzowanej zamiany tekstu na mowę
EP4465662A3 (en) Compression of decomposed representations of a sound field
CL2016003003A1 (es) Traducción durante una llamada
GB201212783D0 (en) A speech processing system
GB201205790D0 (en) Transcription of speech
EP1922723A4 (en) SYSTEMS AND METHODS FOR RESPONDING TO A VOICE STATEMENT IN A NATURAL LANGUAGE
ATE374991T1 (de) Verfahren und system für die umsetzung von text- zu-sprache
WO2013134641A3 (en) Recognizing speech in multiple languages
EP2530671A3 (en) Voice synthesis apparatus
WO2013181272A3 (en) Object-based audio system using vector base amplitude panning
WO2008142836A1 (ja) 声質変換装置および声質変換方法
WO2014182565A3 (en) Seaparating carbon dioxide and hydrogen sulfide from a natural gas stream using co-current contacting systems
MX346294B (es) Método y sistema para el reconocimiento de comandos de voz.
NZ721890A (en) Harmonic bandwidth extension of audio signals
GB2484615A (en) A text to speech method and system
WO2014107635A3 (en) Speech modification for distributed story reading
EP4425488A3 (en) Acoustic model training using corrected terms
EP4234483A3 (en) Synthesis and hydrogen storage properties of novel metal hydrides