EP1498872A1 - Procédé et système pour la reproduction audio d'un texte avec des informations émotionelles - Google Patents

Procédé et système pour la reproduction audio d'un texte avec des informations émotionelles Download PDF

Info

Publication number
EP1498872A1
EP1498872A1 EP03291765A EP03291765A EP1498872A1 EP 1498872 A1 EP1498872 A1 EP 1498872A1 EP 03291765 A EP03291765 A EP 03291765A EP 03291765 A EP03291765 A EP 03291765A EP 1498872 A1 EP1498872 A1 EP 1498872A1
Authority
EP
European Patent Office
Prior art keywords
text
codes
sentences
expressions
tts
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP03291765A
Other languages
German (de)
English (en)
Inventor
Jean Luc Guevel
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alcatel CIT SA
Alcatel Lucent SAS
Original Assignee
Alcatel CIT SA
Alcatel SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alcatel CIT SA, Alcatel SA filed Critical Alcatel CIT SA
Priority to EP03291765A priority Critical patent/EP1498872A1/fr
Publication of EP1498872A1 publication Critical patent/EP1498872A1/fr
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination

Definitions

  • the present invention is related to the field of the conversion of text data into voice or speech data, more particularly in connection with so-called Text To Speech systems (TTS).
  • TTS Text To Speech systems
  • the present invention concerns a method and a system for rendering a text in an audio form with a better expressiveness, based on the state of mind of the author at the time of producing said text.
  • the receiver or listener of the message has most of the time no indication of the state of mind, mood or general state of the sender of the message, as the generated audio signal is "flat" and the output voice generally monotone. This can obliterate a great part of the meaning and/or of the strength of the text or message.
  • EP-A-1 102 242 It has been proposed, in EP-A-1 102 242, to personalise the features of the outputting voice. But said personalisation is performed upon the initiative and according to the desires of the receiver or listener, and does not reflect the feelings, the mood or the state of mind of the writer or sender.
  • the main object of the present invention is a method for rendering in an audio form a text using a given Text To Speech system, program or engine, said text including symbolic signs corresponding to the mental, emotional and/or physical state or state of mind of the author and/or sender of said text, method characterised in that it consists in:
  • the inventive method mainly comprises the following steps:
  • the text consists of an e-mail or SMS message
  • the symbolic signs consist of smileys
  • the processable codes are output configuring escape codes belonging to the used TTS system 1.
  • each symbolic sign inserted in the text to be rendered in audio form affects only the output voice for the word, sentence or expression which immediately precedes it, the corresponding escape code being put immediately in front of the corresponding translated audio word, sentence or expression.
  • each symbolic sign inserted in the text to be rendered in audio form affects the output voice for all the words, sentences and/or expressions which precede it up to a respective preceding symbolic sign, the beginning of the text or predetermined Text cutting sign.
  • the method according to the invention can also comprise a previous step of building up a [symbolic signs / TTS processable codes ] translation library, specifically adapted to the possibilities of the used TTS system, i.e. the plurality of codes it is able to process.
  • Said library can be integrated to or be separate from the pretreatment means, and its content can be specifically adapted to the escape codes of the used TTS system and evoluate with the appearance of new symbolic signs and the disappearance of older ones, which have become obsolete.
  • the text to be treated can also comprise symbolic signs which are not is said library.
  • the method can further comprise the step of deleting or inhibiting, during the first treatment, each symbolic sign which cannot be transcribed into a configuration code processable by the TTS system 1.
  • the adjustable or tunable acoustic parameters may comprise parameters selected from the group consisting of volume, rate, pitch, tone or analog utterance or voice intonation characteristics.
  • the present invention also concerns a system 1 for rendering in an audio form a text, said text including symbolic signs corresponding to the mental, emotional and/or physical state or state of mind of the author or sender of said text, system 2.
  • Said system 1 is characterised in that it comprises:
  • system 1 will further comprise adapted mean 9 to implement the method as described herein before.
  • the first software module can or not be integrated into the TTS program.
  • escape codes can be specific codes proposed by a given TTS (such as "Realspeak” for example), can also be generic codes, belonging to the hardware provider.
  • the text to be processed is either passed over directly to the TTS with understandable escape codes, or an adapted particular software module analyses the generic escape codes and calls the configuring functions (for example C language) provided by the target TTS.
  • TTS which does manage escape codes, but does also offer an API
EP03291765A 2003-07-16 2003-07-16 Procédé et système pour la reproduction audio d'un texte avec des informations émotionelles Withdrawn EP1498872A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP03291765A EP1498872A1 (fr) 2003-07-16 2003-07-16 Procédé et système pour la reproduction audio d'un texte avec des informations émotionelles

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP03291765A EP1498872A1 (fr) 2003-07-16 2003-07-16 Procédé et système pour la reproduction audio d'un texte avec des informations émotionelles

Publications (1)

Publication Number Publication Date
EP1498872A1 true EP1498872A1 (fr) 2005-01-19

Family

ID=33462248

Family Applications (1)

Application Number Title Priority Date Filing Date
EP03291765A Withdrawn EP1498872A1 (fr) 2003-07-16 2003-07-16 Procédé et système pour la reproduction audio d'un texte avec des informations émotionelles

Country Status (1)

Country Link
EP (1) EP1498872A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7983910B2 (en) 2006-03-03 2011-07-19 International Business Machines Corporation Communicating across voice and text channels with emotion preservation
CN102244788A (zh) * 2010-05-10 2011-11-16 索尼公司 信息处理方法、信息处理装置、场景元数据提取装置、丢失恢复信息生成装置和程序
CN106294296A (zh) * 2016-08-16 2017-01-04 唐哲敏 一种文字信息会话管理方法

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020191757A1 (en) * 2001-06-04 2002-12-19 Hewlett-Packard Company Audio-form presentation of text messages
US20030028380A1 (en) * 2000-02-02 2003-02-06 Freeland Warwick Peter Speech system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030028380A1 (en) * 2000-02-02 2003-02-06 Freeland Warwick Peter Speech system
US20020191757A1 (en) * 2001-06-04 2002-12-19 Hewlett-Packard Company Audio-form presentation of text messages

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MARC SCHRÖDER: "Emotional Speech Synthesis: A Review", PROCEEDINGS EUROSPEECH 2001, vol. 1, Aalborg, pages 561 - 564, XP007005064 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7983910B2 (en) 2006-03-03 2011-07-19 International Business Machines Corporation Communicating across voice and text channels with emotion preservation
US8386265B2 (en) 2006-03-03 2013-02-26 International Business Machines Corporation Language translation with emotion metadata
CN102244788A (zh) * 2010-05-10 2011-11-16 索尼公司 信息处理方法、信息处理装置、场景元数据提取装置、丢失恢复信息生成装置和程序
CN102244788B (zh) * 2010-05-10 2015-11-25 索尼公司 信息处理方法、信息处理装置和丢失恢复信息生成装置
CN106294296A (zh) * 2016-08-16 2017-01-04 唐哲敏 一种文字信息会话管理方法

Similar Documents

Publication Publication Date Title
US7490042B2 (en) Methods and apparatus for adapting output speech in accordance with context of communication
US7062439B2 (en) Speech synthesis apparatus and method
US6725199B2 (en) Speech synthesis apparatus and selection method
US7644000B1 (en) Adding audio effects to spoken utterance
CN101727904B (zh) 语音翻译方法和装置
JP3895766B2 (ja) 音声合成装置
KR100590553B1 (ko) 대화체 운율구조 생성방법 및 장치와 이를 적용한음성합성시스템
US7191132B2 (en) Speech synthesis apparatus and method
US20050192793A1 (en) System and method for generating a phrase pronunciation
US20090024393A1 (en) Speech synthesizer and speech synthesis system
US7747440B2 (en) Methods and apparatus for conveying synthetic speech style from a text-to-speech system
US8355484B2 (en) Methods and apparatus for masking latency in text-to-speech systems
EP1498872A1 (fr) Procédé et système pour la reproduction audio d'un texte avec des informations émotionelles
US20040122668A1 (en) Method and apparatus for using computer generated voice
JP3404055B2 (ja) 音声合成装置
JP2002132282A (ja) 電子テキスト読み上げ装置
Plumpe et al. Which is More Important in a Concatenative Text to Speech System-Pitch, Duration, or Spectral Discontinuity?
JPH05134691A (ja) 音声合成方法および装置
JP4056647B2 (ja) 波形接続型音声合成装置および方法
KR102116014B1 (ko) 음성인식엔진과 성대모사용음성합성엔진을 이용한 화자 성대모사시스템
US20240153482A1 (en) Non-transitory computer-readable medium and voice generating system
KR101129124B1 (ko) 개인 음성 특성을 이용한 문자음성변환 단말기 및 그에사용되는 문자음성변환 방법
CN113421549A (zh) 语音合成方法、装置、计算机设备及存储介质
JP2002351791A (ja) 電子メール通信装置、電子メール通信方法および電子メール通信プログラム
Bharthi et al. Unit selection based speech synthesis for converting short text message into voice message in mobile phones

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL LT LV MK

AKX Designation fees paid
REG Reference to a national code

Ref country code: DE

Ref legal event code: 8566

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20050720