EP1498872A1 - Procédé et système pour la reproduction audio d'un texte avec des informations émotionelles - Google Patents
Procédé et système pour la reproduction audio d'un texte avec des informations émotionelles Download PDFInfo
- Publication number
- EP1498872A1 EP1498872A1 EP03291765A EP03291765A EP1498872A1 EP 1498872 A1 EP1498872 A1 EP 1498872A1 EP 03291765 A EP03291765 A EP 03291765A EP 03291765 A EP03291765 A EP 03291765A EP 1498872 A1 EP1498872 A1 EP 1498872A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- text
- codes
- sentences
- expressions
- tts
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
Definitions
- the present invention is related to the field of the conversion of text data into voice or speech data, more particularly in connection with so-called Text To Speech systems (TTS).
- TTS Text To Speech systems
- the present invention concerns a method and a system for rendering a text in an audio form with a better expressiveness, based on the state of mind of the author at the time of producing said text.
- the receiver or listener of the message has most of the time no indication of the state of mind, mood or general state of the sender of the message, as the generated audio signal is "flat" and the output voice generally monotone. This can obliterate a great part of the meaning and/or of the strength of the text or message.
- EP-A-1 102 242 It has been proposed, in EP-A-1 102 242, to personalise the features of the outputting voice. But said personalisation is performed upon the initiative and according to the desires of the receiver or listener, and does not reflect the feelings, the mood or the state of mind of the writer or sender.
- the main object of the present invention is a method for rendering in an audio form a text using a given Text To Speech system, program or engine, said text including symbolic signs corresponding to the mental, emotional and/or physical state or state of mind of the author and/or sender of said text, method characterised in that it consists in:
- the inventive method mainly comprises the following steps:
- the text consists of an e-mail or SMS message
- the symbolic signs consist of smileys
- the processable codes are output configuring escape codes belonging to the used TTS system 1.
- each symbolic sign inserted in the text to be rendered in audio form affects only the output voice for the word, sentence or expression which immediately precedes it, the corresponding escape code being put immediately in front of the corresponding translated audio word, sentence or expression.
- each symbolic sign inserted in the text to be rendered in audio form affects the output voice for all the words, sentences and/or expressions which precede it up to a respective preceding symbolic sign, the beginning of the text or predetermined Text cutting sign.
- the method according to the invention can also comprise a previous step of building up a [symbolic signs / TTS processable codes ] translation library, specifically adapted to the possibilities of the used TTS system, i.e. the plurality of codes it is able to process.
- Said library can be integrated to or be separate from the pretreatment means, and its content can be specifically adapted to the escape codes of the used TTS system and evoluate with the appearance of new symbolic signs and the disappearance of older ones, which have become obsolete.
- the text to be treated can also comprise symbolic signs which are not is said library.
- the method can further comprise the step of deleting or inhibiting, during the first treatment, each symbolic sign which cannot be transcribed into a configuration code processable by the TTS system 1.
- the adjustable or tunable acoustic parameters may comprise parameters selected from the group consisting of volume, rate, pitch, tone or analog utterance or voice intonation characteristics.
- the present invention also concerns a system 1 for rendering in an audio form a text, said text including symbolic signs corresponding to the mental, emotional and/or physical state or state of mind of the author or sender of said text, system 2.
- Said system 1 is characterised in that it comprises:
- system 1 will further comprise adapted mean 9 to implement the method as described herein before.
- the first software module can or not be integrated into the TTS program.
- escape codes can be specific codes proposed by a given TTS (such as "Realspeak” for example), can also be generic codes, belonging to the hardware provider.
- the text to be processed is either passed over directly to the TTS with understandable escape codes, or an adapted particular software module analyses the generic escape codes and calls the configuring functions (for example C language) provided by the target TTS.
- TTS which does manage escape codes, but does also offer an API
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP03291765A EP1498872A1 (fr) | 2003-07-16 | 2003-07-16 | Procédé et système pour la reproduction audio d'un texte avec des informations émotionelles |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP03291765A EP1498872A1 (fr) | 2003-07-16 | 2003-07-16 | Procédé et système pour la reproduction audio d'un texte avec des informations émotionelles |
Publications (1)
Publication Number | Publication Date |
---|---|
EP1498872A1 true EP1498872A1 (fr) | 2005-01-19 |
Family
ID=33462248
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP03291765A Withdrawn EP1498872A1 (fr) | 2003-07-16 | 2003-07-16 | Procédé et système pour la reproduction audio d'un texte avec des informations émotionelles |
Country Status (1)
Country | Link |
---|---|
EP (1) | EP1498872A1 (fr) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7983910B2 (en) | 2006-03-03 | 2011-07-19 | International Business Machines Corporation | Communicating across voice and text channels with emotion preservation |
CN102244788A (zh) * | 2010-05-10 | 2011-11-16 | 索尼公司 | 信息处理方法、信息处理装置、场景元数据提取装置、丢失恢复信息生成装置和程序 |
CN106294296A (zh) * | 2016-08-16 | 2017-01-04 | 唐哲敏 | 一种文字信息会话管理方法 |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020191757A1 (en) * | 2001-06-04 | 2002-12-19 | Hewlett-Packard Company | Audio-form presentation of text messages |
US20030028380A1 (en) * | 2000-02-02 | 2003-02-06 | Freeland Warwick Peter | Speech system |
-
2003
- 2003-07-16 EP EP03291765A patent/EP1498872A1/fr not_active Withdrawn
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030028380A1 (en) * | 2000-02-02 | 2003-02-06 | Freeland Warwick Peter | Speech system |
US20020191757A1 (en) * | 2001-06-04 | 2002-12-19 | Hewlett-Packard Company | Audio-form presentation of text messages |
Non-Patent Citations (1)
Title |
---|
MARC SCHRÖDER: "Emotional Speech Synthesis: A Review", PROCEEDINGS EUROSPEECH 2001, vol. 1, Aalborg, pages 561 - 564, XP007005064 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7983910B2 (en) | 2006-03-03 | 2011-07-19 | International Business Machines Corporation | Communicating across voice and text channels with emotion preservation |
US8386265B2 (en) | 2006-03-03 | 2013-02-26 | International Business Machines Corporation | Language translation with emotion metadata |
CN102244788A (zh) * | 2010-05-10 | 2011-11-16 | 索尼公司 | 信息处理方法、信息处理装置、场景元数据提取装置、丢失恢复信息生成装置和程序 |
CN102244788B (zh) * | 2010-05-10 | 2015-11-25 | 索尼公司 | 信息处理方法、信息处理装置和丢失恢复信息生成装置 |
CN106294296A (zh) * | 2016-08-16 | 2017-01-04 | 唐哲敏 | 一种文字信息会话管理方法 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7490042B2 (en) | Methods and apparatus for adapting output speech in accordance with context of communication | |
US7062439B2 (en) | Speech synthesis apparatus and method | |
US6725199B2 (en) | Speech synthesis apparatus and selection method | |
US7644000B1 (en) | Adding audio effects to spoken utterance | |
CN101727904B (zh) | 语音翻译方法和装置 | |
JP3895766B2 (ja) | 音声合成装置 | |
KR100590553B1 (ko) | 대화체 운율구조 생성방법 및 장치와 이를 적용한음성합성시스템 | |
US7191132B2 (en) | Speech synthesis apparatus and method | |
US20050192793A1 (en) | System and method for generating a phrase pronunciation | |
US20090024393A1 (en) | Speech synthesizer and speech synthesis system | |
US7747440B2 (en) | Methods and apparatus for conveying synthetic speech style from a text-to-speech system | |
US8355484B2 (en) | Methods and apparatus for masking latency in text-to-speech systems | |
EP1498872A1 (fr) | Procédé et système pour la reproduction audio d'un texte avec des informations émotionelles | |
US20040122668A1 (en) | Method and apparatus for using computer generated voice | |
JP3404055B2 (ja) | 音声合成装置 | |
JP2002132282A (ja) | 電子テキスト読み上げ装置 | |
Plumpe et al. | Which is More Important in a Concatenative Text to Speech System-Pitch, Duration, or Spectral Discontinuity? | |
JPH05134691A (ja) | 音声合成方法および装置 | |
JP4056647B2 (ja) | 波形接続型音声合成装置および方法 | |
KR102116014B1 (ko) | 음성인식엔진과 성대모사용음성합성엔진을 이용한 화자 성대모사시스템 | |
US20240153482A1 (en) | Non-transitory computer-readable medium and voice generating system | |
KR101129124B1 (ko) | 개인 음성 특성을 이용한 문자음성변환 단말기 및 그에사용되는 문자음성변환 방법 | |
CN113421549A (zh) | 语音合成方法、装置、计算机设备及存储介质 | |
JP2002351791A (ja) | 電子メール通信装置、電子メール通信方法および電子メール通信プログラム | |
Bharthi et al. | Unit selection based speech synthesis for converting short text message into voice message in mobile phones |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR |
|
AX | Request for extension of the european patent |
Extension state: AL LT LV MK |
|
AKX | Designation fees paid | ||
REG | Reference to a national code |
Ref country code: DE Ref legal event code: 8566 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20050720 |