EP0919052A1 - Procede et systeme de conversion de signaux vocaux en signaux vocaux - Google Patents
Procede et systeme de conversion de signaux vocaux en signaux vocauxInfo
- Publication number
- EP0919052A1 EP0919052A1 EP97919840A EP97919840A EP0919052A1 EP 0919052 A1 EP0919052 A1 EP 0919052A1 EP 97919840 A EP97919840 A EP 97919840A EP 97919840 A EP97919840 A EP 97919840A EP 0919052 A1 EP0919052 A1 EP 0919052A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- speech
- information
- input
- model
- fundamental tone
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000006243 chemical reaction Methods 0.000 title claims abstract description 48
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000009472 formulation Methods 0.000 claims abstract description 8
- 239000000203 mixture Substances 0.000 claims abstract description 8
- 238000004891 communication Methods 0.000 claims description 15
- 238000013518 transcription Methods 0.000 claims description 12
- 230000035897 transcription Effects 0.000 claims description 12
- 238000000605 extraction Methods 0.000 claims description 7
- 230000000977 initiatory effect Effects 0.000 claims description 2
- 238000012549 training Methods 0.000 description 5
- 230000001944 accentuation Effects 0.000 description 3
- 238000012937 correction Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
- G10L13/10—Prosody rules derived from text; Stress or intonation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/04—Details of speech synthesis systems, e.g. synthesiser structure or memory management
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/90—Pitch determination of speech signals
Definitions
- the invention relates to a speech-to-speech conversion system and method which are capable of matching the dialect of speech outputs to that of the respective speech inputs, and to a voice responsive communication system including a speech-to- speech conversion system and operating in accordance with a speech-to-speech conversion method.
- the speech information which is stored in a database and used to provide appropriate synthesised spoken responses to voice inputs utilising a speech-to-speech conversion system, is normally reproduced in a dialect which conforms to a standard national dialect.
- the database of known voice responsive communication systems interpret received speech information, i.e. the voice inputs. It may also be difficult for the person making the voice inputs to fully understand the spoken response. Even if such responses are understandable to a recipient, it would be more user friendly if the dialect of the spoken response is the same as the dialect of the related voice input.
- the meaning of a word can have widely different meanings depending on language stress.
- the meaning of one and the same sentence can be given a different significance depending on where the stress is placed.
- the stressing of sentences, or parts thereof determines sections which are emphasised in the language and which may be of importance in determining the precise meaning of the spoken language.
- a voice responsive communication system Apart from the technical problems of correctly interpreting speech, it is necessary, in voice responsive/control systems, for the verbal instructions, or commands, to be correctly interpreted, otherwise it would not be possible to provide proper responses, or effect proper control of different types of equipments, and/or services, for example, in a telecommunication network. In order to overcome these difficulties, it would be necessary for a voice responsive communication system to be capable of interpreting the received speech information, irrespective of dialect, and to match the dialect of speech outputs to that of the respective speech inputs.
- the invention provides a speech-to-speech conversion system for providing, at the output thereof, spoken responses to speech inputs to the system including speech recognition means for the input speech; interpretation means for interpreting the content of the recognised input speech; and a database containing speech information data for use in the formulation of said spoken responses, the output of said interpretation means being used to access said database and obtain speech information data therefrom, characterised in that the system further includes extraction means for extracting prosody information from the input speech; means for obtaining dialectal information from said prosody information; and text- to-speech conversion means for converting the speech information data obtained from said database into a spoken response using said dialectal information, the dialect of the spoken response being matched to that of the input speech.
- the speech recognition means may be adapted to identifying a number of phonemes from a segment of the input speech and to interpret the phonemes, as possible words, or word combinations, to establish a model of the speech, the speech model having word and sentence accents according to a standardised pattern for the language of the input speech.
- the prosody information extracted from the input speech is preferably the fundamental tone curve of the input speech.
- the means for obtaining dialectal information from said prosody information includes first analysing means for determining the intonation pattern of the fundamental tone of the input speech and thereby the maximum and minimum values of the fundamental tone curve and their respective positions; second analysing means for determining the intonation pattern of the fundamental tone curve of the speech model and thereby the maximum and minimum values of the fundamental tone curve and their respective positions; comparison means for comparing the intonation pattern of the input speech with the intonation pattern of the speech model to identify a time difference between the occurrence of the maximum and minimum values of the fundamental tone curves of the incoming speech in relation to the maximum and minimum values of the fundamental tone curve of the speech model, the identified time difference being indicative of dialectal characteristics of the input speech.
- the time difference may be determined in relation to an intonation pattern reference point, for example, the point at which a consonant/vowel limit occurs.
- the speech-to-speech conversion system may include means for obtaining information on sentence accents from said prosody information.
- the speech recognition means includes checking means for lexically checking the words in the speech model and for syntactically checking the phrases in the speech model, the words and phrases which are not linguistically possible being excluded from the speech model.
- the checking means are, with this arrangement, adapted to check the orthography and phonetic transcription of the words in the speech model, the transcription information including lexically abstracted accent information, of type stressed syllables, and information relating to the location of secondary accent.
- the accent information may, for example, relate to tonal word accent I and accent II.
- the sentence accent information and/or sentence stressing may be used, to advantage, in the interpretation of the content of the recognised input speech.
- the speech-to-speech conversion system may include dialogue management means for managing a dialogue with the database, said dialogue being initiated by the interpretation means.
- the dialogue with the database results in the application of speech information data to the text-to-speech conversion means.
- the invention also provides, in a voice responsive communication system, a method for providing a spoken response to a speech input to the system, said response having a dialect to match that of the speech input, said method including the steps of recognising and interpreting the input speech, and utilising the interpretation to obtain speech information data from a database for use in the formulation of said spoker. response, characterised in that said method further includes the steps of extracting prosody information from the input speech, obtaining dialectal information from said prosody information, and converting the speech information data obtained from said database into said spoken response using said dialectal information.
- the recognition and interpretation of the input speech includes the steps of identifying a number of phonemes from a segment of the input speech and interpreting the phonemes, as possible words, or word combinations, to establish a model of the speech, the speech model having word and sentence accents according to a standardised pattern for the language of the input speech.
- the prosody information extracted from the input speech is the fundamental tone curve of the input speech.
- the method according to the present invention includes the steps of determining the intonation pattern of the fundamental tone of the input speech and thereby the maximum and minimum values of the fundamental tone curve and their respective positions; determining the intonation pattern of the fundamental tone curve of a speech model and thereby the maximum and minimum values of the fundamental tone curve and their respective positions; comparing the intonation pattern of the input speech with the intonation pattern of the speech model to identify a time difference between the occurrence of the maximum and minimum values of the fundamental tone curves of the incoming speech in relation to the maximum and minimum values of the fundamental tone curve of the speech model, the identified time difference being indicative of dialectal characteristics of the input speech.
- the time difference may be determined in relation to an intonation pattern reference point, for example, the point at which a consonant/vowel limit occurs.
- the method may include the step of obtaining information on sentence accents from said prosody information.
- the words in the speech model are checked lexically and the phrases in the speech model are checked syntactically, the words and phrases which are not linguistically possible being excluded from the speech model.
- the orthography and phonetic transcription of the words in the speech model may be checked, the transcription information including lexically abstracted accent information, of type stressed syllables, and information relating to the location of secondary accent.
- the accent information may relate to tonal word accent I and accent II.
- sentence accent information and/or sentence stressing may be used in the interpretation of the content of the recognised input speech.
- the method according to the present invention may include the step of initiating a dialogue with the database to obtain speech information data for formulating said spoken response, said dialogue being initiated following the interpretation of the input speech.
- the dialogue with the database may result in the application of speech information data to text-to-speech conversion means.
- the invention further provides a voice responsive communication system which includes a speech-to-speech conversion system as outlined in the preceding paragraphs, or utilises a method as outlined in the- preceding paragraphs for providing a spoken response to a speech input to the system.
- the characteristic features of the speech-to- speech conversion system and method, according to the present invention are that:
- prosody information is extracted from speech, applied to the input of the system, and handled by the method;
- the prosody information is in the form of the fundamental tone curve of the input speech
- the fundamental tone curve is used to obtain dialectal, sentence accent and sentence, stressing information for the input speech
- the sentence accent and stressing information is _ used in the interpretation of the speech inputs, the result of the interpretation being used to obtain speech information data from a database which is used in the formulation of voice responses to the speech inputs;
- the dialectal information is used to ensure that the voice responses to the speech inputs have a dialect to match that of respective speech inputs.
- a speech-to-speech conversion system includes, at the input 1 thereof, a speech recognition unit 2 and an extraction unit 3 for extracting prosody information from speech applied to the system input 1, i.e. the fundamental tone curve of the input speech.
- speech inputs, applied to the input 1 are simultaneously applied to the units 2 and 3.
- the output of the speech recognition unit 2 and an output of the extraction unit 3 are connected to separate inputs of an interpretation unit 4, the output of which is connected to a database management unit 5.
- the database management unit 5 which is adapted for two way communication with a database 6, is connected at the output thereof to the input of a text-to- speech converter 7.
- the dialogue between the database 6 and the database management unit 5 can be effected by any known database communication language, for example, SQL (Structured
- a further output of the extraction unit 3 is connected to the input ⁇ f a prosody analyzer unit 8 which is adapted for two way communication with the text-to-speech converter 7.
- the prosody analyzer unit 8 is adapted, as a part of the text-to-speech conversion process of the converter 7, to analyze the prosody information, i.e. the fundamental tone curve, of the synthesised speech and make any necessary corrections to the intonation pattern of the synthesised speech in accordance with the dialectal information extracted from the input speech.
- the dialect of the synthesised speech output of the speech-to-speech conversion system will match that of the input speech.
- the present invention is adapted to provide a spoken response to a speech input to the speech-to-speech conversion system which has a dialect to match that of the speech input and that this conversion process includes the steps of recognising and interpreting the input speech, utilising the interpretation to obtain speech information data from a database for use in the formulation of the spoken response, extracting prosody information from the input speech, obtaining dialectal information from the prosody information, and converting the speech information data obtained from said database into the spoken response using the dialectal information.
- This will be outlined in the following paragraphs.
- the speech inputs to the speech-to-speech conversion system which may be in many forms, for example, requests for information on particular topics, such as banking or telephone services, or general enquiries concerning such services, are applied to the input 1 and thereby to the inputs of the units 2 and 3.
- the speech recognition unit 2 and interpretation unit 4 are adapted to operate, in a manner well known to persons skilled in the art, to recognise and interpret the speech inputs to the system.
- the speech recognition unit 2 may, for example, operate by using a Hidden Markov model, or an equivalent speech model.
- the function of the units 2 and 4 is to convert speech inputs to the system into a form which is a faithful representation of the content of the speech inputs and suitable for application to the input of the database management unit 5.
- the content of the textual information data at the output of the interpretation unit 4 must be an accurate representation of the speech input and be usable by the database management unit 5 to access, and extract speech information data from, the database 6 for use in the formulation of a synthesised spoken response to the speech input.
- this process would, in essence, be effected by identifying a number of phonemes from a segment of the input speech which are combined into allophone strings, the phonemes being interpreted as possible words, or word combinations, to establish a model of the speech.
- the established speech model will have word and sentence accents according to a standardised pattern for the language of the input speech.
- the information, concerning the recognised words and word combinations, generated by the speech recognition unit 2 may, in practice, be checked both lexically (using a lexicon, with orthography and transcription) and syntactically.
- the purpose of these checks is to identify and exclude any words which do not exist in the language concerned, and/or any phrase whose syntax does not correspond with the language concerned.
- the speech recognition unit 2 ensures that only those words, and word combinations, which are found to be acceptable both lexically and syntactically, are used to create a model of the input speech.
- the intonation pattern of the speech model is a standardised intonation pattern for the language concerned, or an intonation pattern which has been established by training, or explicit knowledge, using a number of dialects of the language concerned.
- the prosody information i.e. the fundamental tone curve, extracted from the input speech by the extraction unit 3 can be used to obtain dialectal, sentence accent and sentence stressing, information, for use by the speech-to-speech conversion system and method of the present invention.
- the dialectal information can be used by the speech-to-speech conversion system " and method to match the dialect of the output speech to that of the input speech and the sentence accent and stressing information can be used in the recognition and interpretation of the input speech.
- the means for obtaining dialectal information from the prosody information includes;
- first analysing means for determining the intonation pattern of the fundamental tone of the input speech and thereby the maximum and minimum values of the fundamental tone curve and their respective positions;
- comparison means for comparing the intonation pattern of the input speech with the intonation pattern of the speech model to identify a time difference between the occurrence of the maximum and minimum values of the fundamental tone curves of the incoming speech in relation to the maximum and minimum values of the fundamental tone curve of the speech model, the identified time difference being indicative of the dialectal characteristics of the input speech.
- the time difference may be determined in relation to an intonation pattern reference point.
- the * difference, in terms of intonation pattern, between different dialects can be described by different points in time for word and sentence accent, i.e. the time difference can be determined in relation to an intonation pattern reference point, for example, the point at which a consonant/vowel limit occurs.
- the reference against which the time difference is measured is the point at which the consonant/vowel boundary, i.e. the CV-boundary, occurs.
- the identified time difference which, as stated above, is indicative of the dialect in the input speech, i.e. the spoken language, is applied to the text-to-speech converter 7 to enable the intonation pattern, and thereby the dialect, of the speech output of the system to be corrected so that it corresponds to the intonation pattern of the corresponding words and/or phrase of the input speech.
- this corrective process enables the dialectal information in the input speech to be incorporated into the output speech.
- the fundamental tone curve of the speech model is based on information resulting from the lexical
- the transcription information includes lexically abstracted accent information, of type stressed syllables, i.e. tonal word accents I and II, and information relating to the location of secondary accent, i.e. information given, for instance, in dictionaries.
- This information can be used to adjust the recognition pattern of the speech recognition model, for example, the Hidden Markov model, to take account of the transcription information.
- a more exact model of the input speech is, therefore, obtained during the interpretation process.
- a further consequence of this- speech model corrective process is that, through time, the speech model will have an intonation pattern which has been established by a training process.
- the speech model is compared with a spoken input sequence, and any difference there between can be determined and used to bring the speech model into conformity with the spoken sequence and/or to determine stresses in the spoken sequence.
- relative sentence stresses can be determined by classifying the ratio between variations and declination of the fundamental tone curve, whereby emphasised sections, or individual words can be determined.
- the pitch of the speech can be determined from the declination of the fundamental tone curve.
- the extraction unit 3 in association with the interpretation unit 4, is adapted to determine:
- classification of the ratio between the variation and declination of the fundamental tone curve makes it possible to identify/determine relative sentence stresses, and emphasised sections, or words.
- the relation between the variation and declination of the fundamental tone curve can be utilised to determine the dynamic range of the fundamental tone curve.
- the information obtained in respect of the fundamental tone curve concerning dialect, sentence accent and stressing can be used for the interpretation of speech by the interpretation unit 4, i.e. the information can be used, in the manner outlined above, to obtain a better understanding of the content of the input speech and bring the intonation pattern of the speech model into conformity with the input speech.
- the corrected speech model exhibits the language characteristics (including dialect information, sentence accent and stressing) of the input speech it can be used to give an increased understanding of the input speech and be effectively used by the database management unit 5 to obtain the required speech information data from the database 6 to formulate a response to a voice input to the speech-to-speech conversion system.
- the ability to detect speech, irrespective of dialect variations, in accordance with the system and method of the present invention makes it possible to use speech in many different voice-responsive applications.
- the system is, therefore, adapted to recognise and accurately interpret the content of speech inputs and to tailor the dialect of the voice response to match the dialect of the voice input.
- This process provides a user friendly system because the language of the man-machine dialogue is in accordance with the dialect of the user concerned.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
- Use Of Switch Circuits For Exchanges And Methods Of Control Of Multiplex Exchanges (AREA)
Abstract
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SE9601811A SE506003C2 (sv) | 1996-05-13 | 1996-05-13 | Metod och system för tal-till-tal-omvandling med extrahering av prosodiinformation |
SE9601811 | 1996-05-13 | ||
PCT/SE1997/000583 WO1997043756A1 (fr) | 1996-05-13 | 1997-04-08 | Procede et systeme de conversion de signaux vocaux en signaux vocaux |
Publications (2)
Publication Number | Publication Date |
---|---|
EP0919052A1 true EP0919052A1 (fr) | 1999-06-02 |
EP0919052B1 EP0919052B1 (fr) | 2003-07-09 |
Family
ID=20402543
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP97919840A Expired - Lifetime EP0919052B1 (fr) | 1996-05-13 | 1997-04-08 | Procede et systeme de conversion de signaux vocaux en signaux vocaux |
Country Status (6)
Country | Link |
---|---|
EP (1) | EP0919052B1 (fr) |
DE (1) | DE69723449T2 (fr) |
DK (1) | DK0919052T3 (fr) |
NO (1) | NO318557B1 (fr) |
SE (1) | SE506003C2 (fr) |
WO (1) | WO1997043756A1 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3389043A4 (fr) * | 2015-12-07 | 2019-05-15 | Yamaha Corporation | Dispositif d'interaction vocale et procédé d'interaction vocale |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1159702C (zh) * | 2001-04-11 | 2004-07-28 | 国际商业机器公司 | 具有情感的语音-语音翻译系统和方法 |
US7181397B2 (en) | 2005-04-29 | 2007-02-20 | Motorola, Inc. | Speech dialog method and system |
DE102007011039B4 (de) * | 2007-03-07 | 2019-08-29 | Man Truck & Bus Ag | Freisprecheinrichtung in einem Kraftfahrzeug |
US8150020B1 (en) | 2007-04-04 | 2012-04-03 | At&T Intellectual Property Ii, L.P. | System and method for prompt modification based on caller hang ups in IVRs |
US8024179B2 (en) * | 2007-10-30 | 2011-09-20 | At&T Intellectual Property Ii, L.P. | System and method for improving interaction with a user through a dynamically alterable spoken dialog system |
JP5282469B2 (ja) * | 2008-07-25 | 2013-09-04 | ヤマハ株式会社 | 音声処理装置およびプログラム |
CN113470670B (zh) * | 2021-06-30 | 2024-06-07 | 广州资云科技有限公司 | 电音基调快速切换方法及系统 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2165969B (en) * | 1984-10-19 | 1988-07-06 | British Telecomm | Dialogue system |
JPH0772840B2 (ja) * | 1992-09-29 | 1995-08-02 | 日本アイ・ビー・エム株式会社 | 音声モデルの構成方法、音声認識方法、音声認識装置及び音声モデルの訓練方法 |
SE9301596L (sv) * | 1993-05-10 | 1994-05-24 | Televerket | Anordning för att öka talförståelsen vid översätttning av tal från ett första språk till ett andra språk |
SE504177C2 (sv) * | 1994-06-29 | 1996-12-02 | Telia Ab | Metod och anordning att adaptera en taligenkänningsutrustning för dialektala variationer i ett språk |
-
1996
- 1996-05-13 SE SE9601811A patent/SE506003C2/sv unknown
-
1997
- 1997-04-08 DK DK97919840T patent/DK0919052T3/da active
- 1997-04-08 WO PCT/SE1997/000583 patent/WO1997043756A1/fr active IP Right Grant
- 1997-04-08 EP EP97919840A patent/EP0919052B1/fr not_active Expired - Lifetime
- 1997-04-08 DE DE69723449T patent/DE69723449T2/de not_active Expired - Fee Related
-
1998
- 1998-11-06 NO NO19985179A patent/NO318557B1/no unknown
Non-Patent Citations (1)
Title |
---|
See references of WO9743756A1 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3389043A4 (fr) * | 2015-12-07 | 2019-05-15 | Yamaha Corporation | Dispositif d'interaction vocale et procédé d'interaction vocale |
Also Published As
Publication number | Publication date |
---|---|
DE69723449D1 (de) | 2003-08-14 |
SE9601811L (sv) | 1997-11-03 |
DK0919052T3 (da) | 2003-11-03 |
SE9601811D0 (sv) | 1996-05-13 |
WO1997043756A1 (fr) | 1997-11-20 |
DE69723449T2 (de) | 2004-04-22 |
NO318557B1 (no) | 2005-04-11 |
NO985179L (no) | 1998-11-11 |
NO985179D0 (no) | 1998-11-06 |
EP0919052B1 (fr) | 2003-07-09 |
SE506003C2 (sv) | 1997-11-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5752227A (en) | Method and arrangement for speech to text conversion | |
US5806033A (en) | Syllable duration and pitch variation to determine accents and stresses for speech recognition | |
US6910012B2 (en) | Method and system for speech recognition using phonetically similar word alternatives | |
JP4536323B2 (ja) | 音声−音声生成システムおよび方法 | |
US7062440B2 (en) | Monitoring text to speech output to effect control of barge-in | |
US7191132B2 (en) | Speech synthesis apparatus and method | |
EP0767950B1 (fr) | Procede et dispositif pour adapter un equipement de reconnaissance de la parole aux variantes dialectales dans une langue | |
JPH09500223A (ja) | 多言語音声認識システム | |
GB2380380A (en) | Speech synthesis method and apparatus | |
CN101281518A (zh) | 语音翻译装置和方法 | |
US5950162A (en) | Method, device and system for generating segment durations in a text-to-speech system | |
US5677992A (en) | Method and arrangement in automatic extraction of prosodic information | |
EP0919052B1 (fr) | Procede et systeme de conversion de signaux vocaux en signaux vocaux | |
Kadambe et al. | Language identification with phonological and lexical models | |
WO1997043707A1 (fr) | Ameliorations relatives a la conversion voix-voix | |
Chou et al. | Automatic segmental and prosodic labeling of Mandarin speech database | |
Alam et al. | Development of annotated Bangla speech corpora | |
Wester et al. | Speaker adaptation and the evaluation of speaker similarity in the EMIME speech-to-speech translation project | |
Potisuk et al. | Using stress to disambiguate spoken Thai sentences containing syntactic ambiguity | |
KR20220036237A (ko) | 딥러닝을 기반으로 하는 가이드 음성 제공 시스템 | |
Williams | The segmentation and labelling of speech databases | |
Meinedo et al. | The use of syllable segmentation information in continuous speech recognition hybrid systems applied to the Portuguese language. | |
KR0136423B1 (ko) | 발음 제어 기호의 유효성 판정을 이용한 음운 변동 처리 방법 | |
Martin et al. | Cross Lingual Modelling Experiments for Indonesian | |
Mercier et al. | Recognition of speaker-dependent continuous speech with Keal-Nevezh |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 19981214 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): CH DE DK FI FR GB LI NL |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
RIC1 | Information provided on ipc code assigned before grant |
Free format text: 7G 10L 13/08 A, 7G 06F 3/16 B |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Designated state(s): CH DE DK FI FR GB LI NL |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20030709 Ref country code: LI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20030709 Ref country code: CH Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20030709 |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: EP |
|
REF | Corresponds to: |
Ref document number: 69723449 Country of ref document: DE Date of ref document: 20030814 Kind code of ref document: P |
|
REG | Reference to a national code |
Ref country code: DK Ref legal event code: T3 |
|
NLV1 | Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act | ||
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
ET | Fr: translation filed | ||
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed |
Effective date: 20040414 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DK Payment date: 20080411 Year of fee payment: 12 Ref country code: DE Payment date: 20080418 Year of fee payment: 12 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FI Payment date: 20080415 Year of fee payment: 12 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20080412 Year of fee payment: 12 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20080421 Year of fee payment: 12 |
|
REG | Reference to a national code |
Ref country code: DK Ref legal event code: EBP |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20090408 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST Effective date: 20091231 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20090408 Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20091103 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20090408 Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20091222 Ref country code: DK Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20090430 |