WO2002031811A1 - Sortie acoustique de documents en reseau - Google Patents
Sortie acoustique de documents en reseau Download PDFInfo
- Publication number
- WO2002031811A1 WO2002031811A1 PCT/DE2000/003550 DE0003550W WO0231811A1 WO 2002031811 A1 WO2002031811 A1 WO 2002031811A1 DE 0003550 W DE0003550 W DE 0003550W WO 0231811 A1 WO0231811 A1 WO 0231811A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- documents
- commands
- output
- synthesis module
- conditioner
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/487—Arrangements for providing information services, e.g. recorded voice services or time announcements
- H04M3/493—Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
- H04M3/4938—Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals comprising a voice browser which renders and interprets, e.g. VoiceXML
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2201/00—Electronic components, circuits, software, systems or apparatus used in telephone systems
- H04M2201/40—Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2201/00—Electronic components, circuits, software, systems or apparatus used in telephone systems
- H04M2201/60—Medium conversion
Definitions
- the invention relates to a device for the acoustic output of networked documents.
- US Pat. No. 5,825,854 describes a telephone access system for access to a computer by means of a telephone set.
- the device shown there includes the possibility of encoding, i.e. as a string, to output stored text by a speech output unit. It also shows that the structure of a text is analyzed and a controller is provided, by means of which output can take place along the determined structural elements. This system must be reprogrammed for every type of document; Examples include "electronic mail", the file system and other text documents in which the structure analysis mentioned starts.
- a mail or appointment system addressed in the font can save an unlimited number of documents. However, they are always only documents that relate to the user; by belonging to his mail, appearing in his diary or being part of the file system accessible to him. Any new application that exposes a wider range of documents must be re-programmed to allow application-specific navigation through the document.
- the invention uses the knowledge that a marking language with reference elements, such as the hypertext markup language HTML, in connection with an open data network such as the Internet, offers the possibility of an a priori indefinite and unlimited amount that is not assigned to the user access documents in a uniform manner.
- a marking language with reference elements such as the hypertext markup language HTML
- an open data network such as the Internet
- This approach is combined with the fact that the editing of the marking language is separate from the output.
- use is made of a speech synthesis which allows an ongoing speech output to be influenced without disturbing noises or pauses. Such is included, for example, in the application filed in parallel with the title "Control for voice output" by the same inventor.
- This uses a chain of converters, with a controller monitoring the data transmitted within the chain and sending an asynchronous command to the converter concerned and generating indices in addition to the data and the monitoring preferably relating to the indices. It is therefore a device for acoustic access to networked documents via a voice network, a controller separating the commands of a user into navigation and output commands and feeding them to a processor for navigation and a synthesis module for output.
- FIG. 1 shows a block diagram of a device for the acoustic output of networked documents according to the invention.
- a voice network 10 usually the usual telephone network, is connected to a modem 12. It is a modem which, in particular, outputs the tone dialing signals, as can be transmitted by a conventional handset, on an output interface. These signals, which are usually coded in two-tone mode (DTMF), are output on the interface as characters, usually as the digits 0..9 and the special characters * and #. These are fed to a controller 14.
- DTMF two-tone mode
- the controller is in turn connected to a processor 16 for networked documents, which has access to a data network 20.
- the networked documents are preferably saved in the language HTML and called up via the Internet as a data network. This is done using a conditioner, especially a slightly modified form of the LYNX program, which brings the HTML pages into a textual form.
- the modification of the LYNX relate to the fact that the markings, in particular the references, are identified, in particular numbered, by textual means.
- commands of the user can regulate the volume and the playback speed. These are fed directly to the speech synthesis 18 by the controller. This includes a command that cancels the current announcement. An immediate abrupt change or termination of the announcement is usually unpleasant, since this is perceived as an irregularity that requires attention.
- indices are used, which are inserted into the speech synthesis upon input and to which the commands for the speech input relate. Either control via parameters can cause the conditioner to insert the indexes into the edited text. If this is not possible or sensible, the output of the conditioner is not passed directly to the speech synthesis, as shown in FIG. 1, but is first taken over by the controller, which inserts the indices before the transfer. Such an index can mark a reference, for example.
- the controller determines its index, gives the synthesis module the command to abort the output after the current word and then to start outputting the indexed reference.
- the above-mentioned synthesis module which uses a pipeline of components that can be influenced in a targeted manner, the following can be achieved: Since the command to cancel does not act immediately but after the end of the word, the components can be reloaded and the preparation started, see above that by the end of the word the synthesis process for the announcement of the reference has already progressed so far that it is naturally connected to the end of the current word. Similar considerations apply when speeding up output is desired.
- the new speed can apply according to the current word or sentence.
- the controller determines the appropriate index for this and relates the new speed to this index. Since the remaining parts of the reprocessing can remain in the pipeline, an abort with a subsequent restart is avoided.
- control system refers to documents that are already structured and networked, there is nothing special about the existing processing by e.g. the program LYNX necessary. Only a few control commands are therefore necessary to be able to access a large number of documents.
- controller If the controller is implemented in the PERL programming language, it can also use the "LWP: simple” module instead of the LNYX program, as can be obtained from "http://www.perl.com/CPAN/modules". This module represents an alternative form of a conditioner with a functional interface.
Abstract
Dispositif permettant l'accès acoustique à des documents en réseau par l'intermédiaire d'un réseau vocal. Une unité de commande sépare les instructions d'un utilisateur en instructions de navigation et instructions de sortie et envoie ces instructions respectivement à un processeur de navigation et à un module de synthèse pour la sortie.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/DE2000/003550 WO2002031811A1 (fr) | 2000-10-10 | 2000-10-10 | Sortie acoustique de documents en reseau |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/DE2000/003550 WO2002031811A1 (fr) | 2000-10-10 | 2000-10-10 | Sortie acoustique de documents en reseau |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2002031811A1 true WO2002031811A1 (fr) | 2002-04-18 |
Family
ID=5647971
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/DE2000/003550 WO2002031811A1 (fr) | 2000-10-10 | 2000-10-10 | Sortie acoustique de documents en reseau |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2002031811A1 (fr) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0847179A2 (fr) * | 1996-12-04 | 1998-06-10 | AT&T Corp. | Système et méthode d'interface vocale aux informations hyperliées |
EP0848373A2 (fr) * | 1996-12-13 | 1998-06-17 | Siemens Corporate Research, Inc. | Système de communication interactive |
WO2000021057A1 (fr) * | 1998-10-01 | 2000-04-13 | Mindmaker, Inc. | Procede et appareil pour l'affichage d'informations |
-
2000
- 2000-10-10 WO PCT/DE2000/003550 patent/WO2002031811A1/fr active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0847179A2 (fr) * | 1996-12-04 | 1998-06-10 | AT&T Corp. | Système et méthode d'interface vocale aux informations hyperliées |
EP0848373A2 (fr) * | 1996-12-13 | 1998-06-17 | Siemens Corporate Research, Inc. | Système de communication interactive |
WO2000021057A1 (fr) * | 1998-10-01 | 2000-04-13 | Mindmaker, Inc. | Procede et appareil pour l'affichage d'informations |
Non-Patent Citations (1)
Title |
---|
BROWN M: "PhoneBrowser : A Web-Content-Programmable Speech Processing Platform", W3.ORG WORKSHOP 1998, XP002175282, Retrieved from the Internet <URL:http://www.w3.org/Voice/1998/Workshop/Michael-Brown.html> [retrieved on 20010820] * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE4436175B4 (de) | Vorrichtung zum Fernzugreifen auf einen Computer ausgehend von einem Telefonhandapparat | |
DE60020773T2 (de) | Graphische Benutzeroberfläche und Verfahren zur Änderung von Aussprachen in Sprachsynthese und -Erkennungssystemen | |
DE3317325C2 (fr) | ||
DE69828141T2 (de) | Verfahren und Vorrichtung zur Spracherkennung | |
EP0802522B1 (fr) | Appareil et procédé pour déterminer une action, et utilisation de l'appareil et du procédé | |
DE60032846T2 (de) | Verfahren und System zur Anbietung von Alternativen für von stochastischen Eingabequellen abgeleitete Texte | |
DE3910467C2 (de) | Verfahren und Vorrichtung zur Erzeugung von Berichten | |
DE60313706T2 (de) | Spracherkennungs- und -antwortsystem, Spracherkennungs- und -antwortprogramm und zugehöriges Aufzeichnungsmedium | |
DE2551632C2 (de) | Verfahren zum Zusammensetzen von Sprachnachrichten | |
DE2818974A1 (de) | Datenstation fuer datenverarbeitungsanlagen | |
DE2808577A1 (de) | Elektronischer rechner mit synthetischer sprachanzeige | |
DE2946856C2 (de) | Wortspeichergerät | |
DE60123153T2 (de) | Sprachgesteuertes Browsersystem | |
DE102006036338A1 (de) | Verfahren zum Erzeugen einer kontextbasierten Sprachdialogausgabe in einem Sprachdialogsystem | |
DE3630611A1 (de) | Elektronisches musikinstrument | |
DE69233622T2 (de) | Vorrichtung zur Erzeugung von Ansagen | |
EP1321851B1 (fr) | Méthode et système pour l'utilisation de marqueurs sélectionnables par un utilisateur comme points d'entrée dans la structure d'un menu d'un système de dialogue de parole | |
DE4243181C2 (de) | Sprachgesteuerte Vorrichtung und Verfahren zu deren Betrieb | |
WO2002031811A1 (fr) | Sortie acoustique de documents en reseau | |
DE3141254A1 (de) | Sprachausgabevorrichtung | |
DE10030369A1 (de) | Spracherkennungssystem | |
DE3040032C2 (de) | Rechner mit Sprachausgabe | |
DE19911719A1 (de) | Akustische Ausgabe vernetzter Dokumente | |
WO2000016310A1 (fr) | Procede et dispositif de traitement numerique de la voix | |
EP0414238B1 (fr) | Système d'archivage à commande vocale |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
122 | Ep: pct application non-entry in european phase |