WO2002031811A1 - Sortie acoustique de documents en reseau - Google Patents

Sortie acoustique de documents en reseau Download PDF

Info

Publication number
WO2002031811A1
WO2002031811A1 PCT/DE2000/003550 DE0003550W WO0231811A1 WO 2002031811 A1 WO2002031811 A1 WO 2002031811A1 DE 0003550 W DE0003550 W DE 0003550W WO 0231811 A1 WO0231811 A1 WO 0231811A1
Authority
WO
WIPO (PCT)
Prior art keywords
documents
commands
output
synthesis module
conditioner
Prior art date
Application number
PCT/DE2000/003550
Other languages
German (de)
English (en)
Inventor
Klaus-Peter Wegge
Original Assignee
Siemens Aktiengesellschaft
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens Aktiengesellschaft filed Critical Siemens Aktiengesellschaft
Priority to PCT/DE2000/003550 priority Critical patent/WO2002031811A1/fr
Publication of WO2002031811A1 publication Critical patent/WO2002031811A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/487Arrangements for providing information services, e.g. recorded voice services or time announcements
    • H04M3/493Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
    • H04M3/4938Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals comprising a voice browser which renders and interprets, e.g. VoiceXML
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/40Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/60Medium conversion

Definitions

  • the invention relates to a device for the acoustic output of networked documents.
  • US Pat. No. 5,825,854 describes a telephone access system for access to a computer by means of a telephone set.
  • the device shown there includes the possibility of encoding, i.e. as a string, to output stored text by a speech output unit. It also shows that the structure of a text is analyzed and a controller is provided, by means of which output can take place along the determined structural elements. This system must be reprogrammed for every type of document; Examples include "electronic mail", the file system and other text documents in which the structure analysis mentioned starts.
  • a mail or appointment system addressed in the font can save an unlimited number of documents. However, they are always only documents that relate to the user; by belonging to his mail, appearing in his diary or being part of the file system accessible to him. Any new application that exposes a wider range of documents must be re-programmed to allow application-specific navigation through the document.
  • the invention uses the knowledge that a marking language with reference elements, such as the hypertext markup language HTML, in connection with an open data network such as the Internet, offers the possibility of an a priori indefinite and unlimited amount that is not assigned to the user access documents in a uniform manner.
  • a marking language with reference elements such as the hypertext markup language HTML
  • an open data network such as the Internet
  • This approach is combined with the fact that the editing of the marking language is separate from the output.
  • use is made of a speech synthesis which allows an ongoing speech output to be influenced without disturbing noises or pauses. Such is included, for example, in the application filed in parallel with the title "Control for voice output" by the same inventor.
  • This uses a chain of converters, with a controller monitoring the data transmitted within the chain and sending an asynchronous command to the converter concerned and generating indices in addition to the data and the monitoring preferably relating to the indices. It is therefore a device for acoustic access to networked documents via a voice network, a controller separating the commands of a user into navigation and output commands and feeding them to a processor for navigation and a synthesis module for output.
  • FIG. 1 shows a block diagram of a device for the acoustic output of networked documents according to the invention.
  • a voice network 10 usually the usual telephone network, is connected to a modem 12. It is a modem which, in particular, outputs the tone dialing signals, as can be transmitted by a conventional handset, on an output interface. These signals, which are usually coded in two-tone mode (DTMF), are output on the interface as characters, usually as the digits 0..9 and the special characters * and #. These are fed to a controller 14.
  • DTMF two-tone mode
  • the controller is in turn connected to a processor 16 for networked documents, which has access to a data network 20.
  • the networked documents are preferably saved in the language HTML and called up via the Internet as a data network. This is done using a conditioner, especially a slightly modified form of the LYNX program, which brings the HTML pages into a textual form.
  • the modification of the LYNX relate to the fact that the markings, in particular the references, are identified, in particular numbered, by textual means.
  • commands of the user can regulate the volume and the playback speed. These are fed directly to the speech synthesis 18 by the controller. This includes a command that cancels the current announcement. An immediate abrupt change or termination of the announcement is usually unpleasant, since this is perceived as an irregularity that requires attention.
  • indices are used, which are inserted into the speech synthesis upon input and to which the commands for the speech input relate. Either control via parameters can cause the conditioner to insert the indexes into the edited text. If this is not possible or sensible, the output of the conditioner is not passed directly to the speech synthesis, as shown in FIG. 1, but is first taken over by the controller, which inserts the indices before the transfer. Such an index can mark a reference, for example.
  • the controller determines its index, gives the synthesis module the command to abort the output after the current word and then to start outputting the indexed reference.
  • the above-mentioned synthesis module which uses a pipeline of components that can be influenced in a targeted manner, the following can be achieved: Since the command to cancel does not act immediately but after the end of the word, the components can be reloaded and the preparation started, see above that by the end of the word the synthesis process for the announcement of the reference has already progressed so far that it is naturally connected to the end of the current word. Similar considerations apply when speeding up output is desired.
  • the new speed can apply according to the current word or sentence.
  • the controller determines the appropriate index for this and relates the new speed to this index. Since the remaining parts of the reprocessing can remain in the pipeline, an abort with a subsequent restart is avoided.
  • control system refers to documents that are already structured and networked, there is nothing special about the existing processing by e.g. the program LYNX necessary. Only a few control commands are therefore necessary to be able to access a large number of documents.
  • controller If the controller is implemented in the PERL programming language, it can also use the "LWP: simple” module instead of the LNYX program, as can be obtained from "http://www.perl.com/CPAN/modules". This module represents an alternative form of a conditioner with a functional interface.

Abstract

Dispositif permettant l'accès acoustique à des documents en réseau par l'intermédiaire d'un réseau vocal. Une unité de commande sépare les instructions d'un utilisateur en instructions de navigation et instructions de sortie et envoie ces instructions respectivement à un processeur de navigation et à un module de synthèse pour la sortie.
PCT/DE2000/003550 2000-10-10 2000-10-10 Sortie acoustique de documents en reseau WO2002031811A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/DE2000/003550 WO2002031811A1 (fr) 2000-10-10 2000-10-10 Sortie acoustique de documents en reseau

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/DE2000/003550 WO2002031811A1 (fr) 2000-10-10 2000-10-10 Sortie acoustique de documents en reseau

Publications (1)

Publication Number Publication Date
WO2002031811A1 true WO2002031811A1 (fr) 2002-04-18

Family

ID=5647971

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/DE2000/003550 WO2002031811A1 (fr) 2000-10-10 2000-10-10 Sortie acoustique de documents en reseau

Country Status (1)

Country Link
WO (1) WO2002031811A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0847179A2 (fr) * 1996-12-04 1998-06-10 AT&T Corp. Système et méthode d'interface vocale aux informations hyperliées
EP0848373A2 (fr) * 1996-12-13 1998-06-17 Siemens Corporate Research, Inc. Système de communication interactive
WO2000021057A1 (fr) * 1998-10-01 2000-04-13 Mindmaker, Inc. Procede et appareil pour l'affichage d'informations

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0847179A2 (fr) * 1996-12-04 1998-06-10 AT&T Corp. Système et méthode d'interface vocale aux informations hyperliées
EP0848373A2 (fr) * 1996-12-13 1998-06-17 Siemens Corporate Research, Inc. Système de communication interactive
WO2000021057A1 (fr) * 1998-10-01 2000-04-13 Mindmaker, Inc. Procede et appareil pour l'affichage d'informations

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BROWN M: "PhoneBrowser : A Web-Content-Programmable Speech Processing Platform", W3.ORG WORKSHOP 1998, XP002175282, Retrieved from the Internet <URL:http://www.w3.org/Voice/1998/Workshop/Michael-Brown.html> [retrieved on 20010820] *

Similar Documents

Publication Publication Date Title
DE4436175B4 (de) Vorrichtung zum Fernzugreifen auf einen Computer ausgehend von einem Telefonhandapparat
DE60020773T2 (de) Graphische Benutzeroberfläche und Verfahren zur Änderung von Aussprachen in Sprachsynthese und -Erkennungssystemen
DE3317325C2 (fr)
DE69828141T2 (de) Verfahren und Vorrichtung zur Spracherkennung
EP0802522B1 (fr) Appareil et procédé pour déterminer une action, et utilisation de l&#39;appareil et du procédé
DE60032846T2 (de) Verfahren und System zur Anbietung von Alternativen für von stochastischen Eingabequellen abgeleitete Texte
DE3910467C2 (de) Verfahren und Vorrichtung zur Erzeugung von Berichten
DE60313706T2 (de) Spracherkennungs- und -antwortsystem, Spracherkennungs- und -antwortprogramm und zugehöriges Aufzeichnungsmedium
DE2551632C2 (de) Verfahren zum Zusammensetzen von Sprachnachrichten
DE2818974A1 (de) Datenstation fuer datenverarbeitungsanlagen
DE2808577A1 (de) Elektronischer rechner mit synthetischer sprachanzeige
DE2946856C2 (de) Wortspeichergerät
DE60123153T2 (de) Sprachgesteuertes Browsersystem
DE102006036338A1 (de) Verfahren zum Erzeugen einer kontextbasierten Sprachdialogausgabe in einem Sprachdialogsystem
DE3630611A1 (de) Elektronisches musikinstrument
DE69233622T2 (de) Vorrichtung zur Erzeugung von Ansagen
EP1321851B1 (fr) Méthode et système pour l&#39;utilisation de marqueurs sélectionnables par un utilisateur comme points d&#39;entrée dans la structure d&#39;un menu d&#39;un système de dialogue de parole
DE4243181C2 (de) Sprachgesteuerte Vorrichtung und Verfahren zu deren Betrieb
WO2002031811A1 (fr) Sortie acoustique de documents en reseau
DE3141254A1 (de) Sprachausgabevorrichtung
DE10030369A1 (de) Spracherkennungssystem
DE3040032C2 (de) Rechner mit Sprachausgabe
DE19911719A1 (de) Akustische Ausgabe vernetzter Dokumente
WO2000016310A1 (fr) Procede et dispositif de traitement numerique de la voix
EP0414238B1 (fr) Système d&#39;archivage à commande vocale

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): US

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase