DE19514849A1

DE19514849A1 - Remote control of device through communications network

Info

Publication number: DE19514849A1
Application number: DE1995114849
Authority: DE
Inventors: Thomas Dipl Ing Hoermann; Dieter Dipl Ing Kopp
Original assignee: Alcatel SEL AG
Current assignee: Alcatel Lucent Deutschland AG
Priority date: 1995-04-26
Filing date: 1995-04-26
Publication date: 1996-10-31

Abstract

The remote control uses a menu which is controlled according to the skip and scan principle. The machine/device (AB) being controlled has a terminal (AS) for the network e.g. ISDN, and needs an analogue interface, with an analogue-to-digital converter, a microphone (MIC) and a loudspeaker (LS). A memory (MEM) is used for recording and reproducing digital, compressed speech and for storing greetings, notices and information used in communications and can be a DRAM or an EEPROM.

Description

Die vorliegende Erfindung betrifft ein Verfahren zur menügesteuerten Fernsteuerung nach dem Oberbegriff des Patentanspruchs 1 und eine Vorrichtung nach dem Oberbegriff des Patentanspruchs 5.The present invention relates to a method for menu-driven remote control according to the generic term of Claim 1 and a device according to the preamble of Claim 5.

Aus dem Aufsatz "Stimmen aus der Ferne" von Thomas Becker, c′t 1994, Heft 9, Seiten 122 bis 133, ist ein solches Verfahren und eine solche Vorrichtung bekannt. In diesem Aufsatz wird beschrieben, daß zur Fernsteuerung von Sprachübermittlungssystemen eine menüorientierte Dialogstruktur vorgesehen werden kann. Das bedeutet, einem Benutzer, der von entfernter Stelle über das Telefonnetz das Sprachübermittlungssystem fernsteuert, werden verschiedene Menüpunkte zur Auswahl angeboten. Der Benutzer muß dem Sprachübermittlungssystem signalisieren, welchen Menüpunkt er auswählt. Die Auswahl geschieht mit Hilfe von Mehrfrequenzwahltönen (MFV), indem der Benutzer eine Taste seines Endgerätes drückt, oder durch gesprochene Befehle, falls in dem Sprachübermittlungssystem ein Spracherkennungssystem vorhanden ist. Zum Aufbau des Menüs sind mehrere Menüvarianten bekannt. Als besonders bedienerfreundlich hat sich dabei das Menü nach dem sog. Skip-and-Scan-Prinzip erwiesen. Bei dem Menü nach dem Skip-and-Scan-Prinzip wird dem Benutzer ein Menüpunkt angeboten, wobei die Reihenfolge der angebotenen Menüpunkte nach der Wichtigkeit der Benutzung sortiert sind. Bei jedem angebotenen Menüpunkt hat der Benutzer die Möglichkeit, diesen Menüpunkt zu aktivieren, zum nächsten Menüpunkt weiterzugehen oder sich den vorhergehenden Menüpunkt erneut anbieten zu lassen. Dazu sind drei Begriffe (z. B. Weiter, zurück, Ausführen) oder drei Ziffern zur Steuerung ausreichend. Diese drei Begriffe oder Ziffern haben bei jedem Menüpunkt dieselbe Bedeutung. Dies ist ein enormer Beitrag zur Benutzerfreundlichkeit, da diese Begriffe leicht zu merken sind.From the essay "Voices from a distance" by Thomas Becker, c′t 1994, issue 9, pages 122 to 133, is such a method and such a device is known. In this essay described that for remote control of voice transmission systems a menu-oriented dialogue structure can be provided. The means a user who remotely uses the Telephone network, the voice transmission system are controlled remotely different menu items offered for selection. The user must Voice transmission system signal which menu item he selects. The selection is made using multi-frequency dialing tones (DTMF) by the user pressing a button on his terminal, or by spoken commands, if in the voice transmission system there is a speech recognition system. To build the menu are several menu variants known. Has been particularly user-friendly the menu has proven itself according to the so-called skip-and-scan principle. With the menu according to the skip-and-scan principle, the user is a Menu item offered, the order of the offered Menu items are sorted according to the importance of use. At for each menu item offered, the user has the option of activate this menu item to the next menu item continue or the previous menu item again to offer. There are three terms for this (e.g. Next, Back, Execute) or three digits are sufficient for control. These three Terms or numbers have the same meaning for every menu item. This is a huge contribution to usability as this Terms are easy to remember.

Zum Erkennen der drei Begriffe ist in dem Sprachübermittlungssystem ein Spracherkennungssystem erforderlich. Es ist allgemein bekannt, daß solche Spracherkennungssysteme sprecherunabhängig oder sprecherabhängig realisiert werden können. Bei sprecherabhängigen Spracherkennungssystemen muß ein neuer Benutzer das Spracherkennungssystem in einer Lernphase auf seine individuelle Stimme trainieren, bevor er es benutzen kann. Dazu gibt der Benutzer während der Lernphase jedes Wort des Wortschatzes ein- oder mehrmalig dem Spracherkennungssystem akustisch ein. Das Spracherkennungssystem besitzt dazu üblicherweise ein Mikrophon.To recognize the three terms is in the voice transmission system a speech recognition system is required. It's commonly known, that such speech recognition systems are speaker independent or can be realized depending on the speaker. For speaker-dependent Speech recognition systems need a new user Speech recognition system in a learning phase on its individual Train your voice before you can use it. The Users learn each word of the vocabulary during the learning phase or the acoustic recognition system several times. The For this purpose, speech recognition system usually has a microphone.

Es ist die Aufgabe der vorliegenden Erfindung, eine aufwandsarme und benutzerfreundliche Vorrichtung oder ein Verfahren für eine Fernsteuerung anzugeben.It is the object of the present invention, a low-effort and user-friendly device or method for one Specify remote control.

Diese Aufgabe ist durch die Lehre des Patentanspruchs 1 oder des Patentanspruchs 5 gelöst.This object is through the teaching of claim 1 or Claim 5 solved.

Vorteilhafterweise wird automatisch nach einer vorgegebenen Anzahl von Rückweisungen eines Steuerwortes zur Fernsteuerung die Lernphase begonnen. Advantageously, automatically after a predetermined number of rejections of a control word for remote control Learning phase started.

Weitere vorteilhafte Ausgestaltungen der Erfindung sind den abhängigen Ansprüchen zu entnehmen.Further advantageous embodiments of the invention are the dependent claims.

Im folgenden wird die Erfindung anhand von Ausführungsbeispielen und den Fig. 1, 2A und 2B beschrieben. Es zeigen:The invention is described below with reference to exemplary embodiments and FIGS. 1, 2A and 2B. Show it:

Fig. 1 ein Ausführungsbeispiel einer erfindungsgemäßen Vorrichtung und Fig. 1 shows an embodiment of a device according to the invention and

Fig. 2A und 2B ein Ablaufdiagramm eines Ausführungsbeispieles eines erfindungsgemäßen Verfahrens. Figs. 2A and 2B is a flow diagram of one embodiment of a method according to the invention.

Fig. 1 zeigt das Ausführungsbeispiel einer erfindungsgemäßen Vorrichtung AB zur Fernsteuerung. Die Fernsteuerung ist in den vorliegenden Ausführungsbeispielen eine menügesteuerte Fernsteuerung nach dem bekannten Skip-and-Scan Prinzip. Es ist allerdings ebenfalls möglich, eine andersgeartete Fernsteuerung zu verwenden, bei der eine Eingabe von Sprachreferenzen für eine sprecherabhängige Spracherkennung notwendig ist. Die erfindungsgemäße Vorrichtung AB ist im vorliegenden Ausführungsbeispiel ein Anrufbeantworter. Sie kann aber ebenso ein andersgeartetes Sprachinformationssystem sein. Der Anrufbeantworter AB hat ein Anschlußmittel AS für ein Kommunikationsnetz. Dieses Kommunikationsnetz ist im vorliegenden Ausführungsbeispiel ein Kommunikationsnetz nach dem ISDN (Integrated Services Digital Network) - Standard. Es ist auch möglich, die erfindungsgemäße Vorrichtung AB in anderen Kommunikationsnetzen einzusetzen. Die erfindungsgemäße Vorrichtung AB muß dann an die Gegebenheiten dieses anderen Kommunikationsnetzes angepaßt werden. Der Anrufbeantworter AB hat desweiteren eine analoge Schnittstelle mit einem Analog-Digital/Digital-Analog-Wandler AD/DA, über den ein Mikrophon MIC zur analogen Spracheingabe und ein Lautsprecher LS zur analogen Sprachausgabe angeschlossen ist. Ein erster Speicher MEM wird zur Aufzeichnung und Wiedergabe von digitalisierter komprimierter Sprache verwendet. Der erste Speicher MEM wird insbesondere für die Abspeicherung einer Begrüßungsansage und für die Abspeicherung von über das Kommunikationsnetz empfangenen Nachrichten benutzt. Der erste Speicher MEM kann beispielsweise ein handelsübliches DRAM sein, oder aber ein elektrisch löschbarer, programmierbarer Festwertspeicher, ein sog. EEPROM (Elektrically Erasable and Progammable Read Only Memory), wobei auch die blockweise löschbaren Flash-EPROM gemeint sind. Der Anrufbeantworter AB besitzt ein Steuermittel HOST zur Steuerung von mit dem Anrufbeantworter AB durchführbaren Funktionen. Fig. 1 shows the embodiment of a device AB according to the invention for remote control. In the present exemplary embodiments, the remote control is a menu-controlled remote control based on the known skip-and-scan principle. However, it is also possible to use a different type of remote control, in which an input of speech references is necessary for speaker-dependent speech recognition. The device AB according to the invention is an answering machine in the present exemplary embodiment. But it can also be a different kind of speech information system. The answering machine AB has a connection means AS for a communication network. In the present exemplary embodiment, this communication network is a communication network based on the ISDN (Integrated Services Digital Network) standard. It is also possible to use the device AB according to the invention in other communication networks. The device AB according to the invention must then be adapted to the conditions of this other communication network. The answering machine AB also has an analog interface with an analog-digital / digital-analog converter AD / DA, via which a microphone MIC for analog voice input and a loudspeaker LS for analog voice output are connected. A first memory MEM is used to record and reproduce digitized compressed speech. The first memory MEM is used in particular for storing a greeting announcement and for storing messages received via the communication network. The first memory MEM can be, for example, a commercially available DRAM, or else an electrically erasable, programmable read-only memory, a so-called EEPROM (Electrically Erasable and Programmable Read Only Memory), which also means the block-erasable flash EPROM. The answering machine AB has a control means HOST for controlling functions that can be carried out with the answering machine AB.

Die beschriebenen Komponenten des Anrufbeantworters AB sind zum Austausch von Daten an einen internen Bus BUS angeschlossen. Die Komponenten des Anrufbeantworters AB können allerdings auch auf andere Weise miteinander verbunden sein.The components of the answering machine AB described are for Exchange of data connected to an internal bus BUS. The Components of the answering machine AB can, however, also be used be connected in another way.

Das Steuermittel HOST hat eine direkte Verbindung zu einem zweiten Speicher ROM zur Speicherung von Programmodulen. Diese Programmodule enthalten die Anweisungen für das Steuermittel HOST zur Steuerung der Funktionen, die mit dem Anrufbeantworter AB ausgeführt werden können. Der zweite Speicher ROM kann ein gängiger handelsüblicher Speicher sein, der nicht notwendigerweise nur lesbar (Read Only) sein muß.The control means HOST has a direct connection to a second one ROM memory for storing program modules. These Program modules contain the instructions for the control means HOST to control the functions with the answering machine AB can be executed. The second memory ROM can be a common one commercial storage, which is not necessarily only must be readable.

Das Steuermittel HOST hat weiterhin eine direkte Verbindung zu einem dritten Speicher SPSR für das Abspeichern von Sprachreferenzen und eine direkte Verbindung zu einem vierten Speicher SPWO für das Abspeichern von aus dem Kommunikationsnetz empfangenen Steuerworten für die Fernsteuerung. Der erste Speicher MEM, der zweite Speicher ROM, der dritte Speicher SPSR und der vierte Speicher SPWO sind im Ausführungsbeispiel getrennte funktionelle Speicher, die allerdings mittels eines Speichers realisiert werden können. In dem Steuermittel HOST ist ein Vorverarbeitungsmittel SE, ein Vergleichsmittel VM und ein Ausgabe- und Empfangsmittel AEM integriert. Diese benutzen die in dem zweiten Speicher ROM abgespeicherten Programmodule und führen aufgrund der darin enthaltenen Anweisungen die im folgenden beschriebenen Funktionen aus.The control means HOST still has a direct connection to a third memory SPSR for storing Language references and a direct link to a fourth SPWO memory for storing data from the communication network Control words received for remote control. The first store MEM, the second memory ROM, the third memory SPSR and the fourth memories SPWO are separate in the exemplary embodiment functional memory, but by means of a memory can be realized. There is a in the control means HOST Preprocessing means SE, a comparison means VM and an output and receiving means AEM integrated. These use the in the second memory ROM stored program modules and lead based on the instructions contained therein, the following described functions.

Das Vorverarbeitungsmittel SE erfaßt den Wortanfang und das Wortende eines aus dem Kommunikationsnetz empfangenen Steuerwortes zur menüorientierten Fernsteuerung oder einer aus dem Kommunikationsnetz empfangenen, in dem dritten Speicher SPSR neu abzuspeichernden Sprachreferenz. Daraufhin extrahiert das Vorverarbeitungsmittel SE die Merkmale dieses empfangenen Steuerwortes und erstellt für diese Merkmale Merkmalsvektoren die jeweils 20 msec. des Steuerwortes repräsentieren. Das Vergleichsmittel VM führt einen Vergleich der extrahierten Merkmalsvektoren des empfangenen Steuerwortes mit den Merkmalsvektoren der Sprachreferenzen durch. Dazu greift es auf den jeweiligen Speicher SPSR oder SPWO zu, um die benötigten Merkmalsvektoren auszulesen. Das Vergleichsmittel VM bestimmt dabei eine Differenz zwischen den miteinander verglichenen Merkmalsvektoren. Diese Differenz ist ein Maß für die Ähnlichkeit des empfangenen Steuerwortes mit einer abgespeicherten Sprachreferenz. Das Ausgabe- und Empfangsmittel AEM empfängt die über das Kommunikationsnetz übermittelten Steuerworte, sprecherabhängigen Sprachreferenzen und sonstige, dem Anrufbeantworter AB übermittelte Nachrichten. Es führt bei Bedarf eine Dekompression komprimierter Daten durch. Das Ausgabe- und Empfangsmittel AEM veranlaßt desweiteren die Ausgabe von Ansagen in das Kommunikationsnetz. Solche Ansagen sind insbesondere die in dem ersten Speicher MEM abgespeicherte Begrüßungsansage, eine in dem zweiten Speicher ROM abgespeicherte Ansage zur sprachlichen Benutzerführung und eine Ansage mit der Aufforderung, eine sprecherabhängige Sprachreferenz über das Kommunikationsnetz zu dem Anrufbeantworter AB zu übermitteln. Das Ausgabe- und Empfangsmittel AEM führt dazu bei Bedarf eine Dekompression der auszugebenden Daten durch. Das Ausgabe- und Empfangsmittel AEM veranlaßt ebenfalls die Ausgabe von in dem ersten Speicher MEM abgespeicherten Nachrichten über die analoge Schnittstelle AD/DA und den Lautsprecher LS.The preprocessing means SE detects the beginning of the word and that Word end of a control word received from the communication network for menu-oriented remote control or one from the Communication network received, new in the third memory SPSR language reference to be saved. Then that extracts Preprocessing means SE the characteristics of this received Control word and creates the feature vectors for these features 20 msec each. of the control word. The Comparison means VM carries out a comparison of the extracted Characteristic vectors of the control word received with the Feature vectors of the language references by. To do this, it uses the respective memory SPSR or SPWO to the required Read feature vectors. The comparison means VM determines a difference between those compared Feature vectors. This difference is a measure of the similarity of the control word received with a stored one Language reference. The output and receiving means AEM receives the control words transmitted via the communication network, speaker-dependent language references and other, the Answering machine AB messages sent. It performs when needed decompression of compressed data. The issue and Receiving means AEM also causes announcements to be output in the communication network. Such announcements are particularly those in the first memory MEM stored greeting, one in the second memory ROM stored announcement for linguistic User guidance and an announcement asking you to speaker-dependent voice reference over the communication network to the Answering machine to transmit AB. The means of output and reception If necessary, AEM will decompress the output Data through. The output and receiving means causes AEM also the output of in the first memory MEM stored messages via the analog interface AD / DA and the speaker LS.

Die Fig. 2A und 2B zeigen das Ablaufdiagramm des Ausführungsbeispiels des erfindungsgemäßen Verfahrens zur Fernsteuerung des erfindungsgemäßen Anrufbeantworters AB gemäß Fig. 1. Die Fernsteuerung ist hier ebenfalls die menügesteuerte Fernsteuerung nach dem Skip-and-Scan Prinzip. In einem Schritt 1 wird der Anrufbeantworter AB in einen Fernsteuerungsmodus gebracht. Verfahren, die dieses ermöglichen, sind aus dem Stand der Technik allgemein bekannt. In einem Schritt 2 gibt der Anrufbeantworter AB eine Informationsansage zur Benutzerführung der menüorientierten Fernsteuerung in das Kommunikationsnetz zu einem den Fernsteuerungsmodus aktivierenden Benutzer aus. In einem Schritt 3 verzweigt der Anrufbeantworter AB anschließend in das Menü und es wird eine Ansage des aktuellen Menüpunktes in das Kommunikationsnetz ausgegeben. Das Verzweigen in das Menü erfolgt zustandsabhängig, d. h. beispielsweise abhängig davon, ob neue in dem ersten Speicher MEM abgespeicherte Nachrichten vorliegen oder nicht. Liegen neue Nachrichten vor, so werden dem Benutzer diese neuen Nachrichten zum Abspielen angeboten, liegen keine neuen Nachrichten vor, so verzeigt das Menü direkt zum nächsten Menüpunkt, der hier beispielsweise das Abspielen alter, noch im ersten Speicher MEM abgespeicherter Nachrichten sein kann. Der Anrufbeantworter AB erwartet nun eine Eingabe des Benutzers über das Kommunikationsnetz. In einem Schritt 4 wird überprüft, ob der Benutzer eine Eingabe in Form eines Steuerwortes macht oder nicht. Die für die menüorientierte Fernsteuerung verwendbaren Steuerworte sind im vorliegenden Ausführungsbeispiel die Steuerworte weiter, zurück und ausführen. Erfolgt von Seiten des Benutzers keine Eingabe, so ist das erfindungsgemäße Verfahren beendet. Es ist auch möglich, dem Benutzer an dieser Stelle eine Hilfeansage zu übermitteln oder zum nächsten Menüpunkt weiterzuverzweigen und die Nichteingabe durch den Benutzer als das Steuerwort weiter aufzufassen. Erfolgt eine Eingabe, dann verzweigt das erfindungsgemäße Verfahren zu einem Schritt 5, in dem die Spracherkennung aktiviert wird. Das Vorverarbeitungsmittel SE erfaßt dabei den Wortanfang und das Wortende des empfangenen Steuerwortes. In einem Schritt 6 wird daraufhin die Extraktion der Merkmale des empfangenen Wortes und die Erstellung der Merkmalsvektoren durch das Vorverarbeitungsmittel SE durchgeführt. In einem folgenden Schritt 7 werden diese Merkmalsvektoren in dem vierten Speicher SPWO abgespeichert. Dieses Abspeichern wird von dem Steuermittel HOST veranlaßt. Im Schritt 8 werden im dritten Speicher SPSR abgespeicherte Merkmalsvektoren von Sprachreferenzen für die Steuerworte weiter, zurück und ausführen und die in dem vierten Speicher SPWO abgespeicherten Merkmalsvektoren des empfangenen Steuerwortes ausgelesen und in einem Schritt 9 durch das Vergleichsmittel VM miteinander verglichen. Dabei stellt das Vergleichsmittel VM die Differenzen zwischen den miteinander verglichenen Merkmalsvektoren fest. In einem Schritt 10 wird anschließend überprüft, ob eine Rückweisung des empfangenen Steuerwortes stattfinden muß. Eine Rückweisung liegt im vorliegenden Ausführungsbeispiel dann vor, wenn entweder in dem dritten Speicher SPSR noch keine Merkmalsvektoren von Sprachreferenzen abgespeichert sind oder die Differenz zwischen den Merkmalsvektoren des empfangenen Steuerwortes und den Merkmalsvektoren der Sprachreferenzen zu groß ist. Dazu wird ein Schwellenwert vorgegeben, der von der jeweiligen Differenz unterschritten werden muß. Es findet weiterhin eine Rückweisung statt, wenn bei Vorliegen von wenigstens zwei Sprachreferenzen der Unterschied zwischen der kleinsten und der zweitkleinsten Differenz zu gering ist. In diesem Fall kann die sprecherabhängige Spracherkennung keine eindeutige Zuordnung des empfangenen Steuerwortes zu einer abgespeicherten Sprachreferenz vornehmen. Findet im Schritt 10 eine eindeutige Zuordnung statt, d. h. das empfangene Steuerwort wird nicht zurückgewiesen, dann wird diese Eingabe des Benutzers in einem Schritt 11 von dem Steuermittel HOST ausgeführt. In einem Schritt 12 wird anschließend überprüft, ob das Menü beendet ist. Ist dies der Fall, so wird in einem Schritt 13 eine Endansage an den Benutzer in das Kommunikationsnetz ausgegeben und anschließend das erfindungsgemäße Verfahren beendet. Wird allerdings in dem Schritt 12 festgestellt, daß das Menü nicht beendet ist, so verzweigt das erfindungsgemäße Verfahren zu dem Schritt 3, um mit dem nächsten Menüpunkt fortzufahren. Figs. 2A and 2B show the flowchart of the embodiment of the inventive method for remote control of the answering machine AB invention according to Fig. 1. The remote control is also here the menu-driven remote control according to the skip-and-scan principle. In step 1, the answering machine AB is brought into a remote control mode. Methods that make this possible are generally known from the prior art. In a step 2, the answering machine AB issues an information announcement for user guidance of the menu-oriented remote control in the communication network to a user activating the remote control mode. In a step 3, the answering machine AB then branches into the menu and an announcement of the current menu item is output in the communication network. The branching into the menu is status-dependent, that is to say, for example, depending on whether there are new messages stored in the first memory MEM or not. If there are new messages, the user is offered these new messages for playback, if there are no new messages, the menu points directly to the next menu item, which here can be, for example, the playback of old messages that are still stored in the first memory MEM. The answering machine AB now expects the user to input it via the communication network. In a step 4 it is checked whether the user makes an entry in the form of a control word or not. In the present exemplary embodiment, the control words that can be used for the menu-oriented remote control are the control words next, back and execute. If no input is made by the user, the method according to the invention is ended. It is also possible to transmit a help announcement to the user at this point or to branch to the next menu item and to interpret the non-entry by the user as the control word. If an entry is made, the method according to the invention branches to a step 5 in which the speech recognition is activated. The preprocessing means SE detects the beginning and end of the word of the control word received. In a step 6, the extraction of the features of the received word and the creation of the feature vectors are then carried out by the preprocessing means SE. In a subsequent step 7, these feature vectors are stored in the fourth memory SPWO. This storage is caused by the control means HOST. In step 8, feature vectors of language references for the control words stored in the third memory SPSR are forward, back and execute and the feature vectors of the received control word stored in the fourth memory SPWO are read out and compared in a step 9 by the comparison means VM. The comparison means VM determines the differences between the feature vectors compared with one another. In a step 10, it is then checked whether the control word received must be rejected. In the present exemplary embodiment, a rejection exists if either no feature vectors of speech references are stored in the third memory SPSR or the difference between the feature vectors of the received control word and the feature vectors of the speech references is too large. For this purpose, a threshold value is specified which must be undercut by the respective difference. A rejection continues if the difference between the smallest and the second smallest difference is too small when there are at least two language references. In this case, the speaker-dependent speech recognition cannot make an unambiguous assignment of the control word received to a stored speech reference. If an unambiguous assignment takes place in step 10, ie the control word received is not rejected, then this input by the user is carried out in a step 11 by the control means HOST. In a step 12 it is then checked whether the menu has ended. If this is the case, a final announcement is made to the user in the communication network in a step 13 and the method according to the invention is then ended. However, if it is determined in step 12 that the menu has not ended, the method according to the invention branches to step 3 in order to continue with the next menu item.

Wird allerdings in dem Schritt 10 eine Rückweisung des empfangenen Steuerwortes von dem Steuermittel HOST durchgeführt, so wird in einem Schritt 14 eine Variable R, die zu Beginn der Durchführung des erfindungsgemäßen Verfahrens gleich null gesetzt wurde und die die Anzahl von Rückweisungen seit Beginn der Durchführung des erfindungsgemäßen Verfahrens angibt, um 1 erhöht. In einem Schritt 15 wird daraufhin überprüft, ob die Anzahl der Rückweisungen seit Beginn der Durchführung des erfindungsgemäßen Verfahrens die vorgegebene maximale Anzahl R_max von Rückweisungen erreicht hat. Ist dies nicht der Fall, dann wird in einem Schritt 16 eine Ansage mit einer erneuten Aufforderung zur Übermittlung des Steuerwortes in das Kommunikationsnetz zu dem Benutzer ausgegeben. Anschließend verzweigt das erfindungsgemäße Verfahren zu dem Schritt 4, in dem der Anrufbeantworter AB überprüft, ob von dem Benutzer eine Eingabe gemacht wird. Wird allerdings in dem Schritt 15 festgestellt, daß die maximale Anzahl von erlaubten Rückweisungen erreicht wurde, dann wird in einem Schritt 17 eine kurze Ansage mit der Aufforderung in das Kommunikationsnetz ausgegeben, sprecherabhängige Sprachreferenzen über das Kommunikationsnetz zu dem Anrufbeantworter AB zu übermitteln. In dem Schritt 17 wird damit eine Lernphase der sprecherabhängigen Spracherkennung in Dialogform gestartet. Die sprecherabhängigen Sprachreferenzen können dabei dem Anrufbeantworter AB über das Kommunikationsnetz von dem Benutzer "online" übermittelt werden. In einem Schritt 18 wird anschließend überprüft, ob der Benutzer dem Anrufbeantworter AB eine Eingabe und damit eine neue Sprachreferenz übermittelt. Ist das nicht der Fall, dann wird in einem Schritt 22 eine Hilfeansage in das Kommunikationsnetz ausgegeben, mit der der Benutzer eine ausführliche Information zur Eingabe der Sprachreferenz über das Kommunikationsnetz erhält. Danach verzweigt das Menü erneut zum Schritt 18 und erwartet die Eingabe und die Übermittlung der neuen Sprachreferenz. Macht der Benutzer die Eingabe, dann wird in einem Schritt 19 die sprecherabhängige Spracherkennung erneut aktiviert und von dem Vorverarbeitungsmittel SE der Wortanfang und das Wortende der übermittelten und von dem Ausgabe- und Empfangsmittel AEM empfangene Sprachreferenz erfaßt. In einem Schritt 20 findet daraufhin die Extraktion der Merkmale der Sprachreferenz und die Erstellung der zugehörigen Merkmalsvektoren durch das Vorverarbeitungsmittel SE statt. Diese Merkmalsvektoren werden in einem Schritt 21 als neue Sprachreferenzen im Speicher SPSR abgespeichert.If, however, a rejection of the received control word by the control means HOST is carried out in step 10, a variable R, which was set to zero at the beginning of the implementation of the method according to the invention and which represents the number of rejections since the start of the implementation, is implemented in a step 14 method according to the invention indicates increased by 1. Step 15 then checks whether the number of rejections has reached the predetermined maximum number R _max of rejections since the method according to the invention was started. If this is not the case, then an announcement with a renewed request for the transmission of the control word into the communication network is issued to the user in a step 16. The method according to the invention then branches to step 4, in which the answering machine AB checks whether an entry has been made by the user. If, however, it is determined in step 15 that the maximum number of rejections allowed has been reached, then in a step 17 a short announcement is issued in the communication network with the request to transmit speaker-dependent voice references via the communication network to the answering machine AB. In step 17, a learning phase of speaker-dependent speech recognition is started in dialog form. The speaker-dependent voice references can be transmitted to the answering machine AB "online" by the user via the communication network. In a step 18, it is then checked whether the user transmits an input and thus a new voice reference to the answering machine AB. If this is not the case, then in a step 22 a help announcement is issued in the communication network, with which the user receives detailed information for entering the voice reference via the communication network. The menu then branches again to step 18 and awaits the input and transmission of the new language reference. If the user makes the input, the speaker-dependent speech recognition is reactivated in a step 19 and the beginning and the end of the word of the transmitted and received speech reference received by the output and reception device AEM are detected by the preprocessing means SE. In a step 20, the features of the language reference are then extracted and the associated feature vectors are created by the preprocessing means SE. These feature vectors are stored in a step 21 as new language references in the memory SPSR.

Es ist auch möglich, diese Merkmalsvektoren der Sprachreferenzen zusätzlich zu den bereits im Speicher SPSR abgespeicherten abzuspeichern. Auf diese Weise ist es beispielsweise möglich, daß mehrere Benutzer den Anrufbeantworter AB sprachgesteuert fernsteuern können, ohne daß sie jeweils wieder neue Sprachreferenzen eingeben müssen. Es ist desweiteren nicht notwendig, vor Beginn der Inbetriebnahme des Anrufbeantworters AB in einer Lernphase die sprecherabhängigen Sprachreferenzen einzugeben. Dies kann erfindungsgemäß bei Bedarf und jederzeit über das Kommunikationsnetz "online" vom Benutzer durchgeführt werden.It is also possible to use these feature vectors of the language references in addition to those already stored in the SPSR memory save. In this way it is possible, for example, that several users the answering machine AB voice-controlled can control remotely without new ones Need to enter language references. Furthermore, it is not necessary, before the start of the answering machine AB the speaker-dependent language references in a learning phase to enter. This can, according to the invention, if necessary and at any time the communication network can be carried out "online" by the user.

Ist die Lernphase der sprecherabhängigen Spracherkennung abgeschlossen, d. h. wurden alle für die Durchführung des Skip-and-Scan-Menüs benötigten Sprachreferenzen in den Schritten 17 bis 21 in den Anrufbeantworter AB eingegeben, so verzweigt das erfindungsgemäße Verfahren zu dem Schritt 16, um dem Benutzer erneut eine Ansage mit einer Aufforderung zur Eingabe eines Steuerwortes zu übermitteln.Is the learning phase of speaker-dependent speech recognition completed, d. H. were all responsible for carrying out the Skip and scan menus require language references in steps 17 to 21 entered in the answering machine AB, it branches inventive method to step 16 to the user again an announcement with a request to enter a To transmit control word.

Dadurch daß die sprecherabhängigen Sprachreferenzen "online" über das Kommunikationsnetz in den Anrufbeantworter AB eingegeben werden, werden die Sprachreferenzen gleichzeitig an die augenblicklich vorhandene Leistungscharakteristik des Kommunikationsnetzes angepaßt. Daher sind aufwendige Filtereinrichtungen im Anrufbeantworter AB nicht notwendig.Because the speaker-dependent language references "online" via entered the communication network in the answering machine AB the language references are sent to the current performance characteristics of the Communication network adapted. Therefore, they are expensive Filtering devices in the answering machine AB are not necessary.

Claims

1. A method for remote control of a device (AB) via a communication network, in which the device (AB) is brought into a remote control mode and in which a speaker-dependent speech recognition is carried out, characterized in that a speaker-dependent speech reference via the communication network to the device ( AB) is transmitted and that the transmitted speaker-dependent speech reference is received and stored in the device.

2. The method according to claim 1, characterized in that in the speaker-dependent speech recognition, a rejection (R) of a control word received via the communication network for remote control of the device (AB) takes place if this control word is not clearly identified that the number of rejections (R ) in the case of speaker-dependent speech recognition, it is detected that a number (R _max ) of maximum rejections is specified for speaker-dependent speech recognition and that an announcement from the device (AB) is entered into the communication network, with the request to use the speaker-dependent speech reference via the To transmit communication network to the device (AB) when the number of rejections (R) has reached the number (R _max ) of the maximum permitted rejections.

3. The method according to claim 1 or 2, characterized in that the remote control is menu controlled and that the menu used for remote control is a menu after a The principle of skip-and-scan is.

4. The method according to any one of claims 1-3, characterized in that different speaker-dependent Language references for one of the control words for remote control of the Device (AB) can be saved.

5. Device (AB) with a remote control, with a Connection means (AS) for a communication network, with a Control means (HOST) for their control, with a means (SE, VM) for speaker-dependent speech recognition, characterized in that the device (AB) has a mode, by sending one over the communications network receives and saves speaker-dependent language reference.

6. The device according to claim 5, characterized in that it has a first memory (SPSR) for Storage of speaker-dependent language references and one second memory (ROM) and that in the second memory Program module is stored, by means of which the control means (HOST) an output of an announcement in the communication network prompted, with a request, a speaker dependent Voice reference over the communication network to the device (AB) to transmit that a receiving means (AEM) the over the Communication network transmitted speaker-dependent voice reference receives and that the control means (HOST) storing the speaker-dependent language reference in the first memory (SPSR) prompted.

7. The device according to claim 6, characterized in that the control means (HOST) causes a rejection (R) of a transmitted via the communication network and received by the receiving means (AEM) speaker-dependent control word for remote control if the means (SE, VM) for speaker-dependent Speech recognition of this control word does not uniquely identify that the control means (HOST) detects the number of rejections (R), that a number (R _max ) of maximum permitted rejections is stored in the second memory (ROM) and that the control means (HOST) the Output of the announcement in the communication network prompted with the request to transmit the speaker-dependent voice reference over the communication network to the device (AB) when the number of rejections (R) has reached the number (R _max ) of the maximum allowed rejections.

8. The device according to claim 5, 6 or 7 characterized in that the remote control is menu controlled and that used for the remote control menu after a menu The principle of skip-and-scan is.

9. Device according to one of claims 5-8, characterized in that it is an answering machine.