EP1209663A1

EP1209663A1 - Device and method for access control

Info

Publication number: EP1209663A1
Application number: EP00125914A
Authority: EP
Inventors: Meinrad Niemoeller; Reinhart Vogl
Original assignee: Siemens AG
Current assignee: Siemens AG
Priority date: 2000-11-27
Filing date: 2000-11-27
Publication date: 2002-05-29
Also published as: US20030004726A1; EP1342229A1; WO2002043050A1

Abstract

The voice controled arrangement has at least one access controler (3',5',7',9') for releasing or blocking an access, especially to a restricted spatial area (7,9), technical equipment (3,5) or data or telecommunications network and a voice input unit (11) connected to the access controler via an especially wireless message connection. Independent claims are also included for the following: an access control method.

Description

Die Erfindung betrifft ein Verfahren zur Zugangssteuerung nach dem Oberbegriff des Anspruchs 10 sowie eine entsprechende Zugangssteueranordnung.The invention relates to a method for access control according to the preamble of claim 10 and a corresponding Access control arrangement.

Die Steuerung des Zuganges zu abgegrenzten Raumbereichen, komplizierten technischen Geräten mit anspruchsvoller Bedienung und hohem Gefahrenpotential bei Fehlbedienungen sowie auch zu Daten- bzw. Telekommunikationsnetzen stellt einen wesentlichen Sicherheitsaspekt der Nutzung solcher Bereiche bzw. Systeme dar. Mit der zunehmenden Vielzahl von Bereichen oder Systemen im täglichen Leben, für die besondere Zugangsbedingungen gelten, wächst die Anzahl der jeweils den Zugang ermöglichenden Schlüssel bzw. Codes im Besitz vieler Benutzer stark an. Deren sichere Aufbewahrung einerseits und der sofortige und zuverlässige Zugriff darauf andererseits werden daher zunehmend problematisch.Controlling access to delimited areas, complicated technical devices with demanding operation and high risk potential in the event of incorrect operation as well data and telecommunications networks are also essential Security aspect of using such areas or systems. With the increasing variety of areas or systems in daily life, for the special access conditions apply, the number of each access grows enabling keys or codes in the possession of many users strong. Their safe storage on the one hand and immediate and reliable access to it on the other hand therefore increasingly problematic.

Es sind daher vielfältige Anstrengungen unternommen worden, durch Vereinheitlichung der für verschiedene Räume, Geräte, Netze etc. benötigten "Schlüssel" Erleichterungen für die Benutzer zu schaffen. Hier treten aber zum einen Kompatibilitätsprobleme zwischen verschiedenen Zugangssteuersystemen mit unterschiedlichen Sicherheitsniveaus auf, und zum anderen werden natürlich die mit einem Verlust oder einer Entwendung des "Schlüssels" verbundenen Folgen für den Benutzer einerseits und die mit diesem einen Schlüssel gesicherten Systeme anderseits in der Summe immer schwerwiegender.A wide range of efforts have therefore been made by standardizing the for different rooms, devices, Networks etc. need "keys" to make things easier for users to accomplish. Here, on the one hand, compatibility problems arise between different access control systems with different levels of security, and on the other are of course those with a loss or a theft consequences of the "key" for the user on the one hand and the systems secured with this one key on the other hand, more and more serious in total.

Es wird daher seit längerem auch an Möglichkeiten der Nutzung biometrischer Daten der Benutzer - etwa der Papillarlinien, des Retinamusters oder der Stimme bzw. Sprache - zur Zugangssteuerung gearbeitet. Diese "Schlüssel" sind grundsätzlich unverlierbar und auch relativ schwer zu fälschen, und vor allem ist ihr Einsatz für den Benutzer denkbar einfach.It has therefore also been used for a long time biometric data of the users - such as the papillary lines, retina pattern or voice or language - for access control worked. These "keys" are fundamental captive and relatively difficult to forge, and above all their use is very easy for the user.

Die elektronische Sprecherverifikation bzw. -identifizierung benutzt ähnliche Methoden wie die Spracherkennung. Ihr Ziel besteht jedoch nicht in einer Wandlung von Sprachäußerungen in Text, sondern in der Identifizierung bzw. Verifizierung einer Person aufgrund ihrer Sprachäußerung. Die bekannten Sprecherverifikationssysteme sind relativ komplex und teuer und haben deshalb noch keine große Verbreitung gefunden. Hierzu hat auch das Problem beigetragen, daß herkömmliche Spracherkennungssysteme in einem auch als "Enrollment" bezeichneten Prozeß auf den oder die Benutzer initialisiert bzw. trainiert werden müssen. Dieses Problem wirkt sich besonders nachteilig aus, wenn ein Benutzer Zugang zu verschiedenen Raumbereichen, Gebäuden, Geräten, Netzen o. ä. per Sprecheridentifikation erlangen muß oder möchte und jeweils das einzelne System vorher trainieren muß.The electronic speaker verification or identification uses methods similar to speech recognition. Your goal however, does not consist in a change of utterances in text, but in the identification or verification a person because of their speech. The well-known Speaker verification systems are relatively complex and expensive and have therefore not yet been widely used. The problem that conventional ones also contributed to this Speech recognition systems in one also referred to as "enrollment" Process initialized to the user or users or have to be trained. This problem particularly affects disadvantageous when a user has access to different Room areas, buildings, devices, networks or similar per Speaker identification must or would like to and always the individual system must train beforehand.

Es ist daher Aufgabe der Erfindung, ein einfaches, kostengünstig realisierbares und für den bzw. die Benutzer leicht zu handhabendes sprachgesteuertes Zugangssteuersystem sowie ein entsprechendes Verfahren zur Zugangssteuerung anzugeben.It is therefore an object of the invention to be simple, inexpensive realizable and easy for the user to be operated voice-controlled access control system as well to specify a corresponding procedure for access control.

Diese Aufgabe wird hinsichtlich ihres Vorrichtungsaspektes durch eine Zugangssteueranordnung mit den Merkmalen des Anspruchs 1 und hinsichtlich ihres Verfahrensaspektes durch ein Verfahren mit den Merkmalen des Anspruchs 10 gelöst.This task is in terms of their device aspect by an access control arrangement with the features of the claim 1 and with regard to their procedural aspect by a Method with the features of claim 10 solved.

Die Erfindung schließt den grundlegenden Gedanken ein, den Gesamtablauf der Zugangssteuerung per Sprecheridentifikation (von der Spracheingabe bis zur Freigabe oder Sperrung des Zuganges) zwischen zwei Teilsystemen bzw. Teil-Verfahrensabläufen aufzuteilen, wobei eines der Teilsysteme bzw. einer der Verfahrensabschnitte für eine Vielzahl von Zugangssteuersituationen nutzbar ist. Es handelt sich hierbei um eine mobile Spracheingabeeinheit, die einen Teil des Sprecheridentifikationsvorganges ausführt, während der andere Teil der Gesamtanordnung - genauer gesagt: einer Vielzahl möglicher Gesamtanordnungen - in einem jeweils die eigentliche Zugangssteuerung bewirkenden Zugangssteuergerät besteht. In diesem wird ein anderer Teil der Sprecheridentifikation ausgeführt, und hier ist insbesondere auch ein für die Autorisierung des Benutzers eingesetzter Wortschatz gespeichert.The invention includes the basic idea that Overall process of access control via speaker identification (from voice input to opening or blocking access) between two subsystems or sub-procedures to split, one of the subsystems or one of the Process sections for a variety of admission control situations is usable. It is a mobile Voice input unit that is part of the speaker identification process executes while the other part of the overall arrangement - More specifically: a variety of possible overall arrangements - in each case the actual access control effecting access control device exists. In this will executed another part of the speaker identification, and here is especially one for user authorization used vocabulary saved.

In einer bevorzugten Ausgestaltung der Anordnung umfaßt das bzw. jedes Zugangssteuergerät neben einem entsprechenden Steuergerät-Wortschatzspeicher eine Steuerwort-Sendeeinheit zur Übertragung von Worten aus dem gespeicherten Wortschatz an die Spracheingabeeinheit, und die Spracheingabeeinheit hat entsprechend eine Steuerwort-Empfangseinheit zum Empfang der Steuerworte, ein Mikrofon und eine nachgeschaltete NF-Stufe zur Spracheingabe, eine Sprechermerkmals-Extraktionsstufe (Spracherkenner) und eine Sprechermerkmals-Sendestufe zur Übermittlung eines extrahierten Sprechermerkmalssatzes an das jeweilige Zugangssteuergerät. Letzteres verfügt außerdem über eine entsprechende Sprechermerkmals-Empfangsstufe, einen Sprechermerkmals-Referenzspeicher zur Speicherung von Sprechermerkmalen vorbestimmter Benutzer sowie eine Sprechermerkmals-Vergleichereinheit, die in Abhängigkeit vom Ergebnis eines Vergleiches der aktuell ermittelten Sprechermerkmale mit vorgespeicherten Sprechermerkmalen ein Zugangs-Freigabesignal oder aber Zugangs-Sperrsignal erzeugt.In a preferred embodiment of the arrangement, this comprises or each access control device in addition to a corresponding one Control unit vocabulary memory a control word transmission unit for transferring words from the stored vocabulary to the voice input unit, and the voice input unit has correspondingly a control word receiving unit for receiving the Control words, a microphone and a downstream LF stage for voice input, a speaker feature extraction level (Speech recognizer) and a speaker feature transmission stage for transmission an extracted set of speaker features to the respective access control device. The latter also has a corresponding speaker feature reception stage, one Speaker feature reference memory for storing speaker features predetermined user and a speaker feature comparator unit, which depending on the result of a Comparison of the currently determined speaker characteristics with pre-stored speaker features an access enable signal or generated access blocking signal.

Die mobile Spracheingabeeinheit umfaßt zweckmäßigerweise einen zwischen die Steuerwort-Empfangseinheit und die Sprechermerkmals-Extraktionsstufe bzw. den Spracherkenner geschalteten Zwischenspeicher für die von dem Zugangssteuergerät empfangenen ausgewählten Steuer- bzw. Identifikationsworte, ebenso wie das Zugangssteuergerät zweckmäßigerweise eine zwischen die Sprechermerkmals-Empfangsstufe und die Sprechermerkmals-Vergleichereinheit geschalteten Sprechermerkmals-Zwischenspeicher für die von der Spracheingabeeinheit empfangenen Sprechermerkmale aufweist. Diese Speicher können permanent oder semi-permanent sein und für ein und dasselbe Zugangssteuergerät im Zusammenwirken mit ein und derselben Spracheingabeeinheit in einem Gesamtsystem aus mehreren Spracheingabeeinheiten und/oder Zugangssteuergeräten, je nach konkreter Systemkonfiguration, eine mehr oder weniger langfristige Speicherung einer Steuer- bzw. Identifikationswortmenge bzw. der Merkmale eines einsprechenden Zugangswilligen sichern.The mobile voice input unit expediently comprises one between the control word receiving unit and the speaker feature extraction stage or switched the speech recognizer Buffer for those received by the access control device selected control or identification words, as well as the access control device expediently one between the speaker feature receiving stage and the speaker feature comparing unit switched speaker feature cache for those received by the voice input unit Has speaker characteristics. These memories can be permanent or be semi-permanent and for one and the same access control device in cooperation with one and the same Voice input unit in an overall system of several Voice input units and / or access control devices, depending on concrete system configuration, a more or less long-term Storage of a control or identification word set or the characteristics of an opposing access person to back up.

Nach obigem finden die Spracheingabe und die Merkmalsextraktion an der mobilen Spracheingabeeinheit statt. In dieser ist jedoch in der bevorzugten Ausführung nicht die Kenntnis darüber verankert, welche Worte von einem zugangswilligen Benutzer zum Zwecke der Sprecherverifikation eingesprochen werden sollen. Sobald eine Spracheingabeeinheit in Verbindung mit einem Zugangssteuergerät kommt, überträgt die Spracheingabeeinheit beispielsweise einen Benutzernamen oder Benutzercode an das Zugangssteuergerät. Dieses übermittelt im Gegenzug Worte oder einen Text, anhand dessen die Sprecherverifikation für den zugangswilligen Benutzer ausgeführt werden soll. (Diese Worte bzw. dieser Text werden hier kurz als "Steuerworte" bezeichnet.) Diese Steuerworte werden in einer bevorzugten Ausführung über einen Zufallsgenerator aus einer vorgegebenen Liste (Wortschatz) ausgewählt.According to the above find the voice input and the feature extraction on the mobile voice input unit instead. In this is however, in the preferred embodiment, no knowledge of it anchored what words of an accessible user be spoken for the purpose of speaker verification should. Once connected to a voice input device comes with an access control unit, transmits the voice input unit for example a user name or user code to the access control device. In return, this transmits Words or text based on which the speaker verification for the accessable user should. (These words or this text are briefly referred to here as "Control words".) These control words are in a preferred embodiment via a random generator from a given list (vocabulary) selected.

Die nächste Aufgabe der mobilen Spracheingabeeinheit besteht dann darin, diese vom Benutzer einzusprechenden Worte in einem Verifikationsdialog zu präsentieren, den Benutzer zur Spracheingabe aufzufordern und seine Sprachäußerung aufzunehmen. Hierzu werden an sich bekannte Displays mit Menüführung und Audio-Frontends eingesetzt.The next task of the mobile voice input unit is then in these words spoken by the user in one To present verification dialog to the user To request voice input and record his speech. Known displays with menu navigation are used for this purpose and audio front ends.

Anschließend wird mit an sich bekannten Strukturen und Algorithmen der Spracherkennung - insbesondere auf der Basis eines Hidden-Markov-Modells oder neuronalen Netzes - die erwähnte Extraktion der Sprechermerkmale ausgeführt. Diese werden dann zurück an das Zugangssteuergerät übertragen und dort mit vorher abgelegten Sprechermerkmalssätzen bzw. -vektoren autorisierter Sprecher - insbesondere mit dem Sprechermerkmalsvektor des durch den Namen oder Benutzercode gekennzeichneten speziellen Benutzers - verglichen. Eine unter Einsatz eines Schwellwertdiskriminators ausgeführte Klassifikationsstufe des Zugangsgerätes entscheidet dann im Ergebnis einer statistischen Auswertung, ob die Sprachmuster einander hinreichend ähnlich sind, und gibt im Ergebnis dieses Vergleiches ein Zugangs-Freigabesignal oder Zugangs-Sperrsignal aus. Es versteht sich, daß die Anordnung für einen einzelnen berechtigten Benutzer trainiert bzw. initialisiert sein kann und nur für diesen der Zugang freigegeben wird; im allgemeinen wird aber der Sprechermerkmals-Referenzspeicher des Zugangssteuergerätes eine Mehrzahl von jeweils über einen Benutzernamen oder Benutzercode adressierbaren Sprechermerkmals-Speicherbereichen aufweisen.Subsequently, structures and algorithms known per se are used speech recognition - especially on the basis of a Hidden Markov model or neural network - the one mentioned Extraction of the speaker characteristics carried out. These will then transferred back to the access control device and there with previously stored speaker feature sets or vectors authorized speaker - especially with the speaker feature vector the one identified by the name or user code special user - compared. One under use of a threshold level discriminator the access device then decides in the result one statistical evaluation of whether the language patterns match each other sufficiently are similar and gives as a result of this comparison an access enable signal or access disable signal. It is understood that the arrangement is justified for an individual User can be trained or initialized and only for this access is released; in general but becomes the speaker feature reference memory of the access control device a plurality of each with a username or user code addressable speaker feature storage areas exhibit.

Die Kommunikation zwischen der Spracheingabeeinheit und dem Zugangssteuergerät bzw. den Zugangssteuergeräten läuft zweckmäßigerweise als drahtlose Kommunikation, insbesondere auf einer Funkstrecke. Als bevorzugt werden derzeit eine Funkstrecke auf Basis des Bluetooth- oder DECT-Standards (beispielsweise bei einem Schnurlostelefon) und die Nutzung eines Mobilfunknetzes mit Sprach- und Datenübertragung nach dem GSM- oder UMTS-Standard angesehen. Hierbei sind insbesondere die Wortschatz-Sendeeinheit und die Sprechermerkmals-Empfangsstufe des jeweiligen Zugangssteuergerätes und die Wortschatz-Empfangseinheit und die Sprechermerkmals-Sendestufe der Spracheingabeeinheit als Funksende- bzw. -empfangseinheiten ausgebildet. Grundsätzlich ist auch der Einsatz von bewährten Infrarot-Schnittstellen möglich.Communication between the voice input device and the Access control device or the access control devices expediently runs as wireless communication, especially on a radio link. One is currently preferred Radio link based on the Bluetooth or DECT standard (for example with a cordless telephone) and the use a cellular network with voice and data transmission viewed the GSM or UMTS standard. Here are particular the vocabulary transmission unit and the speaker feature reception stage of the respective access control device and the Vocabulary receiving unit and the speaker feature transmission stage the voice input unit as radio transmission or reception units educated. The use is also fundamental of proven infrared interfaces possible.

Bei der bevorzugten Ausführung der Sprechermerkmals-Extraktionsstufe mit einem phonem-basierten Hidden-Markov-Modell ist es nicht erforderlich, daß die als Referenz dienenden vorgespeicherten Sprechermerkmale aus den aktuell als Steuerworte dienenden Worten gewonnen wurden. Vielmehr können durch das Zugangssteuergerät für jeden zugangswilligen Benutzer und/oder bei jedem Zugangsversuch oder aber auch in periodischen Abständen neue Steuerworte vorgegeben werden, ohne daß ein erneutes Training des Spracherkenners in der Spracheingabeeinheit erforderlich wäre.In the preferred embodiment of the speaker feature extraction stage with a phoneme-based hidden Markov model it is not necessary that those serving as a reference pre-stored speaker characteristics from the currently used as control words serving words were obtained. Rather, you can the access control device for every user willing to access and / or with every access attempt or also in periodic Intervals new control words can be specified without a new training of the speech recognizer in the speech input unit would be required.

In diesem Zusammenhang spielt das Training oder Enrollment eine wichtige Rolle. Dieses ist grundsätzlich in zwei Teile zu unterteilen, nämlich die Aufnahme eines Wortes bzw. einer Sprachäußerung und die Berechnung der Merkmale auf der Spracheingabeeinheit einerseits und die Ablage der Merkmale mit einem Sprecheridentifikationscode auf einem Zugangsgerät andererseits. Diese beiden Teile des Enrollment können auch zeitlich getrennt voneinander durchgeführt werden, und insbesondere können einmal auf einer Spracheingabeeinheit gewonnene Sprechermerkmale an verschiedene Zugangsgeräte übertragen werden.In this context, training or enrollment plays an important role. This is basically in two parts to subdivide, namely the inclusion of a word or a Speech and the calculation of characteristics on the Voice input unit on the one hand and the storage of the features with a speaker identification code on an access device on the other hand. These two parts of the enrollment can too be carried out separately in time, and in particular can be obtained once on a voice input device Transfer speaker features to different access devices become.

Insgesamt erbringen die vorgeschlagene Anordnung und das vorgeschlagene Verfahren eine Vielzahl von Vorteilen gegenüber bekannten Verfahren:

Die zur Erlangung der Zugangsberechtigung einzusprechenden Worte können (gemäß einer bevorzugten Ausführung der Erfindung) nicht durch vorab hergestellte Tonaufzeichnungen gefälscht werden, da seitens des Zugangsgerätes fallweise entschieden wird, welche Worte zur Erlangung der Zugangsberechtigung eingesprochen und analysiert werden.
Bei den Zugangsgeräten sind als Komponenten zur Sprecherverifikation lediglich die Komponenten für die Wortauswahl, Referenzmerkmalsspeicherung und Klassifikation bzw. Schwellwertdiskriminierung vorzusehen, und dies führt zu einer Vereinfachung und Kostenreduzierung auf Seiten der Zugangsgeräte.
Da der Merkmalsvergleich und die Klassifikation bzw. Schwellwertdiskriminierung beim Zugangsgerät stattfinden, ist das System insgesamt gut gegen ein Eindringen von außen geschützt. Eine besonders starke Verschlüsselung der Kommunikation zwischen der Spracheingabeeinheit und den Zugangsgeräten ist nicht erforderlich, da die zur Sprecherverifikation herangezogenen Worte vor Einleitung der Zugangsprozedur ohnehin nicht bekannt sind.
Der verarbeitungsintensive Teil der Sprecherverifikation, nämlich die Merkmalsextraktion, findet bei der Spracheingabeeinheit statt, die für eine Vielzahl von Zugangssteueraufgaben genutzt werden kann. Hierdurch reduziert sich insgesamt der Hardware- und Softwareaufwand bei komplexen Zugangssteuersystemen.
Bei geeigneten Realisierungsformen (Mobiltelefon, Schnurlostelefon o. ä.) kann auf Seiten der Spracheingabeeinheit ein Audio-Frontend (Mikrofon, A/D-Wandler, eventuell digitaler Signalprozessor) genutzt werden, das ohnehin bereits vorhanden ist.
Der zeitintensive Teil des Enrollment, nämlich die (insbesondere mehrfache) Aufnahme und Merkmalsextraktion eines Trainings-Wortschatzes, braucht nur einmal in der Spracheingabeeinheit für verschiedene Zugangssteueranwendungen ausgeführt werden. Da die Ergebnisse bei der Anmeldung an einem neuen - natürlich systemkompatiblen - Zugangssteuergerät wiederverwendet werden, wird diese Anmeldung wesentlich verkürzt und insgesamt die Handhabung des Zugangssystems für den Nutzer vereinfacht und bequem gestaltet.

Overall, the proposed arrangement and the proposed method offer a multitude of advantages over known methods:

According to a preferred embodiment of the invention, the words to be spoken in order to obtain access authorization cannot be falsified by sound recordings made beforehand, since the access device decides on a case-by-case basis which words are spoken and analyzed to obtain access authorization.
With the access devices, only the components for word selection, reference feature storage and classification or threshold discrimination are to be provided as components for speaker verification, and this leads to simplification and cost reduction on the part of the access devices.
Since the feature comparison and the classification or threshold value discrimination take place in the access device, the system is generally well protected against intrusion from the outside. A particularly strong encryption of the communication between the voice input unit and the access devices is not necessary since the words used for the speaker verification are not known in any case before the access procedure is initiated.
The processing-intensive part of the speaker verification, namely the feature extraction, takes place at the voice input unit, which can be used for a large number of access control tasks. This reduces the overall hardware and software expenditure for complex access control systems.
With suitable forms of implementation (mobile phone, cordless telephone or similar), an audio front end (microphone, A / D converter, possibly a digital signal processor) can be used on the part of the voice input unit, which is already present anyway.
The time-consuming part of the enrollment, namely the (in particular multiple) recording and feature extraction of a training vocabulary, only needs to be carried out once in the voice input unit for different access control applications. Since the results of logging on to a new - of course system-compatible - access control device are reused, this logon is significantly shortened and overall the handling of the access system is simplified and convenient for the user.

Vorteile und Zweckmäßigkeiten der Erfindung ergeben sich im übrigen aus den Unteransprüchen sowie der nachfolgenden skizzenartigen Beschreibung von Ausführungsbeispielen, teilweise anhand der Figur. Advantages and advantages of the invention result in others from the subclaims and the following sketch-like Description of exemplary embodiments, in part based on the figure.

Diese zeigt skizzenartig in einem Funktions-Blockschaltbild eine komplexe Zugangssteuerkonfiguration 1 aus mehreren per Sprecherverifikation zugangsgesteuerten Geräten oder Gegenständen bzw. Raumbereichen, nämlich einem Fernsehgerät 3, einer Computeranlage 5, einem Safe 7 und einer Garagentoranlage 9, die jeweils ein Zugangssteuergerät 3', 5', 7' bzw. 9' aufweisen, und einem Mobiltelefon 11 als Spracheingabeeinheit.This shows a sketch in a function block diagram a complex access control configuration 1 from several per Speaker verification of access-controlled devices or objects or room areas, namely a television 3, one Computer system 5, a safe 7 and a garage door system 9, each of which has an access control device 3 ', 5', 7 'or 9', and a mobile phone 11 as a voice input unit.

Die Zugangssteuergeräte 3' bis 9' weisen jeweils einen Wortschatzspeicher 3a bis 9a, eine mit diesem verbundenen Steuerwort-Auswahlstufe 3b bis 9b und eine mit dieser verbundene Steuerwort-Sendestufe 3c bis 9c zur Speicherung, Auswahl und Übermittlung von Steuerworten für die Sprecherverifikation eines jeweils zugangswilligen Benutzers an die Spracheingabeeinheit 11 auf.The access control devices 3 'to 9' each have a vocabulary memory 3a to 9a, a control word selection stage connected to this 3b to 9b and one connected to it Control word transmission stage 3c to 9c for storage, selection and Submission of tax words for speaker verification of a user who is willing to access the voice input unit 11 on.

Diese hat eine Steuerwort-Empfangseinheit 11a zum Empfang der jeweiligen Steuerworte und eine Anzeigeeinheit 11b zur Anzeige der einzusprechenden Steuerworte für den Benutzer. Weiter hat sie ein Audio-Frontend 11c für die Spracheingabe durch den Benutzer und eine mit dem Audio-Frontend einerseits und der Steuerwort-Empfangseinheit andererseits verbundene, als Spracherkenner mit einem Hidden-Markov-Modell ausgeführte Sprechermerkmals-Extraktionsstufe 11d sowie eine mit dem Ausgang der Sprechermerkmals-Extraktionsstufe 11d verbundene Sprechermerkmals-Sendestufe 11e zur Übermittlung von aus der Spracheingabe extrahierten Sprechermerkmalen an die Zugangssteuergeräte 3' bis 9'. (Insoweit geht die Funktionalität der Spracheingabeeinheit 11 über diejenige eines normalen Mobiltelefons hinaus, es wird im Beispiel aber angenommen, daß die Spracheingabeeinheit durch ein entsprechend "aufgerüstetes" Mobiltelefon gebildet ist. Die üblichen Komponenten eines solchen sind nicht dargestellt und werden hier nicht beschrieben.)This has a control word receiving unit 11a for receiving the respective control words and a display unit 11b for display the control words to be spoken for the user. Further it has an audio front end 11c for voice input the user and one with the audio front end on the one hand and connected to the control word receiving unit, on the other hand, as Speech recognizer with a hidden Markov model Speaker feature extraction stage 11d and one with the output the speaker feature extraction stage 11d Speaker feature transmission stage 11e for transmitting from the Voice input extracted speaker features to the access control devices 3 'to 9'. (The functionality of the Voice input unit 11 over that of a normal mobile phone addition, it is assumed in the example that the Voice input unit by an appropriately "upgraded" Mobile phone is formed. The usual components of a such are not shown and are not described here.)

Die aktuell ermittelten Sprechermerkmale werden in den Zugangssteuergeräten 3' bis 9' jeweils durch eine Sprechermerkmals-Empfangsstufe 3d bis 9d empfangen, die ihrerseits mit einer Sprechermerkmals-Vergleichereinheit 3e bis 9e verbunden ist. Diese ist weiterhin mit einem Sprechermerkmals-Referenzspeicher 3f bis 9f zur Speicherung von Sprechermerkmalen eines vorbestimmten Benutzerkreises als Referenz für die Sprecherverifikation verbunden und dient zum Vergleich der aktuell ermittelten mitgespeicherten Sprechermerkmalsvektoren und zur Ausgabe eines Übereinstimmungsmaßes im Ergebnis eines statistischen Vergleichsvorganges.The currently determined speaker characteristics are in the access control devices 3 'to 9' each through a speaker feature receiving stage 3d to 9d received, which in turn with a speaker feature comparator unit 3e to 9e is. This is also with a speaker feature reference memory 3f to 9f for storing speaker characteristics of a predetermined user group as a reference for speaker verification connected and serves to compare the current determined co-stored speaker feature vectors and to output a measure of conformity in the result of a statistical comparison process.

Ihr nachgeschaltet ist jeweils eine Klassifikatorstufe (Schwellwertdiskriminator) 3g bis 9g zur Klassifizierung des Vergleichsergebnisses an einem vorbestimmten Schwellwert des Übereinstimmungsmaßes. Diese Klassifikatorstufe gibt letztlich in Abhängigkeit vom Ergebnis der Schwellwertdiskriminierung ein Zugangs-Freigabesignal oder Zugangs-Sperrsignal als finales Steuersignal der Speicherverifikation aus. Die Schwellwerte können bei den einzelnen Zugangssteuergeräten in Abhängigkeit von der gewünschten Stärke des Schutzes vor unbefugter Benutzung des jeweiligen zu sichernden Raumes oder Systems unterschiedlich gewählt sein. Ebenso können die Wortschätze der einzelnen Zugangssteuergeräte unterschiedlich gewählt sein, und der Umfang des jeweils aus dem Gesamt-Wortschatz ausgewählten Steuerwort-Satzes oder Steuertextes für die Sprecherverifikation kann unterschiedlich groß sein.It is followed by a classifier level (Threshold discriminator) 3g to 9g for the classification of the Comparison result at a predetermined threshold of Similarity measure. This classifier level ultimately gives depending on the result of the threshold discrimination an access enable signal or access lock signal as final control signal of the memory verification. The Threshold values can be set for the individual access control devices in Depends on the desired level of protection against unauthorized persons Use of the respective space to be secured or Systems can be chosen differently. Likewise, the vocabulary of the individual access control devices selected differently be, and the scope of each of the total vocabulary selected control word set or control text for the speaker verification can be of different sizes.

Die Zuordnung des zugangswilligen Benutzers erfolgt bei dieser Ausführung durch eine (nicht dargestellte) Auswertung von an die Zugangssteuergeräte - die natürlich ein Mobilfunk-Sende-/Empfangsteil aufweisen müssen - übermittelten Daten von der SIM-Karte des Mobiltelefons 1. Dies erhöht zusätzlich die Sicherheit vor unbefugtem Zugang zu den Geräten, da bereits die Benutzung des Mobiltelefons 11 nur nach Aktivierung einer ausschließlich dem Nutzer bekannten PIN möglich ist.The user who is willing to access is assigned to the user Execution by (not shown) evaluation of to the access control devices - which, of course, are a mobile radio transceiver must have - transmitted data from the SIM card of the mobile phone 1. This additionally increases the Security against unauthorized access to the devices, since already the use of the mobile phone 11 only after activation of one only the PIN known to the user is possible.

In einer modifizierten, nicht dargestellten Ausführung ist als erster Schritt des Zugangs-Procederes das Einsprechen des Namens des Benutzers und dessen Übertragung an das jeweilige Zugangssteuergerät zur Adressierung eines Sprechermerkmals-Referenzspeichers vorgesehen, der eine Mehrzahl von über die Benutzernamen adressierbaren Speicherbereichen für Sprechermerkmalssätze aufweist.In a modified version, not shown as a first step in the access procedure, speaking in Name of the user and his transfer to the respective Access control device for addressing a speaker feature reference memory provided a plurality of over the User name addressable storage areas for speaker feature sets having.

Ein anderes Ausführungsbeispiel sieht den Einsatz der Bluetooth-Technologie für die drahtlose Kommunikation zwischen einer Spracheingabeeinheit und den Zugangssteuergeräten vor. Als Spracheingabeeinheit, insbesondere für den Heimbereich, dient hier beispielsweise ein mit einem Bluetooth-Modul nachgerüstetes Schnurlostelefon oder auch ein PDA bzw. Handheld-PC, in das bzw. den die oben erwähnte Sprechermerkmals-Extraktionsstufe integriert ist. Das Vorhandensein der erforderlichen Audio-Komponenten ermöglicht auch hier eine kostengünstige Realisierung der Spracheingabeeinheit.Another embodiment sees the use of Bluetooth technology for wireless communication between a voice input unit and the access control devices. As a voice input unit, especially for the home, is used here, for example, a retrofitted with a Bluetooth module Cordless telephone or a PDA or handheld PC, into which the speaker feature extraction level mentioned above is integrated. The presence of the required Audio components also enable a cost-effective one Realization of the voice input unit.

Die Ausführung der Erfindung ist nicht auf die oben beschriebenen Beispiele beschränkt, sondern im Rahmen der anhängenden Ansprüche auch in einer Vielzahl von Abwandlungen möglich, die im Rahmen fachgemäßen Handelns liegen.The implementation of the invention is not based on those described above Examples limited but within the attached Claims also possible in a variety of modifications, that are within the scope of professional action.

Claims

Voice-controlled access control arrangement (1) with at least an access control device (3 ', 5', 7 ', 9') for release or Blocking access, especially to a restricted one Room area (7, 9), technical device (3, 5) or data or Telecommunications network, and one with the access control device via a, in particular wireless, communication link mobile voice input unit (11).

Access control arrangement according to claim 1,
characterized in that
the or each access control device (3 ', 5', 7 ', 9') has a control device vocabulary memory (3a, 5a, 7a, 9a) for storing a predetermined vocabulary,
a control word transmission unit (3c, 5c, 7c, 9c) for transferring words from the stored vocabulary to the voice input unit (11) as control words,
a speaker feature reception stage (3d, 5d, 7d, 9d) for receiving speaker features extracted in the voice input unit,
a speaker feature reference memory (3f, 5f, 7f, 9f) for storing speaker features of predetermined users as feature vectors and
has a speaker feature comparator unit (3e, 5e, 7e, 9e) for comparing currently determined with stored speaker feature vectors and for outputting an access release signal or access lock signal depending on the comparison result and
the voice input unit (11) a control word receiving unit (11a) for receiving the control words transmitted by the control device,
a control word display unit (11b),
Means for voice input (11c),
a speaker feature extraction stage (11d) connected to the means for voice input and at least indirectly to the vocabulary receiving unit for obtaining a speaker feature set and
has a speaker feature transmission stage (11e) for transmitting the extracted speaker feature set to the access control device.

Access control arrangement according to claim 2,
characterized in that
the voice input unit (11) has a control word buffer and is connected between the control word reception unit (11a) and the speaker feature extraction stage (11d)
the access control device has a speaker feature buffer connected between the speaker feature receiving stage (3d, 5d, 7d, 9d) and the speaker feature comparator unit (3e, 5e, 7e, 9e).

Access control arrangement according to claim 1 or 2,
characterized in that
the or each access control device (3 ', 5', 7 ', 9'), in particular its control word transmission unit (3c, 5c, 7c, 9c) and speaker feature reception stage (3d, 5d, 7d, 9d), and the mobile voice input unit (11), in particular their control word reception unit (11a) and speaker feature transmission stage (11e), as radio transmission or reception units, in particular mobile radio transmission or reception units or Bluetooth or DECT transmission or reception units, are trained.

Access control arrangement according to one of the preceding claims,
characterized in that
the mobile voice input unit (11) has means (11b) for user guidance during voice input based on the control words received from the access control device (3 ', 5', 7 ', 9').

Access control arrangement according to one of the preceding claims,
characterized in that
the or each access control device (3 ', 5', 7 ', 9') has a selection device (3b, 5b, 7b, 9b), in particular working according to the random generator principle, for the occasional selection of a set of control words from the stored vocabulary.

Access control arrangement according to one of the preceding claims, in particular one of claims 2 to 6,
characterized in that
the speaker feature reference memory (3f, 5f, 7f, 9f) of the or each access control device (3 ', 5', 7 ', 9') a plurality of speaker feature memory areas addressable via a user name or a user code and
the voice input unit (11) has a buffer (11b) for storing an entered user name or user code, which is connected to the speaker feature transmission stage (11e) for transmission to the access control device in connection with the extracted speaker features.

Access control arrangement according to one of the preceding claims, in particular one of claims 2 to 7,
characterized in that
the speaker feature extraction stage (11d) of the speech input unit (11) is designed as a speech recognizer, in which a hidden Markov model or neural network suitable for speaker verification is implemented, which initializes for at least one user, in particular for a plurality of users. is initializable.

Access control arrangement according to one of the preceding claims, in particular one of claims 4 to 8,
characterized in that
a voice input unit (11) designed as a mobile radio terminal is designed to transmit user data from the SIM card to the access control device and
the access control device has an evaluation device for evaluating the transmitted user data in connection with data determined during the speaker feature extraction.

Method for controlling access, in particular to a delimited area (7, 9), technical device (3, 5) or data or telecommunications network, with the evaluation of utterances of at least one user, from which a set of speaker characteristics can be derived using methods of speech recognition, which can be derived with at least a pre-stored set of speaker characteristics is compared, the access being enabled or blocked as a result of the comparison,
characterized in that
the extraction of the speaker features from the speech utterance and the comparison of the speaker feature set with the pre-stored speaker feature set distributed in a voice input device (11) on the one hand or an access control device (3 ', 5', 7 ', 9') on the other hand.

A method according to claim 10,
characterized in that
Control words which are pre-stored for the utterance are given from a vocabulary, in particular selected at random.

A method according to claim 10 or 11,
characterized in that
the vocabulary is stored in the access control device (3 ', 5', 7 ', 9'), the control words are selected in the access control device and the selected control words are temporarily stored in the voice input device (11) and output to the user as part of user guidance.

Method according to one of claims 10 to 12, in particular according to claim 11,
marked by
a wireless transmission of the selected control words from the access control device (3 ', 5', 7 ', 9') to the voice input unit (11) and the speaker features from the voice input unit to the access control device.

Method according to one of claims 10 to 13,
characterized in that
a hidden Markov model or a neural network for speech recognition is initialized in the voice input unit (11) before the method is carried out in an enrollment,
wherein each speaker identifies himself by speaking identification words and a predetermined set of speaker features is extracted from the speech data he has spoken and is stored together with the user name or a user code.

Method according to one of claims 10 to 14, in particular claim 14,
characterized in that
the voice data together with the spoken control word and / or a corresponding phonetic transcription of the control word are transmitted to an access control device and are stored there in a speaker characteristic reference memory.

Method according to one of claims 10 to 15,
characterized in that the process of enrollment in the steps

(1) Recording the control word and extracting the speaker characteristics and

(2) transfer of the features with the corresponding control word, the phonetic transcription and a user code or name to an access control device,

wherein step (2) can be carried out individually for several access control devices.

Method according to one of claims 10 to 16,
characterized in that
for each comparison of a currently obtained set of speaker features with a pre-stored set of speaker features, a statistical measure of agreement of the speaker features is determined,
discriminating the measure of conformity with a predetermined threshold value and
a release of the access is only triggered if the degree of conformity for the opposing user is above the threshold value.

Method according to one of claims 10 to 17,
characterized in that
the storage of the control words in the vocabulary memory of the access control devices is expanded by the storage of the corresponding phonetic transcription in order to facilitate speech recognition on a phoneme basis.