DE102007005704B4

DE102007005704B4 - Digital method for authenticating a person and ordering to carry it out

Info

Publication number: DE102007005704B4
Application number: DE200710005704
Authority: DE
Inventors: Christian Pilz; Bianca Aschenberner
Original assignee: Voice Trust AG
Current assignee: AI SOFTWARE LLC ( N.D. GES. D. STAATES DELAWAR, US
Priority date: 2007-02-05
Filing date: 2007-02-05
Publication date: 2008-10-30
Anticipated expiration: 2027-02-06
Also published as: DE102007005704A1; WO2008095827A1; EP2122611A1

Abstract

Digitales Verfahren zur Authentifizierung einer Person durch Vergleich eines aktuellen Stimmprofils mit einem vorgespeicherten initialen Stimmprofil, wobei die Person zur Bestimmung des jeweiligen Stimmprofils mindestens eine Sprachprobe einspricht, die eingesprochene Sprachprobe einer Stimmprofil-Berechnungseinheit zugeführt wird und hieraus aufgrund eines vorbestimmten Stimmprofil-Algorithmus das Stimmprofil errechnet wird, dadurch gekennzeichnet, dass für die oder jede Sprachprobe durch Spracherkennung mit anschließender phonematischer Analyse eine Phonemstruktur und eine Folge von den Phonemen zugeordneten Gewichtungsfaktoren und/oder ein phonematischer Bewertungskoeffizient bestimmt und die Gewichtungsfaktoren bzw. der Bewertungskoeffizient zur Bestimmung eines Konfidenzwertes des Stimmprofils und/oder zur Steuerung dessen genutzt werden, ob die jeweilige eingesprochene Sprachprobe oder Teile derselben der Stimmprofil-Berechnungseinheit zugeführt werden.digital Method for authenticating a person by comparing one current voice profile with a pre-stored initial voice profile, wherein the person to determine the respective voice profile at least a speech sample speaks, the speech sample of a Voice profile calculation unit is supplied and because of this a predetermined voice profile algorithm calculates the voice profile is characterized in that for the or each speech sample through speech recognition with subsequent phonematic analysis a phoneme structure and a sequence associated with the phonemes Weighting factors and / or a phonemic weighting coefficient determined and the weighting factors or the evaluation coefficient for the determination a confidence value of the voice profile and / or for controlling it be used, whether the respective voice sample or speech Parts of the same are supplied to the voice profile calculation unit.

Description

Die Erfindung betrifft ein Verfahren zur Authentifizierung einer Person nach dem Oberbegriff des Anspruchs 1 sowie eine Anordnung zur Durchführung dieses Verfahrens.The The invention relates to a method for authenticating a person according to the preamble of claim 1 and an arrangement for carrying out this Process.

Althergebrachte Verfahren zur Authentifizierung einer Person basieren auf der Überprüfung, ob die zu überprüfende Person im Besitz bestimmter Gegenstände (traditionell etwa Siegel oder Pass, neuerdings auch Zugangskarte oder Token) oder von individualisiertem Wissen (etwa PIN oder Passwort) ist. Authentifizierungs-Verfahren auf biometrischer Basis bedienen sich hingegen bestimmter körperlicher Merkmale der Person, etwa ihres Fingerabdrucks oder Retinamusters oder eines Stimmprofils. Zu den letzteren Verfahren gibt es seit einigen Jahren umfangreiche Entwicklungsarbeiten, die auch bereits zu marktgängigen Produkten geführt haben. Im Rahmen dieser Entwicklungen nimmt die Frage der Brauchbarkeit von „Spuren" der Person für die Authentifizierung oder auch eine anfängliche Registrierung (Enrollment) breiten Raum ein, sowohl unter dem Gesichtspunkt der Erkennungs- bzw. Authentifizierungs-Sicherheit als auch unter dem Gesichtspunkt der Nutzerakzeptanz, nämlich des Vermeidens langer und umständlicher Prozeduren.longstanding Methods of authenticating a person are based on verifying whether the person to be checked in possession of certain objects (traditionally about seal or passport, now also access card or token) or individualized knowledge (such as PIN or password) is. Authentication procedures on a biometric basis, on the other hand, certain physical uses Characteristics of the person, such as her fingerprint or retina pattern or a voice profile. The latter methods have been around some years of extensive development work, which already to marketable Products to have. As part of these developments, the issue of usability is increasing of "traces" of the person for authentication or an initial one Enrollment, both from the point of view the recognition or Authentication security as well as user acceptance namely avoidance of long and cumbersome Procedures.

Bei Verfahren der gattungsgemäßen Art geht es hierbei um die Brauchbarkeit gelieferter Sprachproben für die Registrierung (Enrollment) bzw. Authentifizierung des Nutzers und um sich hieraus ergebende Steuerungs-Einflüsse auf die Verfahrensführung. Bei der praktischen Erprobung bereits implementierter Systeme hat sich näm lich gezeigt, dass sich mit bestimmten Sprachproben entweder nur eine unterhalb der (hohen) Anforderungen liegende Sicherheit ergibt, wenn man den Ablauf nutzerfreundlich kurz halten will, oder dass sich zur Erreichung eines bestimmten Sicherheitsniveaus fallweise unbequem lange Enrollment- bzw. Verifizierungs-Prozeduren ergeben. Die konkreten Ergebnisse hängen offensichtlich vom verwendeten Sprachmaterial ab.at Method of the generic type this is about the usability of delivered speech samples for registration (Enrollment) or authentication of the user and to get out of it resulting control influences on the procedure. In the practical testing of already implemented systems has Namely shown that with certain language samples either only one safety below the (high) requirements, if you want to keep the process user-friendly short, or that to achieve a certain level of safety on a case-by-case basis uncomfortably long enrollment or verification procedures. The concrete results depend obviously from the language material used.

Aus der WO 2003075540 A2 ist ein digitales Verfahren zur Authentifizierung einer Person durch Vergleich eines aktuellen Stimmprofils mit einem vorgespeicherten initialen Stimmprofil bekannt, bei dem die Person zur Bestimmung des jeweiligen Stimmprofils mindestens eine Sprachprobe [= voice template] einspricht, die eingesprochene Sprachprobe einer Stimmprofil-Berechnungseinheit zugeführt wird und hieraus aufgrund eines vorbestimmten Stimmprofil-Algorithmus [= noise-filtering algorithm] das Stimmprofil errechnet wird. Ähnliche Verfahren bzw. nach diesen arbeitende Anordnungen sind auch aus der EP 1843 325 A1 , Q. Li. Qiru Zhou, C. H. Lee: "Automatic verbal Information verification for unser authentification". In: IEEE Trans. Speech and Audio Proc., Vol.8, Nr. 5, S. 585–596, September 2000 und aus der WO 1998022936 bekannt.From the WO 2003075540 A2 a digital method for authenticating a person by comparison of a current voice profile with a prestored initial voice profile is known in which the person for determining the respective voice profile at least one voice sample speaks, the voice sample is supplied to a voice profile calculation unit and from this based on a predetermined voice profile algorithm [= noise-filtering algorithm] the voice profile is calculated. Similar methods or according to these working arrangements are also from the EP 1843 325 A1 Q. Qiru Zhou, CH Lee: "Automatic verbal information verification for our authentication". In: IEEE Trans. Speech and Audio Proc., Vol. 8, No. 5, pp. 585-596, September 2000 and from the WO 1998022936 known.

Es ist Aufgabe der Erfindung, ein verbessertes Verfahren und eine entsprechende Anordnung bereitzustellen, mit denen sich eine hohe Nutzerfreundlichkeit und -akzeptanz vorteilhaft mit der Erfüllung hoher Sicherheitsanforderungen verknüpfen lässt.It Object of the invention, an improved method and a corresponding To provide arrangement with which a high level of user-friendliness and acceptance advantageous with the fulfillment of high security requirements link leaves.

Diese Aufgabe wird in ihrem Verfahrenaspekt durch ein Verfahren mit den Merkmalen des Anspruchs 1 und in ihrem Vorrichtungsaspekt durch eine Anordnung mit den Merkmalen des Anspruchs 14 gelöst. Zweckmäßige Fortbildungen des Erfindungsgedankens sind Gegenstand der jeweiligen abhängigen Ansprüche.These Task is in their method aspect by a method with the Characteristics of claim 1 and in their device aspect by an arrangement with the features of claim 14 solved. Appropriate training of the inventive concept are the subject of the respective dependent claims.

Die Erfindung schließt den wesentlichen Gedanken ein, für die oder jede Sprachprobe durch Spracherkennung mit anschließender phonematischer Analyse einen phonematischen Bewertungskoeffizienten zu errechnen. Weiter schließt die Erfindung den Gedanken ein, dass dieser zur Bestimmung eines Konfidenzwertes des Stimm profils und/oder zur Steuerung dessen genutzt wird, ob die jeweilige eingesprochene Sprachprobe der Stimmprofil-Berechnungseinheit zugeführt wird. In analoger Weise können auch bereits Gewichtungsfaktoren genutzt werden, die einzelnen Phonemen der Sprachprobe zuordenbar sind. Es wird also – etwas vereinfacht formuliert – der Bestimmung eines Stimmprofils aus einer Sprachprobe eine Spracherkennung und phonematische Analyse vorgeschaltet, um die Stimmprofil-Bestimmung zu optimieren.The Invention includes the essential thoughts, for the or each speech sample by speech recognition followed by phonemic Analysis to calculate a phonemic weighting coefficient. Next closes the invention, the idea that this for the determination of a Confidence value of the voice profile and / or used to control it whether the respective voiced speech sample of the voice profile calculation unit supplied becomes. In an analogous way already weighting factors are used, the individual phonemes the language sample are assignable. It is thus - somewhat simpler formulated - the provision a voice profile from a speech sample a speech recognition and phonemic analysis upstream of the vocal profile determination to optimize.

Dies kann, gemäß verschiedenen Varianten der Realisierung der Erfindung, vor dem und zeitlich losgelöst von dem eigentlichen Einsprechen der auszuwertenden Sprachproben oder aber in (Quasi-)Echtzeit während des Einsprechens, also im Verlaufe eines Enrollment oder einer Authentifizierung, geschehen. In beiden Fäl len lässt sich eine Verbesserung des Aufwand/Nutzen-Verhältnisses erzielen, und die erstgenannte Variante ist besonders zur Realisierung einer wenig zeitaufwendigen und damit auf hohe Nutzerakzeptanz zielenden Verfahrensführung geeignet.This can, according to different Variants of the realization of the invention, before and temporally detached from the actual pronunciation of the speech samples to be evaluated or in (quasi) real time during of pronunciation, that is, in the course of an enrollment or authentication, happen. In both cases let yourself achieve an improvement in the cost / benefit ratio, and the former variant is particularly for the realization of a little time-consuming and thus aimed at high user acceptance process management.

Bei der zweiten Variante ist insbesondere vorgesehen, dass die Gewichtungsfaktoren bzw. der phonematische Bewertungskoeffizient einer Schwellwert-Diskriminierung mit einem vorbestimmten Gewichts-Minimalwert unterzogen und die Zuführung zur Stimmprofil-Berechnungseinheit in Abhängigkeit vom Diskriminierungsergebnis gesteuert werden. Speziell wird hierfür der Gewichts-Minimalwert aus einem vorbestimmten Konfidenz-Minimalwert des Stimmprofils bzw. einem vorbestimmten Sicherheitspegel der Authentifizierung rückgerechnet. Die Benutzung einer eingesprochenen Sprachprobe zur Stimmanalyse (Stimmprofil-Berechnung) wird also davon abhängig gemacht, ob die phonematische Struktur der Sprachprobe – allein oder im Kontext weiterer Teil-Sprachproben – überhaupt zur Realisierung eines bestimmten Konfidenzwertes des Stimmprofils oder letztlich eines geforderten Sicherheitspegels der Authentifizierung geeignet ist.In the second variant, it is provided in particular that the weighting factors or the phonemic evaluation coefficient of a threshold discrimination with a predetermined weight-Mi and the supply to the voice profile calculation unit is controlled depending on the discrimination result. Specifically, for this purpose, the weight minimum value is recalculated from a predetermined confidence minimum value of the voice profile or a predetermined security level of the authentication. The use of a voice sample for voice analysis (voice profile calculation) is thus made dependent on whether the phonemic structure of the voice sample - alone or in the context of other partial voice samples - at all suitable for realizing a certain confidence value of the voice profile or ultimately a required security level of authentication is.

Ist dies nicht der Fall, so dient ein Blockier-Steuersignal, welches die Zuführung einer Sprachprobe zur Stimmprofil-Berechnungseinheit blockiert, zugleich als Steuersignal zur Vorgabe oder Anforderung einer Ersatz-Sprachprobe. Im Kontext eines konkreten Enrollment- oder Authentifizierungs-Ablaufes mit geeigneter Benutzerführung sieht das so aus, dass das Blockier-Steuersignal die Ausgabe einer Aufforderung zum Einsprechen einer nicht vorab festgelegten Ersatz-Sprachprobe im Rahmen einer Benutzerführung steuert und für die hierauf empfangene Ersatz-Sprachprobe der phonematische Bewertungskoeffizient berechnet wird. Erweist sich auch die neue Sprachprobe als nicht hinreichend geeignet, muss dieser Ablauf gegebenenfalls wiederholt werden.is this is not the case, then serves a blocking control signal, which the feeder blocked a voice sample to the voice profile calculation unit, at the same time as a control signal to specify or request a replacement speech sample. In the context of a concrete enrollment or authentication process with suitable user guidance it looks like that the blocking control signal is the output of a Call for a non-pre-determined substitute speech sample as part of a user guide controls and for the replacement speech sample received thereon the phonemic weighting coefficient is calculated. Does not prove the new language sample as well sufficiently suitable, this procedure must be repeated if necessary become.

In einer alternativen Ausführung ist vorgesehen, dass das Blockier-Steuersignal die Ausgabe einer vorab festgelegten Ersatz-Sprachprobe mit vorbestimmtem phonematischem Bewertungskoeffizienten im Rahmen einer Benutzerführung steuert. Dem zu authentifizierenden Nutzer wird also eine einzusprechende Sprachprobe vorgegeben, deren Brauchbarkeit unter phonematischen Bewertungs-Gesichtspunkten vorab geprüft wurde und gesichert ist. Hiermit wird vermieden, dass dem Nutzer weitere Einsprech-Versuche zugemutet werden, die zu einer Verlängerung der Prozedur führen und seinen Unmut erregen können.In an alternative embodiment it is provided that the blocking control signal, the output of a pre fixed replacement speech sample with predetermined phonemic Assessment coefficients in the context of user guidance controls. The user to be authenticated is thus a voice sample to be recorded their usability under phonemic evaluation criteria tested in advance was and is secured. This avoids being the user further Einsprech attempts are expected, which lead to an extension lead the procedure and can provoke his displeasure.

In einer konsequenten Fortführung dieses Approaches wird so vorgegangen, dass die oder jede Sprachprobe vorgegeben und im Rahmen einer Benutzerführung ausgegeben und der zugehörige phonematische Bewertungskoeffizient vorbestimmt wird. Die phonematische Bewertung hat also vorher stattgefunden und eine Auswahl an gut brauchbaren Sprachproben ergeben, von denen einige dem Nutzer im Rahmen des Enrollment oder auch der späteren Authentifizierung zum Einsprechen angeboten werden.In a consistent continuation This approach is done so that the or each voice sample given and issued in the context of a user guide and the corresponding phonemic Rating coefficient is predetermined. The phonemic evaluation So before that has taken place and a selection of good usable Speech samples revealed, some of which the user under the enrollment or later Authentication to be offered for speech.

Eine weitere, mit den vorgehend angesprochenen Ausführungen kombinierbare Ausgestaltung sieht vor, dass der Verifizierungsvorgang das Einsprechen mehrerer Sprachproben umfasst und aus deren zugehörigen phonematischen Bewertungskoeffizienten ein resultierender Konfidenzwert oder Sicherheitspegel berechnet wird. Dies kann zunächst dazu dienen, den ermittelten Konfidenzwert oder Sicherheitspegel einfach als begleitende Aussage für ein ausgeführtes Enrollment oder eine Authentifizierung verfügbar zu haben und ggf. nach einer späteren Auswertung (etwa zu statistischen Zwecken) zuzuführen.A further, combined with the above-mentioned embodiments embodiment For example, the verification process requires multiple voices Includes speech samples and their associated phonemic weighting coefficients a resulting confidence or safety level is calculated becomes. This can be done first serve the calculated confidence or safety level simply as an accompanying statement for an executed enrollment or authentication available to have and possibly after a later Evaluation (for statistical purposes, for example).

Speziell kann aber auch vorgesehen sein, dass nach jedem Einsprechen einer Sprachprobe oder nach dem Einsprechen einer vorbestimmten Anzahl von Sprachproben der resultierende Konfidenzwert einer Schwellwert-Diskriminierung mit einem vorbestimmten Konfidenz-Minimalwert oder der resultierende Sicherheitspegel-Wert einer Schwellwert-Diskriminierung mit einem vorbestimmten Sicherheitspegel-Minimalwert unterzogen wird. Im Ansprechen auf das Diskriminierungsergebnis wird sodann die Beendigung des Verifizierungsvorganges oder die Anforderung einer weiteren Sprachprobe gesteuert. Der erwähnte Schwellwert kann über die Systemsteuerung einstellbar sein.specially but can also be provided that after each Eingespräche a Speech sample or after the pronunciation of a predetermined number of speech samples, the resulting confidence value of a threshold discrimination with a predetermined minimum confidence value or the resulting Security level value subjected to a threshold discrimination with a predetermined safety level minimum value becomes. In response to the discrimination result is then the completion of the verification process or the request controlled another voice sample. The mentioned threshold can over the Control panel be adjustable.

In Verbindung mit der weiter oben erwähnten Verfahrensführung mit dem Nutzer vorgegebenen Sprachproben kann speziell vorgesehen sein, dass ein Konfidenz- Minimalwert oder Sicherheitspegel-Minimalwert eingegeben und im Ansprechen hierauf aus einer Gesamtmenge vorgegebener Sprachproben mit jeweils vorbestimmtem phonematischem Bewertungskoeffizienten eine Teilmenge zur Ausgabe im Rahmen einer Benutzerführung ausgewählt wird. Hiermit wird eine präzise an gegebene Sicherheits-Anforderungen angepasste Verfahrensführung ohne Vermeidung jeglicher unbrauchbarer Sprachproben-Eingaben, mithin also eine vorteilhafte Verknüpfung von definiertem Sicherheitsstandard mit hoher Nutzerakzeptanz, erreicht.In Compound with the above-mentioned process control with Speech samples given to the user may be specially provided that a confidence minimum value or safety level minimum value entered and responsive thereto from a total of predetermined speech samples, each with a predetermined phonemic weighting coefficient a subset to the output as part of a user guide selected becomes. This becomes a precise one Process management adapted to given security requirements without Avoidance of any useless speech sample inputs, therefore So an advantageous link defined security standard with high user acceptance.

Zweckmäßigerweise wird das vorgeschlagene Verfahren so ausgeführt, dass jedem Phonem der oder jeder Sprachprobe ein Gewichtungsfaktor zugeordnet wird, der aus der jeweiligen Gleichfehlerrate abgeleitet ist. Der phonematische Bewertungskoeffizient der Sprachprobe wird nach einem vorbestimmten Bewertungs-Algorithmus aus den Gewichtungsfaktoren errechnet. Hierfür kommen sehr einfache oder etwas komplexere Algorithmen in Betracht, wobei einfache Beispiele weiter unten erläutert werden.Conveniently, the proposed method is carried out so that each phoneme of the or assigning each voice sample a weighting factor that is derived from the respective same error rate. The phonematic The weighting coefficient of the speech sample is determined according to a predetermined Evaluation algorithm calculated from the weighting factors. Come for this very simple or slightly more complex algorithms, taking into account simple examples will be explained below.

Bevorzugt ist beim vorgeschlagenen Verfahren die automatische Ausführung, unter Ausgabe einer vorgegebenen Benutzerführung, in Quasi-Echtzeit. Wie weiter oben bereits angemerkt, kann aber ein Teil des Verfahrens, nämlich die phonematische Bewertung von Sprachproben und die Zuweisung eines entsprechenden Bewertungskoeffizienten, im Vorfeld eines tatsächlichen Enrollment- oder Authentifizierungsvorganges erfolgen, für das die untersuchten Sprachproben und zugeordneten Bewertungskoeffizienten dann bereitgestellt werden. Im übrigen ist es auch möglich, eingesprochene Sprachproben nicht in Echtzeit, sondern im Nachhinein erfindungsgemäß auszuwerten, etwa im Sinne einer Auswahl relevanter Sprachproben aus einem größeren eingesprochenen und abgespeicherten Sprachproben-Vorrat.Preferably, in the proposed method, the automatic execution, issuing a given user guidance, in quasi real-time. As already noted above, however, part of the method, namely the phonemic evaluation of speech samples and the assignment of a corresponding weighting coefficient, may be prior to an actual enrollment or authentication process for which the examined speech samples and associated weighting coefficients are then provided. Incidentally, it is also possible to evaluate speech samples not in real time, but in retrospect according to the invention, for example in the sense of selecting relevant speech samples from a larger verbalized and stored speech sample supply.

Wesentliche Vorrichtungsaspekte der Erfindung ergeben sich für den Fachmann ohne weiteres aus den oben erläuterten Verfahrensaspekten, so dass deren wiederholte Erläuterung hier nicht angezeigt ist. Hingewiesen wird jedoch auf folgendes:
Ein erstes wesentliches Vorrichtungselement der Erfindung ist eine Spracherkennungseinheit zur phonematischen Analyse von Sprachproben, die typischerweise (aber nicht unbedingt) im Rahmen eines Enrollment oder einer Verifizierungsprozedur eingesprochen werden. Ein weiteres wesentlichen Vorrichtungselement der Erfindung ist eine Bewertungskoeffizienten-Berechnungseinheit, die aus den Phonemen der Sprachprobe und für diese bekannten Bewertungs- bzw. Gewichtungsfaktoren einen phonematischen (Gesamt-) Bewertungskoeffizienten der Sprachprobe ermittelt. Kostenseitig von dieser ist schließlich eine Sprachproben-Zuführsteuerung zur Steuerung der Zuführung der phonematisch ausgewerteten Sprachprobe zur Stimmprofil-Berechnungseinheit und/oder eine Konfidenzwert-Berechnungseinheit zur Berechnung des Konfidenzwertes des hieraus gewonnenen Stimmprofils vorgesehen.Essential device aspects of the invention will be readily apparent to those skilled in the art from the above-discussed aspects of the method, so that their repeated explanation is not indicated herein. However, attention is drawn to the following:
A first essential device element of the invention is a speech recognition unit for phonemic analysis of speech samples typically (but not necessarily) included in an enrollment or verification procedure. Another essential device element of the invention is a weighting coefficient calculation unit which determines from the phonemes of the speech sample and for these known weighting factors a phonematic (total) weighting coefficient of the speech sample. On the expense side of this, finally, a speech sample feed control is provided for controlling the feeding of the phonematically evaluated speech sample to the voice profile calculation unit and / or a confidence value calculation unit for calculating the confidence value of the voice profile obtained therefrom.

Nach Obigem umfasst das System (ein System-Server) in bevorzugter Ausführung der Erfindung zudem eine Benutzerführungseinheit zur Bereitstellung einer Benutzerführung, insbesondere zur Anforderung eingesprochener Sprachproben und/oder zur Ausgabe vorbestimmter Sprachproben zum Einsprechen durch die zu identifizierende Person. In weiterer Ausgestaltung dieser Ausführung ist vorgesehen, dass die Benutzerführungseinheit über einen Steuereingang mindestens mittelbar mit einem Ausgang der Bewertungskoeffizienten-Berechnungseinheit und/oder einem Ausgang der Konfidenzwert-Berechnungseinheit verbunden ist, derart, dass Ausgaben im Rahmen der Benutzerführung in Abhängigkeit von Ergebnissen der Bewertungskoeffizienten- oder Konfidenzwert-Berechnung steuerbar sind.To The above includes the system (a system server) in a preferred embodiment of Invention also a user guidance unit for providing user guidance, in particular for the requirement Speech samples and / or for the output of predetermined speech samples for Speech by the person to be identified. In a further embodiment this version is provided that the user guidance unit via a Control input at least indirectly with an output of the weighting coefficient calculation unit and / or an output of the confidence value calculation unit is such that expenses in the context of user guidance in dependence of results of the evaluation coefficient or confidence value calculation are controllable.

Zur effizienten Ausführung der erforderlichen Berechnungen umfasst das System in einer weiteren Ausführung eine mit dem Gewichtungsfaktor-Eingang der Bewertungskoeffizienten-Berechnungseinheit verbundene Gewichtungsfaktor-Speichereinheit zur Speicherung von Phonem-Gewichtungsfaktoren. Die Gewichtungsfaktoren sind in der Speichereinheit in Art eines Lookup-Table jeweils in Zuordnung zu den im Rahmen einer Sprachanalyse vorkommenden Phonemen abgelegt. Alternativ zum Vorsehen eigenen Speichereinheit kann auf die Gewichtungsfaktoren gegebenenfalls auch in einer externen Datenbasis zugegriffen werden.to efficient execution In one further embodiment, the system includes the required calculations with the weighting factor input of the weighting coefficient calculation unit connected weighting factor storage unit for storing Phoneme weighting factors. The weighting factors are in the Storage unit in the manner of a lookup table in each case in association with stored in the context of a speech analysis occurring phonemes. Alternatively to providing your own memory unit may be based on the weighting factors if necessary also be accessed in an external database.

Die weiter oben erwähnte Verbindung zwischen dem Ausgang der Bewertungskoeffizienten-Berechnungseinheit und einem Steuereingang der Stimmproben-Berech nungssteuereinheit kann derart ausgestaltet sein, dass in diese Verbindung eine Schwellwert-Diskriminatoreinheit zur Schwellwert-Diskriminierung der errechneten Bewertungskoeffizienten mit einem vorgegebenen Gewichts-Minimalwert eingeschleift ist. Mit dem Ausgang dieser Schwellwert-Diskriminatoreinheit kann im übrigen auch ein Steuereingang mit der Benutzerführungseinheit verbunden sein, um das Vergleichs- bzw. Diskriminierungsergebnis für eine angepasste Benutzerführung (Anforderung bzw. Vorgabe weiterer Sprachproben) nutzbar zu machen.The mentioned above Connection between the output of the weighting coefficient calculation unit and a control input of the voice sample calculation control unit may be configured such that in this connection a threshold discriminator unit for threshold discrimination of the calculated weighting coefficients is looped in with a predetermined weight minimum value. With the output of this threshold discriminator unit can, moreover, also a control input is connected to the user guidance unit, to the comparative or discrimination result for an adapted user guidance (Request or specification of further voice samples) to make usable.

Eine zweite Schwellwert-Diskriminatoreinheit kann am Ausgang der Konfidenzwert-Berechnungseinheit vorgesehen sein, um einen aus den Sprachproben errechneten Konfidenzwert mit einem vorgegebenen Minimalwert oder einem errechneten Sicherheitspegel-Wert mit einem vorgegebenen Minimalwert zu vergleichen. Auch die zweite Schwellwert-Diskriminatoreinheit kann über einen Steuereingang mit der Benutzerführungseinheit verbunden sein, um eine Adaption der Benutzerführung an die Ergebnisse der phonematischen Auswertung der Sprachproben anzupassen.A second threshold discriminator unit may be at the output of the confidence value calculation unit be provided to a calculated from the speech samples confidence value with a given minimum value or a calculated safety level value to compare with a given minimum value. Also the second Threshold discriminator unit can via a control input with the user guidance unit be connected to an adaptation of user guidance to the results of phonemic analysis of the speech samples.

In vorrichtungsseitiger Realisierung der oben als besonders effizient gekennzeichneten Verfahrensführung (Enrollment oder Authentifizierung) mit vorgegebenen geeigneten Sprachproben ist ein Sprachprobenspeicher zur geordneten Ablage einer Menge vorgegebener Sprachproben mit jeweils zugehörigem vorbestimmtem phonematischen Bewertungskoeffizienten vorgesehen. Hierbei sind die Benutzerführungseinheit und die Konfidenzwert-Berechnungseinheit zum Abruf ausgewählter Sprachproben mit dem jeweiligen phonematischen Bewertungskoeffizienten verbunden.In Device-side realization of the above as particularly efficient marked procedure (Enrollment or authentication) with given appropriate Voice samples is a voice sample store for orderly filing a set of predetermined speech samples, each with associated predetermined phonemic weighting coefficients. Here are the user guidance unit and the confidence value calculation unit for retrieving selected speech samples associated with the respective phonemic weighting coefficient.

Vorteile und Zweckmäßigkeiten der Erfindung ergeben sich im übrigen aus den nachfolgend beschriebenen Ausführungsbeispielen und -aspekten der Erfindung anhand der Figuren. Von diesen zeigen:Advantages and expediencies of the invention will become apparent from the rest beschrie benen embodiments and aspects of the invention with reference to the figures. From these show:

1 eine schematische Darstellung eines ersten Ausführungsbeispiels der Erfindung in Form eines Funktions-Blockschaltbildes, 1 1 is a schematic representation of a first embodiment of the invention in the form of a functional block diagram,

2 eine schematische Darstellung eines zweiten Ausführungsbeispiels der Erfindung in Form eines Funktions-Blockschaltbildes und 2 a schematic representation of a second embodiment of the invention in the form of a functional block diagram and

3 eine schematische Darstellung eines dritten Ausführungsbeispiels der Erfindung in Form eines Funktions-Blockschaltbildes. 3 a schematic representation of a third embodiment of the invention in the form of a functional block diagram.

1 zeigt schematisch eine erste Anordnung 100 für eine stimmprofil-basierte Authentifizierung einer Person, in der ein für die Ausführung der Erfindung wesentlicher Abschnitt eines System-Servers 101 in Kommunikationsverbindung mit einem Mobiltelefon 103 eines Nutzers dargestellt ist. Es wird darauf hingewiesen, dass der System-Server 101 neben den nachfolgend beschriebenen Komponenten und Funktionen weitere, applikations-spezifische Komponenten und Funktionen enthalten/ausführen kann. 1 schematically shows a first arrangement 100 for a voice-profile based authentication of a person in which a portion of a system server essential to the practice of the invention 101 in communication with a mobile phone 103 a user is shown. It should be noted that the system server 101 In addition to the components and functions described below, other, application-specific components and functions may include / execute.

Der System-Server 101 steht über eine Sprachproben-Eingabeschnittstelle 105 eingangsseitig und über eine Benutzerführungs-Ausgabeschnittstelle 107 ausgangsseitig in zeitweiliger Verbindung mit dem Mobiltelefon 103, um den Benutzer in einer Enrollment- oder Verifizierungs-Prozedur zu führen und ihm die Eingabe mindestens einer Sprachprobe in das System zu ermöglichen. Daneben können weitere Ein-/Ausgabeschnittstellen, etwa für eine Dateneingabe in das System durch Betätigung der Mobiltelefon-Tastatur, vorgesehen sein. Solche sind im Zusammenhang mit der Erläuterung der Erfindung aber nicht erforderlich und werden daher hier nicht gezeigt und beschrieben.The system server 101 is via a voice sample input interface 105 on the input side and via a user interface output interface 107 on the output side in temporary connection with the mobile phone 103 to guide the user in an enrollment or verification procedure and to allow him to enter at least one voice sample into the system. In addition, further input / output interfaces, such as for data entry into the system by pressing the mobile phone keyboard, may be provided. However, such are not required in connection with the explanation of the invention and are therefore not shown and described here.

Die Sprachproben-Eingabeschnittstelle 105 ist intern mit einer Spracherkennungseinheit 109 sowie parallel mit einer Sprachproben-Zuführsteuerung 111 jeweils an deren Eingang verbunden. Die Spracherkennungseinheit 109 ist ausgangsseitig zum einen mit einer Gewichtungsfaktor-Speichereinheit 113 und zum anderen mit dem Eingang einer Bewertungskoeffizienten-Berechnungseinheit 115 verbunden. Über einen weiteren Eingang ist die Bewertungskoeffizienten-Berechnungseinheit 115 mit der Gewichtungsfaktor-Speichereinheit 113 verbunden, um von dieser vorgespeicherte Phonem-Gewichtungsfaktoren für diejenigen Phoneme zu empfangen, die es sich im Ergebnis der Spracherkennung bzw. phonematischen Analyse der empfangenen Sprachprobe als deren Bestandteile ergeben haben.The voice sample input interface 105 is internal with a speech recognition unit 109 and in parallel with a voice sample feed control 111 each connected to the entrance. The speech recognition unit 109 On the output side, it has a weighting factor storage unit 113 and the other with the input of a weighting coefficient calculation unit 115 connected. Via another input is the weighting coefficient calculation unit 115 with the weighting factor storage unit 113 connected to receive from these prestored phoneme weighting factors for those phonemes which have resulted as a result of the speech recognition or phonematic analysis of the received speech sample as its constituents.

Ausgangsseitig ist die Bewertungskoeffizienten-Berechnungseinheit mit einem Berechnungskoeffizienten-Schwellwertdiskriminator (ersten Schwellwert-Diskriminator) 117 verbunden, deren Schwellwert über eine Schwellwert-Einstelleinheit 118 einstellbar ist. Der erste Schwellwert-Diskriminator 117 ist ausgangsseitig einerseits mit einem Steuereingang der Sprachproben-Zuführsteuerung 111 und andererseits mit einer Benutzerführungseinheit 119 verbunden, um im Ergebnis der Schwellwert-Diskriminierung in der Berechnungseinheit 115 berechneten phonematischen Bewertungskoeffizienten einerseits die empfangene Sprachprobe entweder einer Stimmprofilanalyse zuzuführen oder zu blockieren und andererseits die Ausgabe einer entsprechenden Benutzerführung (Anforderung einer weiteren Sprachprobe) zu bewirken.On the output side, the evaluation coefficient calculation unit is provided with a calculation coefficient threshold discriminator (first threshold discriminator). 117 whose threshold value is connected via a threshold setting unit 118 is adjustable. The first threshold discriminator 117 on the output side, on the one hand, with a control input of the voice sample feed control 111 and, on the other hand, with a user guidance unit 119 connected as a result of the threshold discrimination in the calculation unit 115 calculated phonemic weighting coefficients on the one hand either supply the voice sample received a voice profile analysis or block and on the other hand, the output of a corresponding user guidance (request another voice sample) to effect.

Wird eine neue Sprachprobe benötigt, gibt die Benutzerführungseinheit 119 eine solche im Ansprechen auf das empfangene Steuersignal an die Schnittstelle 107 und über diese in das Mobiltelefon 103 aus. Das beschriebene Procedere wiederholt sich dann. Ist hingegen die empfangene und bewertete Sprachprobe unter dem Gesichtspunkt ihrer phonematischen Bewertung für eine Stimmanalyse (Stimmprofilberechnung) brauchbar, wird sie einer Stimmprofil-Berechnungseinheit 121 zugeführt und hieraus ein Stimmprofil des Nutzers des Mobiltelefons 103 ermittelt. Mit der in der Figur dargestellten Signalverbindung ist illustriert, dass dieses – wie bei einem anfänglichen Enrollment des Nutzers erforderlich – in einer Stimmprofil-Speichereinheit 123 abgelegt wird. Die punktierten Signallinien geben an, dass das Stimmprofil im Falle einer späteren Verifizierung des Nutzers auch einer Stimmprofil-Vergleichereinheit 125 zugeführt werden und in dieser mit einem in der Speichereinheit 123 gespeicherten initialen Stimmprofil verglichen und ein das Vergleichsergebnis kennzeichnendes Ausgangssignal der Vergleichereinheit 125 an nachfolgende Stufen des System-Servers 101 ausgegeben werden kann.If a new voice sample is needed, enter the user guidance unit 119 such in response to the received control signal to the interface 107 and about this in the mobile phone 103 out. The described procedure is then repeated. If, on the other hand, the received and evaluated speech sample is useful for voice analysis (voice profiler calculation) from the viewpoint of its phonemic evaluation, it becomes a voice profiler calculation unit 121 supplied and from this a voice profile of the user of the mobile phone 103 determined. The signal connection shown in the figure illustrates that this is required in a voice profile memory unit as required by an initial enrollment of the user 123 is filed. The dotted signal lines indicate that the voice profile in case of a later verification of the user also a voice profile comparator unit 125 be fed and in this with a in the storage unit 123 stored initial voice profile compared and a comparison result characterizing the output signal of the comparator unit 125 to subsequent levels of the system server 101 can be issued.

Für die eigentliche phonematische Bewertung sind verschiedene Algorithmen nutzbar. Sie bauen auf den Ergebnissen empirischer Untersuchungen zur „Erkennungs leistung" auf, aus denen sich für die im Rahmen einer Spracherkennung zu gewinnenden Lautbestandteile (Phoneme) von Sprachproben spezifische Gewichtungen ableiten lassen. Neben der erkennungs-bezogenen Qualität der einzelnen Phoneme kann in die Gesamtbewertung einer Sprachprobe auch deren Quantität (Anzahl) einfließen, und dies wird bei Einsatz von Sprachproben unterschiedlicher Länge und gegebenen verarbeitungstechnischen Voraussetzungen vorteilhafterweise auch praktiziert.Various algorithms can be used for the actual phonemic evaluation. They build on the results of empirical studies on the "recognition performance", from which specific weightings can be derived for the speech constituents (phonemes) of speech samples to be obtained in speech recognition The overall evaluation of a speech sample also includes its quantity (number), and this is advantageously also practiced when using speech samples of different lengths and given processing-technical prerequisites.

Aus der Erkenntnis, dass die einzelnen Lauteinheiten von Sprachen unterschiedliche Erkennungsqualität haben, ergibt sich die (sowohl beim vorstehend beschriebenen ersten Ausführungsbeispiel als auch bei den nachfolgend beschriebenen weiteren Beispielen nutzbare) Verfahrens-Ausgestaltung, überhaupt nur Lauteinheiten mit hoher Erkennungs-Eignung (oberhalb eines bestimmten Schwellwertes – der weiteren Verarbeitung, also Stimmprofilberechnung, zuzuführen, während Lauteinheiten mit geringer Erkennungs-Eignung nicht weiterverarbeitet werden. Diese auf einzelne Phoneme bezogene Steuerung ist den Figuren nicht zu entnehmen, da diese im Interesse einer guten Übersichtlichkeit auf die Darstellung einer sprachprobenbezogenen Verfahrensführung beschränkt wurden.Out Recognizing that the individual sound units of languages are different recognition quality have the result (both in the first described above embodiment as well as usable in the further examples described below) Process design, at all only sound units with high recognition capability (above a certain Threshold - the further processing, ie voice profile calculation, while sound units be processed with low recognition suitability. This control related to individual phonemes is not the characters for the sake of clarity of presentation were limited to a language sample-related procedure.

In 2 ist eine gegenüber der Anordnung 100 nach 1 modifizierte Anordnung 200 zur Realisierung einer modifizierten Verfahrensführung gezeigt. Der Anordnung in 1 funktional entsprechende Komponenten sind mit hieran angelehnten Bezugsziffern bezeichnet und werden nachfolgend nicht nochmals erläutert.In 2 is one opposite the arrangement 100 to 1 modified arrangement 200 to realize a modified process control. The arrangement in 1 Functionally corresponding components are denoted by reference numerals based thereon and will not be explained again below.

Nachfolgend wird die Spracherkennung (phonematische Analyse) und phonematische Bewertung an vereinfachten Beispielen erläutert.following is the speech recognition (phonematic analysis) and phonemic Evaluation explained on simplified examples.

Beispiele für mittels Spracherkennung empirisch ermittelte Gleichfehlerraten ausgewählter Lauteinheiten bzw. Phoneme sind in Tabelle 1 angegeben. Tabelle 1 SAMPA Symbol: EER (Gleichfehlerrate): a: 8,2% E 10,6% m 8,5% N 9,7% F 21,0% V 24,7% T 25,3% K 23,7% Examples of equal error rates of selected sound units or phonemes empirically determined by means of speech recognition are given in Table 1. Table 1 SAMPA symbol: EER (equal error rate): a: 8.2% e 10.6% m 8.5% N 9.7% F 21.0% V 24.7% T 25.3% K 23.7%

Durch Differenzbildung zum minimalen möglichen Fehler (Null) und Normierung auf ergibt sich als Gewichtungsfaktor etwa für a: 100 – 8,2 = 91,8 => 0,981 für k: 100 – 23,7 = 76,3 => 0,763 By forming the difference to the minimum possible error (zero) and normalization, the weighting factor is about for a: 100 - 8.2 = 91.8 => 0.981 for k: 100-23.7 = 76.3 => 0.763

Wird ein Wort (eine Sprachprobe) nun auf seine (ihre) Lauteinheiten untersucht, kann für jede ermittelte Einheit die entsprechende Gewichtung benutzt werden und somit eine Gesamtgewichtung (ein Bewertungskoeffizient) für dieses Wort ermittelt werden. Es ist aber auch möglich, die Lauteinheiten-Gewichtungsfaktoren in besonders einfacher Weise zur Bewertung eines Wortes zu verwenden, indem man einen Gewichtungsfaktor-Minimalwert festlegt und nur diejenigen Lauteinheiten als brauchbar klassifiziert, deren Gewichtungsfaktor oberhalb des Minimalwertes liegt, und schließlich deren Anzahl zur Gesamtzahl der Lauteinheiten des Wortes ins Verhältnis setzt.Becomes examining a word (a voice sample) for its (her) sound units, can for each unit determined uses the appropriate weighting and thus a total weighting (a weighting coefficient) for this Word to be determined. But it is also possible to use the sound unit weighting factors to use in a particularly simple way for evaluating a word, by setting a minimum weighting factor and only those Sound units classified as usable, their weighting factor above the minimum value, and finally their number to the total number sets the sound units of the word in proportion.

Geht man etwa aus von (fiktiven) Lauteinheiten und Gewichtungen gemäß Tabelle 2 und setzt man als Minimal- bzw. Schwellwert 0,7, so ergibt sich die in der Tabelle vermerkte phonem-bezogene Eignung. Tabelle 2 a: 0,9 -> geeignet b: 0,6 c: 0,4 d: 0,5 e: 0,8 -> geeignet For example, assuming (fictitious) sound units and weightings according to Table 2 and setting the minimum or threshold value to 0.7, the result is the phoneme-related suitability noted in the table. Table 2 a: 0.9 -> suitable b: 0.6 c: 0.4 d: 0.5 e: 0,8 -> suitable

Für eine Symbolfolge „ceabde", die aus den sechs in der Tabelle genannten Lauteinheiten zusammengesetzt ist, ergibt sich mithin, dass drei der Lauteinheiten geeignet und die drei anderen ungeeignet sind, also der auf die erwähnte Weise ermittelte Bewertungskoeffizient der Symbolfolge (des Wortes) 0,5 wäre.For a symbol sequence "ceabde", which consists of the six Composed in the table called sound units results Therefore, that three of the sound units suitable and the other three unsuitable, that is, the evaluation coefficient determined in the aforementioned way the symbol sequence (of the word) would be 0.5.

Eine andere mögliche Methode besteht darin, die Gewichte der einzelnen Lauteinheiten aufzusummieren und das Ergebnis durch die Anzahl der Lauteinheiten zu dividieren. Für das oben genannte Beispiel ergäbe sich ein Bewertungskoeffizient K also aus c (0,4), e (0,8), a (0,9), b (0,6), d (0,5), e (0,8) als K = 0,4 + 0,8 + 0,9 + 0,6 + 0,5 + 0,8 = 4 => 4/6 = 0,667A other possible Method is the weights of each sound unit to sum up and the result by the number of sound units to divide. For the above example would result a weighting coefficient K, ie c (0,4), e (0,8), a (0,9), b (0.6), d (0.5), e (0.8) as K = 0.4 + 0.8 + 0.9 + 0.6 + 0.5 + 0.8 = 4 => 4/6 = 0.667

Betracht man als zusätzliche Beispiel-Symbolfolge die Folge „ceabda", ergibt sich nach dem weiter oben genannten Verfahren auch für diese ein Bewertungskoeffizient von 0,5, während er sich mit dem letztgenannten Verfahren aus c (0,4), e (0,8), a (0,9), b (0,6), d (0,5), a (0,9) als K = 0,4 + 0,8 + 0,9 + 0,6 + 0,5 + 0,9 = 4,1 => 4,1/6 = 0,683 ergibt. Der Wert der ermittelten Bewertungskoeffizienten kann mithin durchaus (und unter Umständen erheblich) vom gewählten Verfahren abhängen.consideration one as additional Example sequence of symbols "ceabda", results after the above mentioned Procedure also for this is a weighting coefficient of 0.5, while he deals with the latter Method of c (0.4), e (0.8), a (0.9), b (0.6), d (0.5), a (0.9) as K = 0.4 + 0.8 + 0.9 + 0.6 + 0.5 + 0.9 = 4.1 => 4.1 / 6 = 0.683. The value of the determined weighting coefficients can therefore be quite (and possibly considerably) of the selected Depend on method.

Tabelle 3 zeigt, zur weiteren Illustration möglicher Verfahrensgestaltungen, eine Tabelle von Passwörtern mit jeweiliger phonetischer Transkription und einem zugeordneten Bewertungskoeffizienten K, welcher nach dem oben zuerst erläuterten Verfahren unter der Annahme bestimmt wurde, dass die Phoneme a, e, i, o, y, 6, m, j und s geeignet, die übrigen Phoneme hingegen sämtlich (nach Maßgabe eines vorbestimmten Schwellwertes) hingegen ungeeignet sind. Tabelle 3 Passwort Transkription K Cr1 rosemarie_maximilian rotz@mari:maksImi:Ila:n 12/19 = 0,6316 Cr2 lieselotte_sebastian li:z@IOt@z@bastIa:n 6/17 = 0,3529 Cr3 veronika_ferdinand ve:ro:nIkafErdi:nant 10/17 = 0,5882 Cr4 evamaria_konstantin e:famari:akOnstanti:n 12/18 = 0,6667 Cr5 christiane_dagobert krIstIa:n@da:go:bErt 7/17 = 0,4118 GID GHI456 ge:ha:i:fi:rfYnfsEks 7/16 = 0,4375 GName karoline_mustermann karo:li:n@mUst@rmann 8/17 = 0,4706 GPhrase Meine Stimme ist mein Passwort maIn@StIm@IstmaInpasv Ort 14/24 = 0,583 Table 3 shows, for further illustration of possible process designs, a table of passwords with respective phonetic transcription and an associated weighting coefficient K, which was determined according to the method explained above on the assumption that the phonemes a, e, i, o, y, 6, m, j and s suitable, the other phonemes, however, all (according to a predetermined threshold), however, are unsuitable. Table 3 password transcription K Cr1 rosemarie_maximilian espite @ mari: maksImi: Ila: n 12/19 = 0.6316 Cr2 lieselotte_sebastian li: z @ iot @ z @ Bastia: n 6/17 = 0.3529 Cr3 veronika_ferdinand ve: ro: nIkafErdi: nant 10/17 = 0.5882 Cr4 evamaria_konstantin e: famari: akOnstanti: n 12/18 = 0.6667 cr5 christiane_dagobert Kristia: n @ da: go: Bert 7/17 = 0.4118 GID GHI456 ge: ha: i: fi: rfYnfsEks 7/16 = 0.4375 gName karoline_mustermann karo: li: @ n @ Must rmann 8/17 = 0.4706 GPhrase My voice is my password maIn @ StIm @ IstmaInpasv place 14/24 = 0.583

Tabelle 4 zeigt hierfür dann eine Zusammenstellung externer ermittelter Gleichfehlerraten der einzelnen Passwörter, zusammen mit dem zugehörigen Wert des Bewertungskoeffizienten. Tabelle 4 EER in %: K CR4: 1,07 0,6667 CR1: 1,66 0,6316 CR3: 1,95 0,5882 GPhrase: 2,00 0,583 CR2: 2,23 0,3529 GName: 2,46 0,4706 CR5: 2,55 0,4118 GID: 4,83 0,4375 Table 4 then shows a compilation of externally determined equal error rates of the individual passwords, together with the associated value of the weighting coefficient. Table 4 EER in%: K CR4: 1.07 0.6667 CR1: 1.66 .6316 CR3: 1.95 .5882 GPhrase: 2.00 0.583 CR2: 2.23 .3529 gName: 2.46 .4706 CR5: 2.55 .4118 GID: 4.83 0.4375

Es zeigt sich, dass die Erkennungsleistung für Passwörter mit hohem phonematischem Bewertungskoeffizienten tatsächlich ebenfalls hoch ist, wodurch die Brauch barkeit des Verfahrens im Kontext der Registrierung bzw. Authentifizierung von Personen aufgrund ihres Stimmprofils zu belegen ist.It shows that the recognition performance for passwords with high phonemic Weighting coefficients actually is also high, thereby increasing the usability of the process in the Context of the registration or authentication of persons due to their voice profile is to prove.

Eine wesentliche Änderung gegenüber der Anordnung 100 nach 1 besteht bei der Anordnung 200 nach 2 darin, dass keine Sprachproben-Zuführsteuerung vorgesehen ist, sondern jede empfangene Sprachprobe neben der Spracherkennungseinheit 209 auch in die Stimmprofil-Berechnungseinheit 221 gelangt und – unabhängig von der phonematischen Bewertung – zur Berechnung eines Stimmprofils genutzt wird. Das Ausgangssignal der ersten Schwellwert-Diskriminatoreinheit 217 gelangt hier zu einem zweiten Schwellwert-Diskriminator (Sicherheits-Diskriminatoreinheit) 227, die über einen anderen Eingang mit einer Konfidenzwert- oder Sicherheitspegel-Einstelleinheit 229 verbunden ist, über die ein vorgegebener Minimal-Konfidenzwert des zu bestimmenden Stimmprofils oder ein vorgegebener Sicherheitspegelwert eines auszuführenden Verifizierungsvorganges einstellbar ist.A major change compared to the arrangement 100 to 1 exists in the arrangement 200 to 2 in that no speech sample feed control is provided, but each received speech sample in addition to the speech recognition unit 209 also in the voice profile calculation unit 221 regardless of the phonemic rating used to calculate a voice profile. The output of the first threshold discriminator unit 217 here comes to a second threshold discriminator (security discrimination unit) 227 which has another input with a confidence level or safety level adjustment unit 229 is connected, via which a predetermined minimum confidence value of the voice profile to be determined or a predetermined security level value of a verification process to be performed is adjustable.

Am Ausgang des zweiten Schwellwert-Diskriminators 227 steht ein Signal bereit, welches kennzeichnet, ob die Stimmanalyse einer empfangenen Sprachprobe – für sich genommen – geeignet ist, vorgegebene Konfidenz- bzw. Sicherheitsanforderungen zu erfüllen oder nicht. Dieses Signal kann einerseits in nachfolgenden Stufen des System-Servers 201 verwendet werden und wird andererseits der Benutzerführungseinheit 219 zugeführt, um dort gegebenenfalls die Anforderung einer weiteren Sprachprobe zu steuern. Anders als bei der Ausführung nach 1 dienen eine oder mehrere weitere Sprachproben, die vom Nutzer geliefert werden, aber nicht einer Ersetzung der ersten (und gegebenenfalls nachfolgenden) Sprachprobe(n) bei der Stimmanalyse, sondern einer zusätzlichen Einbeziehung in die Stimmanalyse, um letztlich durch die Analyse einer Mehrzahl von Sprachproben zu einem insgesamt den definierten Minimalanforderungen genügenden Konfidenz- bzw. Sicherheitspegel zu gelangen. Bezüglich der miteinander verknüpften Auswertung mehrerer Sprachproben ist die Darstellung in 2 nicht hinreichend detailliert, der Fachmann kann aber aufgrund der vorstehenden Beschreibung eine solche Kombinations-Verarbeitung mehrerer Sprachproben, von denen jede für sich allein keine hinreichende Konfidez bzw. Sicherheit gewährleistet, ohne weiteres realisieren.At the output of the second threshold discriminator 227 is a signal ready, which indicates whether the voice analysis of a received speech sample - taken by itself - is suitable to meet predetermined confidence or security requirements or not. This signal can on the one hand in subsequent stages of the system server 201 on the other hand, the user guidance unit 219 supplied there, where appropriate, to control the request for another voice sample. Unlike the execution after 1 serve one or more other voice samples supplied by the user, but not a replacement of the first (and possibly subsequent) voice sample (s) in the voice analysis, but an additional inclusion in the voice analysis, to ultimately through the analysis of a plurality of voice samples to achieve a confidence or safety level that meets the minimum requirements. With regard to the interlinked evaluation of several speech samples, the representation in 2 not sufficiently detailed, but the skilled person can readily realize such a combination processing of multiple voice samples, each of which alone does not ensure sufficient Konfidez or security due to the above description.

Wie aus 2 ersichtlich, ist bei der zweiten Ausführungsform der zweite Schwellwert-Diskriminator 227 an die Stelle der ersten Schwellwert-Diskriminatoreinheit 117 der ersten Ausführungsform getreten und die zugehörige Einstelleinheit 229 ersetzt dementsprechend die Einstelleinheit 218 der ersten Ausführungsform. Hier wird also der in der Bewertungskoeffizienten-Berechnungseinheit 215 errechnete Bewertungskoeffizient der jeweiligen Sprachprobe einer Konfidenzwert-Berechnungseinheit 216 zugeführt, welche hieraus in erwarteten Konfidenzwert eines aus dieser Sprachprobe errechneten Stimmprofils ermittelt.How out 2 As can be seen, in the second embodiment, the second threshold discriminator 227 in place of the first threshold discriminator unit 117 the first embodiment and the associated adjustment 229 accordingly replaces the setting unit 218 the first embodiment. Here, therefore, the one in the weighting coefficient calculation unit becomes 215 calculated evaluation coefficient of the respective speech sample of a confidence value calculation unit 216 which determines from this the expected confidence value of a voice profile calculated from this speech sample.

Die Art und Weise der Verarbeitung mehrerer Sprachproben zur Ableitung eines Stimmprofils mit hinreichender Konfidenz kann nach verschiedenen Algorithmen erfolgen. Am einfachsten ist die Zuführung der Sprachproben zur Stimmprofil-Berechnungseinheit ohne jede Gewichtung. In einer anderen Variante, die in 2 mit einer gepunkteten Linie gekennzeichnet ist, kann die Stimmprofil-Berechnungseinheit als zusätzliches Steuersignal den errechneten Wertungskoeffizienten der jeweiligen Sprachprobe empfangen, und das Berechnungsergebnis wird für die jeweilige Sprachprobe mit dem zugehörigen Bewertungskoeffizienten gewichtet.The way of processing multiple speech samples to derive a voice profile with sufficient confidence can be done according to different algorithms. The easiest way is to supply the voice samples to the voice profile calculation unit without any weighting. In another variant, the in 2 is marked with a dotted line, the voice profile calculation unit can receive as an additional control signal the calculated weighting coefficient of the respective voice sample, and the calculation result is weighted for the respective voice sample with the associated weighting coefficient.

Während beim vorangehend beschriebenen ersten und zweiten Ausführungsbeispiel die Stimmanalyse/Stimmprofilberechnung anhand von Sprachproben erfolgt, die der Nutzer, d. h. die zu authentifizierende Person, selbst vorgibt (etwa seinen Namen, einem Codewort o. ä.), kann eine auf systemseitig vorgegebenen Sprachproben beruhende Stimmanalyse, sowohl beim Enrollment als auch bei. der Authentifizierung, die Erreichung eines höheren Sicherheitsniveaus ermöglichen und/oder den Verfahrensablauf verkürzen und damit die Nutzerakzeptanz erhöhen. Im Kontext der Erfindung ist vorgesehen, dass die für ein solches Verfahren bereitzustellenden Sprachproben nach phonematischen Bewertungskriterien ausgewählt sind. Das Verfahren schließt also eine vorgeschaltete Phase der Phonem-Analyse und phonematischen Bewertung eines größeren Sprachproben-Reservoirs und die Festlegung von bevorzugt zu verwendenden Sprachproben, nämlich solcher mit einem hohen phonematischen Bewertungskoeffizienten, für die spätere eigentliche Endrollment- oder Authentifizierungs-Prozedur ein.While at previously described first and second embodiments the voice analysis / voice profile calculation is based on voice samples, that of the users, d. H. the person to be authenticated pretends to be (such as his name, a codeword, etc.), one on system side given voice samples based vocal analysis, both in enrollment as well as at. authentication, achieving a higher level of security enable and / or shorten the procedure and thus the user acceptance increase. In the context of the invention, it is provided that for such Method to be provided speech samples according to phonemic evaluation criteria selected are. The procedure concludes So an upstream phase of phoneme analysis and phonemic Evaluation of a larger voice sample reservoir and the determination of preferred to be used voice samples, namely such with a high phonemic weighting coefficient, for later actual Endrollment or authentication procedure.

Eine entsprechende Anordnung 300 ist skizzenartig in 3 dargestellt. Auch hier sind Komponenten, die funktional mit Komponenten des ersten und zweiten Ausführungsbeispiels vergleichbar sind, mit an die 1 und 2 angelehnten Bezugsziffern bezeichnet und werden nachfolgend nicht genauer erläutert. Die Anordnung 300 ist in ihrem die Stimmanalyse betreffenden Teil mit den Signalverbindungen dargestellt, wie sie in der Verifizierungsphase gegeben sind.A corresponding arrangement 300 is sketchy in 3 shown. Again, components that are functionally comparable to components of the first and second embodiments, with to the 1 and 2 ajar reference numerals and are not explained in more detail below. The order 300 is represented in her voice analysis part with the signal connections as given in the verification phase.

Abweichend von der oben beschriebenen ersten und zweiten Anordnung, hat die Anordnung 300 zwei Sprachproben-Eingabeschnittstellen 305A, 3058, von denen die erstere in der Vorbereitungsstufe mit einem Mikrofon 302 und die letztere in der eigentlichen Authentifizierungs-(oder auch Enrollment-)Phase mit einem Mobiltelefon 303 einer zu authentifizierenden (oder zu registrierenden) Person verbunden ist.Notwithstanding the above-described first and second arrangement, the arrangement 300 two voice sample input interfaces 305A . 3058 of which the former in the preparation stage with a microphone 302 and the latter in the actual authentication (or enrollment) phase with a mobile phone 303 connected to a person to be authenticated (or registered).

In der Vorbereitungsphase gelangen am Mikrofon 302 eingesprochene Sprachproben, die nicht einer Stimmanalyse, sondern lediglich einer Spracherkennung und phonematischen Bewertung unterzogen werden sollen, in einen ersten Serverabschnitt 301A, in dem eine Spracherkennung und Bestimmung phonematischer Bewertungskoeffizienten wie beim ersten Ausführungsbeispiel abläuft. In deren Ergebnis gibt ein Bewertungskoeffizienten-Schwelldiskriminator 317 ein Weiterleitungs-Steuersignal an einen Sprachproben-Zwischenspeicher 320 aus, in den zunächst jede über das Mikrofon 302 eingesprochene Sprachprobe gelangt und wo sie zwischengespeichert wird. Bei einem positiven Bewertungsergebnis der Sprachprobe bewirkt dieses Steuersignal, dass die zwischengespeicherte Sprachprobe in einen Sprachprobenspeicher 322 gelangt, von wo aus sie bei einer späteren Registrierung oder Authentifizierung in die Benutzerführung 319 eingespeist wird, um sie der zu registrierenden oder zu authentifizierenden Person als einzusprechende (d. h. nachzusprechende) Sprachprobe vorgegeben wird.In the preparation phase arrive at the microphone 302 speech samples which are not to be subjected to voice analysis, but only to speech recognition and phonemic evaluation, into a first server section 301A in which speech recognition and determination of phonemic weighting coefficients are performed as in the first embodiment. The result is a weighting threshold discriminator 317 a forwarding control signal to a speech sample buffer 320 out, in the first each over the microphone 302 speech sample passed and where it is cached. For a positive evaluation result of the speech sample, this control signal causes the cached speech sample to be stored in a speech sample memory 322 from where they will be in later registration or authentication in the user interface 319 is fed to the person to be registered or to be authenticated as given to be audited (ie nachzusprechende) speech sample.

Die Benutzerführung und Stimmanalyse läuft in einem zweiten Serverabschnitt 301B im wesentlichen so ab wie beim zweiten Ausführungsbeispiel nach 2. Die Authentifizierung (oder auch Registrierung) kann mit einer einzigen aus dem Speicher 322 in die Benutzerführung übernommenen Sprachprobe oder auch mit mehreren Sprachproben erfolgen, was wesentlich von dem vorgegebenen Sicherheitsniveau abhängig sein wird. Optional können hierbei – ebenfalls ähnlich wie beim zweiten Ausführungsbeispiel – in der Stimmprofil-Berechnungseinheit 321 die numerischen Ergebnisse der phonematischen Bewertung verwendet werden, um bei Nutzung mehrerer Sprachproben zur Ableitung des Stimmprofils jeder Sprachprobe ein dem phonematischen Bewertungskoeffizienten entsprechendes Gewicht zuzuweisen. Dies ist wieder durch eine punktierte Linie in der Figur symbolisiert.The user guidance and voice analysis runs in a second server section 301B essentially as from the second embodiment 2 , Authentication (or even registration) can be done with a single one out of memory 322 be carried out in the user guidance voice sample or with multiple voice samples, which will be significantly dependent on the given level of security. Optionally, in this case also in the voice profile calculation unit, similar to the second exemplary embodiment 321 the numerical results of the phonemic score are used to assign a weight corresponding to the phonemic weighting coefficient when using multiple voice samples to derive the voice profile of each voice sample. This is again symbolized by a dotted line in the figure.

Die Ausführung der Erfindung ist nicht auf die oben erläuterten Beispiele und hervorgehobenen Aspekte beschränkt, sondern ebenso in einer Vielzahl von Abwandlungen möglich, die im Rahmen fachgemäßen Handelns liegen.The execution The invention is not limited to the examples discussed above and highlighted Limited aspects, but equally possible in a variety of modifications, the in the context of professional action lie.

Claims

Digital method for authenticating a Person by comparing a current voice profile with a pre-stored one initial voice profile, whereby the person to determine the respective Voice Profiles at least one voice sample, the spoken Voice sample of a voice profile calculation unit is supplied and from this due to a predetermined voice profile algorithm the voice profile is calculated, characterized in that for the or each speech sample by speech recognition followed by phonemic Analysis of a phoneme structure and a sequence of weighting factors assigned to the phonemes and / or a phonemic weighting coefficient and the Weighting factors or the evaluation coefficient for determination a confidence value of the voice profile and / or for controlling it be used, whether the respective voice sample or speech Parts of the same are supplied to the voice profile calculation unit.

Method according to claim 1, characterized in that that the weighting factors or the phonemic weighting coefficient a threshold discrimination with a predetermined minimum weight value subjected and the feeder to the voice profile calculation unit depending on the discrimination result is controlled.

Method according to claim 2, characterized in that that the weight minimum value from a predetermined confidence minimum value of the voice profile or a predetermined security level of the authentication recalculated becomes.

Method according to one of the preceding claims, characterized characterized in that a blocking control signal indicating the supply of a Speech sample to the voice profile calculation unit blocked, at the same time as a control signal to specify or request a replacement speech sample serves.

Method according to claim 4, characterized in that that the blocking control signal is the output of a request for Speech of a non-pre-determined substitute speech sample in the Frame of a user guide controls and for the replacement speech sample received thereon the phonemic weighting coefficient is calculated.

A method according to claim 4, characterized in that the blocking control signal ei the output ner pre-determined substitute speech sample with a predetermined phonemic weighting coefficient in the context of a user guidance controls.

Method according to one of claims 1 to 3, characterized that the or each voice sample given and issued as part of a user guide and the associated one phonemic weighting coefficient is predetermined.

Method according to one of the preceding claims, characterized characterized in that the verification process is the speech of several Includes speech samples and their associated phonemic weighting coefficients a resulting confidence or safety level is calculated becomes.

Method according to claim 8, characterized in that that after each speech test or after speech a predetermined number of speech samples, the resulting confidence value of a Threshold discrimination with a predetermined minimum confidence value or the resulting security level value of a threshold discrimination subjected to a predetermined safety level minimum value and in response to the discrimination result, the termination the verification process or the request for another Voice sample is controlled.

Method according to claim 8 or 9, characterized that is a confidence minimum value or safety level minimum value inputted and responsive thereto from a total of predetermined speech samples each with a predetermined phonemic weighting coefficient a subset is selected for output as part of a user guide.

Method according to one of the preceding claims, characterized characterized in that each phoneme of the or each speech sample Weighting factor is derived, which derived from the respective Gleichfehlerrate and the phonemic weighting coefficient of the speech sample according to a predetermined evaluation algorithm from the weighting factors is calculated.

Method according to one of claims 2 to 11, characterized that for calculating the voice profile only those phonemes of a Speech sample are fed to the voice profiler calculation unit, whose weighting factor is above the weight minimum value, while the remaining Phonemes will not be further processed.

Method according to one of the preceding claims, characterized through the automatic execution, under the output of a given user guidance, in quasi real-time.

Arrangement for carrying out the method according to one of the preceding claims, With - one Voice sample input interface, - An input side with the voice sample input interface associated voice profile calculation unit, - one parallel to the voice sample calculation unit on the input side with the Speech input interface associated speech recognition unit for phonemic analysis of a received speech sample, - one Evaluation coefficient calculation unit for calculating the phonemic Weighting coefficients of the speech recognition unit analyzed received voice sample and - one with the output of the Evaluation coefficient calculation unit associated with speech sample supply control to control the feeder the received speech sample to the voice profile calculation unit or a confidence value calculation unit for calculating the confidence value of the voice profile.

Arrangement according to claim 14, characterized by, a user guidance unit for providing user guidance, in particular for requesting speech samples and / or for the output predetermined Speech samples for speech by the person to be identified.

Arrangement according to claim 15, characterized that the user guidance unit via a control input at least indirectly with an output of the weighting coefficient calculation unit and / or connected to an output of the confidence value calculation unit, such that expenditures in the context of user guidance depending on results of the evaluation coefficient or confidence-value calculation are controllable.

Arrangement according to one of Claims 14 to 16, characterized by a weighting factor memory connected to the weighting factor input of the weighting coefficient calculation unit unit for storing phoneme weighting factors.

Arrangement according to one of claims 14 to 17, characterized by one with the output of the evaluation coefficient calculation unit connected first threshold discriminator unit for threshold discrimination the calculated weighting coefficient with a given weight minimum value, wherein the threshold discriminator unit via a control input with the voice sample calculation control unit and optionally the user guidance unit connected is.

Arrangement according to one of claims 14 to 18, characterized by one with the output of the confidence value calculation unit connected second threshold discriminator unit for discriminating a calculated confidence value with a given minimum confidence value or a calculated safety level value with a predetermined Sicherheitspe gel minimum value, wherein the second threshold discriminator via a Control input with the user guidance unit connected is.

Arrangement according to one of claims 14 to 19, characterized by a speech sample memory for the orderly storage of a set predetermined speech samples, each with associated predetermined phonemic Evaluation coefficients, wherein the user guidance unit and the confidence value calculation unit for retrieval selected Speech samples with the respective phonemic weighting coefficient are connected to the speech sample memory.