EP2122611A1

EP2122611A1 - Digital method for authenticating a person and arrangement for performing the same

Info

Publication number: EP2122611A1
Application number: EP08708336A
Authority: EP
Inventors: Christian Pilz; Bianca Aschenberner
Original assignee: Voice Trust AG
Current assignee: VOICETRUST ESERVICES CANADA Inc
Priority date: 2007-02-05
Filing date: 2008-01-29
Publication date: 2009-11-25
Also published as: WO2008095827A1; DE102007005704A1; DE102007005704B4

Abstract

The invention relates to a digital method for authenticating a person by comparing a current voice profile to a previously stored initial voice profile, wherein the person speaks at least one speech sample in order to determine the respective voice profile, the spoken speech sample is fed to a voice profile computing unit, and based on a predefined voice profile algorithm the voice profile is computed. For the, or every, speech sample, a phoneme structure and a sequence of weighting factors associated with the phonemes, and/or a phonematic evaluation coefficient are determined by means of voice recognition with subsequent phonematic analysis, and the weighting factors and/or the evaluation coefficient are used to determine a confidence value of the voice profile and/or to control to whether the respectively spoken speech sample or parts thereof are fed to the voice profile computing unit.

Description

description

The invention relates to a method for authentication of a person according to the preamble of claim 1 and to an arrangement for carrying out this method.

Traditional methods for authenticating a person are based on checking whether the person to be checked possesses certain items (traditionally a seal or passport, more recently also an access card or token) or individualized knowledge (such as PIN or password). On the other hand, authentication methods based on biometrics make use of certain physical characteristics of the person, such as their fingerprint or retinal pattern or a voice profile. For the last few years, extensive development work has been carried out on these latter processes, which have already led to marketable products. In the context of these developments, the question of the usability of "traces" of the person for authentication or also an initial registration (enrollment) occupies a wide space, both from the viewpoint of recognition security and from the point of view of security User acceptance, namely the avoidance of long and cumbersome procedures.

In the case of methods of the generic type, this involves the usability of delivered speech samples for the enrollment or authentication of the user and the resulting control influences on the process control. In the practical testing of already implemented systems, it has been shown that with certain speech samples, there is either only a security that is below the (high) requirements, if the process is to be kept user-friendly short, or, in some cases, inconveniently long to achieve a certain security level Enrollment or verification procedures. The concrete results obviously depend on the language material used. It is therefore an object of the invention to provide an improved method and entspre ^¬ sponding arrangement with which a high Nutzerfreundiichkeit and acceptance advantageously be associated with the fulfillment of high safety requirements can.

This object is achieved in its procedural aspect by a method having the features of claim 1 and in its device aspect by an arrangement having the features of claim 13. Expedient development of the inventive concept are the subject of the respective dependent claims.

The invention includes the essential idea to calculate for the or each Sprachpro ^¬ be by voice recognition followed by analysis phonemic a photo- nematic weighting coefficients. Furthermore, the invention includes the idea that this is used to determine a confidence value of the voice profile and / or to control whether the respective voice sample spoken is supplied to the voice profile calculation unit. In an analogous manner, weighting factors can already be used which can be assigned to individual phonemes of the speech sample. Thus, the determination of a voice profile from a voice sample is preceded by voice recognition and phonematic analysis, in order to optimize the voice profile determination-somewhat simplified for ^¬ .

This can, according to different variants of the realization of the invention, before and temporally detached from the actual speech of the speech samples to be evaluated or in (quasi) real-time during the Einsprechens, ie in the course of Enrollment or authentication done. In both cases, an improvement of the cost / benefit ratio can be achieved, and the former variant is particularly suitable for the realization of a little time-consuming and thus aimed at high user acceptance process management.

In the second variant, it is provided in particular that the weighting factors or the phonemic weighting coefficient of a threshold discriminant subjected to a predetermined weight minimum value and the supply to the voice profile calculation unit are controlled in dependence on the discrimination result. Spezielf for this purpose, the weight minimum value from a predetermined confidence minimum value of the voice profile or a predetermined security level of authentication is recalculated. The use of a voice sample for voice analysis (voice profile calculation) is thus made dependent on whether the phonemic structure of the voice sample - alone or in the context of other partial voice samples - at all suitable for realizing a certain confidence value of the voice professional or ultimately a required security level of authentication is.

If this is not the case, then a blocking control signal which blocks the supply of a speech sample to the voice profile calculation unit also serves as a control signal for prescribing or requesting a replacement speech sample. In the context of a concrete EnroSIment- or authentication process with appropriate user guidance, this is such that the blocking control signal controls the output of a request to speak a non-predetermined substitute speech sample in the context of user guidance and for the replacement speech sample received thereon phonemic weighting coefficient is calculated. If the new language sample also proves to be insufficient, this procedure must be repeated if necessary.

In an alternative embodiment, it is provided that the Biockier control signal controls the output of a predetermined substitute speech sample with a predetermined phonemic evaluation coefficient as part of a user guidance. The user to be authenticated is therefore given a speech sample to be recorded, the usability of which has been checked in advance under phonematic evaluation criteria and is secured. This avoids the user being expected to make further in-touch attempts, which may prolong the procedure and provoke his displeasure.

In a consistent continuation of this approach, the procedure is such that the or each voice sample is predetermined and used as part of a user guide. given and the associated phonemic weighting coefficient is predetermined. Thus, the phonemic evaluation has previously taken place and results in a selection of well-usable speech samples, some of which are offered to the user as part of the enrollment or later authentication for speech.

Another embodiment which can be combined with the above-mentioned embodiments provides that the verification process comprises the speech of several speech samples and from whose associated phonemic evaluation coefficients a resulting confidence value or safety level is calculated. This can initially serve to have the determined confidence value or security level simply available as an accompanying statement for an executed enrollment or an authentication and, if appropriate, to be supplied after a later evaluation (for statistical purposes, for example).

Specifically, however, it can also be provided that, after each speech sample is spoken or after a predetermined number of speech samples have been received, the resulting confidence value of a threshold discrimination with a predetermined confidence minimum value or the resulting safety level value of a threshold discrimination with a predetermined safety level. Minimum value is subjected. In response to the discrimination result, the termination of the verification process or the request for another voice sample is then controlled. The mentioned threshold value can be adjustable via the system control.

In connection with the speech samples specified above with the user, it may be specifically provided that a confidence minimum value or safety level minimum value is input and, in response thereto, a subset for output as part of a user guidance from a total of predetermined speech samples each having a predetermined phonemic weighting coefficient is selected. This is a precisely adapted to given security requirements process management without avoidance Any unusable voice sample inputs, thus an advantageous linkage of defined security standard with high user acceptance achieved.

Conveniently, the proposed method is carried out such that each phoneme of the or each speech sample is assigned a weighting factor derived from the respective equal error rate. The phonemic weighting coefficient of the speech sample is calculated from the weighting factors according to a predetermined evaluation algorithm. For this very simple or slightly more complex algorithms come into consideration, with simple examples are explained below.

Preferably, in the proposed method, the automatic execution, issuing a predetermined user guidance, in quasi-real time. As already noted above, however, part of the method, namely the phonemic evaluation of speech samples and the assignment of a corresponding weighting coefficient, can take place prior to an actual enrollment or authentication process for which the examined speech samples and associated weighting coefficients are then provided. Incidentally, it is also possible to evaluate speech samples which have been recorded in real time, but in accordance with the invention, for example in the sense of selecting relevant speech samples from a larger speech sample supply that has been recorded and stored,

Essential device aspects of the invention will be readily apparent to those skilled in the art from the above-discussed aspects of the method, so that their repeated explanation is not indicated herein. However, attention is drawn to the following:

A first essential device element of the invention is a speech recognition unit for phonemic analysis of speech samples that are typically (but not necessarily) included as part of an enroute or verification procedure. A further essential device element of the invention is a weighting coefficient calculation unit which uses the phonemes of the speech sample and for these known weighting factors a phonematic (total) weighting coefficient of the speech sample. averages. On the expense side of this, finally, a speech sample feed control is provided for controlling the feeding of the phonematically evaluated speech sample to the voice profile calculation unit and / or a confidence value calculation unit for calculating the confidence value of the voice profile obtained therefrom.

According to the above, the system (a system server) in a preferred embodiment of the invention additionally comprises a user guidance unit for providing a user guidance, in particular for requesting speech samples and / or for outputting predetermined speech samples for being spoken by the person to be identified. In a further embodiment of this embodiment, it is provided that the user guidance unit is connected via a control input at least indirectly to an output of the weighting coefficient calculation unit and / or an output of the confidence value calculation unit, such that outputs in the context of user guidance are dependent on results of the weighting coefficients. or confidence value calculation are controllable.

To efficiently perform the required calculations, in another embodiment the system comprises a weighting factor storage unit connected to the weighting factor input of the weighting coefficient calculation unit for storing phoneme weighting factors. The weighting factors are stored in the memory unit in the manner of a lookup table, in each case in association with the phonemes occurring as part of a speech analysis. As an alternative to providing your own memory unit, the weighting factors may also be accessed in an external database.

The above-mentioned connection between the output of the weighting coefficient calculation unit and a control input of the voice sample calculation control unit may be configured such that a threshold discriminator unit for thresholding the calculated weighting coefficients with a predetermined weight minimum value is looped into this connection. Incidentally, a control input can also be connected to the user guidance unit with the output of this threshold discriminator unit the comparison or discrimination result for an adapted user guidance (requirement or specification of further speech samples) to make usable.

A second threshold value discrimination unit may be provided at the output of the confidence value calculation unit for comparing a confidence value calculated from the speech samples with a predetermined minimum value or a calculated safety level value with a predetermined minimum value. The second threshold discriminator unit can also be connected to the user guidance unit via a control input in order to adapt an adaptation of the user guidance to the results of the phonemic evaluation of the speech samples.

In device-side realization of the process management (enrollment or authentication) with predetermined suitable speech samples identified above as particularly efficient, a speech sample memory is provided for the orderly storage of a set of predefined speech samples, each with an associated predetermined phonemic weighting coefficient. Here, the user guidance unit and the confidence value calculation unit for retrieving selected speech samples are associated with the respective phonemic weighting coefficient.

Moreover, advantages and expediencies of the invention will become apparent from the following described embodiments and aspects of the invention with reference to the figures. From these show:

1 is a schematic representation of a first embodiment of the invention in the form of a functional block diagram,

Fig. 2 is a schematic representation of a second embodiment of the invention in the form of a functional block diagram and

Fig. 3 is a schematic representation of a third Ausführungsbeϊspiels of the invention in the form of a functional block diagram. 1 schematically shows a first arrangement 100 for a voice profi-based authentication of a person, in which a section of a system server 101 essential for the implementation of the invention is shown in communication with a mobile telephone 103 of a user. It should be noted that the system server 101 may include / execute other application-specific components and functions in addition to the components and functions described below.

The system server 101 is on the output side via a voice sample input interface 105 and in temporary communication with the mobile telephone 103 via a user guidance output interface 107 to guide the user in an enrollment or verification procedure and input to him at least one voice sample into the system. In addition, further input / output interfaces, such as for data entry into the system by pressing the mobile phone keyboard, may be provided. However, such are not required in connection with the explanation of the invention and are therefore not shown and described here.

The speech sample input interface 105 is internally connected to a speech recognition unit 109 and in parallel with a speech sample feed control 111, respectively at the input thereof. The speech recognition unit 109 is connected on the output side to a weighting factor storage unit 113 on the one hand and to the input of a weighting coefficient calculation unit 115 on the other hand. Via another input, the weighting factor storage unit 113 is connected to the weighting factor storage unit 113 for receiving therefrom prestored phoneme weighting factors for those phonemes which have resulted as a result of the speech recognition of the received speech sample as its constituent parts ,

On the output side, the weighting coefficient calculation unit is connected to a calculation coefficient threshold value discriminator (first threshold value discriminator) 117 whose threshold value can be set via a threshold setting unit 118. The first threshold value discriminator 117 is on the output side on the one hand to a control input of the voice sample feed controller 111 and, on the other hand, to a user guidance unit 119 for supplying or blocking the received speech sample either to a vocal sample analysis as a result of threshold discrimination in the phonemic weighting coefficients computed in the calculation unit 115, or outputting a corresponding user guidance (Request another voice sample) effect.

If a new voice sample is needed, the user guidance unit 119 outputs one to the interface 107 in response to the received control signal and via this to the mobile telephone 103. The described procedure is then repeated. If, on the other hand, the received and evaluated speech sample is usable for voice analysis (voice profile calculation) from the viewpoint of its phonemic evaluation, it is supplied to a voice profile calculation unit 121 and from this a voice profile of the user of the mobile telephone 103 is determined. The signal connection shown in the figure illustrates that this is stored in a voice profile storage unit 123, as required by an initial enrollment of the user. The dotted signal lines indicate that the voice profile is also supplied to a voice profile comparator unit 125 in the event of a later verification of the user and compared thereto with an initial voice profile stored in the memory unit 123 and an output signal of the comparator unit 125 indicative of the comparison result to subsequent stages of the voice unit System server 101 can be output.

Various algorithms can be used for the actual phonemic evaluation. Based on the results of empirical investigations on the "recognition performance", specific weightings can be derived for the speech constituents (phonemes) of speech samples to be extracted in a speech recognition process also their quantity (number) are included, and this is advantageously also practiced when using voice samples of different length and given processing conditions. From the realization that the individual sound units of languages have different recognition quality, the (both in the above-described first embodiment and in the further examples described below usable) process embodiment, at all only sound units with high recognition suitability (above a certain Threshold value - for further processing, ie voice profile calculation, while sound units with low recognition suitability are not processed further.This individual phoneme-related control can not be inferred from the figures, since these are for the sake of clarity in presenting a speech sample-related process were limited.

FIG. 2 shows a modified arrangement 200 modified from the arrangement 100 according to FIG. 1 for implementing a modified process control. The arrangement in Fig. 1 functionally corresponding components are denoted by reference numerals therefrom and will not be explained again below.

In the following, speech recognition (phonematic analysis) and phonemic evaluation are explained using simplified examples.

Examples of equal error rates of selected sound units or phonemes empirically determined by means of speech recognition are given in Table 1.

Table 1

SAMPA symbol: EER (equal error rate) a: 8,2%

E 10.6% m 8.5%

N 9.7%

F 21.0%

V 24.7%

T 25.3%

K 23.7% By forming the difference to the minimum possible error (zero) and normalization, the weighting factor is about

for a: 100 - 8.2 = 91.8 => 0.981 for k: 100 - 23 _r 7 = 76.3 => 0.763

If a word (a speech sample) is now examined for its (her) sound units, the corresponding weighting can be used for each unit determined and thus a total weighting (a weighting coefficient) for this word can be determined. However, it is also possible to use the sound unit weighting factors in a particularly simple manner for evaluating a word by setting a weighting factor minimum value and classifying only those sound units whose weighting factor is above the minimum value, and finally their number to the total number of Ratio of the word in proportion.

For example, assuming (fictitious) sound units and weightings according to Table 2 and setting the minimum or threshold value to 0.7, the result is the phoneme-related suitability noted in the table.

Table 2

a: 0.9 -> suitable b; 0.6 c: 0.4 d: 0.5 e: 0.8 -> suitable

For a symbol sequence "ceabde", which is composed of the six sound units mentioned in the table, it thus follows that three of the sound units are suitable and the other three are unsuitable, ie the evaluation coefficient of the symbol sequence (of the word) 0 determined in the aforementioned manner , 5 would be. Another possible method is to sum up the weights of the individual sound units and to divide the result by the number of sound units. For the above example, a weighting coefficient K would result from c (0.4), e (0.8), a (0.9), b (0.6), d (0.5), e (0 , 8) as K = 0.4 + 0.8 + 0.9 + 0.6 + 0.5 + 0.8 = 4 => 4/6 = 0.667

If we consider the sequence "ceabda" as an additional example symbol sequence, the result of the above-mentioned method is also a weighting coefficient of 0.5 for the latter, while with the last-mentioned method c (0.4), e (0, 8), a (0.9), b (0.6), d (0.5), a (0.9) as

K = 0.4 + 0.8 + 0.9 + 0.6 + 0.5 + 0.9 = 4, 1 => 4, l / 6 = 0.683. The value of the determined weighting coefficients can therefore (and in some cases significantly) depend on the chosen method.

Table 3 shows, for further illustration of possible process designs, a table of passwords with respective phonetic transcription and an associated weighting coefficient K, which was determined according to the method explained above on the assumption that the phonemes a, e, i, o, y, 6, m, j and s suitable, the other phonemes, however, all (according to a predetermined Schwel Iwertes), however, are unsuitable.

Table 3

Table 4 then shows a compilation of externally determined equalizer rates of the individual passwords together with the associated value of the weighting coefficient.

Table 4

It turns out that the recognition performance for passwords with a high phonemic weighting coefficient is in fact also high, which proves the usefulness of the method in the context of the registration or authentication of persons on the basis of their voice profile.

A significant change from the arrangement 100 of FIG. 1 in the arrangement 200 of FIG. 2 is that no voice sample feed control is provided, but each received speech sample in addition to the voice recognition unit 209 also enters the voice profile calculation unit 221 and - regardless of the phonemic evaluation - for calculating a voice profile is being used. Here, the output signal of the first threshold value discriminator unit 217 reaches a second threshold discriminator (safety discriminator unit) 227, which is connected via another input to a confidence level or setting unit 229, via which a predetermined minimum confidence value of the is adjustable to be determined voice profile or a predetermined level of security of a verification process to be performed. At the output of the second threshold discriminator 227, a signal is available, which indicates whether the voice analysis of a received voice sample - taken by itself - is suitable for fulfilling predetermined confidence or security requirements or not. On the one hand, this signal can be used in subsequent stages of the system server 201 and, on the other hand, it is fed to the user guidance unit 219 in order, where appropriate, to control the request for a further speech sample. Unlike the embodiment of FIG. 1, one or more other speech samples supplied by the user are not intended to replace the first (and possibly subsequent) speech sample (s) in the voice analysis, but to be additionally included in the voice analysis Finally, by analyzing a plurality of speech samples to achieve a total of the defined minimum requirements confidence or security pee. With regard to the interlinked evaluation of several speech samples, the illustration in FIG. 2 is not sufficiently detailed, but on the basis of the above description the person skilled in the art can guarantee such combination processing of several speech samples, each of which alone does not ensure sufficient conformity or safety. realize without further ado.

2, in the second embodiment, the second threshold value discriminator 227 has been substituted for the first threshold discriminating unit 117 of the first embodiment, and the associated setting unit 229 accordingly replaces the setting unit 218 of the first embodiment. Here, therefore, the weighting coefficient of the respective speech sample calculated in the weighting coefficient calculation unit 215 is supplied to a confidence value calculation unit 216 which determines therefrom the expected confidence value of a voice profile calculated from this speech sample.

The way of processing multiple speech samples to derive a voice profile with sufficient confidence can be done according to different algorithms. The easiest way is to supply the voice samples to the voice profile calculation unit without any weighting. In another variant, which is indicated by a dotted line in FIG. 2, the voice profile calculation unit receive as additional control signal the calculated weighting coefficient of the respective speech sample, and the calculation result is weighted for the respective speech sample with the associated weighting coefficient.

While in the first and second Ausführungsbelspϊei described above, the voice analysis / voice profile calculation on the basis of voice samples that the user, ie the person to be authenticated himself pretends (such as his name, a codeword or the like) _: based on system prescriptive voice samples voice analysis, both in Enrollment as well as authentication, enable the achievement of a higher level of security and / or shorten the process flow and thus increase user acceptance. In the context of the invention, it is provided that the speech samples to be provided for such a method are selected according to phonemic evaluation criteria. The method thus includes an upstream phase of the phoneme analysis and phonemic evaluation of a larger speech sample reservoir and the definition of preferred speech samples, namely those with a high phonemic weighting coefficient, for the later actual end-rollment or authentication procedure.

A corresponding arrangement 300 is shown sketch-like in FIG. Again, components that are functionally comparable to components of the first and second embodiments are designated with reference to Figures 1 and 2 reference numerals and will not be explained in more detail below. The arrangement 300 is shown in its voice analysis part with the signal connections as given in the verification phase.

Different from the first and second arrangements described above, the arrangement 300 has two voice sample input interfaces 305A, 305B, the former of which is in the preparatory stage with a microphone 302 and the latter in the actual authentication (or enrollment) phase with a microphone Mobile phone 303 is connected to a person to be authenticated (or registered). In the preparation phase, speech samples which are recorded on the microphone 302 and which are not to be subjected to voice analysis but merely to voice recognition and phonemic evaluation are taken to a first server section 301A in which speech recognition and determination of phonematic evaluation coefficients are carried out as in the first embodiment. As a result, a weighting coefficient threshold discriminator 317 outputs a forwarding control signal to a voice sample buffer 320 into which each speech sample received via the microphone 302 first passes and where it is latched. In the case of a positive evaluation result of the speech sample, this control signal causes the cached speech sample to enter a speech sample memory 322, from which it is fed to the user guide 319 in a later registration or authentication in order to be credited to the person to be registered or authenticated (ie nachzusprechende) speech sample is given.

The user guidance and voice analysis run in a second server section 301B substantially as in the second embodiment of FIG. 2, the authentication (or registration) can be done with a single taken from the memory 322 in the user guidance speech sample or with multiple voice samples, which will be significantly dependent on the given level of security. Optionally, similarly to the second embodiment, in the voice profile calculation unit 321, the numerical results of the phonemic evaluation may be used to assign a weight corresponding to the phonemic weighting coefficient to each voice sample when using multiple voice samples to derive the voice profile. This is again symbolized by a dotted line in the figure.

The embodiment of the invention is not limited to the examples and highlighted aspects discussed above, but is also possible in a variety of variations that are within the scope of skill in the art.

Claims

claims

A digital method of authenticating a person by comparing a current voice profile with a pre-stored initial voice profile, wherein the person for determining the respective voice profile at least one voice sample speaks, the voice sample is supplied to a voice profile calculation unit and therefrom based on a predetermined voice profile algorithm the voice profile is calculated by means of speech recognition with subsequent phonematic analysis determining a phoneme structure and a sequence of weighting factors assigned to the phonemes and / or a phonemic weighting coefficient for the or each speech sample, and the weighting factors or the weighting coefficient Determining a confidence value of the voice profile and / or used to control whether the respective voice sample spoken or parts thereof are fed to the Stirnmprofil calculation unit.

2. The method of claim 1, wherein the weighting factors or the phonemic weighting coefficient are subjected to a threshold discrimination with a predetermined weight minimum value and the supply to the voice profile calculation unit is controlled as a function of the discrimination result.

3. The method of claim 2, wherein a minimum weight value is recalculated from a predetermined confidence minimum value of the voice profile or a predetermined security level of the authentication.

4. Method according to one of the preceding claims, characterized in that a blocking control signal which inhibits the supply of a speech sample to the voice profile calculation unit, serves as a control signal for specifying or requesting a replacement speech sample.

5. The method as claimed in claim 4, wherein the blocking control signal controls the output of a request to speak in a user guide of a non-predefined substitute speech sample and for the replacement speech sample received thereon. nematic weighting coefficient is calculated.

6. The method of claim 4, wherein the blocking control signal controls the output of a predetermined substitute speech sample having a predetermined phonemic weighting coefficient as part of a user's guide.

7. Method according to one of claims 1 to 3, characterized in that the or each speech sample is predetermined and output within the framework of a user guidance and the associated phonemic evaluation coefficient is predetermined.

8. Method according to one of the preceding claims, characterized in that the verification process comprises the speech of several speech samples and from the associated phonemic weighting coefficients a resulting confidence value or safety level is calculated.

9. The method as claimed in claim 8, characterized in that after each speech sample is spoken in or after a predetermined number of speech samples have been spoken, the resulting confidence value of a speech sample Subjecting threshold discrimination to a predetermined confidence minimum value or the resulting safety level value to a threshold discrimination having a predetermined safety level minimum value and controlling, in response to the discrimination result, the completion of the verification process or the request of another voice sample,

10. The method according to claim 8 or 9, characterized in that a confidence minimum value or safety level minimum value is entered and, in response to this, a subset for output as part of a user guide is selected from a total of predetermined speech samples each having a predetermined phonemic weighting coefficient ,

11. Method according to one of the preceding claims, characterized in that each phoneme of the or each speech sample is assigned a weighting factor derived from the respective same error rate, and the phonemic weighting coefficient of the speech sample after a predetermined time - Evaluation Atgorϊthmus is calculated from the weighting factors.

12. Method according to one of claims 2 to 11, characterized in that only those phonemes of a voice sample of the voice profiler calculation unit are supplied to calculate the voice profile whose weighting factor lies above the weight minimum value, while the remaining ones Phonemes will not be further processed.

13. A method according to any one of the preceding claims, wherein the automatic execution, outputting a predetermined user guidance, is in quasi-real time.

14. Arrangement for carrying out the method according to one of the preceding claims, with

a voice sample input interface,

a voice profile calculation unit connected on the input side to the voice sample input interface,

a speech recognition unit connected in parallel with the voice sample calculation unit to the speech sample input interface for phonemic analysis of a received speech sample,

a weighting coefficient calculation unit for calculating the phonemic weighting coefficient of the received voice sample analyzed in the voice recognition unit and a voice sample feed control connected to the output of the weighting coefficient calculation unit for controlling the supply of the received voice sample to the voice profile calculation unit or a confidence value calculation unit for calculation the confidence score of the voice profile.

15. An arrangement according to claim 14, wherein a user guidance unit for providing a user guidance, in particular for requesting speech samples and / or for outputting predetermined speech samples for being spoken by the person to be identified.

16. Arrangement according to claim 15, characterized in that the user guidance unit is connected via a control input at least indirectly to an output of the weighting coefficient calculation unit and / or an output of the confidence value calculation unit, such that outputs in the context of user guidance in dependence on results of the weighting coefficients - or confidence value calculation are controllable.

17. Arrangement according to claim 14, wherein a weighting factor storage unit connected to the weighting factor input of the weighting coefficient calculation unit stores phoneme weighting factors.

18. Arrangement according to one of claims 14 to 17, ge ken n zei chn et du rc h connected to the output of the evaluation coefficient computation unit first threshold discriminator unit for threshold discrimination of the calculated weighting coefficients with a predetermined weight minimum value, wherein the threshold Discriminator unit is connected via a control input to the voice sample calculation control unit and optionally the user guidance unit.

19. Arrangement according to claim 14, wherein the second threshold discriminator unit connected to the output of the confidence value calculation unit discriminates a calculated confidence value with a predetermined minimum confidence value or a calculated safety level value with a predetermined threshold Security level minimum value, wherein the second threshold discriminator unit is connected via a control input to the user guidance unit.

20. Arrangement according to claim 14, comprising a speech sample memory for orderly storing a set of predefined speech samples, each with a respective predetermined phonemic weighting coefficient, wherein the user guidance unit and the confidence value calculation unit use to retrieve selected speech samples respective phonemic weighting coefficients are associated with the speech sample memory.