NL2012300C2 - Automated audio optical system for identity authentication. - Google Patents

Automated audio optical system for identity authentication. Download PDF

Info

Publication number
NL2012300C2
NL2012300C2 NL2012300A NL2012300A NL2012300C2 NL 2012300 C2 NL2012300 C2 NL 2012300C2 NL 2012300 A NL2012300 A NL 2012300A NL 2012300 A NL2012300 A NL 2012300A NL 2012300 C2 NL2012300 C2 NL 2012300C2
Authority
NL
Netherlands
Prior art keywords
input
audio
user
optical
authentication
Prior art date
Application number
NL2012300A
Other languages
Dutch (nl)
Inventor
Joost Johannes Hendrikus Christiaan Doremalen
Martijn Enter
Original Assignee
Novolanguage B V
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Novolanguage B V filed Critical Novolanguage B V
Priority to NL2012300A priority Critical patent/NL2012300C2/en
Application granted granted Critical
Publication of NL2012300C2 publication Critical patent/NL2012300C2/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/10Multimodal systems, i.e. based on the integration of multiple recognition engines or fusion of expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/32User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/165Detection; Localisation; Normalisation using facial parts and geometric relationships

Description

Automated audio optical system for identity authentication
DESCRIPTION
FIELD OF THE INVENTION
The present invention is in the field of automated systems and methods for audio optical identity authentication.
BACKGROUND OF THE INVENTION
Authentication relates to confirming the authenticity of an attribute and specifically of a person. Such may involve confirming an identity of a person. Authentication often involves verifying the validity of at least one form of identification.
One type of authentication is comparing characteristics of an object (e.g. person) to what is known about objects of that origin. For example, an art expert might look for similarities in the style of painting, check the location and form of a signature, or compare the object to an old photograph. Physics of sound and light, and comparison with a known physical environment, such as a previously recorded environment, can be used to examine the authenticity of audio recordings, photographs, or videos.
Authentication of a person can be split into categories, namely: something a person knows, something a person has, and something a person is. Each authentication category covers a range of elements used to authenticate or verify a person's identity. The present invention lies in establishing a relationship of physical and personal characteristics of a person within the three mentioned categories in a first situation and in a second situation, and comparing the characteristics of the two situations, in order to establish if the person in the first situation is the .same as in the second situation.
It is noted that for the second mentioned category, which typically relates to attributes of a person, comparison may be vulnerable to forgery. In general, it relies on the facts that creating a forgery indistinguishable from a genuine artifact requires (expert) knowledge, that mistakes are easily made, and that the amount of effort required to do so is considerably greater than the amount of profit that can be gained from the forgery. In order to prevent such forgery or abuse a watertight system needs to be developed.
It is considered that for a "positive" authentication, elements from at least two, and preferably all three, categories should be verified. Such is the aim of the present invention. Some example of each category are: ownership: (something a user has) ID card, security token, wrist band, software token, and cell phone; knowledge: (something a user knows) personal identification
number (PIN), a password, pass phrase, challenge response, a pattern; inherence: (something a user is or does) DNA sequence, retinal pattern, fingerprint, signature, face, voice, unique bio-electric signals, other biometric identifier .
The present system is aimed at using all three categories (sometimes referred to as factors) for authentication, and to comparing authentication in a first situation to authentication in a second situation.
The present invention therefore relates to automated systems and methods for audio optical identity authentication, which overcomes one or more of the above disadvantages, without jeopardizing functionality and advantages .
SUMMARY OF THE INVENTION
The present invention relates in a first aspect to an automated system according to claim 1 comprising various electronic elements. The automated system is typically at least partially implemented on a computer, an electronic equipment, or the like. The present system is suited for real-time authentication.
In a first means audio input is processed. Part of initial processing may relate to improving an audio signal, such as by removing noise, echo, flanger, unusual and atypical sound, phaser, etc. Further a signal may (partly) be attenuated or boosted to produce desired spectral characteristics (equalized). Also typically filtering of a signal is used, in order. to emphasize or frequency ranges, such as by use of low-pass, high-pass, band-pass or band-stop filters. Compression may be used to reduce a dynamic range of a sound. A reversed pitch shift may be used to identify lower or upper harmonics and/or resonators. Time stretching, time shortening, and modulation may be used as a further characterization of the signal. Care is taken to process audio input in such a way that in any given situation more or less the same processed audio signal is provided.
Typically at least two means for receiving audio inputs are provided, as input may be received in different locations. However, if input is received at a same location in a consecutive mode, e.g. at different times, one means of course will be sufficient. A similar reasoning also applies for the means for receiving optical input.
In a second means optical input is processed. Similar processing techniques as above for the audio signals may be used for the optical signals. Care is taken to process optical input in such a way that in any given situation more or less the same processed optical signal is provided.
It is preferred to process and capture audio and optical input in combination, if possible. In a first situation the combined input may be captured by a computer of a user at home, in a second situation the combined input may be captured by a video camera observing a user in a public environment or in a confined environment, such as an assessment.
The processing of the input is typically performed by a processor, such as a chip, a computer, and a dedicated apparatus. The processor typically has sophisticated circuits for processing, and typically some software for controlling circuits. The processor provides output that may be further processed, such as by the present authentication comparator. A user of the system, which may or may not be present when receiving input, may use the output generated by the system, e.g. in order to authenticate a person, and to compare authentication in a first situation to authentication in a second situation. As such, with a certain uncertainty, it can be established if an authenticated person in the first situation is the same as an authenticated person in the second situation. The present system is directly aimed at authentication of persons at substantially the same time in a given location, but rather at authentication of a large number of persons over time and in given locations, such as 1,000 persons, typically 10,000 persons or more.
In order to process the input an audio signal processing unit, an optical signal processing unit and an identity code capturer are provided. The two optical processing units may be one and the same. These units may be stored on a computer, may be present as such, may be in the form of software, and combinations thereof.
Part of the audio processing unit relates to automatic speech recognition (ASR). In general it is noted that Automatic Speech Recognition (ASR) is already quite challenging for native speech, but it is even more challenging for non-native speech, since non-native speech deviates substantially from native speech in at least three aspects: the sounds, lexicon, and grammar differ. The present ASR technology differs in many ways from others. As a consequence e.g. a word error rate (incorrect determined words)of the present system is 5%-20%, as has been established upon evaluating the system with a significant number of users.
As explained above one or more audio analyzers, and likewise processors, optionally combined in one unit, are present.
In order to verify if audio input is "correct", e.g. as expected, at least one error detector is present. Such a detector may signal errors which preferably are not taken into account when authenticating.
The processed audio (and likewise optical) input relating to information of a person is stored on a data storage means, such as a memory. Data may be stored as such, as a representation of data, such as in an n-dimensional vector space, as characterizations' of a person, and a combination thereof.
Similar to the audio analyzers, an optical processing unit is provided. Also an identity code capturer is provided, having a same functionality as the optical processing unit; it is noted that the identity code capturer is capable of processing an image, such as a bar code, a matrix code, a social security number, an identity number, a passport photo, and combinations thereof. Optical data may be stored separately from the audio data, or in the same data storage means . ' Further an authentication comparator for mutual comparing and authenticating input of a first means for audio input and a first means for optical input, with input of a second means for audio input, and a second means for optical input, and with input of an identity code capturer, respectively, is provided. In an example the first audio and optical means are aimed at establishing ownership and inherence characteristics of a person, typically in a first situation, whereas the second audio and optical means are aimed at establishing knowledge, inherence and optionally ownership, characteristics of a person, typically in a second situation. In the second situation knowledge characteristics are added to the system, whereas inherence and optionally ownership characteristics are used for authentication.
Thereto an authentication comparator is provided, for comparing e.g. input of a first situation with input of a second situation, the inputs being processed. An inherence characteristics that may be compared and scored is pronunciation.
The output is preferably provided in a visual manner, such as on a monitor.
The present system is provided with a means for receiving audio input. The input is typically provided by the user, the user reading out loud a (target) text, the text being provided by the present system, giving an answer to a question posed, etc., such as in the form of spoken language. The target text and the like may be provided by a virtual agent. The present system may provide prompts. As such a user may select to repeat an exercise, hear back his/her own input, be provided with an example input, continue, etc. The example input may also be provided as a randomly provided sequence of words, which require a user to return a correct syntax. A typical length of the present input is 10-250 phonemes, such as 50-100 phonemes.
The present (first and second phase) automated speech recognition software (ASR) may consist of a decoder (a search algorithm) and three 'knowledge sources': a language model, a lexicon, and at least one acoustic model. The language model (LM) contains probabilities of words and sequences of words. Acoustic models are models of how the sounds of a language are pronounced. The lexicon contains information on how the words are pronounced.
The present system may further comprise a first means for determining and analyzing input. In view of e.g. authenticating it may be important to determine what word(s) were actually spoken. The first means may relate to a first phase (automated) speech recognition software, which software typically determines input in a tolerant mode, e.g. globally checking given (or actual) input versus required (target)input (the provided target text). A goal thereof is to recognize words a user intended to pronounce, even though the non-native speech of a user may deviate in various ways. The ASR system is optimized for this phase, e.g. by tuning the three knowledge sources using non-native speech.
The output of the first phase speech recognition software may provide input to the second phase speech recognition software (or in an alternative, vice versa).
The system may further comprise a second means for determining input, such as second phase (automated) speech recognition software comprising a .pronunciation quality evaluation unit for processing input to determine potential difference between target pronunciation and actual pronunciation, which unit functions in a detailed and strict manner. The manner may depend on the level of the user. The output of the first phase may be used as input, as well as the non-processed captured input. The differences identified, if any, may further be used to authenticate a person, as these differences characterize such a person further.
In the second phase the system is strict. Now a goal is to detect differences, such as large deviations between pronunciation received and target pronunciation. A further version of the ASR system is used which is optimized for this task. The ASR system then segments the non-native speech signal, it detects the position (begin and endpoint) of the words and the phonemes (sounds).
The system may further comprise various error detectors. These detectors relate to one or more of sounds and phonemes, lexicon, grammar, and prosody. Examples are a pronunciation error detector, a prosody error detector, e.g. a word stress error detector and an intonation error detector, a respiration error detector, a formant error detector, and a grammar error detector, e.g. a morphology error detector and a syntax error detector, an interaction error detector, and a lexicon error detector. Typically these detectors are optimized, e.g. in view of first and second language, such as Dutch. The errors, or differences, identified, if any, may further be used to authenticate a person, as these differences characterize such a person further.
The system may further comprise a selector for selecting a first phase speech recognition software version and/or a second phase speech recognition software version, the version(s) being optimized for a group of users. As such a user or a teacher may set a software version being specifically adapted to a level of oral language proficiency of a user, adapted to a native language of a user, adapted to a variety or dialect of a user, and combinations thereof.
Such further optimization can be used for authentication, especially if (characterizing) differences between persons are otherwise relatively small.
The present software and detectors are stored. They may be stored in any means capable of storage of binary data, such as RAM, a ROM, a hard-disk, a CD, a DVD, etc., and combinations thereof. The stored data should be accessible to the present system, when in use. It is noted that various elements of the present system may be located within one location, even within one apparatus, such as a computer, wherein e.g. software is loaded on memory, or located at different locations, such as on the internet, on a mobile phone, on a computer, at a learning center, and combinations thereof. Within e.g. a combination a first element may function as a client to a further element, an element may function as a server, etc. For some applications it is preferred to use a server or a cloud. A user may interact with a server or cloud as a client, e.g. a browser based client. Preferably a broadband connection between client and server or cloud is used, enabling fast communication of data.
The present system may be accessible on internet, on a hard disk of a computer, on a DVD, a CD-ROM, etc.
Thereby the present invention provides a solution to one or more of the above mentioned problems, by providing an extended system, comprising various functionalities, wherein the functionalities are further optimized with respect to each other, thereby further improving functionality and user friendliness .
Advantages of the present description are detailed throughout the description.
DETAILED DESCRIPTION OF THE INVENTION
The present invention relates in a first aspect to an automated audio-optical system for user identity authentication according to claim 1.
In an example the present invention relates to a system wherein personal characteristics of a user are stored in a first separate domain, and/or wherein the authentication comparator is stored in a second separate domain, and/or wherein audio and optical information is stored in a third separate domain, wherein the first, second and third domain can be linked by an secret key, wherein the secret key is stored in a fourth domain, wherein the fourth domain is preferably only accessible to an administrator upon entering a user ID and a code. As such privacy of personal data is safeguarded.
In an example the present invention relates to a system further comprising stored on the system yl) first phase speech recognition software for determining audio input in a tolerant mode, wherein the input is in the form of a word, a sentence, and the like, wherein a typical length of the present input is preferably 10-250 phonemes, such as 50-100 phonemes, the first phase speech recognition software preferably providing input to second phase speech recognition software, and y2) second phase speech recognition software for determining audio input in a strict mode, comprising a pronunciation quality evaluation unit for processing input to determine potential difference between stored target pronunciation and actual audio input pronunciation, and for generating feedback output. As such especially audio input and processing thereof is improved, e.g. in a reduction of error rates of the present system.
In an example the present invention relates to a system further comprising one or more of e) v) a word stress error detector, vi) a morphology error detector, vii) a syntax error detector, viii) an interaction error detector, ix)an intonation error detector, x) a respiration error detector, xi) a formant error detector, and xii) a selector for selecting a first phase speech recognition software version and/or a second phase speech recognition software version, the version (s) being optimized for a group of users. The various error detectors and selector further contribute to the robustness of the present system.
In an example the present invention relates to a system wherein input and/or output are in a second language and the user being native in a first language, wherein the first and second language are selected from Indo-European languages, such as Spanish, English, Hindi, Portuguese, Bengali,
Russian, German, Marathi, French, Italian, Punjabi, Urdu, Dutch, German, French, Spanish, Italian, Sino-Tibetan languages, such as Chinese, Austro-Asiatic languages, Austronesian languages, Altaic languages, such as wherein the first and second language are Dutch and English, Dutch and
German, Dutch and Spanish, Dutch and Chinese, German and English, French and English, Chinese and English, preferably wherein the second language is a foreign language such as English, and vice versa.
In an example the first language may be- selected from Dutch, German, French, Spanish,' Italian, Polish, Chinese, Japanese, Korean, Afrikaans, and English.
In an example the second language may be selected from Dutch, German, French, Spanish, Italian, Polish, Chinese, Japanese, Korean, Afrikaans, and English.
In an example the present invention relates to a system wherein the authentication comparator further comprises an audio subtractor for subtracting a (part of a) first audio input from a (part of a) second audio input, an optical subtractor for subtracting a (part of a) first optical input from a (part of a) second optical input, and for subtracting a (part of a) first optical input from a (part of a) identity code input. By subtracting an optional difference can be detected. Based on a difference found it can be determined if the difference is significant/measurable, or insignificant/not measurable. A significant difference indicates authentication has failed, whereas an insignificant difference indicates authentication has been successful.
In an example the present invention relates to a system wherein the authentication comparator further comprises one or more of a pronunciation subtractor, a language proficiency subtractor, a communication ability subtractor, a correctness of answer subtractor, a user capability subtractor, a job experience subtractor, an education subtractor, a grade subtractor, etc. The various subtractors may be used to identify further differences or absences thereof. Therewith robustness of the comparator is further improved.
Also varieties of the above languages may be selected, such as British English, American English, Australian English, Canadian English, New Zealandian English, Indian English, etc.
Further, the present system is also adapted to process dialects, such as Dutch dialects and varieties, such as wherein the pronunciation quality evaluation unit is adapted for one or more (language) varieties and/or dialects, such as British English, Limburgs, Brabants, Gronings, and Drenths. Clearly such can only be achieved after gathering data, analyzing data, ordering data, etc. as described throughout the description. The processing of dialects and/or varieties may further be used to authenticate a person, as the use of a specific dialect and/or variety characterize such a person further .
In an example the present invention relates to a system wherein the pronunciation quality evaluation unit comprises software, wherein the software is preferably being stored on a computer.
In an example the present invention relates to a system further comprising one or more of a language model, a lexicon, a phoneme model, one or more thresholds, one or more probability criteria, one or more random number generators, a level adjustment set-up, and a decoder, wherein the decoder may comprise the previous elements.
In an example the present invention relates to a system further comprising one or more of a reference set of parameters, a fine-tuning mechanism, a self-learning algorithm, a self-improvement algorithm, and a selection means for selecting criteria. The parameters may for instance relate to one or more classifiers, as well as to (implementing) algorithms, e.g. for determining a probability.
In an example the present invention relates to a system further comprising a data base, wherein data is stored for one or more of pronunciation, word stress, intonation, and phoneme segmentation. It is noted that the present data base comprises an extensive amount of data, gathered throughout the years .
In an example the present system further comprises one or more decision trees, stored on the system, such as a. decision tree being adapted to provide questions and responses thereto, a decision tree being adapted to provide purposive training in view of second phase speech recognition. An example for a decision tree is a job interview. A user is e.g. asked (general) questions relating to various aspects of the job and towards the users background. An example may relate to a route to be followed, e.g. towards a museum in a city. In general the decision tree may relate to a Quest. As such a user "moves" through a decision tree and progress of a user can be monitored. The interaction becomes much more vivid.
The present invention relates in a second aspect to a method for automatic real time user authentication according to claim 10.
In an example of the present method further a normalized score of authentications may be provided, such as for monitoring and evaluating.
In an example of the present method provides monitoring scores of users and relation between one or more users in a sequence of users.
In an example the present technology is used in assessment, serious gaming, for ranking,
EXAMPLES
The invention is further detailed by the accompanying example, which is exemplary and explanatory of nature and are not limiting the scope of the invention. To the person skilled in the art it may be clear that many variants, being obvious or not, may be conceivable falling within the scope of protection, defined by the present claims.
SUMMARY OF FIGURES
Figure 1 shows an example of a functional flow diagram of the present system.
DETAILED DESCRIPTION OF THE FIGURES
Figure 1 shows an example of a functional flow diagram of the present system. Therein two phases can be identified: (1) An enrolment phase, and (2) an assessment phase.
In the enrolment phase, a photo (ID) is scanned and a video is recorded. During video recording a user reads a text presented on a screen. Both the photo and the video recording are stored in separate databases. From the video recording a photo and an audio fragment may be extracted, respectively.
During the assessment phase therein, assessment items (exercises) are retrieved from a database, the items typically requiring a response (answer, repetition, etc.) from a user. A user may record his/her responses through a microphone. The responses as well as the audio input are stored. For every item an item score is -calculated and stored automatically, typically in a separate (scoring) database. Furthermore, the audio recordings are stored in a (further and separate) database. When the assessment is finished, an overall assessment score (AS) is calculated.
Based on three data sources obtained so far, namely data source 1 (Dl) the scanned photo (enrolment), source 2 (D2) the recorded video (enrolment), and source 3 (D3) the recorded audio (assessment) a similarity score (SS) is calculated. Such a SS can be calculate by combining (1) the similarity score of Dl and extracted stills from D2 using face verification technology, and (2) the similarity score of D3 and extracted audio from D2 using speaker verification technology.
The SS and the AS are linked and stored in a database. Results are then sorted on AS in descending order. Further users with an SS below a certain threshold can be indicated. Also potential intruders, that is not identified before, are indicated. Any user is double checked, such as manually by inspecting and comparing Dl, D2 and D3.

Claims (10)

1. Geautomatiseerde audio-optisch systeem voor gebruikersidentiteit authenticatie omvattende a) ten minste één middel voor het ontvangen van audio-invoer, zoals een microfoon, en het overzetten van de audio-invoer in een elektrisch audiosignaal, b) ten minste één middel voor het ontvangen van optische invoer, zoals een camera, en overzetten van de optische invoer in een elektrisch optisch signaal, waarbij de invoer een combinatie van audio-en optische invoer kan zijn, zoals video-invoer, en on-line invoer, c) een processor voor het verzamelen en verwerken van audio en optische invoer en voor het verschaffen van uitvoer, bij voorkeur een digitale signaalprocessor, zoals een computer, waarbij de processor omvat d) een audiosignaal-verwerkingseenheid voor het verschaffen van een audio karakterisering omvattende i) automatische spraakherkenning, ii) één of meer audio-analysatoren, zoals een audio-spectrumanalysator, een bandbreedte-analysator, een geluidanalysator, en een vermogensanalysator, iii) eventueel een foutdetector, en iv) een data-opslagmiddel voor het opslaan van informatie van een persoon, en e) ten minste één van een optische signaal verwerkingseenheid en een identificatiecode verzamelaar voor het verschaffen van een optische karakterisatie, elk omvattend 1. een biometrisch gezichtsanalysator voor het identificeren van een persoon, waarbij de biometrische gezichtsanalysator omvat een set van parameters voor het karakteriseren van gelaatstrekken, zoals ogen, neus, lippen, haar, oren, oriëntatiepunten, drempels voor deze parameters, en eventueel een gezichtsdatabase, 2. een data-opslagmiddel voor het opslaan van informatie van een persoon, f) een authenticatiecomparator voor het onderling vergelijken en authenticatie van audio-invoer en/of optische invoer in een eerste situatie, met respectievelijk audio-invoer en/of optische invoer in een tweede situatie,, en/of respectievelijk invoer van een identiteitscode verzamelaar, waarbij authenticatiecomparator bij voorkeur een uitspraak score omvat, en g) ten minste één middel voor het verschaffen van uitvoer aan de gebruiker, zoals een luidspreker voor het verschaffen van audio feedback en een monitor voor het verschaffen van visuele feedback.An automated audio-optical system for user identity authentication comprising a) at least one means for receiving audio input, such as a microphone, and transferring the audio input into an electrical audio signal, b) at least one means for receiving optical input, such as a camera, and transferring the optical input into an electrical optical signal, the input being a combination of audio and optical input, such as video input, and on-line input, c) a processor for collecting and processing audio and optical input and for providing output, preferably a digital signal processor, such as a computer, the processor comprising d) an audio signal processing unit for providing an audio characterization comprising i) automatic speech recognition, ii) one or more audio analyzers, such as an audio spectrum analyzer, a bandwidth analyzer, a sound analyzer, and a power analyzer alysator, iii) optionally an error detector, and iv) a data storage means for storing information from a person, and e) at least one of an optical signal processing unit and an identification code collector for providing an optical characterization, each comprising 1 a biometric face analyzer for identifying a person, the biometric face analyzer comprising a set of parameters for characterizing facial features, such as eyes, nose, lips, hair, ears, landmarks, thresholds for these parameters, and optionally a face database, a data storage means for storing information from a person, f) an authentication comparator for mutually comparing and authenticating audio input and / or optical input in a first situation, with audio input and / or optical input respectively in a second situation ,, and / or input of an identity code collector, where authentication compara preferably comprises a pronunciation score, and g) at least one means for providing output to the user, such as a speaker for providing audio feedback and a monitor for providing visual feedback. 2. Een geautomatiseerd systeem volgens conclusie 1, waarbij persoonlijke kenmerken van een gebruiker worden opgeslagen in een eerste afzonderlijk domein, en/of waarbij de authenticatiecomparator wordt opgeslagen in een afzonderlijk tweede domein, en/of waarbij audio en optische informatie wordt opgeslagen in een derde afzonderlijk domein, waarbij het eerste, tweede en derde domein kunnen worden gekoppeld met een geheime sleutel, waarbij de geheime sleutel wordt opgeslagen in een vierde domein, waarbij het vierde domein bij voorkeur slechts toegankelijk is voor een beheerder na het invoeren van een gebruikersnaam en een code.An automated system according to claim 1, wherein personal characteristics of a user are stored in a first separate domain, and / or wherein the authentication comparator is stored in a separate second domain, and / or wherein audio and optical information is stored in a third separate domain, where the first, second and third domains can be linked with a secret key, the secret key being stored in a fourth domain, the fourth domain preferably being accessible only by an administrator after entering a user name and a code. 3. Geautomatiseerd systeem volgens conclusie 1 of 2, waarbij de authenticatiecomparator verder omvat een audio subtractor voor het aftrekken van een (deel van een) eerste audio-invoer van een (deel van een) tweede audio-invoer, een optische subtractor voor het aftrekken van een (deel van een) eerste optische invoer van een (deel van een) tweede optische invoer, en voor het aftrekken van een (deel van een) eerste of tweede optische invoer van een (deel van een) identiteitscode invoer.An automated system according to claim 1 or 2, wherein the authentication comparator further comprises an audio subtractor for subtracting a (part of a) first audio input from a (part of a) second audio input, an optical subtractor for subtraction of a (part of a) first optical input of a (part of a) second optical input, and for subtracting a (part of a) first or second optical input from a (part of a) identity code input. 4. Systeem volgens één der voorgaande conclusies, verder omvattende één of meer van een uitspraaksubtractor, een taalvaardigheidsubtractor, een communicatievaardigheidsubtractor, een antwoordjuistheid subtractor, een gebruikervaardigeheidssubtractor, een baanervaringsubtractor, een opleidingssubtractor, en een opleidingsniveausubtractor.A system according to any one of the preceding claims, further comprising one or more of a pronunciation subtractor, a language proficiency contractor, a communication proficiency subtractor, a response correctness subtractor, a user proficiency subtractor, an orbit experience contractor, a training subtractor, and a training level subtractor. 5. Systeem volgens één der voorgaande conclusies, verder omvattende één of meer van een referentieset van parameters een fijnafstemmingsmechanisme, een zelflerend algoritme, een zelf-verbeterend algoritme, een selectiemiddel voor het selecteren van criteria, een databank, waarin gegevens worden opgeslagen voor één of meer van de uitspraak, klemtoon, intonatie, en foneemsegmentatie.A system according to any one of the preceding claims, further comprising one or more of a reference set of parameters, a fine-tuning mechanism, a self-learning algorithm, a self-correcting algorithm, a selection means for selecting criteria, a database, in which data are stored for one or more more of the pronunciation, stress, intonation, and phoneme segmentation. 6. Systeem volgens één der voorgaande conclusies, verder omvattende één of meer beslisbomen, zoals een beslisboom die is aangepast om vragen en reacties daarop te verschaffen.A system according to any preceding claim, further comprising one or more decision trees, such as a decision tree adapted to provide questions and responses thereto. 7. Werkwijze, gebruikmakend van een systeem volgens één der conclusies 1-6, voor automatische realtime gebruikersauthenticatie omvattende één of meer van i) verificatie van de mondelinge taalvaardigheid, ii) verificatie van de identiteit van een gebruiker, iii) verificatie van stemkarakteristieken van een gebruiker, iv) verificatie van het gezichtskenmerken van een gebruiker, v) authenticatie van communicatieve vaardigheden van een gebruiker, vi) verificatie van intellect van een gebruiker, vii) verificatie van karakter van een gebruiker, en viii) authenticatie van de motivatie van een gebruiker.A method, using a system according to any of claims 1-6, for automatic real-time user authentication comprising one or more of i) verification of oral language proficiency, ii) verification of a user's identity, iii) verification of voice characteristics of a user, iv) verification of a user's facial features, v) authentication of a user's communication skills, vi) verification of a user's intellect, vii) verification of a user's character, and viii) authentication of a user's motivation . 8. Werkwijze volgens conclusie 7, verder omvattend het verstrekken van een genormaliseerde score van authenticaties.The method of claim 7, further comprising providing a normalized score of authentications. 9. Werkwijze volgens één der conclusies 7-8, verder omvatten het volgen van scores van de gebruikers en de relatie tussen één of meer gebruikers in een reeks gebruikers.The method of any one of claims 7-8, further comprising monitoring user scores and the relationship between one or more users in a series of users. 10. Werkwijze volgens één der conclusies 7-9, voor gebruik in assessment, in serious gaming, en voor rangschikken.10. Method according to one of claims 7-9, for use in assessment, in serious gaming, and for ranking.
NL2012300A 2014-02-21 2014-02-21 Automated audio optical system for identity authentication. NL2012300C2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
NL2012300A NL2012300C2 (en) 2014-02-21 2014-02-21 Automated audio optical system for identity authentication.

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
NL2012300A NL2012300C2 (en) 2014-02-21 2014-02-21 Automated audio optical system for identity authentication.
NL2012300 2014-02-21

Publications (1)

Publication Number Publication Date
NL2012300C2 true NL2012300C2 (en) 2015-08-25

Family

ID=50687583

Family Applications (1)

Application Number Title Priority Date Filing Date
NL2012300A NL2012300C2 (en) 2014-02-21 2014-02-21 Automated audio optical system for identity authentication.

Country Status (1)

Country Link
NL (1) NL2012300C2 (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002044999A2 (en) * 2000-11-29 2002-06-06 Siemens Aktiengesellschaft Method and device for determining an error rate of biometric devices
US20040151348A1 (en) * 2003-02-05 2004-08-05 Shuji Ono Authentication apparatus
WO2005034025A1 (en) * 2003-10-08 2005-04-14 Xid Technologies Pte Ltd Individual identity authentication systems
WO2006128171A2 (en) * 2005-05-27 2006-11-30 Porticus Technology, Inc. Method and system for bio-metric voice print authentication
EP1962280A1 (en) * 2006-03-08 2008-08-27 BIOMETRY.com AG Method and network-based biometric system for biometric authentication of an end user
WO2010066269A1 (en) * 2008-12-10 2010-06-17 Agnitio, S.L. Method for verifying the identify of a speaker and related computer readable medium and computer
GB2493849A (en) * 2011-08-19 2013-02-20 Boeing Co A system for speaker identity verification

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002044999A2 (en) * 2000-11-29 2002-06-06 Siemens Aktiengesellschaft Method and device for determining an error rate of biometric devices
US20040151348A1 (en) * 2003-02-05 2004-08-05 Shuji Ono Authentication apparatus
WO2005034025A1 (en) * 2003-10-08 2005-04-14 Xid Technologies Pte Ltd Individual identity authentication systems
WO2006128171A2 (en) * 2005-05-27 2006-11-30 Porticus Technology, Inc. Method and system for bio-metric voice print authentication
EP1962280A1 (en) * 2006-03-08 2008-08-27 BIOMETRY.com AG Method and network-based biometric system for biometric authentication of an end user
WO2010066269A1 (en) * 2008-12-10 2010-06-17 Agnitio, S.L. Method for verifying the identify of a speaker and related computer readable medium and computer
GB2493849A (en) * 2011-08-19 2013-02-20 Boeing Co A system for speaker identity verification

Similar Documents

Publication Publication Date Title
US10276152B2 (en) System and method for discriminating between speakers for authentication
EP3599606B1 (en) Machine learning for authenticating voice
Kamble et al. Advances in anti-spoofing: from the perspective of ASVspoof challenges
Hautamäki et al. Automatic versus human speaker verification: The case of voice mimicry
US20210327431A1 (en) 'liveness' detection system
WO2017215558A1 (en) Voiceprint recognition method and device
Vestman et al. Voice mimicry attacks assisted by automatic speaker verification
JP2006285205A (en) Speech biometrics system, method, and computer program for determining whether to accept or reject subject for enrollment
Tan et al. A survey on presentation attack detection for automatic speaker verification systems: State-of-the-art, taxonomy, issues and future direction
CN114677634B (en) Surface label identification method and device, electronic equipment and storage medium
US20140163986A1 (en) Voice-based captcha method and apparatus
Firc et al. The dawn of a text-dependent society: Deepfakes as a threat to speech verification systems
Shirvanian et al. Quantifying the breakability of voice assistants
Safavi et al. Comparison of speaker verification performance for adult and child speech
NL2012300C2 (en) Automated audio optical system for identity authentication.
Paul et al. Presence of speech region detection using vowel-like regions and spectral slope information
US20230419736A1 (en) Detection apparatus and spoofing detection method
Stewart et al. LIVENESS'DETECTION SYSTEM
Akbar A Overview of Spoof Speech Detection for Automatic Speaker Verification
Aljasem Secure Automatic Speaker Verification Systems
Muckenhirn Trustworthy speaker recognition with minimal prior knowledge using neural networks
Martinez et al. ◾ Voice Recognition
Girija et al. Multi-Biometric Person Authentication System Using Speech, Signature And Handwriting Features
Shirali-Shahreza et al. Realistic answer verification: An analysis of user errors in a sentence-repetition task

Legal Events

Date Code Title Description
MM Lapsed because of non-payment of the annual fee

Effective date: 20170301