Title: SYSTEM AND METHOD OF USER VERIFICATION
Inventor: Andrew R. Mark
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to biometric user verification in which an entered biometric feature is processed to yield an alpha numeric coded sequence representing its attributes. For increased security this coded sequence may then be encrypted in a manner specific to both the user and to the specific destination for which authorization is sought. 2. Description of Related Art
Biometrics is the science of identifying a person through the electronic examination of his or her physical characteristics (e.g. fingerprints, voice, or retina patterns). These methods are extraordinarily useful as protections against fraud as well as an impediment to unauthorized electronic access to data networks. Biometric systems allow only those persons possessing the biological characteristic equated with them to present themselves as the authentic person in a non-face to face transaction over the telephone or a computer network. Normally, the biometric process involves a comparison of a "live" personal characteristic with one that has been stored on a database. However, the existence of these databases provokes great concern. Not only can a biometric characteristic be used for authentication, it can be used as a tool to track and monitor a person's movements and transactions. Knowledge of such can lead to further information obtained about the person's likes, dislikes, political viewpoints, sexual habits, and health records. The use of biometric systems can therefore potentially effect Constitutionally protected areas of a person's life.
Further, each type of biometric used brings with it its own special variances and features that must be taken into consideration. For example, many voice verification systems used "hidden Markov models" (HMMs) to identify the speech
pattern of a particular person. HMMs relate to very detailed features or nuances of an individual's speech pattern. However, their use increases the rate of false rejections of an authentic user because a person's voice pattern changes according to, among other things, health and mood.
Therefore, for useful voice verification to take place, it becomes necessary to reduce the specificity in analysis of an entered biometric characteristic — but in a manner that does not diminish the system's security and effectiveness. Further, it is highly preferable to the users that a biometric system operates in a manner that can accommodate privacy interests. The present invention fulfills these goals.
SUMMARY OF THE INVENTION
The present invention performs cursory analysis of a user's inputted biometric characteristic for authentication. It compensates for any loss of security by incorporating a user device into its functioning that transmits dynamically changing device identity data to a platform. In the preferred embodiment, the invention authenticates a user's own special device as well as the user's voice pattern, thus reducing the need for high levels of specificity in voice verification.
These and other features of the invention will be more fully understood by reference to the following drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 is a graphic representation of the relationship of the desired level of security as a function of the number of security measures employed.
Fig. 2 is a flowchart depicting the present invention's processing of inputted speech for both the initial approval of a password and the subsequent use of that spoken password in verifying the speaker.
Fig. 3 is a front view of a control panel of the preferred embodiment of the present invention.
Figs. 4, 5 and 6 are charts each illustrating an utterance of a spoken password and the resulting code sequences generated by the preferred embodiment of the present invention.
Fig. 7 is a chart illustrating the determination of the identification number by the preferred embodiment of the present invention.
Figs. 8A, 8B and 8C are tables illustrating the determination of code parameters which are defined for ranges of three attributes of inputted speech. Fig. 9 is a chart illustrating the correspondence between the phoneme identification number determined by the preferred embodiment of the present invention and the spoken phoneme.
Fig. 10 is a block diagram indicating an alternative embodiment of the present invention in which additional levels of encryption occur prior to the User ID being received by the ultimate destination.
DETAILED DESCRIPTION OF THE INVENTION
During the course of this description, like numbers will be used to identify like elements according to different figures which illustrate the invention. It is well known in the security art to employ one or more of the following elements in designing a security system: (1) require the user seeking access to have some physical object (e.g., a door key), (2) require the user to have knowledge of a code or password (e.g., a PIN number to access his bank account information), and (3) require that a biometric physical characteristic of the user match a stored model of that user's characteristic. In combining these elements, a high level of security can be attained.
An important feature of the present invention is that both the biometric element and the physical object (a user device) are converted into coded sequences. Accordingly, the only stored data that are used as models for the verification comparison are these codes. Thus, a high level of security is achieved and the user's privacy interests are protected.
In addition, this combination of two different types of data permits the present invention to offer different levels of security. Reliance on device data alone will provide adequate level reliability for authentication. A combination of the two types, with a greater portion coming from the user device, would provide medium security. Alternatively, a combination of the two, with a greater portion of data extracted from the biometric, would provide high level security. Variations on these combination
levels could provide an increased number of security levels. Further, as depicted in Fig. 1 with respect to the preferred embodiment of the preferred invention, additional device features could be employed to further increase the level of security attained.
In the preferred embodiment, the present invention utilizes a user's voice as the biometric. The invention performs cursory analysis of a person's voice pattern for authentication. That is, this analysis is not as critical as conventional methods such as HMM analysis. Consequently, it is less likely in the present invention that a given user will be falsely rejected. The present invention compensates for any loss of security by incorporating a user device that transmits dynamically changing personal identification data to a platform. That is, the present invention authenticates a user's own special device as well as the user's voice pattern, thus reducing the need for high levels of specificity in voice verification.
The voice authentication process of the preferred embodiment begins with a registration phase which includes an analysis of an individual's utterances of a proposed pass-phrase. This step is performed before the phrase is ever used for authentication purposes. In one embodiment of the invention, this evaluation entails having the person speak the passphrase three times. As depicted in Fig. 2, the system (1) examines the utterance for its phonetic content and derives values based on those components; (2) Normalizes the utterance based on "System Adjustment Tones" and derives values based on these modified components; and, (3) Imposes wire-line impairments on the normalized utterance and again derives the values.
A specific example of this analysis will now be discussed in which the word "SPAGHETTI" is uttered three times. As depicted in Fig. 4, the first utterance results in a matrix of numbers. The initial "S" sound is recognized and, using a table lookup such as the one depicted in Fig. 9, this sound is quantified as phoneme ID# 29. This "S" sound is also quantified as to other parameters as well. That is, the duration, frequency range, and average volume level are similarly quantified by use of range values similar to the ones depicted in Figs. 8C, 8B and 8A, respectively. This analysis is also performed on each of the remaining phonemes that appear in the uttered word. In an alternative embodiment, these frequency and volume level range values are settable by use of system switches, as depicted in Fig. 3.
In the preferred embodiment the system next performs the same analysis for each phoneme after determined adjustments are applied to the speech signal ("Normalization"). Examples of such adjustments are background noise and type of microphone. Fig. 3 illustrates an alternative embodiment of the invention in which these parameters are either enabled or disabled by use of simple switches.
In the preferred embodiment, a third analysis is then performed for each phoneme based upon the above Normalized utterance further modified by wireline impairments. Examples of such impairments include identification of cellular versus wireline communication. Fig. 3 again illustrates an alternative embodiment wherein these impairments are selected by use of switch mechanisms.
Fig. 4 illustrates the effects of the Normalization process and the addition of wireline impairments on both the speech pattern to be analyzed and the resulting quantified values obtained. It further illustrates an important feature of the present invention. The values obtained for certain phonemes change as the speech pattern to be analyzed is modified in the manner described above. Conversely, certain phonemes are resilient to these variations. These latter phonemes are candidates to be included within an identification number to be used to identify this user.
The above analysis which related to the user's first utterance of the password "SPAGHETTI" is then repeated for the second and third utterance of this word, as depicted in Figs. 5 and 6, respectively. The summary of the results obtained for all of the utterances is displayed in Fig. 7. It can readily be seen that a phoneme that was characterized as resilient in one utterance (e.g., "E" in the 1st utterance) was not deemed so in another utterance (e.g., "E" in the 2nd utterance). The identification number for this user is obtained by utilizing only those phonemes which were deemed resilient in all three of the utterances.
That is, the system examines all the derived values and determines which values are consistent among each of the three versions and are therefore the most reliable information for authentication purposes. Inconsistent values will be ignored. Any remaining consistent values are strung together to form an identification number. Specifically, the codified phoneme ID#s, durations, and frequency values are appended to yield such an identification number, in this example, 27-B-05-01-A-01- 15-E-08.
Should the system not yield sufficient resilient phonemes, the user would be directed to select a different candidate pass-phrase. Once the system determines that the resulting number is sufficiently robust for identification purposes, it notifies the user that the chosen pass-phrase is acceptable for use as an identifier in an authentication. That is, the system determines that the distinctive elements present in the proposed pass-phrase will be discernible regardless of the alterations which may be imposed on it during normal usage (such as type of microphone, background noise, etc.).
In the preferred embodiment, once this evaluation is complete, a user may perform a voice authentication with any destination. As depicted in Fig. 2, when the user makes the connection to the State Machine platform of the present invention (a non-secure state machine), a voice prompt will ask the user to speak the same pass- phrase with which the user registered. When the platform hears the same utterances by the user, it should decode the utterance into the same bracketed results as during registration. The likelihood of the appearance of the same robust phoneme selections, cadences, frequency ranges and relative phoneme levels in combination with full word text recognition provides a excellent, high-security means of user specific verification.
After the person speaks the passphrase into a microphone or other input device, the platform, as during the evaluation process, shall break down the sentences into syllables and assign values to the phonetic components (phonemes) as it did during registration. In the embodiment depicted in this example, these components include: (1) an identification number for each syllable; (2) a value for the duration of each syllable; (3) values for the frequency ranges of the syllables; (4) values for average volume of the phonemes; and, (5) a ranking of the frequency levels. Alternative embodiments, both in the speech area and relating to other biometrics permit variations in the number of such elements to be considered thereby achieving corresponding variations in the level of security attained.
The result is a number string that represents a person's voice pattern as alphanumeric values. The State Machine (STI) then encrypts the number string with a special algorithm used only for that particular destination and that particular user. In the preferred embodiment, this result is then transmitted to the end destination which then re-encrypts the number string a second time. This second encryption produces the
identification values that the destination uses to authenticate a person as the person he or she claims to be.
If the destination system has no record of the identification values being transmitted, the destination will perform a manual authentication which requires the person to input personal information to identify the person as someone authorized to make any transaction. When the destination recognizes the person, it will equate the identification number with that person.
In the future, when the values delivered match what the destination has recorded as an authorized person's values, an authentication may take place. If they do not match, access to the destination's system would be denied to the user. If the values presented are notably similar to the values on record, yet not identical, the system could request personal information from the user via voice prompt (social security number, date of birth, etc.) which would provide the extra security to allow the transaction to be completed. In the preferred embodiment, additional security is provided by having the individual access the State Machine through a user device which transmits a dynamic signature. Such a device is described in U.S. Patent 5,583,933 issued to Applicant,
Andrew Mark, on December 10, 1996, which patent is hereby incorporated by reference. Such a device is designated "SmartKey" in Step 1 of Fig. 10. In the preferred embodiment this dynamic signature is combined with the alpha numeric voice string and the result, when encrypted for the intended destination, creates a device specific user identification number (DSUID). This DSUID provides a high level of security by minimizing the likelihood of a false verification occurring.
Further, the DSUID makes it very difficult for the specific user to be monitored as to other transactions he conducts independent of those performed at this destination.
This maintenance of user confidentiality is an important feature of the present invention.
The user device provides yet an additional feature. It generates specific tones and transmits these tones as a reference signal to thereby be used by the State Machine to normalize the communication channel. That is, by analysis of a received reference signal, the system can adjust for various communication channel variations such as,
but not limited to, type of microphone and type of communication path (e.g., cellular versus wireline).
By way of summary the preferred embodiment of the present invention contains the following elements: A. A State Machine a. that acts as a user-specific utterance evaluator which determines upon registration:
(i) If a proposed utterance can produce consistent and reliable values repeatedly derived from the phonetic composition of the utterance (i.e., it contains robust elements which can survive impairments caused by voice channel transmission and their subsequent normalization so that the same values may be derived from them reliably over time); (ii) Whether the impaired iterations contain the same phonetically identifiable elements as the unimpaired elements; and, (iii) If all the modified and unmodified utterances of the user's proposed pass- phrase derive the same values; AND b. which during every authentication:
(i) Normalizes the communication channels to eliminate transmission (including microphone and line) variances;
(ii) Evaluates the utterances into phonetic elements (identifies phonemes, bracketed frequencies and duration levels); and (iii) Converts identified elements into numerical coefficients; B. A destination specific encryption of the derived device ID; and, C. A numeric description of the user which is destination specific.
An alternative embodiment of the present invention uses the automatic number identification (ANI) capability of the phone system to identify the number of the calling party. Such a capability is well known and includes the ability to identify the particular phone used when it is serviced by a local or private telephone switching system. In this alternative embodiment, at registration a user can elect to have the ANI number of his home or business phone used in place of the code generated by his "SmartKey". The system simple combines the ANI number to create the DSUID to be
used for identification. In this embodiment access to the system from a "foreign phone" would require use of the individual's SmartKey.
A yet another alternative embodiment of the present invention is depicted in Fig. 10 in which an additional level of encryption occurs at Step 3. This additional encryption still further protects the identity of the user and the security of any transactions he performs at other destinations. That is, the encrypted user ID received in Step 4 identifies the user to that particular destination. Even if an interloper attains the actual identity of the user associated with that destination ID, without knowledge of the encryption which occurs at each level, he cannot use this destination ID to track or monitor transactions of the user at other destinations.
While the invention has been described with reference to the above alternative embodiments thereof, it will be appreciated by those of ordinary skill in the art that various modifications can be made to the structure and function of the individual parts of the system without departing from the sprit and scope of the invention as a whole.