WO2000039789A1 - Security and user convenience through voice commands - Google Patents
Security and user convenience through voice commands Download PDFInfo
- Publication number
- WO2000039789A1 WO2000039789A1 PCT/US1999/030839 US9930839W WO0039789A1 WO 2000039789 A1 WO2000039789 A1 WO 2000039789A1 US 9930839 W US9930839 W US 9930839W WO 0039789 A1 WO0039789 A1 WO 0039789A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- circuitry
- identifying
- user
- action
- verbal
- Prior art date
Links
- 230000001755 vocal effect Effects 0.000 claims abstract description 40
- 238000000034 method Methods 0.000 claims abstract description 32
- 238000012795 verification Methods 0.000 claims description 27
- 230000009471 action Effects 0.000 claims description 21
- 230000001419 dependent effect Effects 0.000 claims description 5
- 230000004044 response Effects 0.000 claims description 3
- 230000000977 initiatory effect Effects 0.000 claims 1
- 230000008901 benefit Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000013475 authorization Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 230000007794 irritation Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000002207 retinal effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
Definitions
- TECHNICAL FIELD This invention relates in general to security systems and, more particularly, to using voice commands to improve security and user convenience.
- the sensitive data which may be stored on a computer includes technical data, business data and personnel data.
- Much of this data is accessible to select groups of employees.
- an employee may enter an employee number which is public and a password which is private. Once verification is complete, the user is allowed to view the data.
- This method provides several shortcomings. First, if the user is not particular about logging off prior to leaving for any reason, others could gain access to modify the data. Second, passwords can be gained through observation or guesswork by others. Third, for infrequently used passwords are often forgotten, thus creating administration difficulties.
- speaker verification Using speaker verification, a previously stored "voice print" provided by an authorized user at the time of enrollment is compared with a voice print obtained at the time access is requested. A correlation between the stored voice print and the currently spoken voice print provides a score indicating similarities between the speech patterns of the authorized user and the purported authorized user. If the score exceeds a predetermined threshold, it is assumed that the purported authorized user is, in fact, the authorized user. Tests have shown that speaker verification is very accurate. However, current methods of using speaker verification, described hereinbelow, have rendered it less than convenient in many situations.
- a set of one or more authorized users is identified by inherent criteria in response to an action by a purported authorized user.
- a verbal utterance is received from the purported authorized user indicating a desired action and speaker verification is performed on the verbal utterance to confirm the identity of the purported authorized user as one of the set of authorized users. Speech recognition is also performed on the verbal utterance to identify the desired action. If the purported authorized user is identified as one of the set of authorized users, the action is performed.
- the present invention provides significant advantages over the prior art.
- the invention reduces the inconvenience to the user by eliminating the step of explicit identification as provided by the prior art.
- the identity confirmation and identification of the desired action are combined into a single step.
- the system is more convenient for users, without sacrificing security or efficiency.
- Figure 1 illustrates a flow chart describing a prior art system for speaker verification and voice activated commands
- Figure 2 illustrates a block diagram showing a preferred embodiment of the present invention
- Figure 3 illustrates a first method of operation using the system of Figure 2;
- Figure 4 illustrates a second method of operation using the system of Figure 2.
- Figure 1 illustrates a prior art system used by a long distance telephone company for verifying authorization to use long distance services.
- the caller explicitly speaks an identifying code into the phone, typically an account number.
- the account number is decoded using speaker independent voice recognition (since the caller is not yet known) in block 12, using well known techniques. If the decoded information relates to a valid user (i.e., if the decoded account number is a valid account number) in decision block 14, the spoken identification is also compared to verification information (a voice print of the authorized user) previously stored in association with the account number (or other unique identification code) to verify that the speaker is, in fact, the person authorized to charge to the account number in block 16. Otherwise, if the decoded information in block 12 did not identify an authorized user, the caller is returned to block 10 and asked to repeat the identification code (the system may terminate after a predetermined number of failed attempts to log on).
- the speaker verification process of block 16 will render a score determining whether there is a high probability that the purported authorized user is in fact the claimed authorized user. If, in decision block 18, the system determines that the speaker is the person associated with the identification code of block 12, the caller is allowed to use the system to issue commands to use a service in block 20. In the case of a long distance telephone company, the caller could make the verbal command "call John Doe.” This verbal command is identified using speaker dependent (sometimes in conjunction with speaker independent) voice recognition in block 22. The command would then be executed in block 24.
- the system would identify the command "call” (possibly using speaker independent voice recognition techniques) as a request for a telephone connection and would compare the utterance "John Doe" with a number of templates stored for the identified user (using speaker dependent voice recognition techniques). When the best match is found, the system will typically confirm the action by providing audio asking the user "would you like to call John Doe?" where the phrase "John Doe” is a recording made in the identified speaker's own voice. If the user says yes, the telephone number associated with the template is used to make a connection in block 24.
- a system of the type described above can be frustrating for users since it requires that the user memorize an identification code and because it uses multiple steps to verify the authority of the user prior to allowing access to services.
- FIG. 2 illustrates a block diagram of a system which provides a more efficient manner of verifying authority and responding to access to services.
- This system could be used in a variety of services such as physical access (door locks), electronic services (such as voice activated dialing) or access to data.
- the system 30 includes a microphone 32 for receiving utterances from a user 34.
- the microphone 32 could be, for example, the receiver of a telephone handset, a microphone connected to a computer, or a microphone in the housing of a locking mechanism.
- the microphone 32 is connected to gatekeeper 36, which includes speaker recognition logic and speaker verification logic, both of which can be implemented using well known available techniques.
- the speech recognition and speech verification processing can be either remote or local depending upon the needs or the application.
- Gatekeeper 36 is coupled to data bank 38 which stores information on a plurality of "verbal icons" 40. Each authorized user may have information relating to one or more verbal icons stored in memory bank 38. Gatekeeper 36 also outputs service commands, to other subsystems; these service commands would vary upon the environment in which system 30 is used. For example, for use in connection with a lock, the gatekeeper 36 may issue a command to open or close the lock. In a telephone system, the gatekeeper may issue a command to provide a connection to a specified telephone number. In a computer system or network, the gatekeeper 36 may issue a command to provide access to a database, or to allow the user to execute a program.
- Each verbal icon 40 contains the information necessary to perform speech recognition, speech verification and to initiate a service. For example, when a user enrolls a new verbal icon such as "call John Doe" a template is generated in conjunction with well known voice recognition techniques. This template is stored as part of the verbal icon. This template can be used in conjunction with speech recognition techniques (to identify the command) and speaker verifications techniques (to confirm the identity of the user). Further, the service information, indicating an action and or data such as a telephone number or a database filename, is also stored as part of the verbal icon.
- Figure 3 illustrates a flow chart where a user is uniquely identified by inherent criteria.
- the user accesses the gatekeeper and, by doing so, inherently provides identification.
- a gatekeeper 36 could be accessed at a unique telephone number for a long distance service. In this case, the user the identification process would be transparent to the user.
- a request for access of a database coming from a computer on a network could identify the user through identification of the computer's network address.
- the user provides a verbal request in block 52 by speaking a verbal icon (for which a counterpart verbal icon 40 has already been entered into the data bank 38).
- a verbal icon for which a counterpart verbal icon 40 has already been entered into the data bank 38.
- this may entail stating "call John Doe” into the telephone receiver.
- the user may say “open the personnel database” or “open the accounts receivable database” into the microphone coupled to the computer.
- Templates from the data bank 38 are used in conjunction with the speaker dependent voice recognition system to identify which, if any, verbal icons was spoken in block 54.
- the same utterance which is used to identify the verbal icon is also used in conjunction with the template of the same spoken icon to perform speaker verification in block 56.
- the speaker verification logic of the gatekeeper 36 determines that the user is an imposter in decision block 58, the request is rejected in block 60. Otherwise the gatekeeper passes information to initiate the service in block 62.
- a long distance service could benefit from this system.
- a caller could place a call to a personal toll or toll free destination telephone number which was uniquely identified with the caller. This number would serve to connect him or her to the gatekeeper system 36. Since the phone number is associated with the caller, it also provides the gatekeeper 36 with the caller's claimed identity. The caller then speaks a pre-established verbal icon to request a service, such as "call John Blake" (a voice activated dialing command which serves to place a call to the telephone number associated with the command). The gatekeeper 36 compares the utterance with verbal icons 40 enrolled in conjunction with the claimed user and determines which icon was spoken by the purported authorized user.
- verbal icons which may be used to initiate other services in the method of Figure 3 could include “get my calendar” or “show portfolio” to allow access to data stored on a computer, “retrieve voice mail” or “get dial tone.”
- a unique identifying code with an access by a user.
- a computer in the accounts receivable department of a company may be used by a number of individuals, each of which may have different rights to use various databases.
- Figure 4 illustrates a method using the system of Figure 2 wherein voice activated commands and verification are provided with resort to cumbersome security procedures on the part of the user.
- a user accesses the gatekeeper, and the act of accessing the gatekeeper identifies the user as one of predetermined group of users, each of whom have pre-registered verbal icons.
- the use of the computer in the accounts receivable department could identify the user as one of the group of people authorized to use that computer.
- a group of people could be assigned a number through which long distance telephone service was available.
- a group of people could be authorized to use a locked conference room.
- the user speaks a verbal icon pre-established in the system 30 and stored in the data bank 38.
- the current voice print would be compared with the verbal icons 40 stored for each of the group of people determined to authorized in block 70. Once the matching verbal icon was found in block 74, it identifies the user uniquely. The current utterance could then be used to perform speaker verification in block 76 along with the previously stored template associated with the matching verbal icon. If the user is not verified in decision block 78, the request associated with the verbal icon is rejected in block 80, otherwise the request is performed in block 82.
- the limits on the number of people in a group using this method depends upon a number of factors which affect performance of the system.
- the time to match a verbal icon will depend upon the number of possible verbal icons which are compared. Therefore, if each potential user in a group uses a small number of verbal icons, then the group can have a relatively larger number of members.
- Another factor is the processing capabilities of the system; the faster the system, the more verbal icons can be compared in a given time interval.
- a faster system can handle groups with greater numbers of members or greater number of verbal icons per member relative to a slower system.
- the verification threshold employed for highly sensitive databases may be narrower (higher ratio of rejection) than the verification threshold for less sensitive information.
- the present invention provides significant advantages over the prior art.
- First identification of a user's putative identity is completely transparent to the user, and the user does not need to perform a separate step to inform the gateway of his or her identity.
- Second, the system uses the user's spoken request for service to transparently provide verification data to the system. Again, the caller does not knowingly provide verification information as the system is being used, nor is there a separate step to provide the information.
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Telephonic Communication Services (AREA)
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP99967596A EP1147513A1 (en) | 1998-12-29 | 1999-12-27 | Security and user convenience through voice commands |
AU23857/00A AU2385700A (en) | 1998-12-29 | 1999-12-27 | Security and user convenience through voice commands |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US22221498A | 1998-12-29 | 1998-12-29 | |
US09/222,214 | 1998-12-29 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2000039789A1 true WO2000039789A1 (en) | 2000-07-06 |
Family
ID=22831344
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1999/030839 WO2000039789A1 (en) | 1998-12-29 | 1999-12-27 | Security and user convenience through voice commands |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP1147513A1 (en) |
AU (1) | AU2385700A (en) |
WO (1) | WO2000039789A1 (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1063637A1 (en) * | 1999-06-21 | 2000-12-27 | Matsushita Electric Industrial Co., Ltd. | Voice-actuated control apparatus and method of control using the same |
EP1189206A2 (en) * | 2000-09-19 | 2002-03-20 | Thomson Licensing S.A. | Voice control of electronic devices |
EP1349146A1 (en) * | 2002-03-28 | 2003-10-01 | Fujitsu Limited | Method of and apparatus for controlling devices |
EP1203368B1 (en) * | 1999-06-21 | 2003-11-05 | Palux AG | Control device for controlling vending machines |
EP1426924A1 (en) * | 2002-12-03 | 2004-06-09 | Alcatel | Speaker recognition for rejecting background speakers |
EP1513136A1 (en) * | 2003-09-03 | 2005-03-09 | Samsung Electronics Co., Ltd. | Audio/video apparatus and method for providing personalized services through voice and speaker recognition |
DE102009051508A1 (en) * | 2009-10-30 | 2011-05-05 | Continental Automotive Gmbh | Apparatus, system and method for voice dialogue activation and / or management |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE29618130U1 (en) * | 1996-10-02 | 1996-12-19 | Holtek Microelectronics Inc., Hsinchu | Remote control device that can identify the language of a specific user |
US5717743A (en) * | 1992-12-16 | 1998-02-10 | Texas Instruments Incorporated | Transparent telephone access system using voice authorization |
US5832063A (en) * | 1996-02-29 | 1998-11-03 | Nynex Science & Technology, Inc. | Methods and apparatus for performing speaker independent recognition of commands in parallel with speaker dependent recognition of names, words or phrases |
-
1999
- 1999-12-27 EP EP99967596A patent/EP1147513A1/en not_active Withdrawn
- 1999-12-27 WO PCT/US1999/030839 patent/WO2000039789A1/en not_active Application Discontinuation
- 1999-12-27 AU AU23857/00A patent/AU2385700A/en not_active Abandoned
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5717743A (en) * | 1992-12-16 | 1998-02-10 | Texas Instruments Incorporated | Transparent telephone access system using voice authorization |
US5832063A (en) * | 1996-02-29 | 1998-11-03 | Nynex Science & Technology, Inc. | Methods and apparatus for performing speaker independent recognition of commands in parallel with speaker dependent recognition of names, words or phrases |
DE29618130U1 (en) * | 1996-10-02 | 1996-12-19 | Holtek Microelectronics Inc., Hsinchu | Remote control device that can identify the language of a specific user |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1063637A1 (en) * | 1999-06-21 | 2000-12-27 | Matsushita Electric Industrial Co., Ltd. | Voice-actuated control apparatus and method of control using the same |
EP1203368B1 (en) * | 1999-06-21 | 2003-11-05 | Palux AG | Control device for controlling vending machines |
EP1189206A2 (en) * | 2000-09-19 | 2002-03-20 | Thomson Licensing S.A. | Voice control of electronic devices |
EP1189206A3 (en) * | 2000-09-19 | 2002-08-21 | Thomson Licensing S.A. | Voice control of electronic devices |
KR100845476B1 (en) * | 2000-09-19 | 2008-07-14 | 톰슨 라이센싱 | Method and apparatus for the voice control of a device appertaining to consumer electronics |
US6842510B2 (en) | 2002-03-28 | 2005-01-11 | Fujitsu Limited | Method of and apparatus for controlling devices |
EP1349146A1 (en) * | 2002-03-28 | 2003-10-01 | Fujitsu Limited | Method of and apparatus for controlling devices |
KR100881243B1 (en) * | 2002-03-28 | 2009-02-05 | 후지쯔 가부시끼가이샤 | Method and apparatus for controlling devices |
EP1426924A1 (en) * | 2002-12-03 | 2004-06-09 | Alcatel | Speaker recognition for rejecting background speakers |
EP1513136A1 (en) * | 2003-09-03 | 2005-03-09 | Samsung Electronics Co., Ltd. | Audio/video apparatus and method for providing personalized services through voice and speaker recognition |
DE102009051508A1 (en) * | 2009-10-30 | 2011-05-05 | Continental Automotive Gmbh | Apparatus, system and method for voice dialogue activation and / or management |
US9020823B2 (en) | 2009-10-30 | 2015-04-28 | Continental Automotive Gmbh | Apparatus, system and method for voice dialogue activation and/or conduct |
DE102009051508B4 (en) * | 2009-10-30 | 2020-12-03 | Continental Automotive Gmbh | Device, system and method for voice dialog activation and guidance |
Also Published As
Publication number | Publication date |
---|---|
AU2385700A (en) | 2000-07-31 |
EP1147513A1 (en) | 2001-10-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5806040A (en) | Speed controlled telephone credit card verification system | |
JP3904608B2 (en) | Speaker verification method | |
US6389397B1 (en) | User identification system using improved voice print identification processing | |
EP0621532B1 (en) | Password verification system | |
AU673480B2 (en) | Voice command control and verification system and method | |
US7278028B1 (en) | Systems and methods for cross-hatching biometrics with other identifying data | |
US20030171930A1 (en) | Computer telephony system to access secure resources | |
US20030163739A1 (en) | Robust multi-factor authentication for secure application environments | |
US5717743A (en) | Transparent telephone access system using voice authorization | |
US6931375B1 (en) | Speaker verification method | |
US20060286969A1 (en) | Personal authentication system, apparatus and method | |
US20140350932A1 (en) | Voice print identification portal | |
US20060106605A1 (en) | Biometric record management | |
US20050273626A1 (en) | System and method for portable authentication | |
US20030182182A1 (en) | Biometrics-based voting | |
US9373325B2 (en) | Method of accessing a dial-up service | |
JP2001505688A (en) | Speech recognition for information system access and transaction processing | |
EP1147513A1 (en) | Security and user convenience through voice commands | |
WO2000007087A1 (en) | System of accessing crypted data using user authentication | |
Alver | Voice Biometrics in Financial Services | |
US9978373B2 (en) | Method of accessing a dial-up service | |
Shaw | Voice verification—Authenticating remote users over the telephone |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 23857/00 Country of ref document: AU |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1999967596 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 1999967596 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 1999967596 Country of ref document: EP |