WO2015059365A1 - Audiovisual -->associative --> authentication --> method and related system - Google Patents
Audiovisual -->associative --> authentication --> method and related system Download PDFInfo
- Publication number
- WO2015059365A1 WO2015059365A1 PCT/FI2014/050807 FI2014050807W WO2015059365A1 WO 2015059365 A1 WO2015059365 A1 WO 2015059365A1 FI 2014050807 W FI2014050807 W FI 2014050807W WO 2015059365 A1 WO2015059365 A1 WO 2015059365A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user
- authentication
- cues
- terminal
- service
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 53
- 230000004044 response Effects 0.000 claims abstract description 60
- 238000004891 communication Methods 0.000 claims abstract description 20
- 230000000007 visual effect Effects 0.000 claims abstract description 14
- 230000007246 mechanism Effects 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 4
- 230000003028 elevating effect Effects 0.000 claims description 3
- 230000015654 memory Effects 0.000 description 19
- 238000012546 transfer Methods 0.000 description 13
- 230000009471 action Effects 0.000 description 11
- 238000012545 processing Methods 0.000 description 10
- 238000005516 engineering process Methods 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 230000001413 cellular effect Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000014509 gene expression Effects 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 230000001755 vocal effect Effects 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000001934 delay Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000010295 mobile communication Methods 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 239000008186 active pharmaceutical agent Substances 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013079 data visualisation Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000001747 exhibiting effect Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000007519 figuring Methods 0.000 description 1
- 210000003128 head Anatomy 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 210000001525 retina Anatomy 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
- 238000011179 visual inspection Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/22—Interactive procedures; Man-machine interfaces
- G10L17/24—Interactive procedures; Man-machine interfaces the user being prompted to utter a password or a predefined phrase
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/22—Interactive procedures; Man-machine interfaces
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/30—Authentication, i.e. establishing the identity or authorisation of security principals
- G06F21/31—User authentication
- G06F21/32—User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/30—Authentication, i.e. establishing the identity or authorisation of security principals
- G06F21/31—User authentication
- G06F21/36—User authentication by graphic or iconic representation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/08—Network architectures or network communication protocols for network security for authentication of entities
- H04L63/0861—Network architectures or network communication protocols for network security for authentication of entities using biometrical features, e.g. fingerprint, retina-scan
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/21—Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/2103—Challenge-response
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/21—Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/2111—Location-sensitive, e.g. geographical location, GPS
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/08—Network architectures or network communication protocols for network security for authentication of entities
- H04L63/0853—Network architectures or network communication protocols for network security for authentication of entities using an additional device, e.g. smartcard, SIM or a different communication terminal
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/10—Network architectures or network communication protocols for network security for controlling access to devices or network resources
- H04L63/107—Network architectures or network communication protocols for network security for controlling access to devices or network resources wherein the security policies are location-dependent, e.g. entities privileges depend on current location or allowing specific operations only from locally connected terminals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W12/00—Security arrangements; Authentication; Protecting privacy or anonymity
- H04W12/60—Context-dependent security
- H04W12/69—Identity-dependent
- H04W12/77—Graphical identity
Definitions
- the invention pertains to computers and related communications infrastructures.
- the invention concerns authentication relative to an electronic device or electronic service.
- Access control in conjunction with e.g. network services may imply user identification, which can be generally based on a variety of different approaches. For example, three categories may be considered including anonymous, standard and strong identification. Regarding anonymous case, the service users do not have to be and are not identified. Standard, or 'normal', identification may refer to what the requestor for access knows, such as a password, or bears such as a physical security token. Such a token may include password-generating device (e.g. SecurlDTM), a list of one-time passwords, a smart card and a reader, or a one-time password transmitted to a mobile terminal.
- password-generating device e.g. SecurlDTM
- strong identification may be based on a biometric property, particularly a biometrically measurable property, of a user, such as a fingerprint or retina, or a security token the transfer of which between persons is difficult, such as a mobile terminal including a PKI (Public Key Infrastructure) certificate requiring entering a ⁇ (Personal Identification Number) code upon each instance of use.
- a biometric property particularly a biometrically measurable property
- a security token the transfer of which between persons is difficult, such as a mobile terminal including a PKI (Public Key Infrastructure) certificate requiring entering a ⁇ (Personal Identification Number) code upon each instance of use.
- network service -related authentication i.e. reliable identification
- Weak authentication may refer to the use of single standard category identification means such as user ID/password pair.
- strongish authentication may apply at least two standard identification measures utilizing different techniques. With strong authentication, at least one of the identification measures should be strong. Notwithstanding the various advancements taken place during the last years in the context of user and service identification, authentication, and related secure data transfer, some defects still remain therewith and are next briefly and non- exhaustively reviewed with useful general background information.
- access control methods to network services include push and pull methods.
- pull methods a user may first identify oneself anonymously to a network service providing a login screen in return. The user may then type in the user ID and a corresponding password, whereupon he/she may directly access the service or be funneled into the subsequent authentication phase.
- a network server may first transmit information to the e-mail address of the user in order to authorize accessing the service. Preferably only the user knows the password of the e-mail account. The users are often reluctant to manually manage a plurality of user IDs and corresponding passwords.
- a password is typically enabled by access control management entity that may also store the password locally. If the security of the data repository is later jeopardized, third parties may acquire all the passwords stored therein. Also if the user forgets the password or it has to be changed for some other reason, actions have to be taken by the user and optionally service provider. The user has to memorize the new password. Further, the adoption of a personal, potentially network service-specific token such as a smartcard, e.g. SecurlDTM, and a related reader device may require intensive training. The increase in the use of smart cards correspondingly raises the risk of thefts and provision of replacement cards.
- a smartcard e.g. SecurlDTM
- the objective is to at least alleviate one or more problems described hereinabove regarding the usability and security issues, such as authentication, associated with the contemporary remote computer systems and related electronic services such as online services.
- the objective is achieved by the system and method in accordance with the present invention.
- the suggested solution cleverly harnesses, among other factors, the associative memory of a user for electronic authentication as described in more detail hereinafter.
- a system for authenticating a user of an electronic service comprises at least one server apparatus preferably provided with a processing entity and a memory entity for processing and storing data, respectively, and a data transfer entity for receiving and sending data, the system being configured to store, for a number of users, a plurality of personal voiceprints each of which linked with a dedicated visual, audiovisual or audio cue, for challenge-response authentication of the users, the cues being user-selected, provided or created, pick, upon receipt of an authentication request associated with an existing user of said number of users, a number of cues for which there are voiceprints of the existing user stored, and provide the cues for representation to the user as a challenge, receive sound data indicative of the voice response uttered by the user to the represented cues, determine on the basis of the sound data, the represented cues and linked voiceprints, whether the response has been uttered by the existing user of said number of users, and provided that this seems to be the case,
- the sound data is received from a mobile (terminal) device.
- the mobile device advantageously incorporates a microphone for capturing voice and encoding it into digital format.
- the system maintains or has access to information linking service/application users, or user id's, and mobile devices or mobile identities, e.g. IMEI code or IMSI code (or other smart card), respectively, together.
- mobile phone number could be utilized for the purpose.
- the cues are indicated to the user via a first terminal device such as a laptop or desktop computer.
- Service data in general and/or the cues may be provided as browser data such as web page data.
- client data such as web page data.
- first terminal device includes or is at least connected to a display, a projector and/or a loudspeaker with necessary digital-to-analogue conversion means for the purpose.
- the sound data is then obtained via a second terminal device, preferably via the aforementioned mobile device like a cellular phone, typically a smartphone, or a communications-enabled PDA/tablet, configured to capture the sound signal incorporating the user's voice (uttering the response to the cues) and convert it into digital sound data forwarded towards the system.
- a second terminal device preferably via the aforementioned mobile device like a cellular phone, typically a smartphone, or a communications-enabled PDA/tablet, configured to capture the sound signal incorporating the user's voice (uttering the response to the cues) and convert it into digital sound data forwarded towards the system.
- the mobile device may be provided with a message, such as an SMS message, triggered by the system in order to verify that the user requiring voice-based authentication has the mobile device with him/her.
- a message such as an SMS message
- the user may have logged in an electronic service using certain user id that is associated with the mobile device. Such association may be dynamically controlled in the service settings by the user, for instance.
- the user has to trigger sending a reply, optionally via the same mobile device or via the first terminal, optionally provided with a secret such as password, or other acknowledgement linkable by the system with the user (id).
- the cues may be represented visually and/or audibly utilizing e.g. a web browser at the first user terminal.
- the user provides the response using the second terminal such as a mobile terminal.
- the first terminal may refer to e.g. a desktop or laptop computer that may be personal or in a wider use.
- the second terminal particularly if being a mobile terminal such as a smartphone, is typically a personal device associated with a certain user only, or at least rather limited group of users.
- the system may be configured to link or associate the first and second terminals together relative to the ongoing session and authentication task.
- actions taken utilizing the second terminal may be linked with activity or response at the first terminal, e.g. browser thereat, by the system.
- the system may be configured to dynamically allocate a temporary id such as so-called session id to the first terminal.
- This id may comprise a socket id.
- the first terminal may then be configured to indicate the id to the user and/or second terminal.
- visual optionally coded representation applying a Q (Quick Response) code, preferably including also other information such as user id (to the service) and/or domain information may be utilized.
- the second terminal may be then configured to wirelessly obtain the id.
- the second terminal may read or scan, e.g. via camera and associated code reader software, the visual representation and decode it.
- the same application e.g. Java application, which is applied for receiving voice input from the user, is utilized for delivering the obtained id back towards the system, which then associates the two terminals and the session running in the first terminal (via the browser) together.
- the determination tasks may include a number of mapping, feature extraction, and/or comparison actions according to predetermined logic by which the match between the obtained sound data and existing voiceprint data relative to the indicated existing user is confirmed, i.e. the authentication is considered successful in the light of such voice-based authentication factor. In the case of no match, i.e. failed voice-related authentication, the authentication status may remain as is or be lowered (or access completely denied).
- elevating the gained (current) authentication status in connection with successful voice-based authentication may include at least one action selected from the group consisting of: enabling service access, enabling a new service feature, enabling the use of a new application, enabling a new communication method, and enabling the (user) adjustment of service settings or preferences.
- a visual cue defines a graphical image that is rendered on a display device for perception and visual inspection by the user.
- the image may define or comprise a graphical pattern, drawing or e.g. a digital photograph.
- the image is complex enough so that the related (voice) association the user has, bears also necessary complexity and/or length in view of sound data analysis (too short or too simple voice input/voiceprint renders making reliable determinations difficult).
- audiovisual cue includes a video clip or video file with associated integral or separate sound file(s).
- audiovisual cue may incorporate at least one graphical image and related sound.
- video and audiovisual cues are indicated by e.g. a screenshot or other descriptive graphical image, and/or text, shown in the service UI.
- the image or a dedicated UI feature e.g. button symbol
- video cue(s) may playback automatically, optionally repeatedly.
- the audio cue includes sound typically in a form of at least one sound file that may be e.g. monophonic or stereophonic.
- the sound may represent music, sound scenery or landscape (e.g. desert sounds, waterfall, city or traffic sounds, etc.), various noises or e.g. speech.
- Audio cue may, despite of its non-graphical/invisible nature, still be associated with an image represented via the service UI.
- the image used to indicate an audio cue is preferably at least substantially the same (i.e. non-unique) with all audio cues, but anyhow enables visualizing an audio cue in the UI among e.g. visual or audiovisual cues, the cues being optionally rendered as a horizontal sequence of images (typically one image per cue), of the overall challenge.
- the image may be active and selecting, or 'clicking' it, advantageously then triggers the audible reproduction of the cue.
- a common UI feature such as icon may be provided to trigger sequential reproduction of all audio and optionally audiovisual, cues.
- basically all the cues may be indicated in a (horizontal) row or column, or using other configuration, via the service UI.
- Video, audiovisual and/or audio cues may at least have a representative, generic or characterizing, graphical image associated with them as discussed above, while graphical (image) cues are preferably shown as such.
- At least one cue is selected or provided, optionally created, by the user himself/herself.
- a plurality of predetermined cues may be offered by the system to the user for review via the service UI wherefrom the user may select one or more suitable, e.g. the most memorable, cues to be associated with voiceprints.
- a plurality of cues is associated with each user.
- a voiceprint i.e. a voice-based fingerprint
- a voiceprint of the present invention thus advantageously characterizes, or is used to characterize, both the user (utterer) and the spoken message (the cue or substantive personal association with the cue) itself. Recording may be effectuated using the audio input features available in a terminal device such as microphone, analogue-to-digital conversion means, encoder, etc.
- the established service connection is maintained based on a number of security measures the outcome of which is used to determine the future of the service connection, i.e. let it remain, terminate it, or change it, for example.
- fingerprint methodology may be applied.
- a user terminal may initially, upon service log-in, for instance, provide a fingerprint based on a number of predetermined elements, such as browser data such as version data, OS data such as version data, obtained Java entity data such as version data, and/or obtained executable data such as version data.
- Version data may include ID data such as version identifier or generally the identifier (application or software name, for example) of the associated element.
- the arrangement may be configured to request new fingerprint in response to an event such as a timer or other temporal event (timed requests, e.g. on a regular basis).
- an event such as a timer or other temporal event (timed requests, e.g. on a regular basis).
- the client may provide fingerprints independently based on timer and/or some other event, for instance.
- the arrangement may utilize the most recent fingerprint and a number of earlier fingerprints, e.g. the initial one, in a procedure such as a comparison procedure.
- the procedure may be executed to determine the validity of the current access (user). For example, if the compared fingerprints match, a positive outcome may be determined indicating no increased security risk and the connection may remain as is. A mismatch may trigger a further security procedure or terminating the connection.
- the system is location-aware advantageously in a sense it utilizes location information to authenticate the user.
- a number of predetermined allowed and/or non-allowed/blocked locations may be associated with each user of the arrangement.
- the location may refer to at least one element selected from the group consisting of: address, network address, sub-network, IP (Internet Protocol) address, IP sub-network, cell, cell-ID, street address, one or more coordinates, GPS coordinates, GLONASS coordinates, district, town, country, continent, distance to a predetermined location, and direction from a predetermined location.
- IP Internet Protocol
- Failed location-based authentication may result in a failed overall authentication (denied access), or alternatively, a limited functionality such as limited access to the service may be provided.
- a limited functionality such as limited access to the service may be provided.
- Each authentication factor may be associated with a characterizing weight (effect) in the authentication process.
- the system may be configured to transmit a code, preferably as browser data such as web page data, during a communication session associated with a predetermined user of the service for visualization and subsequent input by the user. Further, the system may be configured to receive data indicative of the inputted code and of the location of the terminal device applied for transmitting the data, determine on the basis of the data and predetermined locations associated with the user whether the user currently is in allowed location, and provided that this seems to be the case on the basis of the data, raise the gained authentication status of the user regarding at least the current communication session. Preferably the data is received from a mobile (terminal) device.
- the code is indicated to the user via a first terminal device such as a laptop or desktop computer.
- a code dedicated for the purpose e.g. the aforesaid temporary id such as socket id may be utilized in this context as well.
- a certain location may be associated with a certain user by "knowing" the user, which may refer to optionally automatically profiling and learning the user via monitoring one's habits such as location and optionally movements.
- a number of common, or allowed, locations may be determined and subsequently utilized for authentication purposes.
- the user may manually register a number of allowed locations for utilizing the solution in the arrangement.
- knowing the user and/or his/her gear and utilizing the related information such as location information in connection with access control, conducting automated attacks such as different dictionary attacks against the service may be made more futile.
- the location of the user (terminal) and/or data route may be estimated, e.g.
- delays relating to data packets may be compared with delays associated with a number of e.g. location-wise known references such as reference network nodes, which may include routers, servers, switches, firewalls, terminals, etc.
- the electronic service is a cloud service (running in a cloud). Additionally or alternatively, the service may arrange virtual desktop and/or remote desktop to the user, for instance.
- an electronic device for authenticating a person comprises a voiceprint repository configured to store, for a number of users including at least one user, a plurality of personal voiceprints, each of which being linked with a dedicated visual, audiovisual or audio cue, for challenge-response authentication, the cues being user-selected, user-provided or user-created, an authentication entity configured to pick, upon receipt of an authentication request associated with an existing user of said number of users, a number of cues for which there are voiceprints of the existing user stored, and represent the number of selected cues to the person as a challenge, and a response provision means for obtaining sound data indicative of the voice response uttered by the person to the represented cues, whereupon the authentication entity is configured to determine, on the basis of the sound data, the represented cues and voiceprints linked therewith, whether the response has been uttered by the existing user of said number of users, and provided that this seems to be the case, and to elevate the authentication status of the person
- a method for authenticating a subject person to be executed by one or more electronic devices comprising storing, for a number of users, a plurality of personal voiceprints each of which linked with a dedicated visual, audiovisual or audio cue, for challenge-response authentication of the users, cues being user-selected, provided or created, picking, upon receipt of an authentication request associated with an existing user of said number of users, a number of cues for which there are voiceprints of the existing user stored, to be represented as a challenge, receiving a response incorporating sound data indicative of the voice response uttered by the person to the represented cues, determining on the basis of the sound data, the represented cues and linked voiceprints, whether the response has been uttered by the existing user of said number of users, and provided that this seems to be the case, elevating the authentication status of the person acknowledged as the existing user according to the determination, whereupon e.g. a responsive access control action may be executed.
- the utility of the present invention follows from a plurality of issues depending on each particular embodiment.
- voice is exploited as an authentication factor together with features of speech (i.e. voice input message content) recognition.
- other factors e.g. location data indicative of the location of the user (terminal), may be applied for authentication purposes.
- device and/or service users may be provided with authentication challenge as a number of cues such as images, videos and/or sounds for which they will themselves initially determine the correct response they want to utilize in the future during authentication.
- the user may simply associate each challenge with a first associative, personal response that comes into mind and apply that memory image in the forthcoming authentication events based on voice recognition, as for each cue a voiceprint is recorded indicative of correct response, whereupon the user is required to repeat the voice response upon authentication when the cue is represented to him/her as a challenge.
- security enhancing procedure is offered for linking a number of terminals together from the standpoint of electronic service, related authentication and ongoing session.
- electronic devices such as computers and mobile terminals may be cultivated with the suggested, optionally self-contained, authentication technology that enables authentication on operating system level, which is basically hardware manufacturer independent.
- the outcome of an authentication procedure may be mapped into a corresponding resource or feature (access) control action in the device.
- a resource or feature (access) control action in the device.
- overall access to the device resources e.g. graphical UI and/or operating system
- access to specific application(s) or application feature(s) may be controlled by the authentication procedure, which may be also considered as a next or new generation password mechanism, identity based solution.
- the suggested authentication technology may be utilized to supplement or replace the traditional 'PIN' type numeric or character code based authentication/access control methods.
- the suggested solution may be applied to various access control devices and solutions in connection with, for example, gates, doors, containers, safes, diaries, or locking/unlocking mechanisms in general.
- data transfer may refer to transmitting data, receiving data, or both, depending on the role(s) of a particular entity under analysis relative a data transfer action, i.e. a role of a sender, a role of a recipient, or both.
- Fig. la illustrates the concept of an embodiment of the present invention via both block and signaling diagram approaches relative to an embodiment thereof.
- Fig. lb is a block diagram representing an embodiment of selected internals of the system or device according to the present invention.
- Fig. lc illustrates scenarios involving embodiments of an electronic device in accordance with the present invention.
- Fig. 2a represents one example of service or device UI view in connection with user authentication.
- Fig. 2b represents a further example of service or device UI view in connection with user authentication.
- Fig. 2c represents further example of service or device UI view in connection with user authentication.
- Fig. 2d represents further example of service or device UI view in connection with user authentication.
- Fig. 3 is a flow chart disclosing an embodiment of a method in accordance with the present invention.
- Figure la illustrates an embodiment of the present invention.
- the embodiment may be generally related, by way of example only, to the provision of a network- based or particularly online type electronic service such as a virtual desktop service or document delivery service, e.g. delivery of a bill including a notice of maturity regarding a bank loan.
- Entity 102 refers to the service user (recipient) and associated devices such as a desktop or laptop computer and/or a mobile device utilized for accessing the service in the role of a client, for instance.
- the device(s) preferably provide access to a network 108 such as the Internet.
- the mobile device such as a mobile phone (e.g.
- Entity 106 refers to a system or network arrangement of a number of at least functionally connected devices such as servers. The communication between the entities 102 and 106 may take place over the Internet and underlying technologies, for example. Preferably the entity 106 is functionally also connected to a mobile network.
- the user 102 of an electronic service 106 incorporating or at least utilizing an embodiment of the system in accordance with the present invention (these two terms being therefore used interchangeably hereinafter) is preferably associated with a first terminal device 102a such as a desktop or laptop computer, a thin-client or a tablet/hand-held computer provided with network 108 access, typically Internet access.
- a first terminal device 102a such as a desktop or laptop computer, a thin-client or a tablet/hand-held computer provided with network 108 access, typically Internet access.
- the user 102 preferably has a second terminal 102b such as a mobile communications device with him/her, advantageously being a smartphone or a corresponding device with applicable mobile subscription or other wireless connectivity enabling the device to transfer data e.g. between local applications and the Internet.
- the potential users of the provided system include different network service providers, operators, cloud operators, virtual and/or remote desktop service providers, application/software manufacturers, financial institutions, companies, and individuals in the role of a service provider, intermediate entity, or end user, for example.
- the invention is thus generally applicable in a wide variety of different use scenarios and applications.
- the service 106 may include customer portal service and the service data may correspondingly include customer portal data.
- the user 102 may inspect the available general data, company or other organization-related data or personal data such as data about rental assets, estate or other targets.
- Service access in general, and the access of certain features or sections thereof, may require authentication.
- Multi-level authentication may be supported such that each level can be mapped to predetermined user rights regarding the service features.
- the rights may define the authentication level and optionally also user-specific rules for service usage and thereby allow feature usage, exclude feature usage, or limit the feature usage (e.g. allow related data inspection but prevent data manipulation), for instance.
- the system 106 may be ramped up and configured to offer predetermined service to the users, which may also include creation of user accounts, definition of related user rights, and provision of necessary authentication mechanism(s). Then, the user 102 may execute necessary registration procedures via his/her terminal(s) and establish a service user account cultivated with mandatory or optional information such as user id, service password, e-mail address, personal terminal identification data (e.g. mobile phone number, IMEI code, IMSI code), and especially voiceprints in the light of the present invention.
- Figure 2a visualizes the voiceprint creation in the light of possible related user experience.
- a number of potential cues, such as graphical elements, 202 may be first indicated to the user via the service or device (mutatis mutandis) UI 200.
- the user naturally links at least some cues with certain associations based on e.g. his/her memories and potentially brainworms so that association is easy to recall and unambiguous (only one association per cue; for example, upon seeing a graphical representation of a cruise ship, the user always come up with memory relating to a trip to Caribbean, whereupon the natural association is 'Caribbean', which is then that user's voice response to the cue of a cruise ship).
- Further information 204 such as the size of a captured sound file may also be shown.
- the user may optionally select a sub-set of all the indicated cues and/or provide (upload) cues of his/her own to the system for use during authentication in the future.
- the sound sample to be used for creating the voiceprint, and/or as at least part of a voiceprint may be defined a minimum acceptable duration in terms of e.g. seconds.
- the cues may be visual, audible, or a combination of both.
- the user may then input, typically utter, his/her voice response based on which the system determines the voiceprints, preferably at least one dedicated voiceprint corresponding to each cue in the subset.
- a voiceprint associated with a cue preferably characterizes both the voice and the spoken sound, or message, of the response. In other words, same message later uttered by other user does not match with the voiceprint of the first user during the voice authentication phase even though the first user uttered the very same message to establish the fingerprint.
- a message uttered by the first user does not match the voiceprint established based on other message uttered by the first user.
- the system may be configured to extract a number of parameters describing the properties of the user's vocal tract, for example, and e.g. related formant frequencies. Such frequencies typically indicate the personal resonance frequencies of the vocal tract of the speaker.
- the user 102 may trigger the browser in the (first) terminal and control it to connect to the target electronic service 106. Accordingly, the user may now log into the service 106 using his/her service credentials or provide at least some means of identification thereto as indicated by items 132, 134.
- the system managing the service 106 may comprise e.g. a Node.js server entity whereto/relative to which the web browser registers itself, whereupon the service side allocates a dynamic id such as a socket id or other session id and delivers it at 136 to the browser that indicates the id and optionally other information such as domain, user id/name, etc. to the user via a display at item 138.
- Figure 2b illustrates an embodiment of potential service or device UI features at this stage through a snapshot of UI view 200B.
- the dynamic id is shown both as a numeric code and as embedded in a Q code at 208 to the user.
- Items 206 indicate the available authentication elements or factors, whereas data at 210 implies the current authentication level of the session with related information. Instead of QR code, some other matrix barcode or completely different visual representation could be utilized.
- the code is read by using a camera-based code reader application, for instance, to the second terminal such as mobile terminal of the user.
- the mobile device is configured, using the same or other predetermined application, to transfer an indication of the obtained data such as dynamic id and data identifying the terminal or entity such as smart card therein, to the system 106, wherein a target entity such as socket.io entity that is configured to operate as a (browser type) client to the Node.js server entity, forwards at least part of the data, including the dynamic id, to the Node.js server entity that can thus dynamically link, and preferably shall link, the particular first and second terminals to the same ongoing service (authentication) session at 142.
- a target entity such as socket.io entity that is configured to operate as a (browser type) client
- the Node.js server entity forwards at least part of the data, including the dynamic id, to the Node.js server entity that can thus dynamically link, and preferably shall link
- the subsequent data transfer activities e.g. transfer of voice response
- the second terminal to the system may be at least partially implemented utilizing the same route and related technique(s).
- Different entities on the system side may be, in practical circumstances, be implemented by one or more physical devices such as servers.
- a predetermined server device may implement the Node.js server whereas one other server device may implement the socket.io client entity.
- the system 106 fetches a number (potentially dynamically changing according to predetermined logic) of cues associated with the user account initially indicated/used in the session and for which voiceprints are available.
- the cues may be basically randomly selected (and order-wise also randomly represented to the user).
- the cues are indicated (transferred) to the browser in the terminal that then represents them to the user at 146 e.g. via a display and/or audio reproduction means depending on the nature of the cues.
- Ajax Asynchronous JavaScript and XML
- PHP Hypertext Preprocessor
- the cues may be of the same or mixed type (e.g.
- one graphical image cue, one audio cue, and one video cue optionally with audio track As the user 102 perceives the cues as an authentication challenge, he/she provides the voice response preferably via the second terminal at 148 to the service 106 via a client application that may be the same application used for transferring the dynamic id forward.
- the client side application for the task may be a purpose-specific Java application, for example.
- four graphical (image) cues are indicated at 212 in the service or device UI view 200C (browser view).
- Being also visible in the figure is a plurality of service features at 214, some of which are greyed out, i.e. non-active features, due to the current insufficient authentication level. Indeed, a service or particularly service application or UI feature may be potentially associated with a certain minimum security level required for access.
- Automatic expiration time for the session may also be indicated via the UI.
- a session about to expire or expired may be renewed by repeated/new authentication.
- the service 106 analyzes the obtained user response relative to the cues against the voiceprints using predetermined matching technique(s) and/or algorithms.
- the input order of (sub-)responses corresponding to individual cues in the overall response should match the order in which cues were represented in the service UI (e.g. in a row, from left to right).
- the system 106 may, however, be configured to analyze whether the order of sub-responses matches the order of cues given, or at least to try different ordering(s).
- the system 106 may be configured to rearrange the sub-responses relative to the cues to obtain e.g. better voiceprint matching result during the analysis.
- the voice authentication procedure may be considered successful, and the authentication level may be scaled (typically raised) accordingly at 152.
- the voice-based authentication fails (non-match)
- the authentication status may be left intact or lowered, for instance.
- Outcome of such authentication procedure is signaled to the user (preferably at least to the first terminal, potentially both) for review e.g. as an authentication status message via the service UI at 154. New features may be made available to the user in the service UI.
- Figure 2d depicts a possible service or device UI view 200D after successful voice authentication. Explicit indication of the outcome of the authentication procedure is provided at 218 by authentication status message and as an implicit indication thereof, more service features 216 have been made available to the user (not greyed anymore, which the user will immediately recognize).
- location information may be optionally utilized in the authentication process as such.
- the server 106 and/or other entities external to the user's 102 terminal gear may be configured to locate one or more of the terminals the user 102 applies for communicating with the service 106.
- the terminal devices may bear an own role in the positioning process and execute at least part of the necessary positioning actions locally. Actions required to position a terminal may be shared between the terminal(s) and at least one external entity.
- address information may be used in the positioning process to deduce the location of the particular terminal in question (see Figures 2b-2d wherein IP location has been identified as one applied authentication/identification criterion).
- terminal or access network addresses such as IP addresses are at least loosely associated with physical locations so that the address-based locating is at least limitedly possible.
- roaming signal and data transmission -based positioning For example, by checking the ID of the base station(s) the mobile device is communicating with, at least approximate location of the mobile device may be obtained.
- the mobile device may be located.
- a satellite navigation receiver such as a GPS (Global Positioning System) or GLONASS (GLObal Navigation Satellite System) in connection with a terminal device may be exploited.
- the terminal may share the locally received satellite information with external entities as such or in cultivated form (e.g. ready-determined coordinates based on received satellite signal(s)).
- data entity such as data packet transit times or TT times may be monitored, if possible, e.g. in relation to both the monitored user/terminal and e.g.
- the system 106 may then introduce a further factor, i.e. a location -based factor, to the authentication procedure and verify, whether the current location of the terminal in question matches with predetermined location information defining a number of allowed locations and/or banned locations in the light of the service and/or document access.
- a further factor i.e. a location -based factor
- the status of the location-based factor may be evaluated prior to the evaluation of the fulfillment of other authentication factors, in conjunction with them, or as a final check before authorizing the user to access the service and/or electronic document.
- Fig. lc illustrates few other scenarios involving the embodiments of an electronic device 102c in accordance with the present invention.
- the device 102c may be a self-contained in a sense it can locally take care of authentication procedure based on program logic and data stored thereat.
- the data such as voiceprint data may be still updated e.g. periodically or upon fulfillment of other triggering condition from or to a remote source such as a server 106 via a communications connection possibly including one or more communication networks 108 in between.
- the device 102c is registered before a remote service such as network-based service that maintains a database of devices, associated users and/or related voiceprints.
- the personal voiceprints may be generated at device 102c or network side (e.g. server 106) from the voice input provided by the user in question.
- the device 102c may incorporate a plurality of functionally (communications-wise) connected elements that are physically separate/separable and may even have their own dedicated housings, etc.
- an access control panel or terminal providing UI to a subject of authentication person
- the device 102c may include e.g. a computer device (e.g. laptop or desktop), or a portable user terminal device, such as a smartphone, tablet or other mobile or even wearable device.
- the device 102c may generally be designated as a personal device, or used, typically alternately, by several authorized persons, such as multiple family members or team members at work, and be thus configured to store personal voiceprints of each user, not just of a single user. Storing personal voiceprints of a single user only is often sufficient in the case of a truly personal device.
- the authentication procedure suggested herein may be utilized to control the provision of (further) access to the resources and feature(s) such as application(s) or application feature(s), for instance, in or at the device and/or at least accessible via the device by means of a communications connection to a remote party such as remote terminal or remote network-based service and related entities, typically incorporating at least one server.
- the device 102c may thus be, include or implement at least part of an access control device.
- the access control device 102c may include or be at least connected to a particular controllable physical asset or entity 140a, 140b, such as a door, fence, gate, window, or a latch providing access to a certain associated physical location, such as space (e.g. a room, compound or building) or physical, restricted resource such as container, safe, diary or even briefcase internals potentially containing valuable and/or confidential material.
- a particular controllable physical asset or entity 140a, 140b such as a door, fence, gate, window, or a latch providing access to a certain associated physical location, such as space (e.g. a room, compound or building) or physical, restricted resource such as container, safe, diary or even briefcase internals potentially containing valuable and/or confidential material.
- the device 102c and the suggested authentication logic provided thereat may be at least functionally connected to an (electrically controllable) locking or unlocking mechanism of such an asset/entity
- Figure lb shows, at 109A, a block diagram illustrating the selected internals of an embodiment of device 102c or system 106 presented herein.
- the system 106 may incorporate a number of at least functionally connected servers, and typically indeed at least one device such as a server or a corresponding entity with necessary communications, computational and memory capacity is included in the system.
- terminal devices such as a mobile terminal or a desktop type computer terminal utilized in connection with the present invention could generally include same or similar elements.
- a number of terminals e.g. aforesaid first and/or second terminal, may be included in the system 106 itself.
- devices 102c applied in connection with the present invention may in some embodiments be implemented as a single-housing stand-alone devices, whereas in some other embodiments, may include or consist of two or more functionally connected elements potentially even provided with their own housings (e.g. access control terminal unit at a door connected to a near-by or more distant access control computer via a wired and/or wireless communication connection).
- the utilized device(s) or generally entities in question are typically provided with one or more processing devices capable of processing instructions and other data, such as one or more microprocessors, micro-controllers, DSP's (digital signal processor), programmable logic chips, etc.
- the processing entity 120 may thus, as a functional entity, comprise a plurality of mutually co-operating processors and/or a number of sub-processors connected to a central processing unit, for instance.
- the processing entity 120 may be configured to execute the code stored in a memory 122, which may refer to instructions and data relative to the software logic and software architecture for controlling the device 102c or (device(s) of ) system 106.
- the processing entity 120 may at least partially execute and/or manage the execution of the authentication tasks.
- the memory entity 122 may be divided between one or more physical memory chips or other memory elements.
- the memory 122 may store program code for authentication and potentially other applications/tasks, and other data such as voiceprint repository, user contact information, electronic documents, service data etc.
- the memory 122 may further refer to and include other storage media such as a preferably detachable memory card, a floppy disc, a CD-ROM, or a fixed storage medium such as a hard drive.
- the memory 122 may be nonvolatile, e.g. ROM (Read Only Memory), and/or volatile, e.g. RAM (Random Access Memory), by nature.
- Software (product) may be provided on a carrier medium such as a memory card, a memory stick, an optical disc (e.g. CD-ROM or DVD), or some other memory carrier.
- the UI (user interface) 124, 124B may comprise a display, a touchscreen, or a data projector 124, and keyboard/keypad or other applicable user (control) input entity 124B, such as a touch screen, a number of separate keys, buttons, knobs, switches, a touchpad, a joystick, or a mouse, configured to provide the user of the system with practicable data visualization/reproduction and input/device control means, respectively.
- the UI 124 may include one or more loudspeakers and associated circuitry such as D/A (digital-to- analogue) converter(s) for sound output, and/or sound capturing elements 124B such as a microphone with A/D converter for sound input (obviously the device capturing voice input from the user at least has one, or external loudspeaker(s), earphones, and or microphone(s) may be utilized thereat for which purpose the UI 124, 124B preferably contains suitable wired or wireless (e.g. Bluetooth) interfacing means).
- a printer may be included in the arrangement for providing more permanent output.
- the device 102/system 106 may further comprise a data interface 126 such as a number of wired and/or wireless transmitters, receivers, and/or transceivers for communication with other devices such as terminals and/or network infrastructure(s).
- a data interface 126 such as a number of wired and/or wireless transmitters, receivers, and/or transceivers for communication with other devices such as terminals and/or network infrastructure(s).
- a data interface 126 such as a number of wired and/or wireless transmitters, receivers, and/or transceivers for communication with other devices such as terminals and/or network infrastructure(s).
- a data interface 126 such as a number of wired and/or wireless transmitters, receivers, and/or transceivers for communication with other devices such as terminals and/or network infrastructure(s).
- an integrated or a removable network adapter may be provided.
- Non-limiting examples of the generally applicable technologies include WLAN (Wireless LAN, wireless local area network), LAN, WiFi, Ethernet, USB (Universal Serial Bus), GSM (Global System for Mobile Communications), GPRS (General Packet Radio Service), EDGE (Enhanced Data rates for Global Evolution), UMTS (Universal Mobile Telecommunications System), WCDMA (wideband code division multiple access), CDMA2000, PDC (Personal Digital Cellular), PHS (Personal Handy-phone System), and Bluetooth.
- Some technologies may be supported by the elements of the system as such whereas some others (e.g. cell network connectivity) are provided by external, functionally connected entities.
- the device 102c or system 106 may comprise numerous additional functional and/or structural elements for providing advantageous communication, processing or other features, whereupon this disclosure is not to be construed as limiting the presence of the additional elements in any manner.
- Entity 125 refers to such additional element(s) found useful depending on the embodiment.
- Profiler 1 10 may establish the cue-associated voiceprints for the users based on the voice input by the users.
- the input may include speech or generally voice samples originally captured by user terminal(s) and funneled to the profiler 1 10 for voiceprint generation including e.g. feature extraction.
- Element 1 12 refers to a voiceprint repository 1 12 that may, in practice, contain a number of databases or other data structures for maintaining the personal voiceprints determined for the cues based on voice input by the user(s).
- Voiceprint data is personal (user account or user id related) and characterizes correct voice response to each cue (in the cue sub-set used for authenticating that particular user).
- Voiceprint data may indicate, as already alluded hereinbefore, e.g. fundamental frequency data, vocal tract resonance(s) data, duration/temporal data, loudness/intensity data, etc.
- Voiceprint data may indicate personal (physiological) properties of the user 102 and characteristics of received sample data (thus advantageously characterizing also the substance or message of the input) obtained during the voiceprint generation procedure. In that sense, the voice recognition engine used in accordance with the present invention may also incorporate characteristics of speech recognition.
- Analyzer 1 14 may take care of substantially real-time matching or generally analysis of voice input and already existing voiceprints during authentication. Such analysis may include a number of comparisons according to predetermined logic for figuring out whether the speaker/utterer really is the user initially indicated to the system.
- profiler 1 10 and analyzer 1 14 may be logically implemented by a common entity due to e.g. similarities between the executed associated tasks.
- Authentication entity 1 16 may be such an entity or it may at least generally control the execution of authentication procedure(s), determine cues for an authentication task, raise/lower permanent or session-specific authentication levels based on the outcome thereof, and control e.g. data transfer with terminal devices and network infrastructure(s) including various elements.
- the system 106 may provide a dedicated location(ing) id, a 'geokey', to the user 102 preferably through browser data such as service view, e.g. a login/authentication view or a portal view.
- the user 102 may then notice the (visualized) ID among the service data as a numeric code or generally a string of optionally predetermined length.
- the ID may be dynamic such as session- specific and/or for one-time use only.
- the location id may be combined with the session id (or a common id be used) or generally with data provided by the system for voice authentication e.g. via machine readable optical code like the Q code.
- the user 102 may input or read the code to the (second) terminal, after which the application installed thereat, acquires location data according to predetermined logic based on available positioning options.
- the location data is acquired in real-time or near real-time fashion upon receipt of the id to be current.
- the device may contain a satellite receiver such as GPS or GLONASS receiver through which location data may be obtained.
- the device may utilize network and related signal(s) for obtaining location data such as data provided by cellular network and/or short-range wireless network, optionally WLAN. Network-assisted positioning may be used.
- the application may be configured to utilize available interfaces provided with the mobile operating system for acquiring the positioning data.
- Location data such as longitude information, latitude information, accuracy or error estimate, the id itself or data derived therefrom, and/or time code (or time stamp) may be then collected and transmitted to the system 106.
- at least part of the above data elements may be utilized for determining a hash by means of a secret or asymmetric key, for example, in which case at least the hash is transmitted.
- HTTPS may be utilized for the secured transfer.
- the system 106 receives and optionally processes such as decodes the data. Subsequently, the system 106 may verify the current location of the user 102, as indicated by the obtained location data, against predetermined data indicative of e.g. allowed location(s).
- the resolution of the obtained data and/or related measurement error estimate may be utilized to adapt the decision-making. For example, in the case of a larger error/ worse positioning accuracy, more tolerance may be allowed in verification process, and vice versa.
- the system 106 is configured to maintain data about allowed (and/or rejected) user locations through utilization of polygon data, i.e. geo- referenced polygon data. For example, a number of allowed postal areas represented by the corresponding polygons may have been associated with each user. The obtained location data may be mapped to a corresponding postal area polygon that is then searched from the list of allowed postal area polygons. In such an embodiment, the aforesaid adaptation may be realized by stretching or shrinking the postal area polygon boundaries, for instance. In the case of a positive outcome (allowed location detected), the system 106 may again update the authentication, or generally 'security', status of the user 102 accordingly and typically raise it.
- the user 102 may be provided with enhanced access rights to service features such as payment/finance components, higher security documents, etc. as reviewed above.
- service features such as payment/finance components, higher security documents, etc. as reviewed above.
- Each user may be associated with session-based information such as session record dynamically keeping track of, among potential other issues, the user rights emerging from the successful authentication actions.
- a notification of the raised access security level or failed authentication may be transmitted to the user via mobile application and/or through browser data.
- the system 106 may update the service parameters for the session automatically and provide an updated service view such as browser view to the user's terminal.
- Figure 3 discloses, by way of example only, a method flow diagram in accordance with an embodiment of the present invention.
- the device and/or system of the present invention is obtained and configured, for example through loading and execution of related software, for managing the electronic service and related authentication mechanism(s).
- the voiceprints shall be established as described in this text earlier.
- the device/system may be trained by the user such that the user utters the desired response (association) to each cue in his/her preferred and/or at least partially machine- selected (sub-)set of cues, whereupon the system extracts or derives the voiceprints based on the voice input.
- the user may be asked to provide some general or specific voice input that is not directly associated with any voiceprint. Using that voice input, the system may generally model the user- specific voice and/or speech parameters later applied in voice-based authentication and voiceprint matching, for example.
- an indication of a required authentication is received from a user via feasible UI such as access control terminal, digital service UI (e.g. browser-based UI) or e.g. via a dedicated application.
- UI such as access control terminal, digital service UI (e.g. browser-based UI) or e.g. via a dedicated application.
- the request may be associated with a certain user whose voiceprints are available.
- the request may identify such a user identity by a user ID, for example. Procedures potentially incorporating linking first and second terminals of the user relative to the current service session have been already discussed in this text.
- user identity indication is not necessary.
- a number of cues are determined or selected preferably from a larger group thereof.
- the selection may be random, alternating (subsequent selections preferably contain different cue(s)), and/or following some other logic.
- the number of cues per authentication operation may be dynamically selected by the system/device as well. For example, if a previous voice authentication procedure regarding the same user identity failed, the next one could contain more (or less) cues, and potentially vice versa.
- the status of other authentication factor(s) may be configured to affect the number. For example, if the user has already been authenticated using some other authentication factor or element, e.g. location, the number of cues could be scaled lower than in situation wherein overall authentication status of the user is weaker.
- the cues are represented to the user via a user device utilized for service access, stand-alone user device, or e.g. an access control (terminal) device.
- a user device utilized for service access, stand-alone user device, or e.g. an access control (terminal) device.
- at least indication of the cues may be transmitted by a remote system to the (first) user terminal potentially with instructions regarding visual and/or audible reproduction thereof e.g. via a browser.
- the cues are represented in easily noticeable and recognizable order so that the response thereto may be provided as naturally as s possible following the same order.
- graphical cues may be represented in series extending from left to right via the service or application UI, and the user may provide the voice response acknowledging each cue in the same, natural order advantageously without a need to provide any separate, explicit control command for identifying the target cue during the voice input stage.
- the user may utter the response to each cue one after each other by just keeping a brief pause in between so that cue-specific (sub-)responses may be distinguished from each other (and associated with proper cue) in the overall response afterwards by the terminal or the system based on the pauses.
- the user may explicitly indicate via the UI, through cue-specific icon/symbol selection, for instance, to which cue he/she is next providing the voice response.
- the voice response to the challenged formed by the cues is provided by the user and potentially forwarded via the terminal to a remote analyzing entity such as the authentication system.
- the sound data forwarded may include digital sound samples, such as so-called raw or PCM (pulse-code modulation) samples, or e.g. a more heavily parameterized compressed representation of the capture voice.
- the obtained voice response data is analyzed against the corresponding personal (user-specific) voiceprints of the represented cues.
- the analysis tasks may include different matching and comparison actions following a predetermined logic. For example, a preferred existing or new best match type search algorithm may be exploited potentially followed by additional quality checking rules determining whether even the best match was good enough to acknowledge the current user as the indicated one.
- the logic may apply fixed threshold(s) for making decisions (successful authentication, failed authentication), or alternatively dynamic criteria may be applied. For instance, if e.g. heavy background noise is detected in the obtained sound data, criteria could be loosened.
- the authentication status or level associated with the user is updated accordingly (raised, lowered, or left as is).
- the user may be provided with access to new location(s) or resource(s) (typically takes place only if the authentication status is raised).
- a computer program comprising a code means adapted, when run on a computer, to execute an embodiment of the desired method steps in accordance with the present invention, may be provided.
- a carrier medium such as an optical disc, floppy disc, or a memory card, or other non-transitory carrier medium comprising the computer program may further be provided.
- the program may be further delivered over a communication network and generally over a communication channel.
- HTML5 hypertext mark-up language standard includes application program interfaces for camera, voice recording and geolocation.
- These new HTML features could be utilized in connection with the present invention e.g. instead of a (native) client, e.g. Java, application and/or QR reader described hereinbefore.
- An associated web link could be provided to a user terminal, e.g. to the mobile terminal, included in an SMS (e.g. OTP, onetime password type) message.
- SMS e.g. OTP, onetime password type
- the user could indeed be provided with a message (e.g. the aforesaid SMS or other applicable message type) including a dedicated HTML5 weblink (e.g.
- HTML5 switches to a voice input (uttering) page or view, finally followed by the more conventional transmission of the voice data e.g. as raw data towards a speaker verification engine (analyzer) optionally along with location data.
- a speaker verification engine analyzer
- Such HTML5 based or other similar approach could be considered as an authentication instance-specific native client as the URL in question preferably works only in conjunction with the related authentication event, and the necessary native or native-like feature(s) may be provided within a web page through APIs.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Computing Systems (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Telephonic Communication Services (AREA)
Abstract
Electronic system (106, 109A) for authenticating a user of an electronic service, said system preferably comprising at least one server apparatus, the system being configured to store (122, 112, 200), for a number of users, a plurality of personal voiceprints (204) each of which being linked with a dedicated visual, audiovisual or audio cue (202), for challenge-response authentication of the users, pick (116, 200C, 142, 144), upon receipt of an authentication request associated with an existing user of said number of users, a number of cues (212) for which there are voiceprints of the existing user stored, and provide the cues for representation (144, 126) to the user as a challenge, receive (126, 148) sound data indicative of the voice response uttered by the user to the represented cues, determine (114, 150) on the basis of the sound data, the represented cues and linked voiceprints, whether the response has been uttered by the existing user of said number of users, and provided that this seems to be the case, elevate (116, 152, 200D, 218, 216) the authentication status of the user as the existing user, preferably regarding at least the current communication session. A corresponding method and device (102c) are presented.
Description
AUDIOVISUAL ASSOCIATIVE AUTHENTICATION
METHOD AND RELATED SYSTEM FIELD OF THE INVENTION
Generally the invention pertains to computers and related communications infrastructures. In particular, however not exclusively, the invention concerns authentication relative to an electronic device or electronic service.
BACKGROUND
Access control in conjunction with e.g. network services may imply user identification, which can be generally based on a variety of different approaches. For example, three categories may be considered including anonymous, standard and strong identification. Regarding anonymous case, the service users do not have to be and are not identified. Standard, or 'normal', identification may refer to what the requestor for access knows, such as a password, or bears such as a physical security token. Such a token may include password-generating device (e.g. SecurlD™), a list of one-time passwords, a smart card and a reader, or a one-time password transmitted to a mobile terminal. Further, strong identification may be based on a biometric property, particularly a biometrically measurable property, of a user, such as a fingerprint or retina, or a security token the transfer of which between persons is difficult, such as a mobile terminal including a PKI (Public Key Infrastructure) certificate requiring entering a ΡΓΝ (Personal Identification Number) code upon each instance of use.
On the other hand, network service -related authentication, i.e. reliable identification, may also be implemented on several levels, e.g. on four levels, potentially including unnecessary, weak, strongish, and strong authentication, wherein the strongish authentication, being stronger than weak, thus resides between the weak and strong options. If the user may remain anonymous, authentication is unnecessary. Weak authentication may refer to the use of single standard category identification means such as user ID/password pair. Instead, strongish authentication may apply at least two standard identification measures utilizing different techniques. With strong authentication, at least one of the identification measures should be strong.
Notwithstanding the various advancements taken place during the last years in the context of user and service identification, authentication, and related secure data transfer, some defects still remain therewith and are next briefly and non- exhaustively reviewed with useful general background information.
Roughly, access control methods to network services include push and pull methods. In pull methods, a user may first identify oneself anonymously to a network service providing a login screen in return. The user may then type in the user ID and a corresponding password, whereupon he/she may directly access the service or be funneled into the subsequent authentication phase. In push methods, a network server may first transmit information to the e-mail address of the user in order to authorize accessing the service. Preferably only the user knows the password of the e-mail account. The users are often reluctant to manually manage a plurality of user IDs and corresponding passwords. As a result, they may utilize the very same user ID and/or password in multiple services and/or use rather obvious and thus easy-to- crack words, numbers or expressions as passwords. Even if the access control management systems require using a strong password, i.e. hard-to-remember password, a risk that the user writes the password down increases considerably and the authentication level turns ultimately weak.
Yet, the utilization of a password is typically enabled by access control management entity that may also store the password locally. If the security of the data repository is later jeopardized, third parties may acquire all the passwords stored therein. Also if the user forgets the password or it has to be changed for some other reason, actions have to be taken by the user and optionally service provider. The user has to memorize the new password. Further, the adoption of a personal, potentially network service-specific token such as a smartcard, e.g. SecurlD™, and a related reader device may require intensive training. The increase in the use of smart cards correspondingly raises the risk of thefts and provision of replacement cards. In case the personal tokens apply a common (distributed) secure algorithm, the theft of such algorithm would cause tremendous security issues and trigger massive update operations regarding the associated elements such as tokens in order to recover at least part of the original security.
For instance, in the context of cloud services such as cloud virtual-desktop services that may be regularly, e.g. daily, utilized by a user, the nowadays available access control procedures, especially identification and authentication solutions applied upon logging in to the service, are typically either inadequate in terms of the achieved data security or simply awkward from the standpoint of usability with reference to the aforesaid lengthy and strong, i.e. complex and thus hard-to-remember, passwords.
SUMMARY OF THE INVENTION
The objective is to at least alleviate one or more problems described hereinabove regarding the usability and security issues, such as authentication, associated with the contemporary remote computer systems and related electronic services such as online services.
The objective is achieved by the system and method in accordance with the present invention. The suggested solution cleverly harnesses, among other factors, the associative memory of a user for electronic authentication as described in more detail hereinafter.
In an aspect of the present invention, a system for authenticating a user of an electronic service, comprises at least one server apparatus preferably provided with a processing entity and a memory entity for processing and storing data, respectively, and a data transfer entity for receiving and sending data, the system being configured to store, for a number of users, a plurality of personal voiceprints each of which linked with a dedicated visual, audiovisual or audio cue, for challenge-response authentication of the users, the cues being user-selected, provided or created, pick, upon receipt of an authentication request associated with an existing user of said number of users, a number of cues for which there are voiceprints of the existing user stored, and provide the cues for representation to the user as a challenge, receive sound data indicative of the voice response uttered by the user to the represented cues,
determine on the basis of the sound data, the represented cues and linked voiceprints, whether the response has been uttered by the existing user of said number of users, and provided that this seems to be the case, elevate the authentication status of the user as the existing user, preferably regarding at least the current communication session.
Preferably the sound data is received from a mobile (terminal) device. The mobile device advantageously incorporates a microphone for capturing voice and encoding it into digital format. Preferably, the system maintains or has access to information linking service/application users, or user id's, and mobile devices or mobile identities, e.g. IMEI code or IMSI code (or other smart card), respectively, together. Optionally, mobile phone number could be utilized for the purpose.
Optionally, the cues are indicated to the user via a first terminal device such as a laptop or desktop computer. Service data in general and/or the cues may be provided as browser data such as web page data. Preferably such first terminal device includes or is at least connected to a display, a projector and/or a loudspeaker with necessary digital-to-analogue conversion means for the purpose.
Advantageously, the sound data is then obtained via a second terminal device, preferably via the aforementioned mobile device like a cellular phone, typically a smartphone, or a communications-enabled PDA/tablet, configured to capture the sound signal incorporating the user's voice (uttering the response to the cues) and convert it into digital sound data forwarded towards the system.
In some embodiments, the mobile device may be provided with a message, such as an SMS message, triggered by the system in order to verify that the user requiring voice-based authentication has the mobile device with him/her. For example, the user may have logged in an electronic service using certain user id that is associated with the mobile device. Such association may be dynamically controlled in the service settings by the user, for instance. In response to the message, the user has to trigger sending a reply, optionally via the same mobile device or via the first terminal, optionally provided with a secret such as password, or other acknowledgement linkable by the system with the user (id).
In some embodiments, the cues may be represented visually and/or audibly utilizing e.g. a web browser at the first user terminal. Preferably, but not necessarily, the user provides the response using the second terminal such as a mobile terminal. The first terminal may refer to e.g. a desktop or laptop computer that may be personal or in a wider use. The second terminal, particularly if being a mobile terminal such as a smartphone, is typically a personal device associated with a certain user only, or at least rather limited group of users.
The system may be configured to link or associate the first and second terminals together relative to the ongoing session and authentication task. As a result, actions taken utilizing the second terminal may be linked with activity or response at the first terminal, e.g. browser thereat, by the system.
For example, the system may be configured to dynamically allocate a temporary id such as so-called session id to the first terminal. This id may comprise a socket id. The first terminal may then be configured to indicate the id to the user and/or second terminal. For example, visual optionally coded representation, applying a Q (Quick Response) code, preferably including also other information such as user id (to the service) and/or domain information may be utilized. The second terminal may be then configured to wirelessly obtain the id. Preferably, the second terminal may read or scan, e.g. via camera and associated code reader software, the visual representation and decode it. Preferably using the same application, e.g. Java application, which is applied for receiving voice input from the user, is utilized for delivering the obtained id back towards the system, which then associates the two terminals and the session running in the first terminal (via the browser) together.
In some embodiments, the determination tasks may include a number of mapping, feature extraction, and/or comparison actions according to predetermined logic by which the match between the obtained sound data and existing voiceprint data relative to the indicated existing user is confirmed, i.e. the authentication is considered successful in the light of such voice-based authentication factor. In the case of no match, i.e. failed voice-related authentication, the authentication status may remain as is or be lowered (or access completely denied).
In some embodiments, elevating the gained (current) authentication status in connection with successful voice-based authentication may include at least one
action selected from the group consisting of: enabling service access, enabling a new service feature, enabling the use of a new application, enabling a new communication method, and enabling the (user) adjustment of service settings or preferences.
In some embodiments, a visual cue defines a graphical image that is rendered on a display device for perception and visual inspection by the user. The image may define or comprise a graphical pattern, drawing or e.g. a digital photograph. Preferably, the image is complex enough so that the related (voice) association the user has, bears also necessary complexity and/or length in view of sound data analysis (too short or too simple voice input/voiceprint renders making reliable determinations difficult).
In some embodiments, audiovisual cue includes a video clip or video file with associated integral or separate sound file(s). Alternatively or additionally, audiovisual cue may incorporate at least one graphical image and related sound.
Generally, video and audiovisual cues are indicated by e.g. a screenshot or other descriptive graphical image, and/or text, shown in the service UI. The image or a dedicated UI feature (e.g. button symbol) may be then utilized to activate the video playback by the user through clicking or otherwise selecting the image/feature, for instance. Alternatively, e.g. video cue(s) may playback automatically, optionally repeatedly. In some embodiments, the audio cue includes sound typically in a form of at least one sound file that may be e.g. monophonic or stereophonic. The sound may represent music, sound scenery or landscape (e.g. jungle sounds, waterfall, city or traffic sounds, etc.), various noises or e.g. speech. Audio cue may, despite of its non-graphical/invisible nature, still be associated with an image represented via the service UI. The image used to indicate an audio cue is preferably at least substantially the same (i.e. non-unique) with all audio cues, but anyhow enables visualizing an audio cue in the UI among e.g. visual or audiovisual cues, the cues being optionally rendered as a horizontal sequence of images (typically one image per cue), of the overall challenge. As with video or audiovisual cues, the image may be active and selecting, or 'clicking' it, advantageously then triggers the audible reproduction of the cue.
Alternatively or additionally, a common UI feature such as icon may be provided to trigger sequential reproduction of all audio and optionally audiovisual, cues.
In some embodiments and in the light of foregoing, basically all the cues may be indicated in a (horizontal) row or column, or using other configuration, via the service UI.
Visually distinguishable, clear ordering of the cues is advantageous as the user may immediately realize also the corresponding, correct order of corresponding cue-specific (sub-)responses in his/her overall voice response.
Video, audiovisual and/or audio cues may at least have a representative, generic or characterizing, graphical image associated with them as discussed above, while graphical (image) cues are preferably shown as such.
In some embodiments, at least one cue is selected or provided, optionally created, by the user himself/herself. A plurality of predetermined cues may be offered by the system to the user for review via the service UI wherefrom the user may select one or more suitable, e.g. the most memorable, cues to be associated with voiceprints. Preferably, a plurality of cues is associated with each user.
A voiceprint, i.e. a voice-based fingerprint, may be determined for a cue based on a user's sound, or specifically voice, sample recorded and audibly exhibiting the user's association (preferably brainworm) relating to each particular cue. A voiceprint of the present invention thus advantageously characterizes, or is used to characterize, both the user (utterer) and the spoken message (the cue or substantive personal association with the cue) itself. Recording may be effectuated using the audio input features available in a terminal device such as microphone, analogue-to-digital conversion means, encoder, etc.
With different users, a number of same or similar cues may be generally utilized. Obviously, the voiceprints associated with them are personal.
In some embodiments, the established service connection (access) is maintained based on a number of security measures the outcome of which is used to determine the future of the service connection, i.e. let it remain, terminate it, or change it, for example. In some scenarios, fingerprint methodology may be applied. A user terminal may initially, upon service log-in, for instance, provide a
fingerprint based on a number of predetermined elements, such as browser data such as version data, OS data such as version data, obtained Java entity data such as version data, and/or obtained executable data such as version data. Version data may include ID data such as version identifier or generally the identifier (application or software name, for example) of the associated element. The arrangement may be configured to request new fingerprint in response to an event such as a timer or other temporal event (timed requests, e.g. on a regular basis). Alternatively or additionally, the client may provide fingerprints independently based on timer and/or some other event, for instance.
In response to the received new fingerprint, the arrangement may utilize the most recent fingerprint and a number of earlier fingerprints, e.g. the initial one, in a procedure such as a comparison procedure. The procedure may be executed to determine the validity of the current access (user). For example, if the compared fingerprints match, a positive outcome may be determined indicating no increased security risk and the connection may remain as is. A mismatch may trigger a further security procedure or terminating the connection.
In some embodiments, the system is location-aware advantageously in a sense it utilizes location information to authenticate the user. A number of predetermined allowed and/or non-allowed/blocked locations may be associated with each user of the arrangement. For example, the location may refer to at least one element selected from the group consisting of: address, network address, sub-network, IP (Internet Protocol) address, IP sub-network, cell, cell-ID, street address, one or more coordinates, GPS coordinates, GLONASS coordinates, district, town, country, continent, distance to a predetermined location, and direction from a predetermined location. Each of the aforesaid addresses may refer to an address range. Failed location-based authentication may result in a failed overall authentication (denied access), or alternatively, a limited functionality such as limited access to the service may be provided. The same applies to potential other authentication factors. Each authentication factor may be associated with a characterizing weight (effect) in the authentication process.
In some embodiments, the system may be configured to transmit a code, preferably as browser data such as web page data, during a communication session associated with a predetermined user of the service for visualization and
subsequent input by the user. Further, the system may be configured to receive data indicative of the inputted code and of the location of the terminal device applied for transmitting the data, determine on the basis of the data and predetermined locations associated with the user whether the user currently is in allowed location, and provided that this seems to be the case on the basis of the data, raise the gained authentication status of the user regarding at least the current communication session. Preferably the data is received from a mobile (terminal) device. Optionally, the code is indicated to the user via a first terminal device such as a laptop or desktop computer. Instead of a code dedicated for the purpose, e.g. the aforesaid temporary id such as socket id may be utilized in this context as well.
A certain location may be associated with a certain user by "knowing" the user, which may refer to optionally automatically profiling and learning the user via monitoring one's habits such as location and optionally movements. As a result, a number of common, or allowed, locations may be determined and subsequently utilized for authentication purposes. Additionally or alternatively, the user may manually register a number of allowed locations for utilizing the solution in the arrangement. Generally, in various embodiments of the present invention, knowing the user and/or his/her gear and utilizing the related information such as location information in connection with access control, conducting automated attacks such as different dictionary attacks against the service may be made more futile. In some scenarios, the location of the user (terminal) and/or data route may be estimated, e.g. by the system, based on transit delay and/or round-trip delay. For example, delays relating to data packets may be compared with delays associated with a number of e.g. location-wise known references such as reference network nodes, which may include routers, servers, switches, firewalls, terminals, etc.
Yet in a further, either supplementary or alternative embodiment, the electronic service is a cloud service (running in a cloud). Additionally or alternatively, the service may arrange virtual desktop and/or remote desktop to the user, for instance.
In another aspect, an electronic device for authenticating a person, comprises
a voiceprint repository configured to store, for a number of users including at least one user, a plurality of personal voiceprints, each of which being linked with a dedicated visual, audiovisual or audio cue, for challenge-response authentication, the cues being user-selected, user-provided or user-created, an authentication entity configured to pick, upon receipt of an authentication request associated with an existing user of said number of users, a number of cues for which there are voiceprints of the existing user stored, and represent the number of selected cues to the person as a challenge, and a response provision means for obtaining sound data indicative of the voice response uttered by the person to the represented cues, whereupon the authentication entity is configured to determine, on the basis of the sound data, the represented cues and voiceprints linked therewith, whether the response has been uttered by the existing user of said number of users, and provided that this seems to be the case, and to elevate the authentication status of the person as the existing user.
In a further aspect, a method for authenticating a subject person to be executed by one or more electronic devices, comprising storing, for a number of users, a plurality of personal voiceprints each of which linked with a dedicated visual, audiovisual or audio cue, for challenge-response authentication of the users, cues being user-selected, provided or created, picking, upon receipt of an authentication request associated with an existing user of said number of users, a number of cues for which there are voiceprints of the existing user stored, to be represented as a challenge, receiving a response incorporating sound data indicative of the voice response uttered by the person to the represented cues, determining on the basis of the sound data, the represented cues and linked voiceprints, whether the response has been uttered by the existing user of said number of users, and provided that this seems to be the case,
elevating the authentication status of the person acknowledged as the existing user according to the determination, whereupon e.g. a responsive access control action may be executed. The previously presented considerations concerning the various embodiments of the system may be flexibly applied to the embodiments of the device or method mutatis mutandis, and vice versa, as being appreciated by a skilled person.
The utility of the present invention follows from a plurality of issues depending on each particular embodiment. Cleverly, the associative memory of users and also a phenomenon relating to a memory concept often referred to as brainworms, or earworms, regarding things and related associations one seems to remember, basically reluctantly but still with ease (e.g. songs that are stuck inside one's mind/one cannot get out of his/her head), can be harnessed into utilization in the context of authentication together with voice recognition. One rather fundamental biometric property, i.e. voice, is exploited as an authentication factor together with features of speech (i.e. voice input message content) recognition. Also other factors, e.g. location data indicative of the location of the user (terminal), may be applied for authentication purposes.
Rather regularly people manage to associate different things like sounds, images, videos, etc. together autonomously or automatically and recall such, potentially complex and/or lengthy (advantageous properties in connection with authentication particularly if the related voice inputs and fingerprints exhibit similar characteristics in conjunction with the present invention) association easily after many years, even if the association as such was originally subconscious or at some occasions even undesired as the person in question sees it. By the present solution, device and/or service users may be provided with authentication challenge as a number of cues such as images, videos and/or sounds for which they will themselves initially determine the correct response they want to utilize in the future during authentication. Instead of hard-to- remember numerical or character based code strings, the user may simply associate each challenge with a first associative, personal response that comes into mind and apply that memory image in the forthcoming authentication events based on voice recognition, as for each cue a voiceprint is recorded indicative of correct response, whereupon the user is required to repeat the voice response upon authentication when the cue is represented to him/her as a challenge.
Yet, a technically feasible, security enhancing procedure is offered for linking a number of terminals together from the standpoint of electronic service, related authentication and ongoing session. Still, electronic devices such as computers and mobile terminals may be cultivated with the suggested, optionally self-contained, authentication technology that enables authentication on operating system level, which is basically hardware manufacturer independent. The outcome of an authentication procedure may be mapped into a corresponding resource or feature (access) control action in the device. Generally, (further) overall access to the device resources (e.g. graphical UI and/or operating system) or access to specific application(s) or application feature(s) may be controlled by the authentication procedure, which may be also considered as a next or new generation password mechanism, identity based solution. Indeed, the suggested authentication technology may be utilized to supplement or replace the traditional 'PIN' type numeric or character code based authentication/access control methods.
Further, the suggested solution may be applied to various access control devices and solutions in connection with, for example, gates, doors, containers, safes, diaries, or locking/unlocking mechanisms in general.
The expression "a number of refers herein to any positive integer starting from one (1), e.g. to one, two, or three. The expression "a plurality of refers herein to any positive integer starting from two (2), e.g. to two, three, or four.
The expression "data transfer" may refer to transmitting data, receiving data, or both, depending on the role(s) of a particular entity under analysis relative a data transfer action, i.e. a role of a sender, a role of a recipient, or both.
The terms "electronic service" and "electronic application" are herein utilized interchangeably. The terms "a" and "an" do not denote a limitation of quantity, but denote the presence of at least one of the referenced item.
The terms "first" and "second" do not denote any order, quantity, or importance, but rather are used to distinguish one element from another.
Different embodiments of the present invention are disclosed in the dependent claims.
BRIEF DESCRIPTION OF THE RELATED DRAWINGS
Next the invention is described in more detail with reference to the appended drawings in which
Fig. la illustrates the concept of an embodiment of the present invention via both block and signaling diagram approaches relative to an embodiment thereof.
Fig. lb is a block diagram representing an embodiment of selected internals of the system or device according to the present invention.
Fig. lc illustrates scenarios involving embodiments of an electronic device in accordance with the present invention.
Fig. 2a represents one example of service or device UI view in connection with user authentication.
Fig. 2b represents a further example of service or device UI view in connection with user authentication.
Fig. 2c represents further example of service or device UI view in connection with user authentication.
Fig. 2d represents further example of service or device UI view in connection with user authentication.
Fig. 3 is a flow chart disclosing an embodiment of a method in accordance with the present invention.
DETAILED DESCRIPTION OF THE EMBODIMENTS
Figure la illustrates an embodiment of the present invention. The embodiment may be generally related, by way of example only, to the provision of a network- based or particularly online type electronic service such as a virtual desktop service or document delivery service, e.g. delivery of a bill including a notice of maturity regarding a bank loan. Entity 102 refers to the service user (recipient) and associated devices such as a desktop or laptop computer and/or a mobile device utilized for accessing the service in the role of a client, for instance. The device(s) preferably provide access to a network 108 such as the Internet. The
mobile device, such as a mobile phone (e.g. a smartphone) or a PDA (personal digital assistant) may preferably be wirelessly connected to a compatible network, such as a cellular network. Preferably the Internet may be accessed via the mobile device as well. The mobile device may comprise a browser. Entity 106 refers to a system or network arrangement of a number of at least functionally connected devices such as servers. The communication between the entities 102 and 106 may take place over the Internet and underlying technologies, for example. Preferably the entity 106 is functionally also connected to a mobile network.
Indeed, in the context of the shown embodiment of the present invention, the user 102 of an electronic service 106 incorporating or at least utilizing an embodiment of the system in accordance with the present invention (these two terms being therefore used interchangeably hereinafter) is preferably associated with a first terminal device 102a such as a desktop or laptop computer, a thin-client or a tablet/hand-held computer provided with network 108 access, typically Internet access. Yet, the user 102 preferably has a second terminal 102b such as a mobile communications device with him/her, advantageously being a smartphone or a corresponding device with applicable mobile subscription or other wireless connectivity enabling the device to transfer data e.g. between local applications and the Internet. Many contemporary and forthcoming higher end mobile terminals qualifying as smartphones bear necessary capabilities for both e-mail and web surfing purposes among various other sophisticated features including e.g. camera with optional optical code, e.g. Q code, reader application. In most cases, such devices support a plurality of wireless communication technologies such as cellular and wireless local area network (WLAN) type technologies. A number of different, usually downloadable or carrier provided, such as memory card provided, software solutions, e.g. client applications, may be run on these 'smart' terminal devices.
The potential users of the provided system include different network service providers, operators, cloud operators, virtual and/or remote desktop service providers, application/software manufacturers, financial institutions, companies, and individuals in the role of a service provider, intermediate entity, or end user, for example. The invention is thus generally applicable in a wide variety of different use scenarios and applications.
In some embodiments the service 106 may include customer portal service and the service data may correspondingly include customer portal data. Through the portal, the user 102 may inspect the available general data, company or other organization-related data or personal data such as data about rental assets, estate or other targets. Service access in general, and the access of certain features or sections thereof, may require authentication. Multi-level authentication may be supported such that each level can be mapped to predetermined user rights regarding the service features. The rights may define the authentication level and optionally also user-specific rules for service usage and thereby allow feature usage, exclude feature usage, or limit the feature usage (e.g. allow related data inspection but prevent data manipulation), for instance.
Initially, at 127 the system 106 may be ramped up and configured to offer predetermined service to the users, which may also include creation of user accounts, definition of related user rights, and provision of necessary authentication mechanism(s). Then, the user 102 may execute necessary registration procedures via his/her terminal(s) and establish a service user account cultivated with mandatory or optional information such as user id, service password, e-mail address, personal terminal identification data (e.g. mobile phone number, IMEI code, IMSI code), and especially voiceprints in the light of the present invention. This obviously bi-directional information transfer between the user/user device(s) 102 and the system/service 106, requiring performing related activities at both end, is indicated by items 128, 130 in the figure.
Figure 2a visualizes the voiceprint creation in the light of possible related user experience. A number of potential cues, such as graphical elements, 202 may be first indicated to the user via the service or device (mutatis mutandis) UI 200. Advantageously, the user naturally links at least some cues with certain associations based on e.g. his/her memories and potentially brainworms so that association is easy to recall and unambiguous (only one association per cue; for example, upon seeing a graphical representation of a cruise ship, the user always come up with memory relating to a trip to Caribbean, whereupon the natural association is 'Caribbean', which is then that user's voice response to the cue of a cruise ship).
Further information 204 such as the size of a captured sound file may also be shown. The user may optionally select a sub-set of all the indicated cues and/or
provide (upload) cues of his/her own to the system for use during authentication in the future. There is preferably a minimum size defined for the sub-set, i.e. number of cues, each user should be associated with. That could be three, five, six, nine, ten, or twelve cues, for example. Further, the sound sample to be used for creating the voiceprint, and/or as at least part of a voiceprint, may be defined a minimum acceptable duration in terms of e.g. seconds.
As mentioned hereinearlier, the cues may be visual, audible, or a combination of both. Regarding the user associated cues, the user may then input, typically utter, his/her voice response based on which the system determines the voiceprints, preferably at least one dedicated voiceprint corresponding to each cue in the subset. A voiceprint associated with a cue preferably characterizes both the voice and the spoken sound, or message, of the response. In other words, same message later uttered by other user does not match with the voiceprint of the first user during the voice authentication phase even though the first user uttered the very same message to establish the fingerprint. On the other hand, a message uttered by the first user does not match the voiceprint established based on other message uttered by the first user. For voice characterization, the system may be configured to extract a number of parameters describing the properties of the user's vocal tract, for example, and e.g. related formant frequencies. Such frequencies typically indicate the personal resonance frequencies of the vocal tract of the speaker. Next, reverting to Fig. la and switching over (indicated by the broken line in the figure) to a scenario in which the user has already set up a service account and wishes to authenticate to reach desired authentication status within the service 106, at 132 the user 102 may trigger the browser in the (first) terminal and control it to connect to the target electronic service 106. Accordingly, the user may now log into the service 106 using his/her service credentials or provide at least some means of identification thereto as indicated by items 132, 134.
The system managing the service 106 may comprise e.g. a Node.js server entity whereto/relative to which the web browser registers itself, whereupon the service side allocates a dynamic id such as a socket id or other session id and delivers it at 136 to the browser that indicates the id and optionally other information such as domain, user id/name, etc. to the user via a display at item 138.
Figure 2b illustrates an embodiment of potential service or device UI features at this stage through a snapshot of UI view 200B. The dynamic id is shown both as a numeric code and as embedded in a Q code at 208 to the user. Items 206 indicate the available authentication elements or factors, whereas data at 210 implies the current authentication level of the session with related information. Instead of QR code, some other matrix barcode or completely different visual representation could be utilized.
With reference to Fig. la again, at 140 the code is read by using a camera-based code reader application, for instance, to the second terminal such as mobile terminal of the user. Then the mobile device is configured, using the same or other predetermined application, to transfer an indication of the obtained data such as dynamic id and data identifying the terminal or entity such as smart card therein, to the system 106, wherein a target entity such as socket.io entity that is configured to operate as a (browser type) client to the Node.js server entity, forwards at least part of the data, including the dynamic id, to the Node.js server entity that can thus dynamically link, and preferably shall link, the particular first and second terminals to the same ongoing service (authentication) session at 142. Also the subsequent data transfer activities, e.g. transfer of voice response, from the second terminal to the system may be at least partially implemented utilizing the same route and related technique(s). Different entities on the system side may be, in practical circumstances, be implemented by one or more physical devices such as servers. For example, a predetermined server device may implement the Node.js server whereas one other server device may implement the socket.io client entity.
Next, at 144 the system 106 fetches a number (potentially dynamically changing according to predetermined logic) of cues associated with the user account initially indicated/used in the session and for which voiceprints are available. The cues may be basically randomly selected (and order-wise also randomly represented to the user). The cues are indicated (transferred) to the browser in the terminal that then represents them to the user at 146 e.g. via a display and/or audio reproduction means depending on the nature of the cues. E.g. Ajax (Asynchronous JavaScript and XML) and PHP (Hypertext Preprocessor) may be utilized for terminal side browser control. Mutually the cues may be of the same or mixed type (e.g. one graphical image cue, one audio cue, and one video cue optionally with audio track).
As the user 102 perceives the cues as an authentication challenge, he/she provides the voice response preferably via the second terminal at 148 to the service 106 via a client application that may be the same application used for transferring the dynamic id forward. The client side application for the task may be a purpose-specific Java application, for example. In Figure 2c, four graphical (image) cues are indicated at 212 in the service or device UI view 200C (browser view). Being also visible in the figure is a plurality of service features at 214, some of which are greyed out, i.e. non-active features, due to the current insufficient authentication level. Indeed, a service or particularly service application or UI feature may be potentially associated with a certain minimum security level required for access.
Automatic expiration time for the session may also be indicated via the UI. Preferably, a session about to expire or expired may be renewed by repeated/new authentication.
In Figure la, at 150 the service 106 analyzes the obtained user response relative to the cues against the voiceprints using predetermined matching technique(s) and/or algorithms. In primary embodiments, the input order of (sub-)responses corresponding to individual cues in the overall response should match the order in which cues were represented in the service UI (e.g. in a row, from left to right). In some other embodiments, the system 106 may, however, be configured to analyze whether the order of sub-responses matches the order of cues given, or at least to try different ordering(s). Optionally, the system 106 may be configured to rearrange the sub-responses relative to the cues to obtain e.g. better voiceprint matching result during the analysis.
When the response (e.g. parameter(s) derived therefrom) matches with the voiceprints sufficiently according to predetermined logic, the voice authentication procedure may be considered successful, and the authentication level may be scaled (typically raised) accordingly at 152. On the other hand, if the voice-based authentication fails (non-match), the authentication status may be left intact or lowered, for instance. Outcome of such authentication procedure is signaled to the user (preferably at least to the first terminal, potentially both) for review e.g. as an authentication status message via the service UI at 154. New features may be made available to the user in the service UI.
Figure 2d depicts a possible service or device UI view 200D after successful voice authentication. Explicit indication of the outcome of the authentication procedure is provided at 218 by authentication status message and as an implicit indication thereof, more service features 216 have been made available to the user (not greyed anymore, which the user will immediately recognize).
In some embodiments, location information may be optionally utilized in the authentication process as such. In one embodiment, the server 106 and/or other entities external to the user's 102 terminal gear may be configured to locate one or more of the terminals the user 102 applies for communicating with the service 106. Alternatively or additionally, the terminal devices may bear an own role in the positioning process and execute at least part of the necessary positioning actions locally. Actions required to position a terminal may be shared between the terminal(s) and at least one external entity.
For instance, address information may be used in the positioning process to deduce the location of the particular terminal in question (see Figures 2b-2d wherein IP location has been identified as one applied authentication/identification criterion). Somewhat typically, terminal or access network addresses such as IP addresses are at least loosely associated with physical locations so that the address-based locating is at least limitedly possible. In connection with mobile devices, many other options are also available including roaming signal and data transmission -based positioning. For example, by checking the ID of the base station(s) the mobile device is communicating with, at least approximate location of the mobile device may be obtained. Yet, through more comprehensive signal analysis, such as TOA (Time-Of-Arrival), OTD (Observed-Time-Difference), or AOA (Angle-Of-Arrival), the mobile device may be located. In some embodiments, a satellite navigation receiver, such as a GPS (Global Positioning System) or GLONASS (GLObal Navigation Satellite System), in connection with a terminal device may be exploited. The terminal may share the locally received satellite information with external entities as such or in cultivated form (e.g. ready-determined coordinates based on received satellite signal(s)). Further, data entity such as data packet transit times or TT times may be monitored, if possible, e.g. in relation to both the monitored user/terminal and e.g. location-wise known reference entities as described hereinbefore in order to assess the location of the user/terminal by associated comparison.
On the basis of the terminal location, the system 106 may then introduce a further factor, i.e. a location -based factor, to the authentication procedure and verify, whether the current location of the terminal in question matches with predetermined location information defining a number of allowed locations and/or banned locations in the light of the service and/or document access. Depending on the embodiment, the status of the location-based factor may be evaluated prior to the evaluation of the fulfillment of other authentication factors, in conjunction with them, or as a final check before authorizing the user to access the service and/or electronic document.
Fig. lc illustrates few other scenarios involving the embodiments of an electronic device 102c in accordance with the present invention. The device 102c may be a self-contained in a sense it can locally take care of authentication procedure based on program logic and data stored thereat. Optionally, the data such as voiceprint data may be still updated e.g. periodically or upon fulfillment of other triggering condition from or to a remote source such as a server 106 via a communications connection possibly including one or more communication networks 108 in between. Optionally, the device 102c is registered before a remote service such as network-based service that maintains a database of devices, associated users and/or related voiceprints. Some feasible techniques for implementing user and/or device enrollment for authentication and/or other solutions has been provided e.g. in publication WO2012/045908 A 1 "ARRANGEMENT AND METHOD FOR ACCESSING A NETWORK SERVICE" describing different features of ZEFA™ authentication mechanism along with various supplementary security and communications related features. Depending on the embodiment, the personal voiceprints may be generated at device 102c or network side (e.g. server 106) from the voice input provided by the user in question.
In some embodiments, the device 102c may incorporate a plurality of functionally (communications-wise) connected elements that are physically separate/separable and may even have their own dedicated housings, etc. For example, an access control panel or terminal providing UI to a subject of authentication (person) may be communications-wise connected, e.g. by a wired or wireless link, to an access control computer and/or actuator, which may also take care of one or more task(s) relating to the authentication and/or related access control procedures.
The device 102c may include e.g. a computer device (e.g. laptop or desktop), or a portable user terminal device, such as a smartphone, tablet or other mobile or even wearable device.
The device 102c may generally be designated as a personal device, or used, typically alternately, by several authorized persons, such as multiple family members or team members at work, and be thus configured to store personal voiceprints of each user, not just of a single user. Storing personal voiceprints of a single user only is often sufficient in the case of a truly personal device. The authentication procedure suggested herein may be utilized to control the provision of (further) access to the resources and feature(s) such as application(s) or application feature(s), for instance, in or at the device and/or at least accessible via the device by means of a communications connection to a remote party such as remote terminal or remote network-based service and related entities, typically incorporating at least one server.
The device 102c may thus be, include or implement at least part of an access control device. In some embodiments, the access control device 102c may include or be at least connected to a particular controllable physical asset or entity 140a, 140b, such as a door, fence, gate, window, or a latch providing access to a certain associated physical location, such as space (e.g. a room, compound or building) or physical, restricted resource such as container, safe, diary or even briefcase internals potentially containing valuable and/or confidential material. Particularly, the device 102c and the suggested authentication logic provided thereat may be at least functionally connected to an (electrically controllable) locking or unlocking mechanism of such an asset/entity. Yet, the asset/entity may bear data transfer capability to communicate with external entities regarding e.g. the authentication task and/or outcome thereof as already contemplated hereinbefore.
Figure lb shows, at 109A, a block diagram illustrating the selected internals of an embodiment of device 102c or system 106 presented herein. The system 106 may incorporate a number of at least functionally connected servers, and typically indeed at least one device such as a server or a corresponding entity with necessary communications, computational and memory capacity is included in the system. A skilled person will naturally realize that e.g. terminal devices such as a mobile terminal or a desktop type computer terminal utilized in
connection with the present invention could generally include same or similar elements. In some embodiments, also a number of terminals, e.g. aforesaid first and/or second terminal, may be included in the system 106 itself. Correspondingly, devices 102c applied in connection with the present invention may in some embodiments be implemented as a single-housing stand-alone devices, whereas in some other embodiments, may include or consist of two or more functionally connected elements potentially even provided with their own housings (e.g. access control terminal unit at a door connected to a near-by or more distant access control computer via a wired and/or wireless communication connection).
The utilized device(s) or generally entities in question are typically provided with one or more processing devices capable of processing instructions and other data, such as one or more microprocessors, micro-controllers, DSP's (digital signal processor), programmable logic chips, etc. The processing entity 120 may thus, as a functional entity, comprise a plurality of mutually co-operating processors and/or a number of sub-processors connected to a central processing unit, for instance. The processing entity 120 may be configured to execute the code stored in a memory 122, which may refer to instructions and data relative to the software logic and software architecture for controlling the device 102c or (device(s) of ) system 106. The processing entity 120 may at least partially execute and/or manage the execution of the authentication tasks.
Similarly, the memory entity 122 may be divided between one or more physical memory chips or other memory elements. The memory 122 may store program code for authentication and potentially other applications/tasks, and other data such as voiceprint repository, user contact information, electronic documents, service data etc. The memory 122 may further refer to and include other storage media such as a preferably detachable memory card, a floppy disc, a CD-ROM, or a fixed storage medium such as a hard drive. The memory 122 may be nonvolatile, e.g. ROM (Read Only Memory), and/or volatile, e.g. RAM (Random Access Memory), by nature. Software (product) may be provided on a carrier medium such as a memory card, a memory stick, an optical disc (e.g. CD-ROM or DVD), or some other memory carrier.
The UI (user interface) 124, 124B may comprise a display, a touchscreen, or a data projector 124, and keyboard/keypad or other applicable user (control) input entity 124B, such as a touch screen, a number of separate keys, buttons, knobs,
switches, a touchpad, a joystick, or a mouse, configured to provide the user of the system with practicable data visualization/reproduction and input/device control means, respectively. The UI 124 may include one or more loudspeakers and associated circuitry such as D/A (digital-to- analogue) converter(s) for sound output, and/or sound capturing elements 124B such as a microphone with A/D converter for sound input (obviously the device capturing voice input from the user at least has one, or external loudspeaker(s), earphones, and or microphone(s) may be utilized thereat for which purpose the UI 124, 124B preferably contains suitable wired or wireless (e.g. Bluetooth) interfacing means). A printer may be included in the arrangement for providing more permanent output.
The device 102/system 106 may further comprise a data interface 126 such as a number of wired and/or wireless transmitters, receivers, and/or transceivers for communication with other devices such as terminals and/or network infrastructure(s). For example, an integrated or a removable network adapter may be provided. Non-limiting examples of the generally applicable technologies include WLAN (Wireless LAN, wireless local area network), LAN, WiFi, Ethernet, USB (Universal Serial Bus), GSM (Global System for Mobile Communications), GPRS (General Packet Radio Service), EDGE (Enhanced Data rates for Global Evolution), UMTS (Universal Mobile Telecommunications System), WCDMA (wideband code division multiple access), CDMA2000, PDC (Personal Digital Cellular), PHS (Personal Handy-phone System), and Bluetooth. Some technologies may be supported by the elements of the system as such whereas some others (e.g. cell network connectivity) are provided by external, functionally connected entities.
It is clear to a skilled person that the device 102c or system 106 may comprise numerous additional functional and/or structural elements for providing advantageous communication, processing or other features, whereupon this disclosure is not to be construed as limiting the presence of the additional elements in any manner. Entity 125 refers to such additional element(s) found useful depending on the embodiment.
At 109B, potential functional or logical entities implemented by the device 102c or system 106 (mostly by processing element(s) 120, memory element(s) 122 and communications element(s) 126) for voice authentication are indicated. Profiler 1 10 may establish the cue-associated voiceprints for the users based on the voice input by the users. The input may include speech or generally voice samples
originally captured by user terminal(s) and funneled to the profiler 1 10 for voiceprint generation including e.g. feature extraction. Element 1 12 refers to a voiceprint repository 1 12 that may, in practice, contain a number of databases or other data structures for maintaining the personal voiceprints determined for the cues based on voice input by the user(s).
Voiceprint data is personal (user account or user id related) and characterizes correct voice response to each cue (in the cue sub-set used for authenticating that particular user). Voiceprint data may indicate, as already alluded hereinbefore, e.g. fundamental frequency data, vocal tract resonance(s) data, duration/temporal data, loudness/intensity data, etc. Voiceprint data may indicate personal (physiological) properties of the user 102 and characteristics of received sample data (thus advantageously characterizing also the substance or message of the input) obtained during the voiceprint generation procedure. In that sense, the voice recognition engine used in accordance with the present invention may also incorporate characteristics of speech recognition.
Analyzer 1 14 may take care of substantially real-time matching or generally analysis of voice input and already existing voiceprints during authentication. Such analysis may include a number of comparisons according to predetermined logic for figuring out whether the speaker/utterer really is the user initially indicated to the system. In some embodiments, profiler 1 10 and analyzer 1 14 may be logically implemented by a common entity due to e.g. similarities between the executed associated tasks. Authentication entity 1 16 may be such an entity or it may at least generally control the execution of authentication procedure(s), determine cues for an authentication task, raise/lower permanent or session-specific authentication levels based on the outcome thereof, and control e.g. data transfer with terminal devices and network infrastructure(s) including various elements.
Regarding certain embodiments with additional location-based authentication, e.g. the system 106 may provide a dedicated location(ing) id, a 'geokey', to the user 102 preferably through browser data such as service view, e.g. a login/authentication view or a portal view. The user 102 may then notice the (visualized) ID among the service data as a numeric code or generally a string of optionally predetermined length. The ID may be dynamic such as session- specific and/or for one-time use only. In some embodiments, the location id may be combined with the session id (or a common id be used) or generally with data
provided by the system for voice authentication e.g. via machine readable optical code like the Q code.
The user 102 may input or read the code to the (second) terminal, after which the application installed thereat, acquires location data according to predetermined logic based on available positioning options. Preferably, the location data is acquired in real-time or near real-time fashion upon receipt of the id to be current. For example, the device may contain a satellite receiver such as GPS or GLONASS receiver through which location data may be obtained. In addition, the device may utilize network and related signal(s) for obtaining location data such as data provided by cellular network and/or short-range wireless network, optionally WLAN. Network-assisted positioning may be used. The application may be configured to utilize available interfaces provided with the mobile operating system for acquiring the positioning data.
Location data such as longitude information, latitude information, accuracy or error estimate, the id itself or data derived therefrom, and/or time code (or time stamp) may be then collected and transmitted to the system 106. Preferably at least part of the data is encrypted. Optionally, at least part of the above data elements may be utilized for determining a hash by means of a secret or asymmetric key, for example, in which case at least the hash is transmitted. HTTPS may be utilized for the secured transfer. The system 106 receives and optionally processes such as decodes the data. Subsequently, the system 106 may verify the current location of the user 102, as indicated by the obtained location data, against predetermined data indicative of e.g. allowed location(s). The resolution of the obtained data and/or related measurement error estimate may be utilized to adapt the decision-making. For example, in the case of a larger error/ worse positioning accuracy, more tolerance may be allowed in verification process, and vice versa.
In one embodiment, the system 106 is configured to maintain data about allowed (and/or rejected) user locations through utilization of polygon data, i.e. geo- referenced polygon data. For example, a number of allowed postal areas represented by the corresponding polygons may have been associated with each user. The obtained location data may be mapped to a corresponding postal area polygon that is then searched from the list of allowed postal area polygons. In such an embodiment, the aforesaid adaptation may be realized by stretching or shrinking the postal area polygon boundaries, for instance.
In the case of a positive outcome (allowed location detected), the system 106 may again update the authentication, or generally 'security', status of the user 102 accordingly and typically raise it. In practice, the user 102 may be provided with enhanced access rights to service features such as payment/finance components, higher security documents, etc. as reviewed above. Each user may be associated with session-based information such as session record dynamically keeping track of, among potential other issues, the user rights emerging from the successful authentication actions. A notification of the raised access security level or failed authentication may be transmitted to the user via mobile application and/or through browser data. The system 106 may update the service parameters for the session automatically and provide an updated service view such as browser view to the user's terminal. Figure 3 discloses, by way of example only, a method flow diagram in accordance with an embodiment of the present invention.
At 302 the device and/or system of the present invention is obtained and configured, for example through loading and execution of related software, for managing the electronic service and related authentication mechanism(s). Further, for users willing or obliged to use voice authentication, the voiceprints shall be established as described in this text earlier. For example, the device/system may be trained by the user such that the user utters the desired response (association) to each cue in his/her preferred and/or at least partially machine- selected (sub-)set of cues, whereupon the system extracts or derives the voiceprints based on the voice input. Further, the user may be asked to provide some general or specific voice input that is not directly associated with any voiceprint. Using that voice input, the system may generally model the user- specific voice and/or speech parameters later applied in voice-based authentication and voiceprint matching, for example.
At 304, an indication of a required authentication, such as voice authentication request, is received from a user via feasible UI such as access control terminal, digital service UI (e.g. browser-based UI) or e.g. via a dedicated application. The request may be associated with a certain user whose voiceprints are available. The request may identify such a user identity by a user ID, for example. Procedures potentially incorporating linking first and second terminals of the user relative to the current service session have been already discussed in this
text. Naturally, e.g. in the case of a single user personal, self-contained device comprising personal voiceprints only for the particular user, such user identity indication is not necessary. At 306, a number of cues (for which voiceprint is available by the indicated user) are determined or selected preferably from a larger group thereof. The selection may be random, alternating (subsequent selections preferably contain different cue(s)), and/or following some other logic. The number of cues per authentication operation may be dynamically selected by the system/device as well. For example, if a previous voice authentication procedure regarding the same user identity failed, the next one could contain more (or less) cues, and potentially vice versa. Also the status of other authentication factor(s) may be configured to affect the number. For example, if the user has already been authenticated using some other authentication factor or element, e.g. location, the number of cues could be scaled lower than in situation wherein overall authentication status of the user is weaker.
At 308, the cues are represented to the user via a user device utilized for service access, stand-alone user device, or e.g. an access control (terminal) device. For example, at least indication of the cues may be transmitted by a remote system to the (first) user terminal potentially with instructions regarding visual and/or audible reproduction thereof e.g. via a browser. Preferably, the cues are represented in easily noticeable and recognizable order so that the response thereto may be provided as naturally as s possible following the same order. For example, graphical cues may be represented in series extending from left to right via the service or application UI, and the user may provide the voice response acknowledging each cue in the same, natural order advantageously without a need to provide any separate, explicit control command for identifying the target cue during the voice input stage. The user may utter the response to each cue one after each other by just keeping a brief pause in between so that cue-specific (sub-)responses may be distinguished from each other (and associated with proper cue) in the overall response afterwards by the terminal or the system based on the pauses. Alternatively, the user may explicitly indicate via the UI, through cue-specific icon/symbol selection, for instance, to which cue he/she is next providing the voice response.
Indeed at 310, the voice response to the challenged formed by the cues, such as graphical images, videos, and/or audio files, is provided by the user and
potentially forwarded via the terminal to a remote analyzing entity such as the authentication system. The sound data forwarded may include digital sound samples, such as so-called raw or PCM (pulse-code modulation) samples, or e.g. a more heavily parameterized compressed representation of the capture voice.
At 312, the obtained voice response data is analyzed against the corresponding personal (user-specific) voiceprints of the represented cues. The analysis tasks may include different matching and comparison actions following a predetermined logic. For example, a preferred existing or new best match type search algorithm may be exploited potentially followed by additional quality checking rules determining whether even the best match was good enough to acknowledge the current user as the indicated one. The logic may apply fixed threshold(s) for making decisions (successful authentication, failed authentication), or alternatively dynamic criteria may be applied. For instance, if e.g. heavy background noise is detected in the obtained sound data, criteria could be loosened.
At 314, the authentication status or level associated with the user is updated accordingly (raised, lowered, or left as is). The user may be provided with access to new location(s) or resource(s) (typically takes place only if the authentication status is raised).
At 316, the method execution is ended. A computer program, comprising a code means adapted, when run on a computer, to execute an embodiment of the desired method steps in accordance with the present invention, may be provided. A carrier medium such as an optical disc, floppy disc, or a memory card, or other non-transitory carrier medium comprising the computer program may further be provided. The program may be further delivered over a communication network and generally over a communication channel.
Consequently, a skilled person may on the basis of this disclosure and general knowledge apply the provided teachings in order to implement the scope of the present invention as defined by the appended claims in each particular use case with necessary modifications, deletions, and additions.
For example, HTML5 hypertext mark-up language standard includes application program interfaces for camera, voice recording and geolocation.
These new HTML features could be utilized in connection with the present invention e.g. instead of a (native) client, e.g. Java, application and/or QR reader described hereinbefore. An associated web link could be provided to a user terminal, e.g. to the mobile terminal, included in an SMS (e.g. OTP, onetime password type) message. In more detail, the user could indeed be provided with a message (e.g. the aforesaid SMS or other applicable message type) including a dedicated HTML5 weblink (e.g. with temporary id/session id etc.), whereupon weblink activation triggers QR scanning mode, the QR is then read using the camera, the HTML5 switches to a voice input (uttering) page or view, finally followed by the more conventional transmission of the voice data e.g. as raw data towards a speaker verification engine (analyzer) optionally along with location data. Such HTML5 based or other similar approach could be considered as an authentication instance-specific native client as the URL in question preferably works only in conjunction with the related authentication event, and the necessary native or native-like feature(s) may be provided within a web page through APIs.
Claims
1. Electronic system (106, 109 A) for authenticating a user of an electronic service, said system preferably comprising at least one server apparatus, the system being configured to store (122, 1 12, 200), for a number of users, a plurality of personal voiceprints (204) each of which being linked with a dedicated visual, audiovisual or audio cue (202), for challenge-response authentication of the users, wherein the cues are user-selected, user-provided or user-created, pick (1 16, 200C, 142, 144), upon receipt of an authentication request associated with an existing user of said number of users, a number of cues (212) for which there are voiceprints of the existing user stored, and provide the cues for representation (144, 126) to the user as a challenge, receive (126, 148) sound data indicative of the voice response uttered by the user to the represented cues, determine (1 14, 150) on the basis of the sound data, the represented cues and linked voiceprints, whether the response has been uttered by the existing user of said number of users, and provided that this seems to be the case, elevate (1 16, 152, 200D, 218, 216) the authentication status of the user as the existing user, preferably regarding at least the current communication session.
2. The system of claim 1, wherein at least one cue comprises a graphical image (202) or video to be shown to the user via a display of a terminal device.
3. The system of any preceding claim, wherein at least cue comprises an audio file, optionally music or sound scenery file, to be audibly reproduced to the user.
4. The system of any preceding claim, further configured to initially determine a personal voiceprint for a cue based on a voice response of the user to the cue (128, 130, 200).
5. The system of any preceding claim, configured to link a first user terminal (102a) and a second user terminal (102b) with the ongoing service session of the
user based on a dynamic id that is sent by the system (136) to the first terminal and returned by the second terminal (142).
6. The system of claim 5, comprising a Node.js server configured to remotely control web browser based user interface (UI) of the service at the first user terminal (102a), and allocate the dynamic id, optionally socket id, thereto, wherein the system is further configured to instruct (136) the first terminal to display the socket id visually in the service UI optionally via a two-dimensional graphical code.
7. The system of claim 6, wherein the code comprises a Q (Quick Response) code.
8. The system of claim 6 or 7, comprising socket.io entity configured to act as a client to the Node.js server, to receive the socket id transmitted by the second user terminal and forward it to the Node.js server for linking the first and second user terminals and the current service session of the user together.
9. The system of any preceding claim, further comprising a first user terminal (102a) for accessing the service and reproducing the cues and optionally a dynamic id allocated by the system to the user.
10. The system of claim 9, further comprising a second user terminal (102b), preferably a mobile device, comprising application, optionally Java application, for capturing the voice response by the user.
1 1. The system of claim 10, wherein the second user terminal is further configured to obtain a dynamic id allocated to the first terminal, preferably browser thereat, and signal it (140) to said at least one server of the system.
12. The system of claim 1 1, wherein the second user terminal is configured to optically read a two-dimensional code representation of the id shown on the display of the first terminal.
13. The system of any preceding claim, configured to further utilize the estimated location of the user as an authentication factor, wherein the location estimate is based on the location data obtained relative to a user terminal.
14. Electronic device (102c) for authenticating a person, comprising
a voiceprint repository (1 12) configured to store, for a number of users including at least one user, a plurality of personal voiceprints, each of which being linked with a dedicated visual, audiovisual or audio cue (202), for challenge-response authentication, wherein the cues are user-selected, user-provided or user-created, an authentication entity configured to pick (1 16, 200C), upon receipt of an authentication request associated with an existing user of said number of users, a number of cues (212) for which there are voiceprints of the existing user stored, and represent (124) the number of selected cues to the person as a challenge, and a response provision means (124B) for obtaining sound data indicative of the voice response uttered by the person to the represented cues, whereupon the authentication entity is configured to determine (1 14), on the basis of the sound data, the represented cues and voiceprints linked therewith, whether the response has been uttered by the existing user of said number of users, and provided that this seems to be the case, to elevate (1 16, 200D, 218, 216) the authentication status of the person as the existing user.
15. The electronic device of claim 14, being or comprising at least one element selected from the group consisting of: portable communications-enabled user device, computer, desktop computer, laptop computer, personal digital assistant, mobile terminal, smartphone, tablet, wristop computer, access control terminal or panel, smart goggles, and wearable user device.
16. The electronic device of claim 14 or 15, configured to control, responsive to the authentication status, access to a physical location or resource, preferably via a controllable locking or unlocking mechanism, optionally electrically controlled lock of a door, lid, or hatch.
17. The electronic device of claim 14 or 15, configured to control, responsive to the authentication status, further access to the device itself or a feature, such as application feature or UI feature, thereof or at least accessible therethrough.
18. A method for authenticating a subject person to be executed by one or more electronic devices, comprising storing, for a number of users, a plurality of personal voiceprints each of which linked with a dedicated visual, audiovisual or audio cue, for challenge-response authentication of the users (302), cues being user-selected, provided or created, picking, upon receipt of an authentication request associated with an existing user of said number of users, a number of cues for which there are voiceprints of the existing user stored, to be represented as a challenge (304, 306, 308), receiving response incorporating sound data indicative of the voice response uttered by the person to the represented cues (310), determining on the basis of the sound data, the represented cues and linked voiceprints, whether the response has been uttered by the existing user of said number of users, and provided that this seems to be the case (312), elevating the authentication status of the person acknowledged as the existing user according to the determination.
19. The method of claim 18, further controlling access based on the authentication status.
20. The method of claim 19, wherein the access to an electronic resource, such as electronic service, device, or feature accessible using the service or device, is controlled.
21. The method of claim 19 or 20, wherein the access to a physical location or resource, optionally a space or container behind an electric lock, door, or latch, is controlled.
22. A computer program comprising code means adapted to, when run on a computer, to execute the method items of claim 18.
23. A carrier medium comprising the computer program according to claim 22.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1318876.8A GB2519571A (en) | 2013-10-25 | 2013-10-25 | Audiovisual associative authentication method and related system |
GB1318876.8 | 2013-10-25 | ||
GB1320287.4A GB2519609B (en) | 2013-10-25 | 2013-11-18 | Audiovisual associative authentication method and related system |
GB1320287.4 | 2013-11-18 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2015059365A1 true WO2015059365A1 (en) | 2015-04-30 |
WO2015059365A9 WO2015059365A9 (en) | 2015-08-20 |
Family
ID=49767156
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/FI2014/050807 WO2015059365A1 (en) | 2013-10-25 | 2014-10-27 | Audiovisual -->associative --> authentication --> method and related system |
Country Status (2)
Country | Link |
---|---|
GB (2) | GB2519571A (en) |
WO (1) | WO2015059365A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019218512A1 (en) * | 2018-05-14 | 2019-11-21 | 平安科技(深圳)有限公司 | Server, voiceprint verification method, and storage medium |
CN112346888A (en) * | 2020-11-04 | 2021-02-09 | 网易(杭州)网络有限公司 | Data communication method and device based on software application and server equipment |
US11443030B2 (en) * | 2019-06-10 | 2022-09-13 | Sherman Quackenbush Mohler | Method to encode and decode otherwise unrecorded private credentials, terms, phrases, or sentences |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11669602B2 (en) * | 2019-07-29 | 2023-06-06 | International Business Machines Corporation | Management of securable computing resources |
US11531787B2 (en) | 2019-07-29 | 2022-12-20 | International Business Machines Corporation | Management of securable computing resources |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020080927A1 (en) * | 1996-11-14 | 2002-06-27 | Uppaluru Premkumar V. | System and method for providing and using universally accessible voice and speech data files |
US20050171851A1 (en) * | 2004-01-30 | 2005-08-04 | Applebaum Ted H. | Multiple choice challenge-response user authorization system and method |
US20070061865A1 (en) * | 2005-09-13 | 2007-03-15 | International Business Machines Corporation | Cued one-time passwords |
US20090116703A1 (en) * | 2007-11-07 | 2009-05-07 | Verizon Business Network Services Inc. | Multifactor multimedia biometric authentication |
US20120296651A1 (en) * | 2004-12-03 | 2012-11-22 | Microsoft Corporation | User authentication by combining speaker verification and reverse turing test |
US20130205387A1 (en) * | 2012-02-03 | 2013-08-08 | Futurewei Technologies, Inc. | Method and Apparatus to Authenticate a User to a Mobile Device Using Mnemonic Based Digital Signatures |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6393305B1 (en) * | 1999-06-07 | 2002-05-21 | Nokia Mobile Phones Limited | Secure wireless communication user identification by voice recognition |
WO2013190169A1 (en) * | 2012-06-18 | 2013-12-27 | Aplcomp Oy | Arrangement and method for accessing a network service |
-
2013
- 2013-10-25 GB GB1318876.8A patent/GB2519571A/en not_active Withdrawn
- 2013-11-18 GB GB1320287.4A patent/GB2519609B/en not_active Expired - Fee Related
-
2014
- 2014-10-27 WO PCT/FI2014/050807 patent/WO2015059365A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020080927A1 (en) * | 1996-11-14 | 2002-06-27 | Uppaluru Premkumar V. | System and method for providing and using universally accessible voice and speech data files |
US20050171851A1 (en) * | 2004-01-30 | 2005-08-04 | Applebaum Ted H. | Multiple choice challenge-response user authorization system and method |
US20120296651A1 (en) * | 2004-12-03 | 2012-11-22 | Microsoft Corporation | User authentication by combining speaker verification and reverse turing test |
US20070061865A1 (en) * | 2005-09-13 | 2007-03-15 | International Business Machines Corporation | Cued one-time passwords |
US20090116703A1 (en) * | 2007-11-07 | 2009-05-07 | Verizon Business Network Services Inc. | Multifactor multimedia biometric authentication |
US20130205387A1 (en) * | 2012-02-03 | 2013-08-08 | Futurewei Technologies, Inc. | Method and Apparatus to Authenticate a User to a Mobile Device Using Mnemonic Based Digital Signatures |
Non-Patent Citations (1)
Title |
---|
O'GORMAN, L.: "Comparing passwords, tokens, and biometrics for user authentication", PROC. OF THE IEEE, vol. 91, 12 December 2003 (2003-12-12), pages 2021 - 2040 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019218512A1 (en) * | 2018-05-14 | 2019-11-21 | 平安科技(深圳)有限公司 | Server, voiceprint verification method, and storage medium |
US11443030B2 (en) * | 2019-06-10 | 2022-09-13 | Sherman Quackenbush Mohler | Method to encode and decode otherwise unrecorded private credentials, terms, phrases, or sentences |
CN112346888A (en) * | 2020-11-04 | 2021-02-09 | 网易(杭州)网络有限公司 | Data communication method and device based on software application and server equipment |
Also Published As
Publication number | Publication date |
---|---|
GB2519609A (en) | 2015-04-29 |
GB201318876D0 (en) | 2013-12-11 |
WO2015059365A9 (en) | 2015-08-20 |
GB2519609B (en) | 2017-02-15 |
GB2519571A (en) | 2015-04-29 |
GB201320287D0 (en) | 2014-01-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10146923B2 (en) | Audiovisual associative authentication method, related system and device | |
US20230129693A1 (en) | Transaction authentication and verification using text messages and a distributed ledger | |
US10200377B1 (en) | Associating a device with a user account | |
US10257179B1 (en) | Credential management system and peer detection | |
EP3256976B1 (en) | Toggling biometric authentication | |
US9730065B1 (en) | Credential management | |
US11308189B2 (en) | Remote usage of locally stored biometric authentication data | |
KR101431401B1 (en) | Method and apparatus for voice signature authentication | |
US9027085B2 (en) | Method, system and program product for secure authentication | |
US10110574B1 (en) | Biometric identification | |
US20240346124A1 (en) | System and methods for implementing private identity | |
US11140171B1 (en) | Establishing and verifying identity using action sequences while protecting user privacy | |
US11057372B1 (en) | System and method for authenticating a user to provide a web service | |
US11757870B1 (en) | Bi-directional voice authentication | |
US20150088760A1 (en) | Automatic injection of security confirmation | |
US9438597B1 (en) | Regulating credential information dissemination | |
CN104518876A (en) | Service login method and device | |
US10270774B1 (en) | Electronic credential and analytics integration | |
WO2015059365A1 (en) | Audiovisual -->associative --> authentication --> method and related system | |
Truong et al. | Using contextual co-presence to strengthen Zero-Interaction Authentication: Design, integration and usability | |
CN105898002A (en) | Application unlocking method and apparatus for mobile terminal and mobile terminal | |
WO2013190169A1 (en) | Arrangement and method for accessing a network service | |
WO2021244471A1 (en) | Real-name authentication method and device | |
CN105264817A (en) | Multi-factor authentication techniques | |
CN107231338A (en) | Method for connecting network, device and the device for network connection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14855883 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14855883 Country of ref document: EP Kind code of ref document: A1 |