Embodiment
For further setting forth the present invention for the technological means that realizes predetermined goal of the invention and take and effect, below in conjunction with accompanying drawing and preferred embodiment, to according to the specific embodiment of the present invention, structure, feature and effect thereof, be described in detail as follows.
The auth method that the embodiment of the present invention provides, can be applicable to the management realizing application icon in user terminal, user terminal can comprise smart mobile phone, panel computer, E-book reader, MP3 player (Moving Picture Experts Group Audio Layer III, dynamic image expert compression standard audio frequency aspect 3), MP4(Moving Picture Experts Group Audio Layer IV, dynamic image expert compression standard audio frequency aspect 4) player, pocket computer on knee, desktop computer computing machine, vehicle-mounted computer, all-in-one etc.
Fig. 1 shows a kind of structured flowchart of user terminal.As shown in Figure 1, user terminal 100 comprises storer 102, memory controller 104, one or more (only illustrating one in figure) processor 106, Peripheral Interface 108, radio-frequency module 110, locating module 112, photographing module 114, audio-frequency module 116, Touch Screen 118 and key-press module 120.These assemblies are by one or more communication bus/signal wire 122 communication mutually.
Understandable, the structure shown in Fig. 1 is only signal, and user terminal 100 also can comprise than assembly more or less shown in Fig. 1, or has the configuration different from shown in Fig. 1.Each assembly shown in Fig. 1 can adopt hardware, software or its combination to realize.
Storer 102 can be used for storing software program and module, as carried out auth method and programmed instruction/module corresponding to device in the embodiment of the present invention in user terminal, processor 102 is by running the software program and module that are stored in storer 104, thus perform the application of various function and data processing, namely realize above-mentioned in user terminal, carrying out auth method.
Storer 102 can comprise high speed random access memory, also can comprise nonvolatile memory, as one or more magnetic storage device, flash memory or other non-volatile solid state memories.In some instances, storer 102 can comprise the storer relative to the long-range setting of processor 106 further, and these remote memories can be connected to user terminal 100 by network.The example of above-mentioned network includes but not limited to internet, intranet, LAN (Local Area Network), mobile radio communication and combination thereof.Processor 106 and other possible assemblies can carry out the access of storer 102 under the control of memory controller 104.
Various input/output device is coupled to CPU and storer 102 by Peripheral Interface 108.Various softwares in processor 106 run memory 102, instruction are to perform the various function of user terminal 100 and to carry out data processing.
In certain embodiments, Peripheral Interface 108, processor 106 and memory controller 104 can realize in one single chip.In some other example, they can respectively by independently chip realization.
Radio-frequency module 110, for receiving and sending electromagnetic wave, realizes the mutual conversion of electromagnetic wave and electric signal, thus carries out communication with communication network or other equipment.Radio-frequency module 110 can comprise the various existing circuit component for performing these functions, such as, and antenna, radio-frequency (RF) transceiver, digital signal processor, encrypt/decrypt chip, subscriber identity module (SIM) card, storer etc.Radio-frequency module 110 can with various network as internet, intranet, wireless network carry out communication or carry out communication by wireless network and other equipment.Above-mentioned wireless network can comprise cellular telephone networks, WLAN (wireless local area network) or Metropolitan Area Network (MAN).Above-mentioned wireless network can use various communication standard, agreement and technology, include, but are not limited to global system for mobile communications (Global System for MobileCommunication, GSM), enhancement mode mobile communication technology (Enhanced Data GSMEnvironment, EDGE), Wideband CDMA Technology (wideband code division multipleaccess, W-CDMA), CDMA (Code Division Multiple Access) (Code division access, CDMA), tdma (time division multiple access, TDMA), bluetooth, adopting wireless fidelity technology (Wireless, Fidelity, WiFi) (as IEEE-USA standard IEEE 802.11a, IEEE802.11b, IEEE802.11g and/or IEEE802.11n), the networking telephone (Voice over internetprotocal, VoIP), worldwide interoperability for microwave access (Worldwide Interoperability for MicrowaveAccess, Wi-Max), other are for mail, the agreement of instant messaging and short message, and any other suitable communications protocol, even can comprise those current agreements be developed not yet.
Locating module 112 is for obtaining the current location of user terminal 100.The example of locating module 112 includes but not limited to Global Positioning System (GPS) (GPS), location technology based on WLAN (wireless local area network) or mobile radio communication.
Photographing module 114 is for taking pictures or video.Photo or the video of shooting can be stored in storer 102, and send by radio-frequency module 110.
Audio-frequency module 116 provides audio interface to user, and it can comprise one or more microphone, one or more loudspeaker and voicefrequency circuit.Voicefrequency circuit receives voice data from Peripheral Interface 108, voice data is converted to telecommunications breath, and telecommunications breath is transferred to loudspeaker.Telecommunications breath is changed the sound wave can heard into people's ear by loudspeaker.Voicefrequency circuit also from microphone receive telecommunications breath, convert electrical signals to voice data, and by data transmission in network telephony to Peripheral Interface 108 to be further processed.Voice data can obtain from storer 102 or by radio-frequency module 110.In addition, voice data also can be stored in storer 102 or by radio-frequency module 110 and send.In some instances, audio-frequency module 116 also can comprise an earphone and broadcast hole, for providing audio interface to earphone or other equipment.
Touch Screen 118 provides one simultaneously and exports and inputting interface between user terminal 100 and user.Particularly, Touch Screen 118 exports to user's display video, and the content of these video frequency output can comprise word, figure, video and combination in any thereof.Some Output rusults correspond to some user interface object.Touch Screen 118 also receives the input of user, and the gesture operation such as click, slip of such as user, so that response is made in the input of user interface object to these users.The technology detecting user's input can be based on resistance-type, condenser type or other touch control detection technology possible arbitrarily.The instantiation of Touch Screen 118 display unit includes, but are not limited to liquid crystal display or light emitting polymer displays.
Key-press module 120 provides user to carry out the interface inputted to user terminal 100 equally, and user can perform different functions by pressing different buttons to make user terminal 100.
First embodiment
The process flow diagram carrying out auth method in user terminal that Fig. 2 provides for first embodiment of the invention.As shown in Figure 2, the auth method of the present embodiment comprises the following steps:
Step S11, obtains the voice signal to be verified of user's input, obtains the phonetic feature coefficient in described voice signal to be verified.
Concrete, user terminal calls by audio-frequency module the analog voice signal to be verified that microphone obtains user's input, then by A/D(modulus) converter carries out A/D conversion to the analog voice signal to be verified that microphone obtains, to generate audio digital signals, and obtain the vocal print feature in this audio digital signals.In the embodiment of the invention, vocal print feature can comprise one or more in following phonetic feature coefficient: linear forecast coding coefficient LPC, cepstrum coefficient CEP, mel-frequency cepstrum coefficient MFCC and perception linear predictor coefficient PLP.
Step S12, carries out speech recognition to described voice signal to be verified, obtains the semantic information comprised in described voice signal to be verified.
Semantic information is the voice meaning information to be expressed in voice signal to be verified.Concrete, user terminal, by carrying out speech recognition to the voice signal to be verified being converted to audio digital signals, obtains the text of the semantic information comprised in voice signal to be verified.
Step S13, mates obtained phonetic feature coefficient with preset received pronunciation characteristic coefficient, obtained semantic information is mated with preset standard semantic information.
In the embodiment of the invention, obtain for the standard voice signals that can input when identity registration according to user as the text of the received pronunciation characteristic coefficient of match-on criterion and standard semantic information and store.Concrete, obtained phonetic feature coefficient is mated with preset received pronunciation characteristic coefficient, obtains the similarity of phonetic feature coefficient and received pronunciation characteristic coefficient., the text of the text of the semantic information of acquisition with preset standard semantic information is mated meanwhile, obtain the similarity of the text of semantic information and the text of standard semantic information.
Step S14, when obtained phonetic feature coefficient and described received pronunciation characteristic coefficient match, and when the semantic information obtained and described standard semantic information match, confirms that the authentication of described user is passed through.
Concrete, when the likelihood score of phonetic feature coefficient and received pronunciation characteristic coefficient is greater than preset threshold value, and when the similarity of the content of the content of semantic text and standard semantic text is also greater than preset threshold value, confirm that the authentication of user is passed through.
The auth method that the embodiment of the present invention provides, by the phonetic feature coefficient in the voice signal to be verified of acquisition user input and semantic information, obtained phonetic feature coefficient is mated with preset received pronunciation characteristic coefficient, obtained semantic information is mated with preset standard semantic information; When obtained phonetic feature coefficient and received pronunciation characteristic coefficient match, and when the semantic information obtained and standard semantic information match, confirm that the authentication of user is passed through, owing to phonetic feature coefficient being combined with semantic information in authentication process itself, therefore can prevent disabled user from passing through to imitate the vocal print feature of validated user by authentication, thus the security of authentication can be improved.
Second embodiment
The process flow diagram carrying out auth method in user terminal that Fig. 3 provides for second embodiment of the invention.As shown in Figure 3, the auth method of the present embodiment comprises the following steps:
Step S21, receives the configuration-direct that user triggers, according to described configuration-direct, and configuration standard phonetic feature coefficient and standard semantic information.
In the embodiment of the invention, as shown in Figure 4, this step specifically can comprise:
Step S211, receives the first configuration-direct that user triggers.
Concrete, in this step, the first configuration-direct is used for making user terminal start operative norm phonetic feature coefficient and standard semantic information configuration operates.User can carry out in the process of identity registration to user terminal, trigger the first configuration-direct by performing preset operation.After identity registration success whenever, user also, can trigger the first configuration-direct by performing preset operation.Due to the first configuration-direct that user terminal can trigger at any time according to user, received pronunciation characteristic coefficient and standard semantic information are configured, also namely as the first configuration-direct dynamic conditioning that received pronunciation characteristic coefficient and the standard semantic information on authentication basis can be set out according to user, thus the dirigibility of authentication management can be improved, improve and crack difficulty, and then can identity verification secure be improved.
Step S212, according to described first configuration-direct, obtains the standard voice signals of described user input.
Concrete, user terminal starts operative norm phonetic feature coefficient according to the first configuration-direct and standard semantic information configuration operates, the many groups of mock standard voice signals with identical standard phonetic feature coefficient and standard semantic information that microphone obtains user's input are called by audio-frequency module, by A/D converter, the many groups mock standard voice signal obtained is converted to digital standard voice signal, and noise filtering process is carried out to it.
Step S213, obtains the phonetic feature coefficient in described standard voice signals, and described phonetic feature coefficient is configured to described received pronunciation characteristic coefficient.
Concrete, user terminal, by analyzing the many groups digital standard voice messaging through A/D conversion and noise filtering process, obtains the phonetic feature coefficient in standard voice signals, and is configured as received pronunciation characteristic coefficient, and store.Wherein, phonetic feature coefficient can comprise one or more in following coefficient: linear forecast coding coefficient, cepstrum coefficient, mel-frequency cepstrum coefficient and perception linear predictor coefficient.
Step S214, carries out speech recognition to described standard voice signals, obtains the semantic information comprised in described standard voice signals, and institute's semantic information is configured to described standard semantic information.
Concrete, user terminal, by carrying out speech recognition to standard voice signals, obtains the text of the semantic information comprised in standard voice signals, the text of this semantic information is configured to the text of standard semantic information.
In other embodiments of the present invention, as shown in Figure 5, this step specifically can comprise:
Step S215, receives the first configuration-direct that user triggers.
Concrete, the first configuration-direct can be used for configuration standard phonetic feature coefficient.
Step S216, according to described first configuration-direct, obtains the standard voice signals of described user input.
Step S217, obtains the phonetic feature coefficient in described standard voice signals, and described phonetic feature coefficient is configured to described received pronunciation characteristic coefficient.
Step S216 to step S217 specifically can the content of refer step S211 ~ S213, repeats no more herein.
Step S218, receives the second configuration-direct that described user triggers.
Concrete, the second configuration-direct can be used for configuration standard semantic information.User terminal can receive the second configuration-direct that user triggers at any time, and is configured standard semantic information according to this second configuration-direct.
Step S219, according to the second configuration-direct that described user triggers, receives the Word message of user's input, described Word message is configured to described standard semantic information.
Concrete, user terminal can receive the Word message of user by input through keyboard, or obtain instruction from local storage, the webserver or other External memory equipments according to the Word message of user, obtain the text comprising Word message that this instruction is pointed to, then the Word message comprised in text file is extracted, as standard semantic information.Text file can be the file that .doc .txt etc. are suffix.
By the way, user can trigger the second configuration-direct at any time, by any semanteme, such as: a kind of color oneself liked, a kind of spend name even Chinese idiom be set to the standard semantic information as authentication standard, thus the dirigibility of authentication can be improved, and improve the security of authentication.
Step S22, obtains the service request of described user, comprises the user account number of described user and described service identification in described service request.
Step S23, obtains corresponding semantic information according to described user account, and is shown to described user, and described semantic information is used for the semantic feature pointing out described semantic information to be verified to described user.
Whether concrete, semantic information can be used for the semantic feature pointing out semantic information to be verified to user, such as: semantic type, number of words, containing special symbol etc.User terminal can when receiving the service request of user, obtain the semantic information corresponding with the user account number comprised in this service request, and this semantic information is shown to user, wherein semantic information can be preset according to the information of user's input.By preset semantic information, the structure because of preset standard semantic information can be prevented too complicated, the authentication failure that user have forgotten standard semantic information and causes, thus the convenience of authentication can be improved.
Step S24, obtains the voice signal to be verified of user's input, obtains the phonetic feature coefficient in described voice signal to be verified.
Step S25, carries out speech recognition to described voice signal to be verified, obtains the semantic information comprised in described voice signal to be verified.
Step S26, mates obtained phonetic feature coefficient with preset received pronunciation characteristic coefficient, obtained semantic information is mated with preset standard semantic information.
Step S27, when obtained phonetic feature coefficient and described received pronunciation characteristic coefficient match, and when the semantic information obtained and described standard semantic information match, confirm that the authentication of described user is passed through, be that described user processes corresponding business according to described service identification.
Step S24 ~ step S27 specifically with reference to the related content of the first embodiment, can repeat no more herein.
Understandable, the present embodiment also can be applicable in applied environment as shown in Figure 6, concrete, user terminal 100 can receive the configuration-direct that user triggers, obtain the user account number that user is corresponding, according to this configuration-direct by microphone typing user's specification voice signal, and user account number, standard voice signals are sent to authentication server 300 by access server 200.Authentication server 300 obtains the semantic information that comprises in phonetic feature coefficient in the standard voice signals received and standard voice signals, the received pronunciation characteristic coefficient that the user account number being configured to by phonetic feature coefficient receive is corresponding, the standard semantic information that user account number semantic information being configured to receive is corresponding.
User terminal 100 obtains the voice signal to be verified that the service request of user and user input, and comprises the user account number of user and described service identification, user account number and this voice messaging to be verified are sent to access server 200 in service request.The voice messaging to be verified received is sent to authentication server 300 by access server 200.Authentication server 300 carries out speech recognition to the voice messaging to be verified received, obtain wherein comprised phonetic feature coefficient and semantic information, and received pronunciation characteristic coefficient corresponding with the user account number of reception for obtained phonetic feature coefficient is mated, standard semantic information corresponding with the user account number of reception for obtained semantic information is mated, when obtained phonetic feature coefficient and described received pronunciation characteristic coefficient match, and when the semantic information obtained and described standard semantic information match, confirm that the authentication of described user is passed through, otherwise, confirm that the authentication of described user is not passed through, and matching result is returned to access server 200.Access server 200 receive matching result be user authentication by time, service identification is sent to service server 400.The business that service server 400 is corresponding according to the service identification process received, and result is returned to access server 200.
Understandable, access server 200, authentication server 300 and service server 400 also can be configurable in same server as the module with identical function.
The auth method that the embodiment of the present invention provides, by the phonetic feature coefficient in the voice signal to be verified of acquisition user input and semantic information, obtained phonetic feature coefficient is mated with preset received pronunciation characteristic coefficient, obtained semantic information is mated with preset standard semantic information; When obtained phonetic feature coefficient and received pronunciation characteristic coefficient match, and when the semantic information obtained and standard semantic information match, confirm that the authentication of user is passed through, owing to phonetic feature coefficient being combined with semantic information in authentication process itself, therefore can prevent disabled user from passing through to imitate the vocal print feature of validated user by authentication, thus the security of authentication can be improved.
3rd embodiment
The structural representation carrying out the device of authentication in user terminal that Fig. 7 provides for third embodiment of the invention.The authentication means that the present embodiment provides may be used for the auth method in above-described embodiment.As shown in Figure 7, authentication means 30 comprises: phonetic feature coefficient acquisition module 31, semantic information acquisition module 32, matching module 33 and authentication module 34.
Wherein, phonetic feature coefficient acquisition module 31, for obtaining the voice signal to be verified of user's input, obtains the phonetic feature coefficient in described voice signal to be verified.
Semantic information acquisition module 32, carries out speech recognition for the voice signal described to be verified obtained described phonetic feature coefficient acquisition module 31, obtains the semantic information comprised in described voice signal to be verified.
Matching module 33, phonetic feature coefficient for being obtained by described phonetic feature coefficient acquisition module 31 mates with preset received pronunciation characteristic coefficient, and the semantic information obtained by institute's semantic information acquisition module 32 is mated with preset standard semantic information.
Authentication module 34, match for obtained phonetic feature coefficient and described received pronunciation characteristic coefficient for the matching result when described matching module 33, and when the semantic information obtained and described standard semantic information match, confirm that the authentication of described user is passed through.
Each module can be by software code realization above, and now, above-mentioned each module can be stored in storer 102, as shown in Figure 8.Each module can be realized by hardware such as integrated circuit (IC) chip equally above.
The present embodiment, to the detailed process of each Implement of Function Module of authentication means 30 function separately, refers to the particular content of above-mentioned Fig. 1 to middle description embodiment illustrated in fig. 6, repeats no more herein.
The authentication means that the embodiment of the present invention provides, by the phonetic feature coefficient in the voice signal to be verified of acquisition user input and semantic information, obtained phonetic feature coefficient is mated with preset received pronunciation characteristic coefficient, obtained semantic information is mated with preset standard semantic information; When obtained phonetic feature coefficient and received pronunciation characteristic coefficient match, and when the semantic information obtained and standard semantic information match, confirm that the authentication of user is passed through, owing to phonetic feature coefficient being combined with semantic information in authentication process itself, therefore can prevent disabled user from passing through to imitate the vocal print feature of validated user by authentication, thus the security of authentication can be improved.
4th embodiment
The structural representation carrying out authentication means in user terminal that Fig. 9 provides for fourth embodiment of the invention.The authentication means that the present embodiment provides may be used for the auth method in above-described embodiment.As shown in Figure 9, authentication means 40 comprises: phonetic feature coefficient acquisition module 41, semantic information acquisition module 42, matching module 43, authentication module 44, service request acquisition module 45, semantic reminding module 46, business data processing module 47, first configuration module 48 and the second configuration module 49.
Wherein, phonetic feature coefficient acquisition module 41, for obtaining the voice signal to be verified of user's input, obtains the phonetic feature coefficient in described voice signal to be verified.
Semantic information acquisition module 42, carries out speech recognition for the voice signal described to be verified obtained described phonetic feature coefficient acquisition module 41, obtains the semantic information comprised in described voice signal to be verified.
Matching module 43, phonetic feature coefficient for being obtained by described phonetic feature coefficient acquisition module 41 mates with preset received pronunciation characteristic coefficient, and the semantic information obtained by institute's semantic information acquisition module 42 is mated with preset standard semantic information.
Authentication module 44, match for obtained phonetic feature coefficient and described received pronunciation characteristic coefficient for the matching result when described matching module 43, and when the semantic information obtained and described standard semantic information match, confirm that the authentication of described user is passed through.
Service request acquisition module 45, for obtaining the service request of described user, comprises the user account number of described user and described service identification in described service request.
Semantic reminding module 46, for obtaining corresponding semantic information according to described user account, and is shown to described user, and described semantic information is used for the semantic feature pointing out described semantic information to be verified to described user.
Business data processing module 47, for according to described service identification being the business that described user processes correspondence.
First configuration module 48, as shown in Figure 10, the first configuration module 48 comprises: the first configuration-direct receiving element 481, standard voice signals acquiring unit 482, received pronunciation characteristic coefficient dispensing unit 483 and standard semantic information configuration unit 484.Wherein, the first configuration-direct receiving element 481, for receiving the first configuration-direct that described user triggers; Standard voice signals acquiring unit 482, for according to described first configuration-direct, obtains the standard voice signals of described user input; Received pronunciation characteristic coefficient dispensing unit 483, for obtaining the phonetic feature coefficient in described standard voice signals, is configured to described received pronunciation characteristic coefficient by described phonetic feature coefficient; Standard semantic information configuration unit 484, for carrying out speech recognition to described standard voice signals, obtains the semantic information comprised in described standard voice signals, and institute's semantic information is configured to described standard semantic information.
Second configuration module 49, as shown in figure 11, the second configuration module 49 comprises: the second configuration-direct receiving element 491 and standard semantic information configuration unit 492.Wherein, the second configuration-direct receiving element 491, for receiving the second configuration-direct that described user triggers; Standard semantic information configuration unit 492, for the second configuration-direct triggered according to described user, receives the Word message of user's input, described Word message is configured to described standard semantic information.
Preferably, described phonetic feature coefficient comprises one or more in following coefficient: linear forecast coding coefficient, cepstrum coefficient, mel-frequency cepstrum coefficient and perception linear predictor coefficient.
The present embodiment, to the detailed process of each Implement of Function Module of authentication means 40 function separately, refers to the particular content of above-mentioned Fig. 1 to middle description embodiment illustrated in fig. 6, repeats no more herein.
The authentication means that the embodiment of the present invention provides, by the phonetic feature coefficient in the voice signal to be verified of acquisition user input and semantic information, obtained phonetic feature coefficient is mated with preset received pronunciation characteristic coefficient, obtained semantic information is mated with preset standard semantic information; When obtained phonetic feature coefficient and received pronunciation characteristic coefficient match, and when the semantic information obtained and standard semantic information match, confirm that the authentication of user is passed through, owing to phonetic feature coefficient being combined with semantic information in authentication process itself, therefore can prevent disabled user from passing through to imitate the vocal print feature of validated user by authentication, thus the security of authentication can be improved.
It should be noted that, each embodiment in this instructions all adopts the mode of going forward one by one to describe, and what each embodiment stressed is the difference with other embodiments, between each embodiment identical similar part mutually see.For device class embodiment, due to itself and embodiment of the method basic simlarity, so description is fairly simple, relevant part illustrates see the part of embodiment of the method.
It should be noted that, in this article, the such as relational terms of first and second grades and so on is only used for an entity or operation to separate with another entity or operational zone, and not necessarily requires or imply the relation that there is any this reality between these entities or operation or sequentially.And, term " comprises ", " comprising " or its any other variant are intended to contain comprising of nonexcludability, thus make to comprise the process of a series of key element, method, article or device and not only comprise those key elements, but also comprise other key elements clearly do not listed, or also comprise by the intrinsic key element of this process, method, article or device.When not more restrictions, the key element limited by statement " comprising ... ", and be not precluded within process, method, article or the device comprising key element and also there is other identical element.
One of ordinary skill in the art will appreciate that all or part of step realizing above-described embodiment can have been come by hardware, the hardware that also can carry out instruction relevant by program completes, program can be stored in a kind of computer-readable recording medium, the above-mentioned storage medium mentioned can be ROM (read-only memory), disk or CD etc.
Above, it is only preferred embodiment of the present invention, not any pro forma restriction is done to the present invention, although the present invention discloses as above with preferred embodiment, but and be not used to limit the present invention, any those skilled in the art, do not departing within the scope of technical solution of the present invention, make a little change when the technology contents of above-mentioned announcement can be utilized or be modified to the Equivalent embodiments of equivalent variations, in every case be do not depart from technical solution of the present invention content, according to any simple modification that technical spirit of the present invention is done above embodiment, equivalent variations and modification, all still belong in the scope of technical solution of the present invention.