CN102647521A - Method for removing lock of mobile phone screen based on short voice command and voice-print technology - Google Patents

Method for removing lock of mobile phone screen based on short voice command and voice-print technology Download PDF

Info

Publication number
CN102647521A
CN102647521A CN2012100970831A CN201210097083A CN102647521A CN 102647521 A CN102647521 A CN 102647521A CN 2012100970831 A CN2012100970831 A CN 2012100970831A CN 201210097083 A CN201210097083 A CN 201210097083A CN 102647521 A CN102647521 A CN 102647521A
Authority
CN
China
Prior art keywords
voice password
release
voice
frame
presets
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012100970831A
Other languages
Chinese (zh)
Other versions
CN102647521B (en
Inventor
刘德建
关胤
余志鹏
吴拥民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu com Times Technology Beijing Co Ltd
Original Assignee
FUZHOU BOYUAN WIRELESS NETWORK TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by FUZHOU BOYUAN WIRELESS NETWORK TECHNOLOGY Co Ltd filed Critical FUZHOU BOYUAN WIRELESS NETWORK TECHNOLOGY Co Ltd
Priority to CN2012100970831A priority Critical patent/CN102647521B/en
Publication of CN102647521A publication Critical patent/CN102647521A/en
Application granted granted Critical
Publication of CN102647521B publication Critical patent/CN102647521B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention provides a method for removing lock of mobile phone screen based on a short voice command and a voice-print technology. The method comprises the steps that: in a preset stage, a user inputs preset voice passwords and executes quick fourier change so as to determine a pass threshold; and in an unlocking stage, the user inputs unlocked voice passwords and executes quick fourier change so as to compute a difference value between an unlocked voice password frequency domain signal and a preset voice password frequency domain signal, judges whether the mobile phone is unlocked by comparing whether the difference value is smaller than the pass threshold, and unlocks the locking state of the mobile phone. the method is convenient and fast and ensures the use safety of the mobile phone; rules on the computing of the difference value are conducted on the basis, technologies containing framing and windowing, MFCC (Mel-Frequency Cepstral Coefficient) computing and vector quantization processing are introduced, so that the sound characteristics of the user can be accurately extracted and compared, and the user experience on portability and safety is improved.

Description

Method based on voice short command and the screen locking of vocal print technology releasing mobile phone
[technical field]
The present invention relates to a kind of method based on voice short command and the screen locking of vocal print technology releasing mobile phone.
[background technology]
Existing mobile phone mostly is through touch action, and illumination judges that technology such as cryptoguard reach the purpose of removing the screen locking state.Adopt touch action, technology such as illumination judgement is removed the mobile phone screen locking, and mobile phone does not have fail safe; Anyone can the release mobile phone; And adopt the mode of cryptoguard to remove the mobile phone screen locking, and use mobile phone though can prevent other unauthorized users, convenient and swift inadequately during operation.
Publication number is 102148899A; Open day is the patent of invention of 2011-8-10, is the waveform (time-domain signal just) of user input instruction waveform and the existing release sound instruction of cell phone system is compared, and judging whether to coincide determines whether release; It is perhaps identical for 80%-100% to obtain coincideing through the comparison waveform; This is impossible realize, because same individual tells about an identical speech or a word constantly in difference, its different wave shape is also very big; Therefore, this invention does not possess exploitativeness.
[summary of the invention]
The technical problem that the present invention will solve; Be to provide a kind of method based on voice short command and the screen locking of vocal print technology releasing mobile phone, not only convenient and swift but also guaranteed the fail safe that mobile phone uses, on this basis the calculating of difference value is stipulated; Introduce and divide the frame windowing; MFCC coefficient calculations and vector quantization treatment technology can extract and compare user's sound property more accurately, have improved the user experience in convenience and the fail safe.
The present invention solves above-mentioned technical problem through following two kinds of technical schemes:
Scheme one: a kind of method based on voice short command and the screen locking of vocal print technology releasing mobile phone, comprise preset stage and release stage, said preset stage comprises the steps:
The voice password is preset in step 1, user's input, and the preservation form of said voice password in mobile phone is time-domain signal;
Step 2, be that the voice data that time-domain signal said presets the voice password is carried out FFT, the said voice data that presets the voice password is transformed into the frequency-region signal that presets the voice password the preservation form;
Step 3, the passing threshold of an acquiescence is provided in the user mobile phone system or sets a passing threshold by the user;
The said release stage comprises the steps:
Step 4, user import release voice password, and the preservation form of said voice password in mobile phone is time-domain signal;
Step 5, be that the voice data of the said release voice password of time-domain signal is carried out fast Fourier transform with the preservation form, the voice data of said release voice password is transformed into the frequency-region signal spectrum of release voice password;
Step 6, the said release voice password frequency-region signal of calculating and the said difference value that presets voice password frequency-region signal;
Step 7, whether judge said difference value less than said passing threshold, if, then remove mobile phone screen locking state, if not, then point out the release failure.
Further, said difference value obtains through asking Euclidean distance.
Scheme two: a kind of method based on voice short command and the screen locking of vocal print technology releasing mobile phone, comprise preset stage and release stage, said preset stage comprises the steps:
The voice password is preset in step 10, user input, and said to preset the preservation form of voice password in mobile phone be time-domain signal;
Step 11, be that the said voice data that presets the voice password of time-domain signal carries out the windowing process of branch frame, and calculate the number of frames N that presets the voice password the preservation form;
Step 12, each frame is preset the voice password carry out fast Fourier transform, each frame presets the frequency-region signal that voice password correspondent transform becomes to preset the voice password;
Step 13, with the triangular window filter of linear distribution on X the Mel frequency marking, to the frequency-region signal filtering successively of respectively presetting the voice password, after the filtering, each frequency-region signal that presets the voice password all obtains X corresponding energy value; Said X is a natural number, 1≤X≤128;
Step 14, to preceding Y 1Frame presets each frame in the voice password and presets X corresponding energy value of voice password and ask the noise energy average that presets the voice password, said Y 1Be natural number, 1≤Y 1≤N; The said process that presets voice password noise energy average of asking is specially: first frame is preset X corresponding energy value to the Y of voice password 1Frame presets X corresponding energy value of voice password and asks the arithmetic mean value respectively, obtains presetting X noise energy average of voice password, and the solution procedure of said arithmetic mean is specially: promptly earlier first frame is preset corresponding first energy value to the Y of voice password 1Frame presets first corresponding energy value of voice password and asks the arithmetic mean value, obtains presetting first noise energy average of voice password, then and the like ask the arithmetic mean value, obtain presetting the Y of voice password after the completion altogether 1Individual noise energy average;
Step 15, at remaining N-Y 1Frame presets in the voice password, and each frame presets X corresponding energy value of voice password and deducts X the noise energy average that presets the voice password respectively accordingly, and each frame presets the voice password and all obtains X corresponding with it noise reduction energy value; Said N-Y 1Be meant that the voice password that presets with the N frame removes first frame to the Y that is used to ask the noise energy average 1Frame presets the voice password;
Step 16, to the residue N-Y 1Frame presets each frame in the voice password and presets X corresponding noise reduction energy value of voice password and carry out discrete cosine transform, and the N-Y of voice password is preset in acquisition altogether 1Individual Z dimension MFCC coefficient; Said Z is a natural number, 1≤Z≤128;
Step 17, the N-Y that presets the voice password to obtaining 1Individual Z dimension MFCC coefficient carries out vector quantization, and it is K that the length that quantizes code book is set, and K is a natural number, and 1≤K≤128; Then obtain one and quantize code book, this quantizes code book and is made up of K Z dimension MFCC;
Step 18, the passing threshold of one acquiescence is provided or sets a passing threshold by the user in the user mobile phone system;
The said release stage comprises the steps:
Step 20, user import release voice password; The preservation form of said release voice password in mobile phone is time-domain signal;
Step 21, be that the voice data of the said release voice password of time-domain signal carries out the windowing process of branch frame, and calculate the number of frames M of release voice password the preservation form;
Step 22, fast Fourier transform carried out in each frame release voice password, each frame release voice password correspondent transform becomes the frequency-region signal of a release voice password;
Step 23, with of the frequency-region signal successively filtering of said triangular window filter to each release voice password, after the filtering, the frequency-region signal of each release voice password all obtains X energy value of corresponding release voice password;
Step 24, to preceding Y 2X the energy value that each frame release voice password is corresponding in the frame release voice password found the solution the noise energy average of lock voice password, and the said process of finding the solution lock voice password noise energy average is specially: energy value to the Y of X the release voice password that the first frame release voice password is corresponding 2X the energy value that frame release voice password is corresponding asked the arithmetic mean value respectively, obtains X noise energy average of release voice password, and the solution procedure of said arithmetic mean value is specially: promptly corresponding to the first frame release voice password earlier first energy value to the Y 2First energy value that frame release voice password is corresponding is asked the arithmetic mean value, obtains first noise energy average of release voice password, then and the like ask the arithmetic mean value, obtain the noise energy average of X release voice password after the completion altogether;
Step 25, at remaining M-Y 2In the frame release voice password, X the average energy value that each frame release voice password is corresponding deducts X noise energy average of release voice password respectively accordingly, and each frame release voice password all obtains X corresponding with it noise reduction energy value; Said M-Y 2Be meant that the release voice password with the M frame removes first frame to the Y that is used to ask the noise energy average 2Frame release voice password;
Step 26, to the residue M-Y 2X the noise reduction energy value that each frame release voice password is corresponding in the frame release voice password carries out discrete cosine transform, obtains the M-Y of release voice password altogether 2Individual Z dimension MFCC coefficient;
Step 27, each Z dimension MFCC coefficient of release voice password is compared with the said quantification code book that presets the voice password respectively one by one, release voice password is M-Y altogether 2Individual Z dimension MFCC coefficient is then compared M-Y 2Wheel is made up of K Z dimension MFCC because this quantizes code book, and each takes turns comparison, all obtains K distance value, and chooses lowest distance value wherein, and promptly each is taken turns comparison and obtains a lowest distance value, has all compared, and obtains M-Y altogether 2Individual lowest distance value is with M-Y 2Individual lowest distance value summation and divided by M-Y 2, obtain average minimum range; Said comparison is for asking Euclidean distance;
Step 28, whether judge said average minimum range less than said passing threshold, if, then remove mobile phone screen locking state, if not, then point out the release failure.
Further, in the step 11, through formula N=(L 1-20)/10+1 rounds downwards and tries to achieve the said number of frames N that presets the voice password, wherein, L in the formula 1Represent the said audio frequency duration that presets the voice password, L 1Unit be millisecond, 20 in formula expression frame length is 20 milliseconds, the expression of 10 in formula frame is superposed to 10 milliseconds.
Further, in the step 21, through formula M=(L 2-20)/10+1 rounds the number of frames M that tries to achieve said release voice password downwards, wherein, L in the formula 2The audio frequency duration of representing said release voice password, L 2Unit be millisecond, 20 in formula expression frame length is 20 milliseconds, the expression of 10 in formula frame is superposed to 10 milliseconds.
Further, import saidly when presetting voice password and said release voice password, the said signal sampling rate that presets voice password and said release voice password is 16000Hz.
Further, number 24≤X≤39 of the Mel frequency marking of said triangular window filter.
Further, said triangular window filter is the triangular window filter with linear distribution on 24 Mel frequency markings, i.e. X=24, and the centre frequency of said triangular window filter is respectively: 100,200,300,400,500,600,700,800; 900,1000,1149,1320,1516,1741,2000,2297,2639,3031,3482; 4000,4595,5278,6063,6964, bandwidth is: 100,100,100,100,100; 100,100,100,100,124,160,184,211,242,278; 320,367,422,484,556,639,734,843,969, above numerical value unit is Hz.
Further, said triangular window filter is the triangular window filter with linear distribution on 39 Mel frequency markings, i.e. X=39, and the centre frequency of said triangular window filter is respectively: 50,100,150,200,260,320,390,460,530,610,700; 790,890,990,1100,1210,1340,1480,1610,1770,1930,2100,2280,2480,2680; 2900,3140,3380,3650,3930,4230,4560,4900,5260,5650,6060,6500,6970,7470; Bandwidth is: 100,100,100,120,127,127,148,148,148,169,190,190,233,233; 254,254,296,296,275,339,339,360,381,424,424,466,508; 508,572,593,636,699,720,763,826,869,932,996,1060, above numerical value unit is Hz.
Further, the computing formula of said discrete cosine transform is:
be j noise reduction energy value of En (j) expression wherein; 1≤j≤X; 1≤i≤Z, i, j are natural number.
Further, the windowing process in step 11 and the step 21 is and adds the Hamming window processing.
The present invention has following advantage: remove mobile phone screen locking state through voice short command and voiceprint authentication technology; Not only convenient and swift but also guaranteed the fail safe that mobile phone uses; Simultaneously, introduce the windowing of branch frame, MFCC coefficient calculations and vector quantization treatment technology; User's sound property can be extracted and compare more accurately, improved the user experience in convenience and the fail safe.
[embodiment]
The present invention program one specific embodiment is following:
A kind of method based on voice short command and the screen locking of vocal print technology releasing mobile phone comprises preset stage and release stage, and said preset stage comprises the steps:
The voice password is preset in step 1, user's input, and the preservation form of said voice password in mobile phone is time-domain signal;
Step 2, be that the voice data that time-domain signal said presets the voice password is carried out FFT, the said voice data that presets the voice password is transformed into the frequency-region signal that presets the voice password the preservation form;
Step 3, the passing threshold of an acquiescence is provided in the user mobile phone system or sets a passing threshold by the user;
The said release stage comprises the steps:
Step 4, user import release voice password, and the preservation form of said voice password in mobile phone is time-domain signal;
Step 5, be that the voice data of the said release voice password of time-domain signal is carried out fast Fourier transform with the preservation form, the voice data of said release voice password is transformed into the frequency-region signal spectrum of release voice password;
Step 6, the said release voice password frequency-region signal of calculating and the said difference value that presets voice password frequency-region signal; Said difference value obtains through asking Euclidean distance;
Step 7, whether judge said difference value less than said passing threshold, if, then remove mobile phone screen locking state, if not, then point out the release failure.
The present invention program two first embodiment is following:
A kind of method based on voice short command and the screen locking of vocal print technology releasing mobile phone, it is characterized in that: comprise preset stage and release stage, said preset stage comprises the steps:
The voice password is preset in step 10, user input, and said to preset the preservation form of voice password in mobile phone be time-domain signal; The said signal sampling rate that presets the voice password is 16000Hz;
Step 11, be that the said voice data that presets the voice password of time-domain signal carries out the windowing process of branch frame, and calculate the number of frames N that presets the voice password the preservation form; In the present embodiment through formula N=(L 1-20)/10+1 rounds downwards and tries to achieve the said number of frames N that presets the voice password, wherein, L in the formula 1Represent the said audio frequency duration that presets the voice password, the expression of 20 in formula frame length is 20 milliseconds, and the expression of 10 in formula frame is superposed to 10 milliseconds; Said windowing process is handled for adding Hamming window;
Step 12, each frame is preset the voice password carry out fast Fourier transform, each frame presets the frequency-region signal that voice password correspondent transform becomes to preset the voice password;
Step 13, with the triangular window filter of linear distribution on X the Mel frequency marking, to the frequency-region signal filtering successively of respectively presetting the voice password, after the filtering, each frequency-region signal that presets the voice password all obtains X corresponding energy value; Said X is a natural number, 1≤X≤128.More preferably; Number 24≤X≤39 of the Mel frequency marking of said triangular window filter select for use the triangular window filter of this X scope to obtain reasonably to compromise at operation efficiency and between to the descriptive power of characteristics of speech sounds, obviously; Filter quantity is big more; The value that is X is big more, and is just meticulous more to the description of characteristics of speech sounds, but operation efficiency can reduce.
Step 14, to preceding Y 1Frame presets each frame in the voice password and presets X corresponding energy value of voice password and ask the noise energy average that presets the voice password, said Y 1Be natural number, 1≤Y 1≤N; The said process that presets voice password noise energy average of asking is specially: first frame is preset X corresponding energy value to the Y of voice password 1Frame presets X corresponding energy value of voice password and asks the arithmetic mean value respectively, obtains presetting X noise energy average of voice password, and the solution procedure of said arithmetic mean is specially: promptly earlier first frame is preset corresponding first energy value to the Y of voice password 1Frame presets first corresponding energy value of voice password and asks the arithmetic mean value, obtains presetting first noise energy average of voice password, then and the like ask the arithmetic mean value, obtain presetting the Y of voice password after the completion altogether 1Individual noise energy average;
Step 15, at remaining N-Y 1Frame presets in the voice password, and each frame presets X corresponding energy value of voice password and deducts X the noise energy average that presets the voice password respectively accordingly, and each frame presets the voice password and all obtains X corresponding with it noise reduction energy value; Said N-Y 1Be meant that the voice password that presets with the N frame removes first frame to the Y that is used to ask the noise energy average 1Frame presets the voice password;
Step 16, to the residue N-Y 1Frame presets each frame in the voice password and presets X corresponding noise reduction energy value of voice password and carry out discrete cosine transform, and the N-Y of voice password is preset in acquisition altogether 1Individual Z dimension MFCC coefficient; Said Z is a natural number, 1≤Z≤128; The computing formula of said discrete cosine transform is:
Figure BDA0000150437610000081
Wherein, when carrying out discrete cosine transform to presetting the voice password, j noise reduction energy value that presets the voice password of En (j) expression, 1≤j≤X, 1≤i≤Z, i, j are natural number;
Step 17, the N-Y that presets the voice password to obtaining 1Individual Z dimension MFCC coefficient carries out vector quantization, and it is K that the length that quantizes code book is set, and K is the natural number more than or equal to 1, and 1≤K≤128; Then obtain one and quantize code book, this quantizes code book and is made up of K Z dimension MFCC;
Step 18, the passing threshold of one acquiescence is provided or sets a passing threshold by the user in the user mobile phone system;
The said release stage comprises the steps:
Step 20, user import release voice password, and the preservation form of said release voice password in mobile phone is time-domain signal; Signal sampling rate to said release voice password is 16000Hz;
Step 21, be that the voice data of the said release voice password of time-domain signal carries out the windowing process of branch frame, and calculate the number of frames M of release voice password, in the present embodiment through formula M=(L the preservation form 2-20)/10+1 rounds the number of frames M that tries to achieve said release voice password downwards, wherein, L in the formula 2The audio frequency duration of representing said release voice password, the expression of 20 in formula frame length is 20 milliseconds, the expression of 10 in formula frame is superposed to 10 milliseconds; Said windowing process is handled for adding Hamming window;
Step 22, fast Fourier transform carried out in each frame release voice password, each frame release voice password correspondent transform becomes the frequency-region signal of a release voice password;
Step 23, with of the frequency-region signal successively filtering of said triangular window filter to each release voice password, after the filtering, the frequency-region signal of each release voice password all obtains X energy value of corresponding release voice password;
Step 24, to preceding Y 2X the energy value that each frame release voice password is corresponding in the frame release voice password found the solution the noise energy average of lock voice password, and the said process of finding the solution lock voice password noise energy average is specially: energy value to the Y of X the release voice password that the first frame release voice password is corresponding 2X the energy value that frame release voice password is corresponding asked the arithmetic mean value respectively, obtains X noise energy average of release voice password, and the solution procedure of said arithmetic mean value is specially: promptly corresponding to the first frame release voice password earlier first energy value to the Y 2First energy value that frame release voice password is corresponding is asked the arithmetic mean value, obtains first noise energy average of release voice password, then and the like ask the arithmetic mean value, obtain the noise energy average of X release voice password after the completion altogether;
Step 25, at remaining M-Y 2In the frame release voice password, X the average energy value that each frame release voice password is corresponding deducts X noise energy average of release voice password respectively accordingly, and each frame release voice password all obtains X corresponding with it noise reduction energy value; Said M-Y 2Be meant that the release voice password with the M frame removes first frame to the Y that is used to ask the noise energy average 2Frame release voice password;
Step 26, to the residue M-Y 2X the noise reduction energy value that each frame release voice password is corresponding in the frame release voice password carries out discrete cosine transform, obtains the M-Y of release voice password altogether 2Individual Z dimension MFCC coefficient; The computing formula of said discrete cosine transform is with to preset the computing formula that discrete cosine that the voice password adopted changes identical, promptly
Figure BDA0000150437610000091
When release voice password is carried out discrete cosine transform, the noise reduction energy value of j release voice password of En (j) expression, 1≤j≤X, 1≤i≤Z, i, j are natural number;
Step 27, each Z dimension MFCC coefficient of release voice password is compared with the said quantification code book that presets the voice password respectively one by one, release voice password is M-Y altogether 2Individual Z dimension MFCC coefficient is then compared M-Y 2Wheel is made up of K Z dimension MFCC because this quantizes code book, and each takes turns comparison, all obtains K distance value, and chooses lowest distance value wherein, and promptly each is taken turns comparison and obtains a lowest distance value, has all compared, and obtains M-Y altogether 2Individual lowest distance value is with M-Y 2Individual lowest distance value summation and divided by M-Y 2, obtain average minimum range; Said comparison is for asking Euclidean distance;
Step 28, whether judge said average minimum range less than said passing threshold, if, then remove mobile phone screen locking state, if not, then point out the release failure.
When the signal sampling rate that presets voice password and release voice password of the present invention all adopts 16000Hz; Can under the prerequisite that does not influence speech quality, reduce the amount of audio data that needs processing like this, also be simultaneously the sample frequency that most of audio input device are supported.
The present invention program two second embodiment is following:
In the present embodiment, get X=24, Y 1=3, Y 2=3, Z=13, K=5
A kind of method based on voice short command and the screen locking of vocal print technology releasing mobile phone, it is characterized in that: comprise preset stage and release stage, said preset stage comprises the steps:
The voice password is preset in step 10, user input, and said to preset the preservation form of voice password in mobile phone be time-domain signal; The said signal sampling rate that presets the voice password is 16000Hz;
Step 11, be that the said voice data that presets the voice password of time-domain signal carries out the windowing process of branch frame, and calculate the number of frames N that presets the voice password the preservation form; In the present embodiment through formula N=(L 1-20)/10+1 rounds downwards and tries to achieve the said number of frames N that presets the voice password, wherein, L in the formula 1Represent the said audio frequency duration that presets the voice password, the expression of 20 in formula frame length is 20 milliseconds, and the expression of 10 in formula frame is superposed to 10 milliseconds; Said windowing process is handled for adding Hamming window;
Step 12, each frame is preset the voice password carry out fast Fourier transform, each frame presets the frequency-region signal that voice password correspondent transform becomes to preset the voice password;
Step 13, with the triangular window filter of linear distribution on 24 Mel frequency markings, to the frequency-region signal filtering successively of respectively presetting the voice password, after the filtering, each frequency-region signal that presets the voice password all obtains 24 corresponding energy values; The triangular window filter of linear distribution on said 24 Mel frequency markings, its centre frequency is respectively: 100,200,300,400,500,600,700,800,900,1000,1149,1320; 1516,1741,2000,2297,2639,3031,3482,4000,4595,5278,6063,6964, bandwidth is: 100; 100,100,100,100,100,100,100,100,124,160,184,211; 242,278,320,367,422,484,556,639,734,843,969, above numerical value unit is Hz;
Step 14, preceding 3 frames are preset each frame in the voice password preset 24 corresponding energy values of voice password and ask the noise energy average that presets the voice password; The said process that presets voice password noise energy average of asking is specially: first frame is preset 24 corresponding energy value to the 3 frames of voice password preset 24 corresponding energy values of voice password and ask the arithmetic mean value respectively; Obtain presetting 24 noise energy averages of voice password; The solution procedure of said arithmetic mean is specially: promptly earlier first frame is preset corresponding first energy value to the 3 frames of voice password and preset first corresponding energy value of voice password and ask the arithmetic mean value; Obtain presetting first noise energy average of voice password; Then and the like ask the arithmetic mean value, obtain presetting 3 noise energy averages of voice password after the completion altogether;
Step 15, preset in the voice password at remaining N-3 frame; Each frame presets 24 corresponding energy values of voice password and deducts 24 noise energy averages that preset the voice password respectively accordingly, and each frame presets the voice password and all obtains 24 corresponding with it noise reduction energy values; Said N-3 is meant that the voice password that presets with the N frame removes and is used to ask first frame to the, 3 frames of noise energy average to preset the voice password;
Step 16, residue N-3 frame is preset in the voice password each frame preset 24 corresponding noise reduction energy values of voice password and carry out discrete cosine transform, obtain to preset N-3 13 dimension MFCC coefficients of voice password altogether; The computing formula of said discrete cosine transform is:
Figure BDA0000150437610000111
wherein; When carrying out discrete cosine transform to presetting the voice password; J noise reduction energy value that presets the voice password of En (j) expression; 1≤j≤24,1≤i≤13, i, j are natural number; Specify as follows:
Preset to appoint the voice password from residue N-3 frame and get a frame and preset 24 corresponding noise reduction energy values of voice password; Get i=1 earlier; Try to achieve this frame and preset the first dimension MFCC coefficient of voice password, try to achieve the 13rd dimension MFCC coefficient that this frame presets the voice password when getting i=13, analogize in proper order; The value of i is taken at 13 o'clock from 1, obtains this frame altogether and presets 13 corresponding dimension MFCC coefficients of voice password; Each frame that residue M-3 frame is preset the voice password presets the voice password all through after the said diffusing cosine transform computing formula calculating, obtains to preset N-3 13 dimension MFCC coefficients of voice password;
Step 17, the N-3 that presets the voice password 13 dimension MFCC coefficients that obtain are carried out vector quantization, it is 5 that the length that quantizes code book is set, and then obtains one and quantizes code book, and this quantizes code book and is made up of 5 13 dimension MFCC; Excessive quantification code book length can cause the increase of computing time; Quantize the code book curtailment and then be not enough to portray the phonetic feature that presets password; Selection quantification code book length K=5 o'clock is not only lacked computing time, also can effectively portray the phonetic feature that presets the voice password simultaneously;
Step 18, the passing threshold of one acquiescence is provided or sets a passing threshold by the user in the user mobile phone system;
The said release stage comprises the steps:
Step 20, user import release voice password, and the preservation form of said release voice password in mobile phone is time-domain signal; Signal sampling rate to said release voice password is 16000Hz;
Step 21, be that the voice data of the said release voice password of time-domain signal carries out the windowing process of branch frame, and calculate the number of frames M of release voice password, in the present embodiment through formula M=(L the preservation form 2-20)/10+1 rounds the number of frames M that tries to achieve said release voice password downwards, wherein, L in the formula 2The audio frequency duration of representing said release voice password, the expression of 20 in formula frame length is 20 milliseconds, the expression of 10 in formula frame is superposed to 10 milliseconds; Said windowing process is handled for adding Hamming window;
Step 22, fast Fourier transform carried out in each frame release voice password, each frame release voice password correspondent transform becomes the frequency-region signal of a release voice password;
Step 23, with of the frequency-region signal successively filtering of said triangular window filter to each release voice password, after the filtering, the frequency-region signal of each release voice password all obtains 24 energy values of corresponding release voice password;
Step 24,24 corresponding energy values of each frame release voice password in the preceding 3 frame release voice passwords are found the solution the noise energy average of lock voice password; The said process of finding the solution lock voice password noise energy average is specially: 24 corresponding energy values of energy value to the 3 frame release voice passwords of 24 release voice passwords that the first frame release voice password is corresponding are asked the arithmetic mean value respectively; Obtain 24 noise energy averages of release voice password; The solution procedure of said arithmetic mean value is specially: promptly earlier corresponding first the corresponding energy value of first energy value to the 3 frame release voice passwords of the first frame release voice password is asked the arithmetic mean value; Obtain first noise energy average of release voice password; Then and the like ask the arithmetic mean value, obtain the noise energy average of 24 release voice passwords after the completion altogether;
Step 25, in remaining M-3 frame release voice password; 24 average energy value that each frame release voice password is corresponding deduct 24 noise energy averages of release voice password respectively accordingly, and each frame release voice password all obtains 24 corresponding with it noise reduction energy values; Said M-3 is meant that the release voice password with the M frame removes first frame to the, the 3 frame release voice passwords that are used to ask the noise energy average;
Step 26,24 corresponding noise reduction energy values of each frame release voice password in the residue M-3 frame release voice password are carried out discrete cosine transform, obtain M-3 13 dimension MFCC coefficients of release voice password altogether; The computing formula of said discrete cosine transform is with to preset the computing formula that discrete cosine that the voice password adopted changes identical; Promptly
Figure BDA0000150437610000131
is when carrying out discrete cosine transform to release voice password; The noise reduction energy value of j release voice password of En (j) expression; 1≤j≤24; 1≤i≤13, i, j are natural number;
Step 27, each 13 dimension MFCC coefficient of release voice password are compared with the said quantification code book that presets the voice password respectively one by one, release voice password is M-3 13 dimension MFCC coefficients altogether, then compare the M-3 wheel; Be made up of 5 13 dimension MFCC because this quantizes code book, each takes turns comparison, all obtains 5 distance values; And choose lowest distance value wherein; Be that each takes turns lowest distance value of comparison acquisition, all compared, obtain M-3 lowest distance value altogether; With M-3 lowest distance value summation and divided by M-3, obtain average minimum range; Said comparison is for asking Euclidean distance;
Illustrate comparison process: suppose K=5 at present; M-3=6; Then first round comparison is: from 6 13 dimension MFCC coefficients of release voice password, select one earlier; And it is asked Euclidean distance respectively with presetting 5 the 13 dimension MFCC coefficients that the voice password quantizes in the code book, and then produce 5 distance values, choose the lowest distance value of distance value minimum in these 5 distance values as first round comparison; Second taking turns comparison and be then: from 13 dimension MFCC coefficients of remaining 5 release voice passwords carrying out comparison, select one again; And with its with preset 5 the 13 dimension MFCC coefficients that the voice password quantizes in the code book and ask Euclidean distance respectively; Produce 5 distance values, choose in these 5 distance values minimum distance value as second take turns comparison lowest distance value; By that analogy, 13 dimension MFCC coefficients of 6 release voice passwords are arranged, then carry out 6 and take turns comparison; Each is taken turns comparison and all obtains 5 distance values, and chooses lowest distance value wherein, has all compared, and obtains 6 lowest distance value altogether;
Step 28, whether judge said average minimum range less than said passing threshold, if, then remove mobile phone screen locking state, if not, then point out the release failure.
In the present invention, said triangular window filter is the triangular window filter that also can select linear distribution on 39 Mel frequency markings for use, i.e. X=39, and the centre frequency of said triangular window filter is respectively: 50,100,150,200,260,320,390,460,530,610,700; 790,890,990,1100,1210,1340,1480,1610,1770,1930,2100,2280,2480,2680; 2900,3140,3380,3650,3930,4230,4560,4900,5260,5650,6060,6500,6970,7470; Bandwidth is: 100,100,100,120,127,127,148,148,148,169,190,190,233,233; 254,254,296,296,275,339,339,360,381,424,424,466,508; 508,572,593,636,699,720,763,826,869,932,996,1060, above numerical value unit is Hz.During the triangular window filter of linear distribution, its principle is all identical with embodiment two with the present invention program two embodiment one on selecting 39 Mel frequency markings for use.
Remove mobile phone screen locking state through voice short command and voiceprint authentication technology; Not only convenient and swift but also guaranteed the fail safe that mobile phone uses; Simultaneously, introduce the windowing of branch frame, MFCC coefficient calculations and vector quantization treatment technology; User's sound property can be extracted and compare more accurately, improved the user experience in convenience and the fail safe.
Though more than described embodiment of the present invention; But the technical staff who is familiar with the present technique field is to be understood that; We described concrete embodiment is illustrative; Rather than being used for qualification to scope of the present invention, those of ordinary skill in the art are in the modification and the variation of the equivalence of doing according to spirit of the present invention, all should be encompassed in the scope that claim of the present invention protects.

Claims (11)

1. remove the method for mobile phone screen locking based on voice short command and vocal print technology for one kind, it is characterized in that: comprise preset stage and release stage, said preset stage comprises the steps:
The voice password is preset in step 1, user's input, and the preservation form of said voice password in mobile phone is time-domain signal;
Step 2, be that the voice data that time-domain signal said presets the voice password is carried out FFT, the said voice data that presets the voice password is transformed into the frequency-region signal that presets the voice password the preservation form;
Step 3, the passing threshold of an acquiescence is provided in the user mobile phone system or sets a passing threshold by the user;
The said release stage comprises the steps:
Step 4, user import release voice password, and the preservation form of said voice password in mobile phone is time-domain signal;
Step 5, be that the voice data of the said release voice password of time-domain signal is carried out fast Fourier transform with the preservation form, the voice data of said release voice password is transformed into the frequency-region signal spectrum of release voice password;
Step 6, the said release voice password frequency-region signal of calculating and the said difference value that presets voice password frequency-region signal;
Step 7, whether judge said difference value less than said passing threshold, if, then remove mobile phone screen locking state, if not, then point out the release failure.
2. the method based on voice short command and the screen locking of vocal print technology releasing mobile phone according to claim 1, it is characterized in that: said difference value obtains through asking Euclidean distance.
3. remove the method for mobile phone screen locking based on voice short command and vocal print technology for one kind, it is characterized in that: comprise preset stage and release stage, said preset stage comprises the steps:
The voice password is preset in step 10, user input, and said to preset the preservation form of voice password in mobile phone be time-domain signal;
Step 11, be that the said voice data that presets the voice password of time-domain signal carries out the windowing process of branch frame, and calculate the number of frames N that presets the voice password the preservation form;
Step 12, each frame is preset the voice password carry out fast Fourier transform, each frame presets the frequency-region signal that voice password correspondent transform becomes to preset the voice password;
Step 13, with the triangular window filter of linear distribution on X the Mel frequency marking, to the frequency-region signal filtering successively of respectively presetting the voice password, after the filtering, each frequency-region signal that presets the voice password all obtains X corresponding energy value; Said X is a natural number, 1≤X≤128;
Step 14, to preceding Y 1Frame presets each frame in the voice password and presets X corresponding energy value of voice password and ask the noise energy average that presets the voice password, said Y 1Be natural number, 1≤Y 1≤N; The said process that presets voice password noise energy average of asking is specially: first frame is preset X corresponding energy value to the Y of voice password 1Frame presets X corresponding energy value of voice password and asks the arithmetic mean value respectively, obtains presetting X noise energy average of voice password, and the solution procedure of said arithmetic mean is specially: promptly earlier first frame is preset corresponding first energy value to the Y of voice password 1Frame presets first corresponding energy value of voice password and asks the arithmetic mean value, obtains presetting first noise energy average of voice password, then and the like ask the arithmetic mean value, obtain presetting the Y of voice password after the completion altogether 1Individual noise energy average;
Step 15, at remaining N-Y 1Frame presets in the voice password, and each frame presets X corresponding energy value of voice password and deducts X the noise energy average that presets the voice password respectively accordingly, and each frame presets the voice password and all obtains X corresponding with it noise reduction energy value; Said N-Y 1Be meant that the voice password that presets with the N frame removes first frame to the Y that is used to ask the noise energy average 1Frame presets the voice password;
Step 16, to the residue N-Y 1Frame presets each frame in the voice password and presets X corresponding noise reduction energy value of voice password and carry out discrete cosine transform, and the N-Y of voice password is preset in acquisition altogether 1Individual Z dimension MFCC coefficient; Said Z is a natural number, 1≤Z≤128;
Step 17, the N-Y that presets the voice password to obtaining 1Individual Z dimension MFCC coefficient carries out vector quantization, and it is K that the length that quantizes code book is set, and K is a natural number, and 1≤K≤128; Then obtain one and quantize code book, this quantizes code book and is made up of K Z dimension MFCC;
Step 18, the passing threshold of one acquiescence is provided or sets a passing threshold by the user in the user mobile phone system;
The said release stage comprises the steps:
Step 20, user import release voice password; The preservation form of said release voice password in mobile phone is time-domain signal;
Step 21, be that the voice data of the said release voice password of time-domain signal carries out the windowing process of branch frame, and calculate the number of frames M of release voice password the preservation form;
Step 22, fast Fourier transform carried out in each frame release voice password, each frame release voice password correspondent transform becomes the frequency-region signal of a release voice password;
Step 23, with of the frequency-region signal successively filtering of said triangular window filter to each release voice password, after the filtering, the frequency-region signal of each release voice password all obtains X energy value of corresponding release voice password;
Step 24, to preceding Y 2X the energy value that each frame release voice password is corresponding in the frame release voice password found the solution the noise energy average of lock voice password, and the said process of finding the solution lock voice password noise energy average is specially: energy value to the Y of X the release voice password that the first frame release voice password is corresponding 2X the energy value that frame release voice password is corresponding asked the arithmetic mean value respectively, obtains X noise energy average of release voice password, and the solution procedure of said arithmetic mean value is specially: promptly corresponding to the first frame release voice password earlier first energy value to the Y 2First energy value that frame release voice password is corresponding is asked the arithmetic mean value, obtains first noise energy average of release voice password, then and the like ask the arithmetic mean value, obtain the noise energy average of X release voice password after the completion altogether;
Step 25, at remaining M-Y 2In the frame release voice password, X the average energy value that each frame release voice password is corresponding deducts X noise energy average of release voice password respectively accordingly, and each frame release voice password all obtains X corresponding with it noise reduction energy value; Said M-Y 2Be meant that the release voice password with the M frame removes first frame to the Y that is used to ask the noise energy average 2Frame release voice password;
Step 26, to the residue M-Y 2X the noise reduction energy value that each frame release voice password is corresponding in the frame release voice password carries out discrete cosine transform, obtains the M-Y of release voice password altogether 2Individual Z dimension MFCC coefficient;
Step 27, each Z dimension MFCC coefficient of release voice password is compared with the said quantification code book that presets the voice password respectively one by one, release voice password is M-Y altogether 2Individual Z dimension MFCC coefficient is then compared M-Y 2Wheel is made up of K Z dimension MFCC because this quantizes code book, and each takes turns comparison, all obtains K distance value, and chooses lowest distance value wherein, and promptly each is taken turns comparison and obtains a lowest distance value, has all compared, and obtains M-Y altogether 2Individual lowest distance value is with M-Y 2Individual lowest distance value summation and divided by M-Y 2, obtain average minimum range; Said comparison is for asking Euclidean distance;
Step 28, whether judge said average minimum range less than said passing threshold, if, then remove mobile phone screen locking state, if not, then point out the release failure.
4. according to the method described in the claim 3, it is characterized in that: in the step 11, through formula N=(L based on voice short command and the screen locking of vocal print technology releasing mobile phone 1-20)/10+1 rounds downwards and tries to achieve the said number of frames N that presets the voice password, wherein, L in the formula 1Represent the said audio frequency duration that presets the voice password, L 1Unit be millisecond, 20 in formula expression frame length is 20 milliseconds, the expression of 10 in formula frame is superposed to 10 milliseconds.
5. according to the method described in the claim 3, it is characterized in that: in the step 21, through formula M=(L based on voice short command and the screen locking of vocal print technology releasing mobile phone 2-20)/10+1 rounds the number of frames M that tries to achieve said release voice password downwards, wherein, L in the formula 2The audio frequency duration of representing said release voice password, L 2Unit be millisecond, 20 in formula expression frame length is 20 milliseconds, the expression of 10 in formula frame is superposed to 10 milliseconds.
6. according to the method described in the claim 3 based on voice short command and the screen locking of vocal print technology releasing mobile phone; It is characterized in that: import saidly when presetting voice password and said release voice password, the said signal sampling rate that presets voice password and said release voice password is 16000Hz.
7. according to the method described in the claim 3, it is characterized in that: number 24≤X≤39 of the Mel frequency marking of said triangular window filter based on voice short command and the screen locking of vocal print technology releasing mobile phone.
8. according to removing the method for mobile phone screen locking based on voice short command and vocal print technology described in the claim 7, it is characterized in that: said triangular window filter is the triangular window filter with linear distribution on 24 Mel frequency markings, i.e. X=24, and the centre frequency of said triangular window filter is respectively: 100,200,300,400,500,600; 700,800,900,1000,1149,1320,1516,1741,2000; 2297,2639,3031,3482,4000,4595,5278,6063,6964; Bandwidth is: 100,100,100,100,100,100,100,100,100; 124,160,184,211,242,278,320,367; 422,484,556,639,734,843,969, above numerical value unit is Hz.
9. according to removing the method for mobile phone screen locking based on voice short command and vocal print technology described in the claim 7, it is characterized in that: said triangular window filter is the triangular window filter with linear distribution on 39 Mel frequency markings, i.e. X=39, and the centre frequency of said triangular window filter is respectively: 50,100,150,200,260,320,390,460,530; 610,700,790,890,990,1100,1210,1340,1480,1610,1770,1930; 2100,2280,2480,2680,2900,3140,3380,3650,3930,4230,4560,4900; 5260,5650,6060,6500,6970,7470, bandwidth is: 100,100,100,120,127,127; 148,148,148,169,190,190,233,233,254,254,296,296; 275,339,339,360,381,424,424,466,508,508,572; 593,636,699,720,763,826,869,932,996,1060, above numerical value unit is Hz.
10. according to the method based on voice short command and the screen locking of vocal print technology releasing mobile phone described in the claim 3, it is characterized in that: the computing formula of said discrete cosine transform is:
be j noise reduction energy value of En (j) expression wherein; 1≤j≤X; 1≤i≤Z, i, j are natural number.
11. according to the method based on voice short command and the screen locking of vocal print technology releasing mobile phone described in the claim 3, it is characterized in that: the windowing process in step 11 and the step 21 is and adds the Hamming window processing.
CN2012100970831A 2012-04-05 2012-04-05 Method for removing lock of mobile phone screen based on short voice command and voice-print technology Active CN102647521B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012100970831A CN102647521B (en) 2012-04-05 2012-04-05 Method for removing lock of mobile phone screen based on short voice command and voice-print technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012100970831A CN102647521B (en) 2012-04-05 2012-04-05 Method for removing lock of mobile phone screen based on short voice command and voice-print technology

Publications (2)

Publication Number Publication Date
CN102647521A true CN102647521A (en) 2012-08-22
CN102647521B CN102647521B (en) 2013-10-09

Family

ID=46660091

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012100970831A Active CN102647521B (en) 2012-04-05 2012-04-05 Method for removing lock of mobile phone screen based on short voice command and voice-print technology

Country Status (1)

Country Link
CN (1) CN102647521B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103280219A (en) * 2013-05-16 2013-09-04 中山大学 Android platform-based voiceprint recognition method
CN103760969A (en) * 2013-12-12 2014-04-30 宇龙计算机通信科技(深圳)有限公司 Mobile terminal and method for controlling application program through voice
CN103943110A (en) * 2013-01-21 2014-07-23 联想(北京)有限公司 Control method, device and electronic equipment
CN104219381A (en) * 2014-08-18 2014-12-17 上海天奕达电子科技有限公司 Intelligent unlocking method, terminal and system
CN104464039A (en) * 2013-09-18 2015-03-25 凌通科技股份有限公司 Door unlocking method, asset use right renting method and system using same
CN104813326A (en) * 2012-10-04 2015-07-29 谷歌公司 Limiting functionality of software program based on security model
CN104937603A (en) * 2013-01-10 2015-09-23 日本电气株式会社 Terminal, unlocking method, and program
CN104965724A (en) * 2014-12-16 2015-10-07 深圳市腾讯计算机系统有限公司 Working state switching method and apparatus
WO2016033988A1 (en) * 2014-09-04 2016-03-10 中兴通讯股份有限公司 Method and apparatus for processing service
CN106250742A (en) * 2016-07-22 2016-12-21 北京小米移动软件有限公司 The unlocking method of mobile terminal, device and mobile terminal
CN106601238A (en) * 2015-10-14 2017-04-26 阿里巴巴集团控股有限公司 Application operation processing method and application operation processing device
CN107147791A (en) * 2017-05-15 2017-09-08 上海与德科技有限公司 A kind of method, device and mobile terminal of speech unlocking
WO2017166832A1 (en) * 2016-03-31 2017-10-05 青岛歌尔声学科技有限公司 Unlocking method using sound password and combination lock
CN107644645A (en) * 2017-09-29 2018-01-30 联想(北京)有限公司 A kind of sound control method, device and electronic equipment
CN109313903A (en) * 2016-06-06 2019-02-05 思睿逻辑国际半导体有限公司 Voice user interface
CN111622616A (en) * 2020-04-15 2020-09-04 阜阳万瑞斯电子锁业有限公司 Personal voice recognition unlocking system and method for electronic lock

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1746971A (en) * 2004-09-09 2006-03-15 上海优浪信息科技有限公司 Speech key of mobile
CN101064043A (en) * 2006-04-29 2007-10-31 上海优浪信息科技有限公司 Sound-groove gate inhibition system and uses thereof
US20070288236A1 (en) * 2006-04-05 2007-12-13 Samsung Electronics Co., Ltd. Speech signal pre-processing system and method of extracting characteristic information of speech signal
CN102148899A (en) * 2011-03-29 2011-08-10 广东欧珀移动通信有限公司 Mobile phone acoustic-control unlocking method
CN102324232A (en) * 2011-09-12 2012-01-18 辽宁工业大学 Method for recognizing sound-groove and system based on gauss hybrid models

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1746971A (en) * 2004-09-09 2006-03-15 上海优浪信息科技有限公司 Speech key of mobile
US20070288236A1 (en) * 2006-04-05 2007-12-13 Samsung Electronics Co., Ltd. Speech signal pre-processing system and method of extracting characteristic information of speech signal
CN101064043A (en) * 2006-04-29 2007-10-31 上海优浪信息科技有限公司 Sound-groove gate inhibition system and uses thereof
CN102148899A (en) * 2011-03-29 2011-08-10 广东欧珀移动通信有限公司 Mobile phone acoustic-control unlocking method
CN102324232A (en) * 2011-09-12 2012-01-18 辽宁工业大学 Method for recognizing sound-groove and system based on gauss hybrid models

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
胡益平: "《基于GMM的说话人识别技术研究与实现》", 《CNKI优秀硕士学位论文全文库》, 30 June 2008 (2008-06-30) *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104813326A (en) * 2012-10-04 2015-07-29 谷歌公司 Limiting functionality of software program based on security model
CN104937603B (en) * 2013-01-10 2018-09-25 日本电气株式会社 terminal, unlocking method and program
CN104937603A (en) * 2013-01-10 2015-09-23 日本电气株式会社 Terminal, unlocking method, and program
US10134392B2 (en) 2013-01-10 2018-11-20 Nec Corporation Terminal, unlocking method, and program
US10147420B2 (en) 2013-01-10 2018-12-04 Nec Corporation Terminal, unlocking method, and program
CN103943110A (en) * 2013-01-21 2014-07-23 联想(北京)有限公司 Control method, device and electronic equipment
CN103280219A (en) * 2013-05-16 2013-09-04 中山大学 Android platform-based voiceprint recognition method
CN104464039A (en) * 2013-09-18 2015-03-25 凌通科技股份有限公司 Door unlocking method, asset use right renting method and system using same
CN103760969A (en) * 2013-12-12 2014-04-30 宇龙计算机通信科技(深圳)有限公司 Mobile terminal and method for controlling application program through voice
CN104219381A (en) * 2014-08-18 2014-12-17 上海天奕达电子科技有限公司 Intelligent unlocking method, terminal and system
WO2016033988A1 (en) * 2014-09-04 2016-03-10 中兴通讯股份有限公司 Method and apparatus for processing service
CN105469791A (en) * 2014-09-04 2016-04-06 中兴通讯股份有限公司 Method and device for processing service
CN104965724A (en) * 2014-12-16 2015-10-07 深圳市腾讯计算机系统有限公司 Working state switching method and apparatus
CN106601238A (en) * 2015-10-14 2017-04-26 阿里巴巴集团控股有限公司 Application operation processing method and application operation processing device
WO2017166832A1 (en) * 2016-03-31 2017-10-05 青岛歌尔声学科技有限公司 Unlocking method using sound password and combination lock
CN109313903A (en) * 2016-06-06 2019-02-05 思睿逻辑国际半导体有限公司 Voice user interface
CN106250742A (en) * 2016-07-22 2016-12-21 北京小米移动软件有限公司 The unlocking method of mobile terminal, device and mobile terminal
CN107147791A (en) * 2017-05-15 2017-09-08 上海与德科技有限公司 A kind of method, device and mobile terminal of speech unlocking
CN107147791B (en) * 2017-05-15 2019-11-15 上海与德科技有限公司 A kind of method, device and mobile terminal of speech unlocking
CN107644645A (en) * 2017-09-29 2018-01-30 联想(北京)有限公司 A kind of sound control method, device and electronic equipment
CN111622616A (en) * 2020-04-15 2020-09-04 阜阳万瑞斯电子锁业有限公司 Personal voice recognition unlocking system and method for electronic lock

Also Published As

Publication number Publication date
CN102647521B (en) 2013-10-09

Similar Documents

Publication Publication Date Title
CN102647521B (en) Method for removing lock of mobile phone screen based on short voice command and voice-print technology
CN108597496B (en) Voice generation method and device based on generation type countermeasure network
CN109087669B (en) Audio similarity detection method and device, storage medium and computer equipment
CN102737629B (en) Embedded type speech emotion recognition method and device
CN101197131B (en) Accidental vocal print password validation system, accidental vocal print cipher lock and its generation method
Schluter et al. Using phase spectrum information for improved speech recognition performance
WO2020181824A1 (en) Voiceprint recognition method, apparatus and device, and computer-readable storage medium
CN109215665A (en) A kind of method for recognizing sound-groove based on 3D convolutional neural networks
CN103310788A (en) Voice information identification method and system
WO2014114049A1 (en) Voice recognition method and device
CN109256139A (en) A kind of method for distinguishing speek person based on Triplet-Loss
CN102664010B (en) Robust speaker distinguishing method based on multifactor frequency displacement invariant feature
CN108564965B (en) Anti-noise voice recognition system
CN102968990A (en) Speaker identifying method and system
CN111603776B (en) Method for identifying gunshot in audio data, motor driving method and related device
CN106548786B (en) Audio data detection method and system
CN110767239A (en) Voiceprint recognition method, device and equipment based on deep learning
CN103297590B (en) A kind of method and system realizing equipment unblock based on audio frequency
CN111696580A (en) Voice detection method and device, electronic equipment and storage medium
CN107481727A (en) A kind of acoustic signal processing method and system based on the control of electric sound keynote
Zhang et al. Voice biometric identity authentication system based on android smart phone
CN108682432A (en) Speech emotion recognition device
EP3499502A1 (en) Voice information processing method and apparatus
CN110570871A (en) TristouNet-based voiceprint recognition method, device and equipment
Bahaghighat et al. Textdependent Speaker Recognition by combination of LBG VQ and DTW for persian language

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20160106

Address after: 100000, No. two, building 17, Zhongguancun Software Park, 8 northeast Wang Xi Road, Beijing, Haidian District, A2

Patentee after: BAIDU.COM TIMES TECHNOLOGY (BEIJING) Co.,Ltd.

Address before: 350000, 403A building, four floor, Torch Innovation Building, 8 star road, Fuzhou Development Zone, Fuzhou, Fujian, China

Patentee before: Fuzhou Boyuan Wireless Network Technology Co., Ltd.