CN102647521A

CN102647521A - Method for removing lock of mobile phone screen based on short voice command and voice-print technology

Info

Publication number: CN102647521A
Application number: CN2012100970831A
Authority: CN
Inventors: 刘德建; 关胤; 余志鹏; 吴拥民
Original assignee: FUZHOU BOYUAN WIRELESS NETWORK TECHNOLOGY Co Ltd
Current assignee: Baidu com Times Technology Beijing Co Ltd
Priority date: 2012-04-05
Filing date: 2012-04-05
Publication date: 2012-08-22
Anticipated expiration: 2032-04-05
Also published as: CN102647521B

Abstract

The invention provides a method for removing lock of mobile phone screen based on a short voice command and a voice-print technology. The method comprises the steps that: in a preset stage, a user inputs preset voice passwords and executes quick fourier change so as to determine a pass threshold; and in an unlocking stage, the user inputs unlocked voice passwords and executes quick fourier change so as to compute a difference value between an unlocked voice password frequency domain signal and a preset voice password frequency domain signal, judges whether the mobile phone is unlocked by comparing whether the difference value is smaller than the pass threshold, and unlocks the locking state of the mobile phone. the method is convenient and fast and ensures the use safety of the mobile phone; rules on the computing of the difference value are conducted on the basis, technologies containing framing and windowing, MFCC (Mel-Frequency Cepstral Coefficient) computing and vector quantization processing are introduced, so that the sound characteristics of the user can be accurately extracted and compared, and the user experience on portability and safety is improved.

Description

Method based on voice short command and the screen locking of vocal print technology releasing mobile phone

[technical field]

The present invention relates to a kind of method based on voice short command and the screen locking of vocal print technology releasing mobile phone.

[background technology]

Existing mobile phone mostly is through touch action, and illumination judges that technology such as cryptoguard reach the purpose of removing the screen locking state.Adopt touch action, technology such as illumination judgement is removed the mobile phone screen locking, and mobile phone does not have fail safe; Anyone can the release mobile phone; And adopt the mode of cryptoguard to remove the mobile phone screen locking, and use mobile phone though can prevent other unauthorized users, convenient and swift inadequately during operation.

Publication number is 102148899A; Open day is the patent of invention of 2011-8-10, is the waveform (time-domain signal just) of user input instruction waveform and the existing release sound instruction of cell phone system is compared, and judging whether to coincide determines whether release; It is perhaps identical for 80%-100% to obtain coincideing through the comparison waveform; This is impossible realize, because same individual tells about an identical speech or a word constantly in difference, its different wave shape is also very big; Therefore, this invention does not possess exploitativeness.

[summary of the invention]

The technical problem that the present invention will solve; Be to provide a kind of method based on voice short command and the screen locking of vocal print technology releasing mobile phone, not only convenient and swift but also guaranteed the fail safe that mobile phone uses, on this basis the calculating of difference value is stipulated; Introduce and divide the frame windowing; MFCC coefficient calculations and vector quantization treatment technology can extract and compare user's sound property more accurately, have improved the user experience in convenience and the fail safe.

The present invention solves above-mentioned technical problem through following two kinds of technical schemes:

Scheme one: a kind of method based on voice short command and the screen locking of vocal print technology releasing mobile phone, comprise preset stage and release stage, said preset stage comprises the steps:

The voice password is preset in step 1, user's input, and the preservation form of said voice password in mobile phone is time-domain signal;

Step 2, be that the voice data that time-domain signal said presets the voice password is carried out FFT, the said voice data that presets the voice password is transformed into the frequency-region signal that presets the voice password the preservation form;

Step 3, the passing threshold of an acquiescence is provided in the user mobile phone system or sets a passing threshold by the user;

The said release stage comprises the steps:

Step 4, user import release voice password, and the preservation form of said voice password in mobile phone is time-domain signal;

Step 5, be that the voice data of the said release voice password of time-domain signal is carried out fast Fourier transform with the preservation form, the voice data of said release voice password is transformed into the frequency-region signal spectrum of release voice password;

Step 6, the said release voice password frequency-region signal of calculating and the said difference value that presets voice password frequency-region signal;

Step 7, whether judge said difference value less than said passing threshold, if, then remove mobile phone screen locking state, if not, then point out the release failure.

Further, said difference value obtains through asking Euclidean distance.

Scheme two: a kind of method based on voice short command and the screen locking of vocal print technology releasing mobile phone, comprise preset stage and release stage, said preset stage comprises the steps:

The voice password is preset in step 10, user input, and said to preset the preservation form of voice password in mobile phone be time-domain signal;

Step 11, be that the said voice data that presets the voice password of time-domain signal carries out the windowing process of branch frame, and calculate the number of frames N that presets the voice password the preservation form;

Step 12, each frame is preset the voice password carry out fast Fourier transform, each frame presets the frequency-region signal that voice password correspondent transform becomes to preset the voice password;

Step 13, with the triangular window filter of linear distribution on X the Mel frequency marking, to the frequency-region signal filtering successively of respectively presetting the voice password, after the filtering, each frequency-region signal that presets the voice password all obtains X corresponding energy value; Said X is a natural number, 1≤X≤128;

Step 14, to preceding Y ₁Frame presets each frame in the voice password and presets X corresponding energy value of voice password and ask the noise energy average that presets the voice password, said Y ₁Be natural number, 1≤Y ₁≤N; The said process that presets voice password noise energy average of asking is specially: first frame is preset X corresponding energy value to the Y of voice password ₁Frame presets X corresponding energy value of voice password and asks the arithmetic mean value respectively, obtains presetting X noise energy average of voice password, and the solution procedure of said arithmetic mean is specially: promptly earlier first frame is preset corresponding first energy value to the Y of voice password ₁Frame presets first corresponding energy value of voice password and asks the arithmetic mean value, obtains presetting first noise energy average of voice password, then and the like ask the arithmetic mean value, obtain presetting the Y of voice password after the completion altogether ₁Individual noise energy average;

Step 15, at remaining N-Y ₁Frame presets in the voice password, and each frame presets X corresponding energy value of voice password and deducts X the noise energy average that presets the voice password respectively accordingly, and each frame presets the voice password and all obtains X corresponding with it noise reduction energy value; Said N-Y ₁Be meant that the voice password that presets with the N frame removes first frame to the Y that is used to ask the noise energy average ₁Frame presets the voice password;

Step 16, to the residue N-Y ₁Frame presets each frame in the voice password and presets X corresponding noise reduction energy value of voice password and carry out discrete cosine transform, and the N-Y of voice password is preset in acquisition altogether ₁Individual Z dimension MFCC coefficient; Said Z is a natural number, 1≤Z≤128;

Step 17, the N-Y that presets the voice password to obtaining ₁Individual Z dimension MFCC coefficient carries out vector quantization, and it is K that the length that quantizes code book is set, and K is a natural number, and 1≤K≤128; Then obtain one and quantize code book, this quantizes code book and is made up of K Z dimension MFCC;

Step 18, the passing threshold of one acquiescence is provided or sets a passing threshold by the user in the user mobile phone system;

The said release stage comprises the steps:

Step 20, user import release voice password; The preservation form of said release voice password in mobile phone is time-domain signal;

Step 21, be that the voice data of the said release voice password of time-domain signal carries out the windowing process of branch frame, and calculate the number of frames M of release voice password the preservation form;

Step 22, fast Fourier transform carried out in each frame release voice password, each frame release voice password correspondent transform becomes the frequency-region signal of a release voice password;

Step 23, with of the frequency-region signal successively filtering of said triangular window filter to each release voice password, after the filtering, the frequency-region signal of each release voice password all obtains X energy value of corresponding release voice password;

Step 24, to preceding Y ₂X the energy value that each frame release voice password is corresponding in the frame release voice password found the solution the noise energy average of lock voice password, and the said process of finding the solution lock voice password noise energy average is specially: energy value to the Y of X the release voice password that the first frame release voice password is corresponding ₂X the energy value that frame release voice password is corresponding asked the arithmetic mean value respectively, obtains X noise energy average of release voice password, and the solution procedure of said arithmetic mean value is specially: promptly corresponding to the first frame release voice password earlier first energy value to the Y ₂First energy value that frame release voice password is corresponding is asked the arithmetic mean value, obtains first noise energy average of release voice password, then and the like ask the arithmetic mean value, obtain the noise energy average of X release voice password after the completion altogether;

Step 25, at remaining M-Y ₂In the frame release voice password, X the average energy value that each frame release voice password is corresponding deducts X noise energy average of release voice password respectively accordingly, and each frame release voice password all obtains X corresponding with it noise reduction energy value; Said M-Y ₂Be meant that the release voice password with the M frame removes first frame to the Y that is used to ask the noise energy average ₂Frame release voice password;

Step 26, to the residue M-Y ₂X the noise reduction energy value that each frame release voice password is corresponding in the frame release voice password carries out discrete cosine transform, obtains the M-Y of release voice password altogether ₂Individual Z dimension MFCC coefficient;

Step 27, each Z dimension MFCC coefficient of release voice password is compared with the said quantification code book that presets the voice password respectively one by one, release voice password is M-Y altogether ₂Individual Z dimension MFCC coefficient is then compared M-Y ₂Wheel is made up of K Z dimension MFCC because this quantizes code book, and each takes turns comparison, all obtains K distance value, and chooses lowest distance value wherein, and promptly each is taken turns comparison and obtains a lowest distance value, has all compared, and obtains M-Y altogether ₂Individual lowest distance value is with M-Y ₂Individual lowest distance value summation and divided by M-Y ₂, obtain average minimum range; Said comparison is for asking Euclidean distance;

Step 28, whether judge said average minimum range less than said passing threshold, if, then remove mobile phone screen locking state, if not, then point out the release failure.

Further, in the step 11, through formula N=(L ₁-20)/10+1 rounds downwards and tries to achieve the said number of frames N that presets the voice password, wherein, L in the formula ₁Represent the said audio frequency duration that presets the voice password, L ₁Unit be millisecond, 20 in formula expression frame length is 20 milliseconds, the expression of 10 in formula frame is superposed to 10 milliseconds.

Further, in the step 21, through formula M=(L ₂-20)/10+1 rounds the number of frames M that tries to achieve said release voice password downwards, wherein, L in the formula ₂The audio frequency duration of representing said release voice password, L ₂Unit be millisecond, 20 in formula expression frame length is 20 milliseconds, the expression of 10 in formula frame is superposed to 10 milliseconds.

Further, import saidly when presetting voice password and said release voice password, the said signal sampling rate that presets voice password and said release voice password is 16000Hz.

Further, number 24≤X≤39 of the Mel frequency marking of said triangular window filter.

Further, said triangular window filter is the triangular window filter with linear distribution on 24 Mel frequency markings, i.e. X=24, and the centre frequency of said triangular window filter is respectively: 100,200,300,400,500,600,700,800; 900,1000,1149,1320,1516,1741,2000,2297,2639,3031,3482; 4000,4595,5278,6063,6964, bandwidth is: 100,100,100,100,100; 100,100,100,100,124,160,184,211,242,278; 320,367,422,484,556,639,734,843,969, above numerical value unit is Hz.

Further, said triangular window filter is the triangular window filter with linear distribution on 39 Mel frequency markings, i.e. X=39, and the centre frequency of said triangular window filter is respectively: 50,100,150,200,260,320,390,460,530,610,700; 790,890,990,1100,1210,1340,1480,1610,1770,1930,2100,2280,2480,2680; 2900,3140,3380,3650,3930,4230,4560,4900,5260,5650,6060,6500,6970,7470; Bandwidth is: 100,100,100,120,127,127,148,148,148,169,190,190,233,233; 254,254,296,296,275,339,339,360,381,424,424,466,508; 508,572,593,636,699,720,763,826,869,932,996,1060, above numerical value unit is Hz.

Further, the computing formula of said discrete cosine transform is:

be j noise reduction energy value of En (j) expression wherein; 1≤j≤X; 1≤i≤Z, i, j are natural number.

Further, the windowing process in step 11 and the step 21 is and adds the Hamming window processing.

The present invention has following advantage: remove mobile phone screen locking state through voice short command and voiceprint authentication technology; Not only convenient and swift but also guaranteed the fail safe that mobile phone uses; Simultaneously, introduce the windowing of branch frame, MFCC coefficient calculations and vector quantization treatment technology; User's sound property can be extracted and compare more accurately, improved the user experience in convenience and the fail safe.

[embodiment]

The present invention program one specific embodiment is following:

A kind of method based on voice short command and the screen locking of vocal print technology releasing mobile phone comprises preset stage and release stage, and said preset stage comprises the steps:

The said release stage comprises the steps:

Step 6, the said release voice password frequency-region signal of calculating and the said difference value that presets voice password frequency-region signal; Said difference value obtains through asking Euclidean distance;

The present invention program two first embodiment is following:

A kind of method based on voice short command and the screen locking of vocal print technology releasing mobile phone, it is characterized in that: comprise preset stage and release stage, said preset stage comprises the steps:

The voice password is preset in step 10, user input, and said to preset the preservation form of voice password in mobile phone be time-domain signal; The said signal sampling rate that presets the voice password is 16000Hz;

Step 11, be that the said voice data that presets the voice password of time-domain signal carries out the windowing process of branch frame, and calculate the number of frames N that presets the voice password the preservation form; In the present embodiment through formula N=(L ₁-20)/10+1 rounds downwards and tries to achieve the said number of frames N that presets the voice password, wherein, L in the formula ₁Represent the said audio frequency duration that presets the voice password, the expression of 20 in formula frame length is 20 milliseconds, and the expression of 10 in formula frame is superposed to 10 milliseconds; Said windowing process is handled for adding Hamming window;

Step 13, with the triangular window filter of linear distribution on X the Mel frequency marking, to the frequency-region signal filtering successively of respectively presetting the voice password, after the filtering, each frequency-region signal that presets the voice password all obtains X corresponding energy value; Said X is a natural number, 1≤X≤128.More preferably; Number 24≤X≤39 of the Mel frequency marking of said triangular window filter select for use the triangular window filter of this X scope to obtain reasonably to compromise at operation efficiency and between to the descriptive power of characteristics of speech sounds, obviously; Filter quantity is big more; The value that is X is big more, and is just meticulous more to the description of characteristics of speech sounds, but operation efficiency can reduce.

Step 16, to the residue N-Y ₁Frame presets each frame in the voice password and presets X corresponding noise reduction energy value of voice password and carry out discrete cosine transform, and the N-Y of voice password is preset in acquisition altogether ₁Individual Z dimension MFCC coefficient; Said Z is a natural number, 1≤Z≤128; The computing formula of said discrete cosine transform is:

Wherein, when carrying out discrete cosine transform to presetting the voice password, j noise reduction energy value that presets the voice password of En (j) expression, 1≤j≤X, 1≤i≤Z, i, j are natural number;

Step 17, the N-Y that presets the voice password to obtaining ₁Individual Z dimension MFCC coefficient carries out vector quantization, and it is K that the length that quantizes code book is set, and K is the natural number more than or equal to 1, and 1≤K≤128; Then obtain one and quantize code book, this quantizes code book and is made up of K Z dimension MFCC;

The said release stage comprises the steps:

Step 20, user import release voice password, and the preservation form of said release voice password in mobile phone is time-domain signal; Signal sampling rate to said release voice password is 16000Hz;

Step 21, be that the voice data of the said release voice password of time-domain signal carries out the windowing process of branch frame, and calculate the number of frames M of release voice password, in the present embodiment through formula M=(L the preservation form ₂-20)/10+1 rounds the number of frames M that tries to achieve said release voice password downwards, wherein, L in the formula ₂The audio frequency duration of representing said release voice password, the expression of 20 in formula frame length is 20 milliseconds, the expression of 10 in formula frame is superposed to 10 milliseconds; Said windowing process is handled for adding Hamming window;

Step 26, to the residue M-Y ₂X the noise reduction energy value that each frame release voice password is corresponding in the frame release voice password carries out discrete cosine transform, obtains the M-Y of release voice password altogether ₂Individual Z dimension MFCC coefficient; The computing formula of said discrete cosine transform is with to preset the computing formula that discrete cosine that the voice password adopted changes identical, promptly

When release voice password is carried out discrete cosine transform, the noise reduction energy value of j release voice password of En (j) expression, 1≤j≤X, 1≤i≤Z, i, j are natural number;

When the signal sampling rate that presets voice password and release voice password of the present invention all adopts 16000Hz; Can under the prerequisite that does not influence speech quality, reduce the amount of audio data that needs processing like this, also be simultaneously the sample frequency that most of audio input device are supported.

The present invention program two second embodiment is following:

In the present embodiment, get X=24, Y ₁=3, Y ₂=3, Z=13, K=5

Step 13, with the triangular window filter of linear distribution on 24 Mel frequency markings, to the frequency-region signal filtering successively of respectively presetting the voice password, after the filtering, each frequency-region signal that presets the voice password all obtains 24 corresponding energy values; The triangular window filter of linear distribution on said 24 Mel frequency markings, its centre frequency is respectively: 100,200,300,400,500,600,700,800,900,1000,1149,1320; 1516,1741,2000,2297,2639,3031,3482,4000,4595,5278,6063,6964, bandwidth is: 100; 100,100,100,100,100,100,100,100,124,160,184,211; 242,278,320,367,422,484,556,639,734,843,969, above numerical value unit is Hz;

Step 14, preceding 3 frames are preset each frame in the voice password preset 24 corresponding energy values of voice password and ask the noise energy average that presets the voice password; The said process that presets voice password noise energy average of asking is specially: first frame is preset 24 corresponding energy value to the 3 frames of voice password preset 24 corresponding energy values of voice password and ask the arithmetic mean value respectively; Obtain presetting 24 noise energy averages of voice password; The solution procedure of said arithmetic mean is specially: promptly earlier first frame is preset corresponding first energy value to the 3 frames of voice password and preset first corresponding energy value of voice password and ask the arithmetic mean value; Obtain presetting first noise energy average of voice password; Then and the like ask the arithmetic mean value, obtain presetting 3 noise energy averages of voice password after the completion altogether;

Step 15, preset in the voice password at remaining N-3 frame; Each frame presets 24 corresponding energy values of voice password and deducts 24 noise energy averages that preset the voice password respectively accordingly, and each frame presets the voice password and all obtains 24 corresponding with it noise reduction energy values; Said N-3 is meant that the voice password that presets with the N frame removes and is used to ask first frame to the, 3 frames of noise energy average to preset the voice password;

Step 16, residue N-3 frame is preset in the voice password each frame preset 24 corresponding noise reduction energy values of voice password and carry out discrete cosine transform, obtain to preset N-3 13 dimension MFCC coefficients of voice password altogether; The computing formula of said discrete cosine transform is:

wherein; When carrying out discrete cosine transform to presetting the voice password; J noise reduction energy value that presets the voice password of En (j) expression; 1≤j≤24,1≤i≤13, i, j are natural number; Specify as follows:

Preset to appoint the voice password from residue N-3 frame and get a frame and preset 24 corresponding noise reduction energy values of voice password; Get i=1 earlier; Try to achieve this frame and preset the first dimension MFCC coefficient of voice password, try to achieve the 13rd dimension MFCC coefficient that this frame presets the voice password when getting i=13, analogize in proper order; The value of i is taken at 13 o'clock from 1, obtains this frame altogether and presets 13 corresponding dimension MFCC coefficients of voice password; Each frame that residue M-3 frame is preset the voice password presets the voice password all through after the said diffusing cosine transform computing formula calculating, obtains to preset N-3 13 dimension MFCC coefficients of voice password;

Step 17, the N-3 that presets the voice password 13 dimension MFCC coefficients that obtain are carried out vector quantization, it is 5 that the length that quantizes code book is set, and then obtains one and quantizes code book, and this quantizes code book and is made up of 5 13 dimension MFCC; Excessive quantification code book length can cause the increase of computing time; Quantize the code book curtailment and then be not enough to portray the phonetic feature that presets password; Selection quantification code book length K=5 o'clock is not only lacked computing time, also can effectively portray the phonetic feature that presets the voice password simultaneously;

The said release stage comprises the steps:

Step 23, with of the frequency-region signal successively filtering of said triangular window filter to each release voice password, after the filtering, the frequency-region signal of each release voice password all obtains 24 energy values of corresponding release voice password;

Step 24,24 corresponding energy values of each frame release voice password in the preceding 3 frame release voice passwords are found the solution the noise energy average of lock voice password; The said process of finding the solution lock voice password noise energy average is specially: 24 corresponding energy values of energy value to the 3 frame release voice passwords of 24 release voice passwords that the first frame release voice password is corresponding are asked the arithmetic mean value respectively; Obtain 24 noise energy averages of release voice password; The solution procedure of said arithmetic mean value is specially: promptly earlier corresponding first the corresponding energy value of first energy value to the 3 frame release voice passwords of the first frame release voice password is asked the arithmetic mean value; Obtain first noise energy average of release voice password; Then and the like ask the arithmetic mean value, obtain the noise energy average of 24 release voice passwords after the completion altogether;

Step 25, in remaining M-3 frame release voice password; 24 average energy value that each frame release voice password is corresponding deduct 24 noise energy averages of release voice password respectively accordingly, and each frame release voice password all obtains 24 corresponding with it noise reduction energy values; Said M-3 is meant that the release voice password with the M frame removes first frame to the, the 3 frame release voice passwords that are used to ask the noise energy average;

Step 26,24 corresponding noise reduction energy values of each frame release voice password in the residue M-3 frame release voice password are carried out discrete cosine transform, obtain M-3 13 dimension MFCC coefficients of release voice password altogether; The computing formula of said discrete cosine transform is with to preset the computing formula that discrete cosine that the voice password adopted changes identical; Promptly

is when carrying out discrete cosine transform to release voice password; The noise reduction energy value of j release voice password of En (j) expression; 1≤j≤24; 1≤i≤13, i, j are natural number;

Step 27, each 13 dimension MFCC coefficient of release voice password are compared with the said quantification code book that presets the voice password respectively one by one, release voice password is M-3 13 dimension MFCC coefficients altogether, then compare the M-3 wheel; Be made up of 5 13 dimension MFCC because this quantizes code book, each takes turns comparison, all obtains 5 distance values; And choose lowest distance value wherein; Be that each takes turns lowest distance value of comparison acquisition, all compared, obtain M-3 lowest distance value altogether; With M-3 lowest distance value summation and divided by M-3, obtain average minimum range; Said comparison is for asking Euclidean distance;

Illustrate comparison process: suppose K=5 at present; M-3=6; Then first round comparison is: from 6 13 dimension MFCC coefficients of release voice password, select one earlier; And it is asked Euclidean distance respectively with presetting 5 the 13 dimension MFCC coefficients that the voice password quantizes in the code book, and then produce 5 distance values, choose the lowest distance value of distance value minimum in these 5 distance values as first round comparison; Second taking turns comparison and be then: from 13 dimension MFCC coefficients of remaining 5 release voice passwords carrying out comparison, select one again; And with its with preset 5 the 13 dimension MFCC coefficients that the voice password quantizes in the code book and ask Euclidean distance respectively; Produce 5 distance values, choose in these 5 distance values minimum distance value as second take turns comparison lowest distance value; By that analogy, 13 dimension MFCC coefficients of 6 release voice passwords are arranged, then carry out 6 and take turns comparison; Each is taken turns comparison and all obtains 5 distance values, and chooses lowest distance value wherein, has all compared, and obtains 6 lowest distance value altogether;

In the present invention, said triangular window filter is the triangular window filter that also can select linear distribution on 39 Mel frequency markings for use, i.e. X=39, and the centre frequency of said triangular window filter is respectively: 50,100,150,200,260,320,390,460,530,610,700; 790,890,990,1100,1210,1340,1480,1610,1770,1930,2100,2280,2480,2680; 2900,3140,3380,3650,3930,4230,4560,4900,5260,5650,6060,6500,6970,7470; Bandwidth is: 100,100,100,120,127,127,148,148,148,169,190,190,233,233; 254,254,296,296,275,339,339,360,381,424,424,466,508; 508,572,593,636,699,720,763,826,869,932,996,1060, above numerical value unit is Hz.During the triangular window filter of linear distribution, its principle is all identical with embodiment two with the present invention program two embodiment one on selecting 39 Mel frequency markings for use.

Remove mobile phone screen locking state through voice short command and voiceprint authentication technology; Not only convenient and swift but also guaranteed the fail safe that mobile phone uses; Simultaneously, introduce the windowing of branch frame, MFCC coefficient calculations and vector quantization treatment technology; User's sound property can be extracted and compare more accurately, improved the user experience in convenience and the fail safe.

Though more than described embodiment of the present invention; But the technical staff who is familiar with the present technique field is to be understood that; We described concrete embodiment is illustrative; Rather than being used for qualification to scope of the present invention, those of ordinary skill in the art are in the modification and the variation of the equivalence of doing according to spirit of the present invention, all should be encompassed in the scope that claim of the present invention protects.

Claims

1. remove the method for mobile phone screen locking based on voice short command and vocal print technology for one kind, it is characterized in that: comprise preset stage and release stage, said preset stage comprises the steps:

The said release stage comprises the steps:

2. the method based on voice short command and the screen locking of vocal print technology releasing mobile phone according to claim 1, it is characterized in that: said difference value obtains through asking Euclidean distance.

3. remove the method for mobile phone screen locking based on voice short command and vocal print technology for one kind, it is characterized in that: comprise preset stage and release stage, said preset stage comprises the steps:

The said release stage comprises the steps:

4. according to the method described in the claim 3, it is characterized in that: in the step 11, through formula N=(L based on voice short command and the screen locking of vocal print technology releasing mobile phone ₁-20)/10+1 rounds downwards and tries to achieve the said number of frames N that presets the voice password, wherein, L in the formula ₁Represent the said audio frequency duration that presets the voice password, L ₁Unit be millisecond, 20 in formula expression frame length is 20 milliseconds, the expression of 10 in formula frame is superposed to 10 milliseconds.

5. according to the method described in the claim 3, it is characterized in that: in the step 21, through formula M=(L based on voice short command and the screen locking of vocal print technology releasing mobile phone ₂-20)/10+1 rounds the number of frames M that tries to achieve said release voice password downwards, wherein, L in the formula ₂The audio frequency duration of representing said release voice password, L ₂Unit be millisecond, 20 in formula expression frame length is 20 milliseconds, the expression of 10 in formula frame is superposed to 10 milliseconds.

6. according to the method described in the claim 3 based on voice short command and the screen locking of vocal print technology releasing mobile phone; It is characterized in that: import saidly when presetting voice password and said release voice password, the said signal sampling rate that presets voice password and said release voice password is 16000Hz.

7. according to the method described in the claim 3, it is characterized in that: number 24≤X≤39 of the Mel frequency marking of said triangular window filter based on voice short command and the screen locking of vocal print technology releasing mobile phone.

8. according to removing the method for mobile phone screen locking based on voice short command and vocal print technology described in the claim 7, it is characterized in that: said triangular window filter is the triangular window filter with linear distribution on 24 Mel frequency markings, i.e. X=24, and the centre frequency of said triangular window filter is respectively: 100,200,300,400,500,600; 700,800,900,1000,1149,1320,1516,1741,2000; 2297,2639,3031,3482,4000,4595,5278,6063,6964; Bandwidth is: 100,100,100,100,100,100,100,100,100; 124,160,184,211,242,278,320,367; 422,484,556,639,734,843,969, above numerical value unit is Hz.

9. according to removing the method for mobile phone screen locking based on voice short command and vocal print technology described in the claim 7, it is characterized in that: said triangular window filter is the triangular window filter with linear distribution on 39 Mel frequency markings, i.e. X=39, and the centre frequency of said triangular window filter is respectively: 50,100,150,200,260,320,390,460,530; 610,700,790,890,990,1100,1210,1340,1480,1610,1770,1930; 2100,2280,2480,2680,2900,3140,3380,3650,3930,4230,4560,4900; 5260,5650,6060,6500,6970,7470, bandwidth is: 100,100,100,120,127,127; 148,148,148,169,190,190,233,233,254,254,296,296; 275,339,339,360,381,424,424,466,508,508,572; 593,636,699,720,763,826,869,932,996,1060, above numerical value unit is Hz.

10. according to the method based on voice short command and the screen locking of vocal print technology releasing mobile phone described in the claim 3, it is characterized in that: the computing formula of said discrete cosine transform is:

11. according to the method based on voice short command and the screen locking of vocal print technology releasing mobile phone described in the claim 3, it is characterized in that: the windowing process in step 11 and the step 21 is and adds the Hamming window processing.