CN103297590B - A kind of method and system realizing equipment unblock based on audio frequency - Google Patents
A kind of method and system realizing equipment unblock based on audio frequency Download PDFInfo
- Publication number
- CN103297590B CN103297590B CN201210044261.4A CN201210044261A CN103297590B CN 103297590 B CN103297590 B CN 103297590B CN 201210044261 A CN201210044261 A CN 201210044261A CN 103297590 B CN103297590 B CN 103297590B
- Authority
- CN
- China
- Prior art keywords
- rhythm
- unlocking
- signal
- audio password
- tone
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 230000033764 rhythmic process Effects 0.000 claims abstract description 87
- 238000000605 extraction Methods 0.000 claims description 46
- 238000005070 sampling Methods 0.000 claims description 16
- 238000001228 spectrum Methods 0.000 claims description 13
- 238000001514 detection method Methods 0.000 claims description 11
- 238000009432 framing Methods 0.000 claims description 8
- 238000001914 filtration Methods 0.000 claims description 5
- 239000000284 extract Substances 0.000 claims description 4
- 238000010586 diagram Methods 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/30—Authentication, i.e. establishing the identity or authorisation of security principals
- G06F21/31—User authentication
- G06F21/32—User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/08—Network architectures or network communication protocols for network security for authentication of entities
- H04L63/0861—Network architectures or network communication protocols for network security for authentication of entities using biometrical features, e.g. fingerprint, retina-scan
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/32—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
- H04L9/3226—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using a predetermined code, e.g. password, passphrase or PIN
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Hardware Design (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Electrophonic Musical Instruments (AREA)
- Telephone Function (AREA)
Abstract
The invention discloses and a kind of realize the method and system that equipment unlocks based on audio frequency, at least one of can be extracted from the audio frequency password received by equipment: melody, rhythm, tone color;When described melody, rhythm and tone color match with default melody, rhythm, tone color respectively, described equipment unlocks.The present invention realizes, based on audio frequency, the technology that equipment unlocks, and is considered the material elements in audio frequency when unlocking, thus improves the safety that equipment unlocks.
Description
Technical Field
The invention relates to an audio processing technology, in particular to a method and a system for unlocking equipment based on audio.
Background
With the enhancement of the experience requirements of the mobile phone, the existing mobile phone unlocking function can not meet the requirements of people. The existing mobile phone unlocking function can be mainly divided into: unlocking with a common password, fingerprint unlocking and head portrait unlocking. However, the mobile phone unlocking function cannot meet the mobile phone experience requirements of people, and particularly, the safety of ordinary password unlocking is very low; fingerprint unlocking and avatar unlocking are both based on images, but the safety of fingerprint unlocking and avatar unlocking is reduced by the current fingerprint reverse mode and makeup technology. Therefore, no matter whether the mobile phone or other equipment needing information confidentiality is used, the unlocking function with higher safety is required, but the unlocking function does not exist at present.
Disclosure of Invention
In view of this, the present invention provides a method and a system for unlocking a device based on audio to improve the security of unlocking the device.
In order to achieve the purpose, the technical scheme of the invention is realized as follows:
a method for unlocking a device based on audio comprises the following steps:
the device extracts from the received audio password at least one of: melody, rhythm, timbre;
and when the melody, the rhythm and the tone are respectively matched with the preset melody, the preset rhythm and the preset tone, the equipment is unlocked.
The process of extracting the melody includes:
down-sampling the input signal x (n) to obtain y (n), detecting the end points of the y (n), and judging the end points of the signal starting and ending; dividing the signal into multiple frames according to the short-time stationarity of the music sound and extracting the fundamental frequency of the signal; and converting the extracted fundamental frequency into MIDI notes of a musical instrument digital interface according to the twelve-tone rhythm characteristics.
Extracting the fundamental frequency is realized based on an enhanced improved Mel cepstrum coefficient (Specmurt) algorithm; the enhanced Specmurt algorithm comprises a complex wavelet transform-based Specmurt algorithm for realizing enhancement, or a short-time Fourier transform (STFT) and a modified Specmurt algorithm for realizing MDCT.
The process of extracting the rhythm includes:
down-sampling the input signal x (n) to obtain z (n), performing endpoint detection on z (n), and judging the endpoints of the beginning and the end of the signal;
performing STFT on z (n) to obtain U (wk, ti), and performing ACF on the signals divided into multiple frames to obtain A (l, ti); obtaining A (wl, ti) according to the characteristics of the autocorrelation, wherein wl is l/fs sampling frequency;
combining STFT and FM-ACF to obtain a combined function Y (wk, ti) ═ U (wk, ti). a (wl, ti), solving for the cadence of the signal from Y (wk, ti).
The process of extracting the timbre comprises the following steps:
down-sampling an input signal x (n) to obtain v (n), carrying out endpoint detection on v (n), and judging endpoints q (n) of signal start and end; solving and storing the MFCC coefficients of v (n); solving the spectrum envelope for q (n), and solving the amplitude envelope of the signal for v (n).
When the melody, the rhythm and the timbre are respectively matched with the preset melody, the preset rhythm and the preset timbre, the equipment unlocking process comprises the following steps:
the method comprises the steps of obtaining a comprehensive value of matching distortion degree and path deviation of an input audio password for unlocking, judging whether rhythm of the audio password for unlocking and rhythm of a preset audio password are lower than a preset threshold value or not, comparing whether rhythm of the audio password for unlocking and rhythm of the preset audio password are consistent or not when the rhythm of the audio password for unlocking and the rhythm of the preset audio password are lower than the threshold value or not, judging whether timbre of the audio password for unlocking and timbre of the preset audio password are lower than the preset threshold value or not when the rhythm of the audio password for unlocking and the rhythm of the preset audio password are consistent, and unlocking equipment when the rhythm of the audio password.
A system for enabling device unlocking based on audio, the system comprising a musical tone feature decision module, and further comprising at least one of: the rhythm extraction module, the rhythm extraction module and the tone extraction module are arranged in the frame; wherein,
the rhythm extraction module, the rhythm extraction module and the tone extraction module are respectively used for extracting corresponding contents from the received audio password when the rhythm extraction module, the rhythm extraction module and the tone extraction module are arranged in the system;
and the musical tone characteristic decision module is used for unlocking the equipment when the melody, the rhythm and the tone respectively extracted by the melody extraction module, the rhythm extraction module and the tone extraction module are respectively matched with the preset melody, rhythm and tone.
The melody extraction module is used for:
down-sampling the input signal x (n) to obtain y (n), detecting the end points of the y (n), and judging the end points of the signal starting and ending; dividing the signal into multiple frames according to the short-time stationarity of the music sound and extracting the fundamental frequency of the signal; and converting the extracted fundamental frequency into MIDI notes according to the twelve-tone rhythm characteristics.
The melody extraction module is used for realizing extraction of the fundamental frequency based on an enhanced Specmurt algorithm; the enhanced Specmurt algorithm comprises a complex wavelet transform-based enhanced Specmurt algorithm or an STFT and MDCT-based enhanced Specmurt algorithm.
The rhythm extraction module is used for:
down-sampling the input signal x (n) to obtain z (n), performing endpoint detection on z (n), and judging the endpoints of the beginning and the end of the signal;
performing STFT on z (n) to obtain U (wk, ti), and performing ACF on the signals divided into multiple frames to obtain A (l, ti); a (wl, ti) is obtained from this in combination with the characteristics of the autocorrelation, where wl ═ l/fs;
combining STFT and FM-ACF to obtain a combined function Y (wk, ti) ═ U (wk, ti). a (wl, ti), solving for the cadence of the signal from Y (wk, ti).
The tone extraction module is used for:
down-sampling an input signal x (n) to obtain v (n), carrying out endpoint detection on v (n), and judging endpoints q (n) of signal start and end; solving and storing the MFCC coefficients of v (n); solving the spectrum envelope for q (n), and solving the amplitude envelope of the signal for v (n).
The tone feature decision module, when unlocking the device, is to:
the method comprises the steps of obtaining a comprehensive value of matching distortion degree and path deviation of an input audio password for unlocking, judging whether rhythm of the audio password for unlocking and rhythm of a preset audio password are lower than a preset threshold value or not, comparing whether rhythm of the audio password for unlocking and rhythm of the preset audio password are consistent or not when the rhythm of the audio password for unlocking and the rhythm of the preset audio password are lower than the threshold value or not, judging whether timbre of the audio password for unlocking and timbre of the preset audio password are lower than the preset threshold value or not when the rhythm of the audio password for unlocking and the rhythm of the preset audio password are consistent, and unlocking equipment when the rhythm of the audio password.
The invention is based on the technology of realizing equipment unlocking by audio, and takes the specific factors in the audio into consideration during unlocking, thereby improving the safety of equipment unlocking.
Drawings
Fig. 1 is a flowchart illustrating a mobile phone unlocking process according to an embodiment of the present invention;
FIG. 2 is a flow chart of feature extraction according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating a principle of melody extraction according to an embodiment of the present invention;
FIG. 4 is a diagram of a Specmurt enhancement algorithm implemented by complex wavelet transform in an embodiment of the present invention;
FIG. 5 is a schematic diagram of a Specmurt enhancement algorithm implemented by MDCT (modified discrete cosine transform) and DFT (Fourier transform) in an embodiment of the present invention;
FIG. 6 is a schematic diagram of the rhythm extraction in the embodiment of the present invention;
FIG. 7 is a schematic diagram illustrating the principle of extracting timbre in the embodiment of the present invention;
fig. 8 is a flowchart illustrating unlocking a mobile phone according to another embodiment of the present invention;
fig. 9 is a simplified flowchart of unlocking a mobile phone according to an embodiment of the present invention.
Detailed Description
In practical applications, a flow as shown in fig. 1 may be performed, and a refinement process of the flow may be illustrated by fig. 2 to 8. For audio provided by the user (e.g., music input by the user), the process shown in fig. 2 may be performed: extracting melody, rhythm and tone to obtain music characteristics.
It should be noted that, no matter the mobile phone or other devices requiring information confidentiality, the unlocking function with higher security is required, so the method and system of the present invention are not limited to the mobile phone for the devices requiring information confidentiality, and the technology for unlocking the devices such as the mobile phone is uniform.
The following is a detailed description of the present invention with reference to the drawings and embodiments, taking a mobile phone as an example only.
When the mobile phone is unlocked based on the audio, the following three steps can be performed:
the method comprises the following steps: collecting input sources (including melody, rhythm and tone);
step two: the user inputs an audio password (comprising melody, rhythm and tone) required by unlocking;
step three: and comparing the information based on the audio to unlock the mobile phone.
In the step one, the input source of the user can be collected by using an information collecting device of a mobile phone or other terminal equipment. The concrete form can be as follows:
a. and the microphone is used for acquiring signals by acquiring air vibration waves.
b. And signal acquisition is realized by using throat vibration.
c. And the jaw bone part is vibrated to realize signal acquisition.
d. Other signal acquisitions are achieved by acquiring vibrations of the object.
No matter the input source is the preset audio password or the audio password for unlocking input during unlocking, the mobile phone can process the acquired signal (such as analog-to-digital conversion and recording sampling frequency) and store the processed signal as an input signal. Specifically, the following processing may be performed on the acquired signals: melody extraction as shown in fig. 3; tempo extraction as shown in fig. 6; as illustrated in fig. 7 for tone extraction.
When melody extraction is performed, the following operations may be performed:
a. the input signal x (n) is down-sampled to obtain y (n), suggesting that the sampling rate is reduced to 22050Hz or less, but not less than 10000 Hz.
b. The end point detection unit detects the end points of y (n) and judges the end points of the signal; the framing unit divides the signal into multiple frames according to the short-time stationarity of the musical sound and sends the multiple frames to the fundamental frequency extracting unit.
c. The fundamental frequency extraction unit extracts the fundamental frequency of the signal by using an enhanced Specmurt (modified Mel cepstral coefficient) algorithm.
d. And converting the extracted fundamental frequency into MIDI (musical instrument digital interface) notes by a temperament conversion unit according to the twelve-tone temperament characteristics and storing the MIDI notes in a database.
It should be noted that there are various implementations that can be used for the enhanced speccurt algorithm, such as the enhanced speccurt algorithm implemented by using complex wavelet transform, or the enhanced speccurt algorithm implemented by using STFT (short time fourier transform), MDCT (just one of them), and so on.
When the enhanced specmurat algorithm is implemented using the complex wavelet transform, the operations shown in fig. 4 may be performed, specifically:
a) calculated at SpecmurtIn the implementation process of the method, two aspects of the design of a harmonic structure and the solution of a frequency spectrum under logarithmic frequency are considered. Using fcent ═ log2(fHzPer 2) 120 to eliminate the frequency f in the Spcmurt algorithmHzAnd the distance between the logarithmic frequencies after the relation with the logarithmic frequency fcent is converted is too small, thereby causing great influence on the calculation of the fundamental frequency. The appropriate harmonic structure is selected experimentally.
b) When the Specmurt algorithm is realized by using complex wavelet transform after short-time framing, the noise interference is reduced by adopting a method of complex wavelet transform and overlapping energy under each scale.
c) By using the low-pass filtering method, the distribution of energy of each scale is regarded as a time domain signal, and low-pass filtering is performed to filter out the interference frequency.
d) Deconvolving the result obtained in step c with the fundamental harmonic structure.
When the enhanced speccurt algorithm is implemented by using STFT and MDCT, operations as shown in fig. 5 may be performed, specifically:
a) the linear spectrum is converted to a logarithmic spectrum.
b) The defect existing when the linear frequency spectrum is directly converted into the logarithmic frequency spectrum is eliminated by adopting the method of extracting the envelope.
c) Deconvolving with the fundamental harmonic structure.
When the tempo extraction is performed, the following operations may be performed:
a. the input signal x (n) is down-sampled by a down-sampling unit to obtain z (n), which is suggested to be 11.025 Hz.
b. The end point detection unit detects the end points of the signal z (n) and judges the end points of the signal start and end; the framing unit accordingly divides the signal into a plurality of frames and stores as p (n) according to the short-time stationarity of the tones.
c. An ACF (autocorrelation solution) unit performs ACF on the signal p (n) stored by the framing unit to obtain A (l, ti); the FM-ACF unit obtains a (wl, ti) from this in combination with the auto-correlation feature, where wl ═ l/fs (sampling frequency).
d. STFT, DFT is carried out on the signal p (n) stored by the framing unit to obtain U (wk, ti); the formula is as follows:
e. the union function processing unit union STFT and FM-ACF to obtain union function Y (wk, ti) ═ U (wk, ti). A (wl, ti).
e. And solving the rhythm of the signal according to Y (wk, ti) and storing the rhythm in a database.
When performing tone extraction, the following operations may be performed:
a. the input signal x (n) is down-sampled to obtain v (n), preferably 22050 Hz. And carrying out endpoint detection on v (n), and judging the endpoints of the beginning and the end of the signal.
Specifically, the start-stop algorithm characterized by energy (E) and Zero Crossing Rate (ZCR) is based on the fact that background noise is statistically significantly different from short periods and features of speech. Setting the waveform time domain signal as x (1), windowing and framing to obtain the nth frame speech signal so as to firstly use short-time energy to make first discrimination, and on the basis of said first discrimination, using short-time zero-crossing rate to make second discrimination. When the short-time energy is used for the first discrimination, a dual-threshold comparison method is often adopted in order to prevent the local falling point of the speech energy from being mistaken as the start point and the end point.
b. Solving the spectrum envelope of the v (n) of the judged end point and storing the spectrum envelope. Such as: and carrying out short-time Fourier transform on v (n), then obtaining a maximum value for each frame of signal, and connecting local maximum values to obtain a spectrum envelope.
c. And (5) obtaining the amplitude envelope of the signal for v (n) and storing the amplitude envelope. Such as: the Teager energy operator envelopes: t (v (n)) ═ v (n)]2-v (n-1) × v (n + 1)); a low-pass filter can be used for filtering out high-frequency components, and the low-frequency components are envelopes; and solving local maximum values, setting an envelope threshold, and connecting the local maximum values.
And when the second step is executed, a song required for unlocking can be input into the mobile phone, the mobile phone collects and processes the information input by the user according to the method in the first step, and the processed result is stored in the database.
In performing step three, the tone feature decision module shown in fig. 2 may perform the following operations:
a. comparing the melodies: by adopting DTW (dynamic time warping algorithm), time warping and distance measurement calculation are combined, and errors caused by different time lengths are reduced. And acquiring a comprehensive value of matching distortion and path deviation of the input audio password for unlocking, judging whether the melody of the audio password for unlocking and the melody of the preset audio password are lower than a preset threshold value, if so, comparing the melody with the preset threshold value, and otherwise, prompting that the unlocking is unsuccessful.
b. Comparing rhythm information: and comparing whether the rhythm of the audio password used for unlocking is consistent with the preset rhythm of the audio password, if so, carrying out next comparison, and otherwise, prompting that the unlocking is unsuccessful.
C. And comparing tone color information: and obtaining the spectrum envelope, the amplitude envelope and the MFCC coefficient by adopting a DTW (delay tolerant wavelet) method to obtain a final comprehensive value, judging whether the tone of the audio password for unlocking and the tone of the preset audio password are lower than a preset threshold value or not, unlocking the mobile phone and prompting that the unlocking is successful if the tone of the audio password for unlocking and the tone of the preset audio password are lower than the preset threshold value, and otherwise prompting that the unlocking is unsuccessful.
In practical applications, only one or two of the melody, the rhythm and the tone may be determined to determine whether to unlock the lock, and the specific determination method is similar to the above description.
As can be seen from the above description, the operation idea of the present invention for unlocking a device based on audio can be represented by a process shown in fig. 9, where the process includes the following steps:
step 910: the device extracts from the received audio password at least one of: melody, rhythm, timbre.
Step 920: and when the melody, the rhythm and the tone are respectively matched with the preset melody, the preset rhythm and the preset tone, the equipment is unlocked. The matching may be completely consistent or may be consistent with the aforementioned threshold requirement when melody, rhythm or timbre contrast is performed.
In summary, in both the method and the system, the technology for unlocking the equipment based on the audio frequency is adopted, and specific factors in the audio frequency are considered during unlocking, so that the safety of equipment unlocking is improved.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.
Claims (10)
1. A method for unlocking a device based on audio is characterized by comprising the following steps:
the device extracts from the received audio password at least one of: melody, rhythm, timbre;
when the melody, the rhythm and the tone are respectively matched with the preset melody, rhythm and tone, the equipment is unlocked;
when the tone extraction is performed, the following operations are specifically performed:
a. carrying out windowing and framing processing on an input signal x (n) to obtain an nth frame voice signal so as to obtain v (n), carrying out endpoint detection on v (n) by using a start-stop point algorithm and a double-threshold comparison method which are characterized by energy (E) and Zero Crossing Rate (ZCR), and judging endpoints of the beginning and the end of the signal;
b. carrying out short-time Fourier transform on v (n) of which the end points are judged, then solving a maximum value of each frame of signal, and connecting local maximum values to obtain a spectrum envelope;
c. obtaining and storing an amplitude envelope of a signal v (n) through a Teager energy operator T (v (n)) [ v (n)) ]2-v (n-1) × v (n + 1)); specifically, a low-pass filter is used for filtering out high-frequency components, and low-frequency components are envelopes; and solving local maximum values, setting an envelope threshold, and connecting the local maximum values.
2. The method of claim 1, wherein the process of extracting the melody comprises:
down-sampling the input signal x (n) to obtain y (n), detecting the end points of the y (n), and judging the end points of the signal starting and ending; dividing the signal into multiple frames according to the short-time stationarity of the music sound and extracting the fundamental frequency of the signal; and converting the extracted fundamental frequency into MIDI notes of a musical instrument digital interface according to the twelve-tone rhythm characteristics.
3. The method according to claim 2, characterized in that the extraction of the fundamental frequency is carried out based on an enhanced modified mel-frequency cepstrum coefficient (Specmurt) algorithm; the enhanced Specmurt algorithm comprises a complex wavelet transform-based Specmurt algorithm for realizing enhancement, or a short-time Fourier transform (STFT) and a modified Specmurt algorithm for realizing MDCT.
4. The method of claim 1, wherein extracting the tempo comprises:
down-sampling the input signal x (n) to obtain z (n), performing endpoint detection on z (n), and judging the endpoints of the beginning and the end of the signal;
performing STFT on z (n) to obtain U (wk, ti), and performing ACF on the signals divided into multiple frames to obtain A (l, ti); obtaining A (wl, ti) according to the characteristics of the autocorrelation, wherein wl is l/fs sampling frequency;
combining STFT and FM-ACF to obtain a combined function Y (wk, ti) ═ U (wk, ti). a (wl, ti), solving for the cadence of the signal from Y (wk, ti).
5. The method according to any one of claims 1 to 4, wherein when the melody, rhythm and timbre are respectively matched with the preset melody, rhythm and timbre, the unlocking process of the device comprises:
the method comprises the steps of obtaining a comprehensive value of matching distortion degree and path deviation of an input audio password for unlocking, judging whether rhythm of the audio password for unlocking and rhythm of a preset audio password are lower than a preset threshold value or not, comparing whether rhythm of the audio password for unlocking and rhythm of the preset audio password are consistent or not when the rhythm of the audio password for unlocking and the rhythm of the preset audio password are lower than the threshold value or not, judging whether timbre of the audio password for unlocking and timbre of the preset audio password are lower than the preset threshold value or not when the rhythm of the audio password for unlocking and the rhythm of the preset audio password are consistent, and unlocking equipment when the rhythm of the audio password.
6. A system for unlocking a device based on audio, the system comprising a musical tone characteristic decision module, and at least one of the following modules: the rhythm extraction module, the rhythm extraction module and the tone extraction module are arranged in the frame; wherein,
the rhythm extraction module, the rhythm extraction module and the tone extraction module are respectively used for extracting corresponding contents from the received audio password when the rhythm extraction module, the rhythm extraction module and the tone extraction module are arranged in the system;
the musical tone characteristic decision module is used for unlocking the equipment when the melody, the rhythm and the tone respectively extracted by the melody extraction module, the rhythm extraction module and the tone extraction module are respectively matched with the preset melody, rhythm and tone;
wherein, the tone extraction module is specifically configured to:
a. carrying out windowing and framing processing on an input signal x (n) to obtain an nth frame voice signal so as to obtain v (n), carrying out endpoint detection on v (n) by using a start-stop point algorithm and a double-threshold comparison method which are characterized by energy (E) and Zero Crossing Rate (ZCR), and judging endpoints of the beginning and the end of the signal; b. carrying out short-time Fourier transform on v (n) of which the end points are judged, then solving a maximum value of each frame of signal, and connecting local maximum values to obtain a spectrum envelope;
c. obtaining and storing an amplitude envelope of a signal v (n) through a Teager energy operator T (v (n)) [ v (n)) ]2-v (n-1) × v (n + 1)); specifically, a low-pass filter is used for filtering out high-frequency components, and low-frequency components are envelopes; and solving local maximum values, setting an envelope threshold, and connecting the local maximum values.
7. The system of claim 6, wherein the melody extraction module, when extracting the melody, is configured to:
down-sampling the input signal x (n) to obtain y (n), detecting the end points of the y (n), and judging the end points of the signal starting and ending; dividing the signal into multiple frames according to the short-time stationarity of the music sound and extracting the fundamental frequency of the signal; and converting the extracted fundamental frequency into MIDI notes according to the twelve-tone rhythm characteristics.
8. The system of claim 7, wherein the melody extraction module is configured to perform the extraction of the fundamental frequency based on an enhanced speccurt algorithm; the enhanced Specmurt algorithm comprises a complex wavelet transform-based enhanced Specmurt algorithm or an STFT and MDCT-based enhanced Specmurt algorithm.
9. The system of claim 6, wherein the tempo extraction module, when extracting the tempo, is configured to:
down-sampling the input signal x (n) to obtain z (n), performing endpoint detection on z (n), and judging the endpoints of the beginning and the end of the signal;
performing STFT on z (n) to obtain U (wk, ti), and performing ACF on the signals divided into multiple frames to obtain A (l, ti); a (wl, ti) is obtained from this in combination with the characteristics of the autocorrelation, where wl ═ l/fs;
combining STFT and FM-ACF to obtain a combined function Y (wk, ti) ═ U (wk, ti). a (wl, ti), solving for the cadence of the signal from Y (wk, ti).
10. The system of any one of claims 6 to 9, wherein the musical sound characteristic decision module, when unlocking the apparatus, is configured to:
the method comprises the steps of obtaining a comprehensive value of matching distortion degree and path deviation of an input audio password for unlocking, judging whether rhythm of the audio password for unlocking and rhythm of a preset audio password are lower than a preset threshold value or not, comparing whether rhythm of the audio password for unlocking and rhythm of the preset audio password are consistent or not when the rhythm of the audio password for unlocking and the rhythm of the preset audio password are lower than the threshold value or not, judging whether timbre of the audio password for unlocking and timbre of the preset audio password are lower than the preset threshold value or not when the rhythm of the audio password for unlocking and the rhythm of the preset audio password are consistent, and unlocking equipment when the rhythm of the audio password.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210044261.4A CN103297590B (en) | 2012-02-24 | 2012-02-24 | A kind of method and system realizing equipment unblock based on audio frequency |
PCT/CN2012/077371 WO2013123747A1 (en) | 2012-02-24 | 2012-06-21 | Method and system for achieving device unlocking based on audio |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210044261.4A CN103297590B (en) | 2012-02-24 | 2012-02-24 | A kind of method and system realizing equipment unblock based on audio frequency |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103297590A CN103297590A (en) | 2013-09-11 |
CN103297590B true CN103297590B (en) | 2016-12-14 |
Family
ID=49004965
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210044261.4A Expired - Fee Related CN103297590B (en) | 2012-02-24 | 2012-02-24 | A kind of method and system realizing equipment unblock based on audio frequency |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN103297590B (en) |
WO (1) | WO2013123747A1 (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104219382B (en) * | 2014-08-18 | 2016-09-14 | 上海卓易科技股份有限公司 | A kind of solution lock control processing method, terminal and system |
CN106657554A (en) * | 2015-10-29 | 2017-05-10 | 中兴通讯股份有限公司 | Audio unlocking method and audio unlocking device |
CN105630372B (en) * | 2015-10-30 | 2018-11-06 | 东莞酷派软件技术有限公司 | A kind of unlocking method and device of terminal |
CN105893825A (en) * | 2016-04-25 | 2016-08-24 | 广东欧珀移动通信有限公司 | Display screen unlocking method, device and mobile terminal based on music identifier |
CN107316653B (en) * | 2016-04-27 | 2020-06-26 | 南京理工大学 | Improved empirical wavelet transform-based fundamental frequency detection method |
CN107527627A (en) * | 2016-06-21 | 2017-12-29 | 中兴通讯股份有限公司 | A kind of door lock safety instruction method and device |
CN106250742A (en) * | 2016-07-22 | 2016-12-21 | 北京小米移动软件有限公司 | The unlocking method of mobile terminal, device and mobile terminal |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101772015A (en) * | 2008-12-29 | 2010-07-07 | 卢中江 | Method for starting up mobile terminal through voice password |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001125599A (en) * | 1999-10-25 | 2001-05-11 | Mitsubishi Electric Corp | Voice data synchronizing device and voice data generator |
CN1221937C (en) * | 2002-12-31 | 2005-10-05 | 北京天朗语音科技有限公司 | Voice identification system of voice speed adaption |
CN100555412C (en) * | 2004-09-09 | 2009-10-28 | 上海优浪信息科技股份有限公司 | A kind of speech key of mobile |
JP4677548B2 (en) * | 2005-09-16 | 2011-04-27 | 株式会社国際電気通信基礎技術研究所 | Paralinguistic information detection apparatus and computer program |
CN101398827B (en) * | 2007-09-28 | 2013-01-23 | 三星电子株式会社 | Method and device for singing search |
CN101178897B (en) * | 2007-12-05 | 2011-04-20 | 浙江大学 | Speaking man recognizing method using base frequency envelope to eliminate emotion voice |
CN101568102A (en) * | 2008-04-24 | 2009-10-28 | 中兴通讯股份有限公司 | Interactive-type color ring back tone (CRBT) inquiring and customizing system and inquiring and customizing method thereof |
CN101345668A (en) * | 2008-08-22 | 2009-01-14 | 中兴通讯股份有限公司 | Control method and apparatus for monitoring equipment |
CN101364408A (en) * | 2008-10-07 | 2009-02-11 | 西安成峰科技有限公司 | Sound image combined monitoring method and system |
CN101697514B (en) * | 2009-10-22 | 2016-08-24 | 中兴通讯股份有限公司 | A kind of method and system of authentication |
CN101894552B (en) * | 2010-07-16 | 2012-09-26 | 安徽科大讯飞信息科技股份有限公司 | Speech spectrum segmentation based singing evaluating system |
CN201937690U (en) * | 2010-12-23 | 2011-08-17 | 上海华勤通讯技术有限公司 | Mobile terminal capable of realizing acoustic-controlled unlocking |
CN102148899A (en) * | 2011-03-29 | 2011-08-10 | 广东欧珀移动通信有限公司 | Mobile phone acoustic-control unlocking method |
CN102142071A (en) * | 2011-04-26 | 2011-08-03 | 汉王科技股份有限公司 | Method and device for verifying mobile terminal |
-
2012
- 2012-02-24 CN CN201210044261.4A patent/CN103297590B/en not_active Expired - Fee Related
- 2012-06-21 WO PCT/CN2012/077371 patent/WO2013123747A1/en active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101772015A (en) * | 2008-12-29 | 2010-07-07 | 卢中江 | Method for starting up mobile terminal through voice password |
Non-Patent Citations (4)
Title |
---|
一种基于谱分析的音乐节奏识别算法;刘卫;《青海师范大学学报(自然科学版)》;20080915;全文 * |
基于内容的音乐信息检索技术综述;孟宪巍、徐蔚然、潘兴德、郭军;《2008年声频工程学术交流年会议论文集》;20081130;第11-20页 * |
基于声音样本匹配的语音应用系统技术研究;马永芬;《中国优秀硕士学位论文》;20091015;正文第9-39页 * |
基于音频频率包络抽取的MFCC算法;李波、王成友、杨聪、蔡宣平、张尔扬;《国防科技大学学报》;20040825(第26卷第4期);第42-45页 * |
Also Published As
Publication number | Publication date |
---|---|
CN103297590A (en) | 2013-09-11 |
WO2013123747A1 (en) | 2013-08-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103297590B (en) | A kind of method and system realizing equipment unblock based on audio frequency | |
US9570057B2 (en) | Audio signal processing methods and systems | |
Ghadage et al. | Speech to text conversion for multilingual languages | |
CN107851444A (en) | For acoustic signal to be decomposed into the method and system, target voice and its use of target voice | |
JP2012215668A (en) | Speaker state detecting apparatus, speaker state detecting method, and speaker state detecting computer program | |
CN108257605B (en) | Multi-channel recording method and device and electronic equipment | |
Shah et al. | Chroma feature extraction | |
CN110516102B (en) | Lyric time stamp generation method based on spectrogram recognition | |
US10068558B2 (en) | Method and installation for processing a sequence of signals for polyphonic note recognition | |
Patil et al. | Novel variable length Teager energy based features for person recognition from their hum | |
FitzGerald et al. | Single channel vocal separation using median filtering and factorisation techniques | |
JP2005292207A (en) | Method of music analysis | |
Khan et al. | Hindi speaking person identification using zero crossing rate | |
Patil et al. | Development of TEO phase for speaker recognition | |
Ambikairajah | Emerging features for speaker recognition | |
VH et al. | A study on speech recognition technology | |
Reddy et al. | Predominant melody extraction from vocal polyphonic music signal by combined spectro-temporal method | |
Sharma et al. | Singing characterization using temporal and spectral features in indian musical notes | |
CN114093388A (en) | Note cutting method, cutting system and video-song evaluation method | |
Singh et al. | Speaker Recognition and Fast Fourier Transform | |
de León et al. | A complex wavelet based fundamental frequency estimator in singlechannel polyphonic signals | |
JP5203404B2 (en) | Tempo value detection device and tempo value detection method | |
JP4328423B2 (en) | Voice identification device | |
JPWO2008001779A1 (en) | Fundamental frequency estimation method and acoustic signal estimation system | |
Aye | Speech recognition using Zero-crossing features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20161214 Termination date: 20210224 |
|
CF01 | Termination of patent right due to non-payment of annual fee |