CN107039051B - Fundamental frequency detection method based on ant group optimization - Google Patents
Fundamental frequency detection method based on ant group optimization Download PDFInfo
- Publication number
- CN107039051B CN107039051B CN201610077857.2A CN201610077857A CN107039051B CN 107039051 B CN107039051 B CN 107039051B CN 201610077857 A CN201610077857 A CN 201610077857A CN 107039051 B CN107039051 B CN 107039051B
- Authority
- CN
- China
- Prior art keywords
- peak
- signature waveform
- optimization
- fundamental frequency
- fundamental
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000005457 optimization Methods 0.000 title claims abstract description 40
- 238000001514 detection method Methods 0.000 title claims abstract description 13
- 238000000034 method Methods 0.000 claims description 32
- 238000012549 training Methods 0.000 claims description 14
- 238000009432 framing Methods 0.000 claims description 10
- 239000003016 pheromone Substances 0.000 claims description 10
- 238000012360 testing method Methods 0.000 claims description 7
- 239000013598 vector Substances 0.000 claims description 7
- 239000000284 extract Substances 0.000 claims description 6
- 230000015572 biosynthetic process Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 241000257303 Hymenoptera Species 0.000 claims description 2
- 238000000605 extraction Methods 0.000 claims 1
- 230000000694 effects Effects 0.000 description 2
- 241001237728 Precis Species 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/90—Pitch determination of speech signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
Abstract
The present invention provides the fundamental frequency detection method under a kind of Arctic ice area environment.It is characterized in that extracting the signature waveform of voice frame signal using PEFAC algorithm, then it optimizes to construct new signature waveform using optimum optimization factor pair signature waveform, finally finds out estimated value of the frequency values corresponding to signature waveform peak-peak as fundamental frequency.Wherein the optimum optimization factor is obtained by ant group optimization (ACO) algorithm search.
Description
Technical field
Fundamental frequency detection side the present invention relates to fundamental frequency detection method, under especially a kind of Arctic ice area environment
Method.
Background technique
Fundamental frequency detects the basic parameter as voice, in the speech processes neck such as speech analysis synthesis and speech Separation
There is extensive purposes in domain.Accurately and reliably estimate and to extract fundamental frequency most important to Speech processing.High s/n ratio
Fundamental frequency detection it is very mature, but these methods are difficult to better effects under low signal-to-noise ratio environment, especially extremely low
Detection effect under signal-to-noise ratio environment is very poor.In consideration of it, the present invention provides the fundamental frequency inspection under a kind of Arctic ice area environment
Survey method.
Summary of the invention
There is obvious deficiency for carry out fundamental frequency detection of the prior art under Arctic ice area environment, the present invention mentions
The fundamental frequency detection method under a kind of Arctic ice area environment is supplied.Method includes the following steps:
1. training process:
(1) speech database is made voice framing { frm (1), frm (2) ..., frm (N) } in chronological order, and utilizes mark
Quasi- algorithm extracts the fundamental frequency F of speech frame0As fundamental frequency true value, and constitute sequence { F0(1),F0(2),…,F0
(N) }, wherein N be speech frame sum;
(2) superimposed noise makes new voice framing sequence { frm on the basis of clean speech framenoise(1),frmnoise
(2),…,frmnoise(N) }, and using PEFAC algorithm by voice frame signal it is converted into corresponding signature waveform sequence
(3) fitness function in ant colony path is constructed together with signature waveform using Optimization Factor, and carry out global search,
Until obtaining the optimum optimization factor.Wherein Optimization Factor is unknown M dimensional vector α=[α1,α2,…,αM], Optimization Factor optimization
Signature waveform laterConstitute new signature waveform sequenceExtract characteristic wave
ShapePeak-peak peak and its corresponding to frequency values fpeakAs fundamental frequency estimated value, and formation sequence
{(peakmax(1),fpeak(1)),(peakmax(2),fpeak(2)),…,(peakmax(N),fpeak(N)) }, ant colony path is direct
Determine α value, then the fitness function in ant colony path is
I.e. fundamental frequency estimated value and true value error are no more than 5% probability.Then be arranged ant colony (ACO) algorithm relevant parameter into
Row search, finally finds out optimal Optimization Factor αoptimal。
The training process specific steps of ant colony are as follows:
Step 1: α=[α is enabled1,α2,…,αi,…,αM], one-dimensional value range [xdown xup], search precision prec, then αi
∈{xdown+prec,xdown+2*prec,…,xdown+ L*prec },
Wherein, floor () is bracket function.α is divided into M × L node, node alphaijWith pheromones τijAnd it inspires
Formula information ηijIt is associated, it is αi=xdownThe expectation of+j*preci, heuristic information ηij=1/ Δ dij, whereinFundamental tone signature waveform as under clean speech environment and it is optimized after fundamental tone signature waveform it
Between deviation;
Step 2: the building in path, kth ant go to the probability of node (i, j) are as follows:
Step 3: Pheromone update: when all ants build path, the pheromones on each node are as follows more
It is new:
The pheromones that kth ant discharges on the node of place are
WhereinFor path Tk
Fitness value.
Step 4: termination condition is to meet maximum number of iterations, at this time the corresponding α value, that is, α of optimal pathoptimal。
2. test process:
(1) sub-frame processing is carried out to tested speech signal, and extracts its fundamental tone signature waveform
(2) optimum optimization factor-alpha is utilizedoptimalOptimization, i.e., the fundamental tone signature waveform after constitution optimization
(3) it identifies and finds outPeak-peak corresponding to estimated value of the frequency values as fundamental frequency.
Above-mentioned technical proposal of the invention has the advantage that compared with prior art
A, signature waveform is extracted using PEFAC algorithm, inherits the advantages of inhibiting noise under its low signal-to-noise ratio environment;
B, the optimum optimization factor is searched for using ant colony optimization algorithm, so that the obtained fundamental frequency estimation after optimization
Value is under low signal-to-noise ratio environment closer to true fundamental frequency value;
Detailed description of the invention
Fig. 1 is the system block diagram constituted according to one embodiment of present invention;
Specific embodiment
Fundamental frequency detection method combination accompanying drawings and embodiments under Arctic ice area environment proposed by the present invention are further
It is described as follows:
Method flow of the invention is as shown in Figure 1, the following steps are included:
1. training process:
(1) to speech database framing in chronological order;
(2) and using canonical algorithm the fundamental frequency of speech frame is extracted as fundamental frequency true value;(3) it makes an uproar to being superimposed with
The sound bank signal of sound framing and converts corresponding fundamental tone feature for voice frame signal using PEFAC algorithm in chronological order
Waveform;
(4) ant colony fitness function is constructed together with fundamental tone signature waveform using Optimization Factor as unknown parameter, go forward side by side
Row global search, until obtaining the optimum optimization factor.
2. test process:
(1) to the voice signal framing of spy's test;
(2) its corresponding fundamental tone signature waveform is converted by voice frame signal;
(3) it is optimized using trained optimum optimization factor pair fundamental tone signature waveform, thus after generating optimization
Fundamental tone signature waveform, and frequency corresponding to the peak-peak of the fundamental tone signature waveform after optimization is calculated as fundamental frequency
Estimated value.
Detailed description are as follows for the specific embodiment of each step of the above method of the present invention:
Sound bank embodiment in above-mentioned training process step (1) is TIMIT international standard database, 30 males and 30
The voice of name women, everyone voice duration 20 minutes, total duration are 20 hours.Temporally the sample rate of framing is 16KHZ, every frame
Data are 160 sampled points;Above-mentioned training process step (2) extracts the standard method of sound bank fundamental frequency as praat calculation
Method tool;
The noise signal type of above-mentioned training process step (3) superposition is white Gaussian noise, and voice framing method and step
Suddenly the algorithm that frame format voice signal is converted into fundamental tone signature waveform is PEFAC algorithm, algorithm flow is as follows unanimously by (1):
(a) voice frame signal is mapped to frequency domain by Short Time Fourier Transform, and makees standardization as Xt' (q),
Wherein q is logarithmetics frequency, i.e. q=log (f).
(b) to Xt' (q) convolution algorithm generates fundamental tone signature waveformWherein filter defines
Are as follows:
Wherein β is chosen for meeting ∫ h
(q) dq=0, and γ is set as 1.8;
The embodiment of Optimization Factor in above-mentioned training process step (4) is 10 dimensional vectors, and per one-dimensional value range
It is 0.5~1.5.Fundamental tone signature waveform is the frequency domain vector signal that dimension is 250 dimensions, and frequency domain span is the fundamental tone of 60~400Hz
Frequency domain maximum magnitude;
Optimization Factor α and fundamental tone signature waveformDimension it is inconsistent, therefore making optimization operationThat is point
When multiplication, α must be extended to 250 vectors, and 250 dimensions are divided into 10 sections to extended method and every 25 dimension is identical.
Ant colony training parameter setting, such as Optimization Factor one-dimensional value range [0.51.5], search precision 0.01, α=2.5, β
=2.5, ρ=0.5, ant colony sum are 100, maximum number of iterations 60
Pheromone release function embodiment
It is consistent in framing method and training process step (1) in above-mentioned test process step (1);It is above-mentioned to test
Fundamental tone signature waveform method for transformation in journey step (2) is consistent with training process step (3);
The optimization operation of fundamental tone signature waveform and above-mentioned training process step (4) in above-mentioned test process step (3) are kept
Unanimously, and Optimization Factor is using the optimum optimization factor-alpha for training generationoptimal。
Claims (3)
1. the fundamental frequency detection method under a kind of Arctic ice area environment, it is characterised in that method includes the following steps:
A. training process:
(1) speech database is made voice framing { frm (1), frm (2) ..., frm (N) } in chronological order, and is calculated using standard
The fundamental frequency F of method extraction speech frame0As fundamental frequency true value, and constitute sequence { F0(1),F0(2),…,F0(N) },
Middle N is the sum of speech frame;
(2) superimposed noise makes new voice framing sequence { frm on the basis of clean speech framenoise(1),frmnoise(2),…,
frmnoise(N) }, and using PEFAC algorithm by voice frame signal it is converted into corresponding signature waveform sequence
(3) ant colony fitness function is constructed together with signature waveform using Optimization Factor, and carry out global search, until obtaining most
Good Optimization Factor, wherein Optimization Factor is unknown M dimensional vector α=[α1,α2,…,αM], the feature after Optimization Factor optimization
WaveformConstitute new signature waveform sequenceExtract signature waveformMaximum
The peak value peak and frequency values f corresponding to itpeakAs fundamental frequency estimated value, and formation sequence { (peakmax(1),fpeak
(1)),(peakmax(2),fpeak(2)),…,(peakmax(N),fpeak(N)) },
Ant colony pheromone release functionThat is fundamental frequency estimated value and true value error
Probability no more than 5%, the relevant parameter that ant group algorithm is then arranged scan for, and finally find out optimal Optimization Factor
αoptimal;
B. test process:
(1) sub-frame processing is carried out to tested speech signal, and extracts its fundamental tone signature waveform
(2) optimum optimization factor-alpha is utilizedoptimalOptimization, i.e., the fundamental tone signature waveform after constitution optimization
(3) it identifies and finds outPeak-peak corresponding to estimated value of the frequency values as fundamental frequency.
2. fundamental frequency detection method according to claim 1, it is characterised in that the ant colony optimization algorithm packet in this method
Include following steps:
Step 1: α=[α is enabled1,α2,…,αi,…,αM], one-dimensional value range [xdown xup], search precision prec, then αi∈
{xdown+prec,xdown+2*prec,…,xdown+ L*prec },
Wherein, floor () is bracket function, and α is divided into M × L node, node alphaijWith pheromones τijAnd heuristic information
ηijIt is associated, it is αi=xdownThe expectation of+j*prec, heuristic information ηij=1/ Δ dij, wherein
Fundamental tone signature waveform as under clean speech environment and it is optimized after fundamental tone signature waveform between deviation;
Step 2: the building in path, kth ant go to the probability of node (i, j) are as follows:
Step 3: Pheromone update: when all ants build path, the pheromones on each node update as follows:
The pheromones that kth ant discharges on the node of place are
WhereinFor path TkAdaptation
Angle value;
Step 4: termination condition is to meet maximum number of iterations, at this time the corresponding α value, that is, α of optimal pathoptimal。
3. fundamental frequency detection method according to claim 1, it is characterised in that the parameter of this method be set as optimization because
Son is 10 dimensional vectors, and one-dimensional value range is 0.5~1.5, and fundamental tone signature waveform is the frequency domain vector signal that dimension is 250 dimensions,
Frequency domain span is the fundamental tone frequency domain maximum magnitude of 60~400Hz;Ant colony training parameter is provided that search precision 0.01, α=
2.5, β=2.5, ρ=0.5, ant colony sum are 100, maximum number of iterations 60.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610077857.2A CN107039051B (en) | 2016-02-03 | 2016-02-03 | Fundamental frequency detection method based on ant group optimization |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610077857.2A CN107039051B (en) | 2016-02-03 | 2016-02-03 | Fundamental frequency detection method based on ant group optimization |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107039051A CN107039051A (en) | 2017-08-11 |
CN107039051B true CN107039051B (en) | 2019-11-26 |
Family
ID=59532975
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610077857.2A Active CN107039051B (en) | 2016-02-03 | 2016-02-03 | Fundamental frequency detection method based on ant group optimization |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107039051B (en) |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101136199B (en) * | 2006-08-30 | 2011-09-07 | 纽昂斯通讯公司 | Voice data processing method and equipment |
CN100574062C (en) * | 2007-12-20 | 2009-12-23 | 中山大学 | Method of optimization for power electronic circuit based on ant group algorithm |
CN101567188B (en) * | 2009-04-30 | 2011-10-26 | 上海大学 | Multi-pitch estimation method for mixed audio signals with combined long frame and short frame |
CN103474074B (en) * | 2013-09-09 | 2016-05-11 | 深圳广晟信源技术有限公司 | Pitch estimation method and apparatus |
CN103903624B (en) * | 2014-03-31 | 2016-06-01 | 重庆工商职业学院 | Periodical pitch detection method under a kind of gauss heat source model environment |
CN104900235B (en) * | 2015-05-25 | 2019-05-28 | 重庆大学 | Method for recognizing sound-groove based on pitch period composite character parameter |
-
2016
- 2016-02-03 CN CN201610077857.2A patent/CN107039051B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN107039051A (en) | 2017-08-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112509564B (en) | End-to-end voice recognition method based on connection time sequence classification and self-attention mechanism | |
CN103971678B (en) | Keyword spotting method and apparatus | |
CN104732978B (en) | The relevant method for distinguishing speek person of text based on combined depth study | |
Barker et al. | Robust ASR based on clean speech models: an evaluation of missing data techniques for connected digit recognition in noise. | |
CN110852201B (en) | Pulse signal detection method based on multi-pulse envelope spectrum matching | |
Vogt et al. | Modelling session variability in text-independent speaker verification | |
CN104900235B (en) | Method for recognizing sound-groove based on pitch period composite character parameter | |
CN108831440A (en) | A kind of vocal print noise-reduction method and system based on machine learning and deep learning | |
CN110197665B (en) | Voice separation and tracking method for public security criminal investigation monitoring | |
CN102968990B (en) | Speaker identifying method and system | |
CN105632512B (en) | A kind of dual sensor sound enhancement method and device based on statistical model | |
CN103730121B (en) | A kind of recognition methods pretending sound and device | |
Venter et al. | Automatic detection of African elephant (Loxodonta africana) infrasonic vocalisations from recordings | |
CN108922541A (en) | Multidimensional characteristic parameter method for recognizing sound-groove based on DTW and GMM model | |
CN105280196B (en) | Refrain detection method and system | |
CN107767859A (en) | The speaker's property understood detection method of artificial cochlea's signal under noise circumstance | |
CN109767760A (en) | Far field audio recognition method based on the study of the multiple target of amplitude and phase information | |
CN109741759B (en) | Acoustic automatic detection method for specific bird species | |
CN102831431A (en) | Detector training method based on hierarchical clustering | |
CN109061591B (en) | Time-frequency line spectrum detection method based on sequential clustering | |
KR102406512B1 (en) | Method and apparatus for voice recognition | |
CN109920447B (en) | Recording fraud detection method based on adaptive filter amplitude phase characteristic extraction | |
CN107025911B (en) | Fundamental frequency detection method based on particle group optimizing | |
CN106251861A (en) | A kind of abnormal sound in public places detection method based on scene modeling | |
CN107039051B (en) | Fundamental frequency detection method based on ant group optimization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |