CN107039051B - Fundamental frequency detection method based on ant group optimization - Google Patents

Fundamental frequency detection method based on ant group optimization Download PDF

Info

Publication number
CN107039051B
CN107039051B CN201610077857.2A CN201610077857A CN107039051B CN 107039051 B CN107039051 B CN 107039051B CN 201610077857 A CN201610077857 A CN 201610077857A CN 107039051 B CN107039051 B CN 107039051B
Authority
CN
China
Prior art keywords
peak
signature waveform
optimization
fundamental frequency
fundamental
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610077857.2A
Other languages
Chinese (zh)
Other versions
CN107039051A (en
Inventor
张小恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Technology and Business Institute
Original Assignee
Chongqing Technology and Business Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Technology and Business Institute filed Critical Chongqing Technology and Business Institute
Priority to CN201610077857.2A priority Critical patent/CN107039051B/en
Publication of CN107039051A publication Critical patent/CN107039051A/en
Application granted granted Critical
Publication of CN107039051B publication Critical patent/CN107039051B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/90Pitch determination of speech signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)

Abstract

The present invention provides the fundamental frequency detection method under a kind of Arctic ice area environment.It is characterized in that extracting the signature waveform of voice frame signal using PEFAC algorithm, then it optimizes to construct new signature waveform using optimum optimization factor pair signature waveform, finally finds out estimated value of the frequency values corresponding to signature waveform peak-peak as fundamental frequency.Wherein the optimum optimization factor is obtained by ant group optimization (ACO) algorithm search.

Description

Fundamental frequency detection method based on ant group optimization
Technical field
Fundamental frequency detection side the present invention relates to fundamental frequency detection method, under especially a kind of Arctic ice area environment Method.
Background technique
Fundamental frequency detects the basic parameter as voice, in the speech processes neck such as speech analysis synthesis and speech Separation There is extensive purposes in domain.Accurately and reliably estimate and to extract fundamental frequency most important to Speech processing.High s/n ratio Fundamental frequency detection it is very mature, but these methods are difficult to better effects under low signal-to-noise ratio environment, especially extremely low Detection effect under signal-to-noise ratio environment is very poor.In consideration of it, the present invention provides the fundamental frequency inspection under a kind of Arctic ice area environment Survey method.
Summary of the invention
There is obvious deficiency for carry out fundamental frequency detection of the prior art under Arctic ice area environment, the present invention mentions The fundamental frequency detection method under a kind of Arctic ice area environment is supplied.Method includes the following steps:
1. training process:
(1) speech database is made voice framing { frm (1), frm (2) ..., frm (N) } in chronological order, and utilizes mark Quasi- algorithm extracts the fundamental frequency F of speech frame0As fundamental frequency true value, and constitute sequence { F0(1),F0(2),…,F0 (N) }, wherein N be speech frame sum;
(2) superimposed noise makes new voice framing sequence { frm on the basis of clean speech framenoise(1),frmnoise (2),…,frmnoise(N) }, and using PEFAC algorithm by voice frame signal it is converted into corresponding signature waveform sequence
(3) fitness function in ant colony path is constructed together with signature waveform using Optimization Factor, and carry out global search, Until obtaining the optimum optimization factor.Wherein Optimization Factor is unknown M dimensional vector α=[α12,…,αM], Optimization Factor optimization Signature waveform laterConstitute new signature waveform sequenceExtract characteristic wave ShapePeak-peak peak and its corresponding to frequency values fpeakAs fundamental frequency estimated value, and formation sequence {(peakmax(1),fpeak(1)),(peakmax(2),fpeak(2)),…,(peakmax(N),fpeak(N)) }, ant colony path is direct Determine α value, then the fitness function in ant colony path is I.e. fundamental frequency estimated value and true value error are no more than 5% probability.Then be arranged ant colony (ACO) algorithm relevant parameter into Row search, finally finds out optimal Optimization Factor αoptimal
The training process specific steps of ant colony are as follows:
Step 1: α=[α is enabled12,…,αi,…,αM], one-dimensional value range [xdown xup], search precision prec, then αi ∈{xdown+prec,xdown+2*prec,…,xdown+ L*prec },
Wherein, floor () is bracket function.α is divided into M × L node, node alphaijWith pheromones τijAnd it inspires Formula information ηijIt is associated, it is αi=xdownThe expectation of+j*preci, heuristic information ηij=1/ Δ dij, whereinFundamental tone signature waveform as under clean speech environment and it is optimized after fundamental tone signature waveform it Between deviation;
Step 2: the building in path, kth ant go to the probability of node (i, j) are as follows:
Step 3: Pheromone update: when all ants build path, the pheromones on each node are as follows more It is new:
The pheromones that kth ant discharges on the node of place are
WhereinFor path Tk Fitness value.
Step 4: termination condition is to meet maximum number of iterations, at this time the corresponding α value, that is, α of optimal pathoptimal
2. test process:
(1) sub-frame processing is carried out to tested speech signal, and extracts its fundamental tone signature waveform
(2) optimum optimization factor-alpha is utilizedoptimalOptimization, i.e., the fundamental tone signature waveform after constitution optimization
(3) it identifies and finds outPeak-peak corresponding to estimated value of the frequency values as fundamental frequency.
Above-mentioned technical proposal of the invention has the advantage that compared with prior art
A, signature waveform is extracted using PEFAC algorithm, inherits the advantages of inhibiting noise under its low signal-to-noise ratio environment;
B, the optimum optimization factor is searched for using ant colony optimization algorithm, so that the obtained fundamental frequency estimation after optimization Value is under low signal-to-noise ratio environment closer to true fundamental frequency value;
Detailed description of the invention
Fig. 1 is the system block diagram constituted according to one embodiment of present invention;
Specific embodiment
Fundamental frequency detection method combination accompanying drawings and embodiments under Arctic ice area environment proposed by the present invention are further It is described as follows:
Method flow of the invention is as shown in Figure 1, the following steps are included:
1. training process:
(1) to speech database framing in chronological order;
(2) and using canonical algorithm the fundamental frequency of speech frame is extracted as fundamental frequency true value;(3) it makes an uproar to being superimposed with The sound bank signal of sound framing and converts corresponding fundamental tone feature for voice frame signal using PEFAC algorithm in chronological order Waveform;
(4) ant colony fitness function is constructed together with fundamental tone signature waveform using Optimization Factor as unknown parameter, go forward side by side Row global search, until obtaining the optimum optimization factor.
2. test process:
(1) to the voice signal framing of spy's test;
(2) its corresponding fundamental tone signature waveform is converted by voice frame signal;
(3) it is optimized using trained optimum optimization factor pair fundamental tone signature waveform, thus after generating optimization Fundamental tone signature waveform, and frequency corresponding to the peak-peak of the fundamental tone signature waveform after optimization is calculated as fundamental frequency Estimated value.
Detailed description are as follows for the specific embodiment of each step of the above method of the present invention:
Sound bank embodiment in above-mentioned training process step (1) is TIMIT international standard database, 30 males and 30 The voice of name women, everyone voice duration 20 minutes, total duration are 20 hours.Temporally the sample rate of framing is 16KHZ, every frame Data are 160 sampled points;Above-mentioned training process step (2) extracts the standard method of sound bank fundamental frequency as praat calculation Method tool;
The noise signal type of above-mentioned training process step (3) superposition is white Gaussian noise, and voice framing method and step Suddenly the algorithm that frame format voice signal is converted into fundamental tone signature waveform is PEFAC algorithm, algorithm flow is as follows unanimously by (1):
(a) voice frame signal is mapped to frequency domain by Short Time Fourier Transform, and makees standardization as Xt' (q), Wherein q is logarithmetics frequency, i.e. q=log (f).
(b) to Xt' (q) convolution algorithm generates fundamental tone signature waveformWherein filter defines Are as follows:
Wherein β is chosen for meeting ∫ h (q) dq=0, and γ is set as 1.8;
The embodiment of Optimization Factor in above-mentioned training process step (4) is 10 dimensional vectors, and per one-dimensional value range It is 0.5~1.5.Fundamental tone signature waveform is the frequency domain vector signal that dimension is 250 dimensions, and frequency domain span is the fundamental tone of 60~400Hz Frequency domain maximum magnitude;
Optimization Factor α and fundamental tone signature waveformDimension it is inconsistent, therefore making optimization operationThat is point When multiplication, α must be extended to 250 vectors, and 250 dimensions are divided into 10 sections to extended method and every 25 dimension is identical.
Ant colony training parameter setting, such as Optimization Factor one-dimensional value range [0.51.5], search precision 0.01, α=2.5, β =2.5, ρ=0.5, ant colony sum are 100, maximum number of iterations 60
Pheromone release function embodiment
It is consistent in framing method and training process step (1) in above-mentioned test process step (1);It is above-mentioned to test Fundamental tone signature waveform method for transformation in journey step (2) is consistent with training process step (3);
The optimization operation of fundamental tone signature waveform and above-mentioned training process step (4) in above-mentioned test process step (3) are kept Unanimously, and Optimization Factor is using the optimum optimization factor-alpha for training generationoptimal

Claims (3)

1. the fundamental frequency detection method under a kind of Arctic ice area environment, it is characterised in that method includes the following steps:
A. training process:
(1) speech database is made voice framing { frm (1), frm (2) ..., frm (N) } in chronological order, and is calculated using standard The fundamental frequency F of method extraction speech frame0As fundamental frequency true value, and constitute sequence { F0(1),F0(2),…,F0(N) }, Middle N is the sum of speech frame;
(2) superimposed noise makes new voice framing sequence { frm on the basis of clean speech framenoise(1),frmnoise(2),…, frmnoise(N) }, and using PEFAC algorithm by voice frame signal it is converted into corresponding signature waveform sequence
(3) ant colony fitness function is constructed together with signature waveform using Optimization Factor, and carry out global search, until obtaining most Good Optimization Factor, wherein Optimization Factor is unknown M dimensional vector α=[α12,…,αM], the feature after Optimization Factor optimization WaveformConstitute new signature waveform sequenceExtract signature waveformMaximum The peak value peak and frequency values f corresponding to itpeakAs fundamental frequency estimated value, and formation sequence { (peakmax(1),fpeak (1)),(peakmax(2),fpeak(2)),…,(peakmax(N),fpeak(N)) },
Ant colony pheromone release functionThat is fundamental frequency estimated value and true value error Probability no more than 5%, the relevant parameter that ant group algorithm is then arranged scan for, and finally find out optimal Optimization Factor αoptimal
B. test process:
(1) sub-frame processing is carried out to tested speech signal, and extracts its fundamental tone signature waveform
(2) optimum optimization factor-alpha is utilizedoptimalOptimization, i.e., the fundamental tone signature waveform after constitution optimization
(3) it identifies and finds outPeak-peak corresponding to estimated value of the frequency values as fundamental frequency.
2. fundamental frequency detection method according to claim 1, it is characterised in that the ant colony optimization algorithm packet in this method Include following steps:
Step 1: α=[α is enabled12,…,αi,…,αM], one-dimensional value range [xdown xup], search precision prec, then αi∈ {xdown+prec,xdown+2*prec,…,xdown+ L*prec },
Wherein, floor () is bracket function, and α is divided into M × L node, node alphaijWith pheromones τijAnd heuristic information ηijIt is associated, it is αi=xdownThe expectation of+j*prec, heuristic information ηij=1/ Δ dij, wherein Fundamental tone signature waveform as under clean speech environment and it is optimized after fundamental tone signature waveform between deviation;
Step 2: the building in path, kth ant go to the probability of node (i, j) are as follows:
Step 3: Pheromone update: when all ants build path, the pheromones on each node update as follows:
The pheromones that kth ant discharges on the node of place are
WhereinFor path TkAdaptation Angle value;
Step 4: termination condition is to meet maximum number of iterations, at this time the corresponding α value, that is, α of optimal pathoptimal
3. fundamental frequency detection method according to claim 1, it is characterised in that the parameter of this method be set as optimization because Son is 10 dimensional vectors, and one-dimensional value range is 0.5~1.5, and fundamental tone signature waveform is the frequency domain vector signal that dimension is 250 dimensions, Frequency domain span is the fundamental tone frequency domain maximum magnitude of 60~400Hz;Ant colony training parameter is provided that search precision 0.01, α= 2.5, β=2.5, ρ=0.5, ant colony sum are 100, maximum number of iterations 60.
CN201610077857.2A 2016-02-03 2016-02-03 Fundamental frequency detection method based on ant group optimization Active CN107039051B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610077857.2A CN107039051B (en) 2016-02-03 2016-02-03 Fundamental frequency detection method based on ant group optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610077857.2A CN107039051B (en) 2016-02-03 2016-02-03 Fundamental frequency detection method based on ant group optimization

Publications (2)

Publication Number Publication Date
CN107039051A CN107039051A (en) 2017-08-11
CN107039051B true CN107039051B (en) 2019-11-26

Family

ID=59532975

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610077857.2A Active CN107039051B (en) 2016-02-03 2016-02-03 Fundamental frequency detection method based on ant group optimization

Country Status (1)

Country Link
CN (1) CN107039051B (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101136199B (en) * 2006-08-30 2011-09-07 纽昂斯通讯公司 Voice data processing method and equipment
CN100574062C (en) * 2007-12-20 2009-12-23 中山大学 Method of optimization for power electronic circuit based on ant group algorithm
CN101567188B (en) * 2009-04-30 2011-10-26 上海大学 Multi-pitch estimation method for mixed audio signals with combined long frame and short frame
CN103474074B (en) * 2013-09-09 2016-05-11 深圳广晟信源技术有限公司 Pitch estimation method and apparatus
CN103903624B (en) * 2014-03-31 2016-06-01 重庆工商职业学院 Periodical pitch detection method under a kind of gauss heat source model environment
CN104900235B (en) * 2015-05-25 2019-05-28 重庆大学 Method for recognizing sound-groove based on pitch period composite character parameter

Also Published As

Publication number Publication date
CN107039051A (en) 2017-08-11

Similar Documents

Publication Publication Date Title
CN112509564B (en) End-to-end voice recognition method based on connection time sequence classification and self-attention mechanism
CN103971678B (en) Keyword spotting method and apparatus
CN104732978B (en) The relevant method for distinguishing speek person of text based on combined depth study
Barker et al. Robust ASR based on clean speech models: an evaluation of missing data techniques for connected digit recognition in noise.
CN110852201B (en) Pulse signal detection method based on multi-pulse envelope spectrum matching
Vogt et al. Modelling session variability in text-independent speaker verification
CN104900235B (en) Method for recognizing sound-groove based on pitch period composite character parameter
CN108831440A (en) A kind of vocal print noise-reduction method and system based on machine learning and deep learning
CN110197665B (en) Voice separation and tracking method for public security criminal investigation monitoring
CN102968990B (en) Speaker identifying method and system
CN105632512B (en) A kind of dual sensor sound enhancement method and device based on statistical model
CN103730121B (en) A kind of recognition methods pretending sound and device
Venter et al. Automatic detection of African elephant (Loxodonta africana) infrasonic vocalisations from recordings
CN108922541A (en) Multidimensional characteristic parameter method for recognizing sound-groove based on DTW and GMM model
CN105280196B (en) Refrain detection method and system
CN107767859A (en) The speaker's property understood detection method of artificial cochlea's signal under noise circumstance
CN109767760A (en) Far field audio recognition method based on the study of the multiple target of amplitude and phase information
CN109741759B (en) Acoustic automatic detection method for specific bird species
CN102831431A (en) Detector training method based on hierarchical clustering
CN109061591B (en) Time-frequency line spectrum detection method based on sequential clustering
KR102406512B1 (en) Method and apparatus for voice recognition
CN109920447B (en) Recording fraud detection method based on adaptive filter amplitude phase characteristic extraction
CN107025911B (en) Fundamental frequency detection method based on particle group optimizing
CN106251861A (en) A kind of abnormal sound in public places detection method based on scene modeling
CN107039051B (en) Fundamental frequency detection method based on ant group optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant