CN109065070B - Kernel function-based audio characteristic signal dimension reduction method - Google Patents
Kernel function-based audio characteristic signal dimension reduction method Download PDFInfo
- Publication number
- CN109065070B CN109065070B CN201810995309.7A CN201810995309A CN109065070B CN 109065070 B CN109065070 B CN 109065070B CN 201810995309 A CN201810995309 A CN 201810995309A CN 109065070 B CN109065070 B CN 109065070B
- Authority
- CN
- China
- Prior art keywords
- dimension reduction
- audio
- signal
- kernel function
- dimension
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000009467 reduction Effects 0.000 title claims abstract description 66
- 238000000034 method Methods 0.000 title claims abstract description 40
- 230000005236 sound signal Effects 0.000 claims abstract description 28
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 16
- 239000002245 particle Substances 0.000 claims description 23
- 238000005070 sampling Methods 0.000 claims description 13
- 230000037433 frameshift Effects 0.000 claims description 11
- 238000005457 optimization Methods 0.000 claims description 11
- 230000008569 process Effects 0.000 claims description 8
- 238000007781 pre-processing Methods 0.000 claims description 7
- 238000001914 filtration Methods 0.000 claims description 6
- 238000000605 extraction Methods 0.000 claims description 5
- 230000000007 visual effect Effects 0.000 claims description 4
- 230000001133 acceleration Effects 0.000 claims description 3
- 238000009432 framing Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000013139 quantization Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 2
- 230000006911 nucleation Effects 0.000 claims 1
- 238000010899 nucleation Methods 0.000 claims 1
- 238000004458 analytical method Methods 0.000 abstract description 8
- 230000000694 effects Effects 0.000 abstract description 4
- 238000013507 mapping Methods 0.000 abstract description 3
- 230000007547 defect Effects 0.000 abstract description 2
- 238000012544 monitoring process Methods 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 40
- 230000010365 information processing Effects 0.000 description 3
- 238000001311 chemical methods and process Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000008689 nuclear function Effects 0.000 description 1
- 238000012847 principal component analysis method Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/24—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/45—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Circuit For Audible Band Transducer (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The invention relates to a kernel function-based audio characteristic signal dimension reduction method, and belongs to the technical field of audio signal processing. The invention carries out dimension reduction processing on the characteristic parameters of the audio signals, achieves the required dimension reduction effect while not discarding the audio characteristic information quantity, visually displays the final dimension reduction data, and carries out comparison analysis on the results obtained by adopting other audio characteristic parameter dimension reduction methods. The invention carries out dimension reduction on the audio characteristic parameters, mainly carries out dimension reduction processing on a linear prediction coefficient, a linear prediction cepstrum coefficient and a Mel frequency cepstrum coefficient of an audio coefficient field, and visually displays the data result after dimension reduction. The audio feature dimension reduction processing of the invention can be used for monitoring broadcast signals and quickly identifying and processing audio signals. The method has simple algorithm, uses the nonlinear kernel function to represent the mapping relation between the Gaussian observation space and the hidden space, and avoids the defects of limited use range and poor dimension reduction effect of a linear mapping method.
Description
Technical Field
The invention relates to a kernel function-based audio characteristic signal dimension reduction method, and belongs to the technical field of audio characteristic signal processing.
Background
In order to realize the management and control of wireless audio broadcasting and perform safe and efficient real-time monitoring and discrimination on the audio broadcasting, the rapid processing of audio information is related to the process speed of the whole process, and the characteristic signal dimension reduction processing of audio is taken as the core of audio information processing, so that the efficiency and the reliability of the processing are also necessary to be solved at present. Most of the existing audio characteristic signal dimension reduction methods mainly include a local preserving projection method, a multi-dimensional scaling method, a local linear embedding method, a principal component analysis method and the like. Most of the dimension reduction algorithms have high complexity, and the purpose of reducing the dimension by discarding part of characteristic signals can cause unpredictable errors in practical engineering application.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a kernel function-based audio characteristic signal dimension reduction method, which performs dimension reduction analysis on the extracted audio Linear Prediction Coefficients (LPCs), Linear Prediction Cepstrum Coefficients (LPCCs) and Mel Frequency Cepstrum Coefficients (MFCCs) to achieve the purposes of reducing data dimensions and improving information processing rate.
The technical scheme of the invention is as follows: a kernel function-based audio feature signal dimension reduction method. The method comprises the following specific steps:
(1) audio signal acquisition: and acquiring an audio signal to obtain an audio sample.
(2) Audio signal preprocessing: and converting the analog signals in the collected audio samples into digital signals, and writing the digital signals into the WAV file. And filtering, pre-emphasizing and framing the digital signals written into the WAV file.
(3) Characteristic parameter extraction: and extracting high-dimensional characteristic parameters of Linear Prediction Coefficients (LPCs), Linear Prediction Cepstrum Coefficients (LPCCs) and Mel Frequency Cepstrum Coefficients (MFCCs) in the processed digital signals.
(4) Building a dimension reduction model: and (3) sending the extracted characteristic parameters into a dimensionality reduction model built by a kernel chemistry technique (kerneltrick) to directly obtain low-dimensional hidden variables, wherein the low-dimensional hidden variables are dimensionality reduced data. The core of the method is to use a Gaussian regression process model (GPR) to model the relation between an implicit variable and an observed variable in a non-linear mode.
(5) And (3) dimension reduction analysis: and performing visual display (2D/3D) on the data subjected to the dimension reduction, and comparing the results obtained by other dimension reduction methods.
In the above dimension reduction method for the audio characteristic signal based on the kernel function, in the step (1), the audio acquisition is to acquire an audio sample through an audio acquisition device, and the audio acquisition device sets a sampling frequency (the sampling frequency meets the nyquist sampling theorem), a sampling channel number and quantization precision when acquiring the audio signal.
In the above method for reducing the dimension of the audio feature signal based on the kernel function, the audio signal preprocessing in step (2) includes the following steps:
(1) using a rectangular window function w (n) (upper limit frequency is generally f)H3400Hz, lower limit frequency fL60-100 Hz) filtering the collected audio signal x (n) to obtain a signal ya(n) wherein
(2) For the filtered signal ya(n) carrying out pre-emphasis processing by a difference method to obtain a signal yb(n) wherein yb(n) ═ y (n) — α y (n-1) (α is a pre-emphasis coefficient generally having a value close to 1). The high frequency part is improved, the low frequency part is suppressed, and the frequency spectrum of the signal is flattened.
(3) The short-time analysis of the frame-divided voice signal is to divide the signal into a plurality of voice segments, one segment is called a frame, and the time range of each segment is between 10 ms and 30 ms. In order to ensure smooth transition between frames, there is a partial overlap between frames, the overlapped part is called frame shift, and the frame shift takes 1/2 or 1/3 of the length of the frame.
In the above dimension reduction method for audio feature signals based on kernel functions, the step (3) of extracting feature parameters includes the following steps:
(1) linear Prediction Coefficient (LPC): calling an LPC function packet by programming, setting order parameters of a frame length, a frame shift, a window function and an LPC, extracting characteristic values of the audio signals preprocessed in the step (2), and putting the extracted characteristic values into a specified table 1.
(2) Linear Prediction Cepstrum Coefficient (LPCC): calling the LPCC function packet by programming, setting the frame length, the frame shift, the window function and the LPCC order parameter, extracting the characteristic value of the audio signal preprocessed in the step (2), and putting the characteristic value into a specified table 2.
(3) Mel-frequency cepstrum coefficient (MFCC): calling the MFCC function packet by programming, setting the frame length, the frame shift, the window function and the order parameter of the MFCC, extracting the characteristic value of the audio signal preprocessed in the step (2), and putting the extracted characteristic value into a specified table 3.
In the above method for reducing the dimension of the audio characteristic signal based on the kernel function, the building of the dimension reduction model in the step (4) includes the following steps:
(1) the characteristic dimension reduction model is built by firstly recording hidden space asDimension q, let observation space beDimension d (q)<d) In that respect Assuming that a relation of y ═ f (z) + epsilon exists between the observed value and the hidden space parameter, noise epsilon obeys a Gaussian distribution with a mean value of 0 and a variance of beta, and assuming that the hidden function f is a square exponential kernel function satisfying the Gaussian process
Wherein, sigma is a coefficient parameter of square exponential kernel, l represents a distance influence factor parameter between two points z and z ', beta represents a hyper-parameter of the model, sigma (z, z') represents Kronecker delta function, and the parameter to be solved in the kernel function is theta (sigma, l, beta). The kernel function takes a maximum when z is close to z' and a minimum when the distance is far away. For the convenience of subsequent derivation, a calculation formula of the covariance matrix is given, and the formula is
(2) Assuming independent sampling of the d-dimensional observation space, the probability of observation for Y, where Y is:,iFor n elements of the i-th dimension in the observation space Y
To obtain a better dimensionality reduction effect, that is, to obtain the best kernel function hyperparameters by using a correlation algorithm to maximize the probability, a particle swarm optimization algorithm is used to solve the probability, and θ (σ, l, β) is recorded as a ═ a (a)1,a2,a3) Wherein the velocity of the particle i is denoted as vi=(vi1,vi2,vi3) The best position where the particle passes is denoted as pg=(pg1,pg2,pg3) The particle swarm algorithm adopts the following equation to continuously update the positions of the particles
Wherein w is a non-negative inertia factor; acceleration constant c1And c2Is a non-negative number; r is1And r2Is at [01]Random numbers transformed within a range. The current position, the experience position and the neighbor bit information of the particle swarm optimization algorithm are utilized to adjust the state of the particles, the information exchange mode of the particle swarm optimization algorithm is applied to the optimization process of the nuclear parameters, and the particles are influenced by the experience of the particles and the experience in the swarm, so that the particle swarm optimization algorithm has better global optimization capability and convergence speed.
The kernel function used by the model is a nonlinear kernel function, the solved kernel parameters theta (sigma, l, beta) are brought back into the model, the extracted characteristic parameters are sent into the dimension reduction model to obtain hidden parameters, and the hidden parameters are dimension reduced data.
In the dimension reduction analysis method based on the audio characteristic signal, in the step (5), the data after dimension reduction is displayed in a two-dimensional or three-dimensional visual manner, and then is analyzed and compared with other dimension reduction algorithm results.
Compared with the existing kernel function-based audio characteristic signal dimension reduction method, the method has the advantages that:
(1) the invention uses the nonlinear kernel function to represent the direct relation between the observation space data and the hidden space parameters, and avoids the defect of poor dimension reduction effect of certain audio characteristic data caused by linear mapping.
(2) The invention adopts the particle swarm algorithm to solve the hyper-parameters in the kernel function, the excellent global optimization capability of the particle swarm and the directionality of the swarm particles can quickly find the optimal hyper-parameters, and the method is extremely convenient for subsequent replacement of other kernel functions.
(3) The novel audio characteristic dimension reduction theory provided by the invention is simple, the programming is easy to realize, the novel audio characteristic dimension reduction theory is more suitable for the application of real engineering projects, and the improvement of the audio information processing speed is substantially changed.
Drawings
FIG. 1 is a flow chart of a dimension reduction analysis of the present invention;
FIG. 2 is a flow chart of signal preprocessing according to the present invention;
FIG. 3 is a flow chart of feature parameter extraction and dimension reduction processing according to the present invention;
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
As shown in fig. 1-3, a kernel function-based audio feature signal dimension reduction method specifically includes the steps of:
(1) audio signal acquisition: and collecting audio signals to obtain audio samples.
(2) Audio signal preprocessing: and converting the analog signals in the collected audio samples into digital signals, and writing the digital signals into the WAV file. And filtering, pre-emphasizing and framing the digital signals to be written into the WAV file.
(3) Characteristic parameter extraction: and extracting high-dimensional characteristic parameters of Linear Prediction Coefficients (LPCs), Linear Prediction Cepstrum Coefficients (LPCCs) and Mel Frequency Cepstrum Coefficients (MFCCs) in the processed digital signals.
(4) Building a dimension reduction model: and (3) sending the extracted characteristic parameters into a dimensionality reduction model built by a kernel trim (kernel trim) to directly obtain a low-dimensional hidden variable, wherein the low-dimensional hidden variable is dimensionality reduced data.
(5) And (3) dimension reduction analysis: and performing visual display (2D/3D) on the data subjected to the dimension reduction, and comparing the results obtained by other dimension reduction methods.
The audio acquisition is to acquire an audio sample through an audio acquisition device, the audio acquisition device sets the sampling frequency to be 44.1Hz (the sampling frequency meets the Nyquist sampling theorem) when acquiring the audio signal, and the number of sampling channels is single channel and the quantization precision is 16bit because the audio signal is acquired.
The signal preprocessing comprises the following steps:
(1) using a rectangular window function w (n) (the upper limit frequency is generally f)H3400Hz, lower limit frequency fL60-100 Hz) filtering the collected audio signal x (n) to obtain a signal ya(n) wherein
(2) For the filtered signal ya(n) carrying out pre-emphasis processing by a difference method to obtain a signal yb(n) wherein yb(n) ═ y (n) — α y (n-1) (α is a pre-emphasis coefficient generally having a value close to 1).
(3) Processing the pre-emphasis to obtain a signal yb(n) dividing the voice into a plurality of voice segments, wherein one voice segment is called a frame, and the time range of each voice segment is 10-30 ms. There is a partial overlap between frames, the overlapping part is called frame shift, and the frame shift takes the length of 1/2 or 1/3.
The characteristic parameter extraction comprises the following steps:
(1) linear Prediction Coefficient (LPC): calling an LPC function packet by programming, setting order parameters of a frame length, a frame shift, a window function and an LPC, extracting characteristic values of the audio signals preprocessed in the step (2), and putting the extracted characteristic values into a specified table 1.
(2) Linear Prediction Cepstrum Coefficient (LPCC): calling the LPCC function packet by programming, setting the order parameters of the frame length, the frame shift, the window function and the LPCC, extracting the characteristic values of the audio signals preprocessed in the step (2), and putting the characteristic values into a specified table 2.
(3) Mel-frequency cepstrum coefficient (MFCC): calling the MFCC function packet by programming, setting the frame length, the frame shift, the window function and the order parameter of the MFCC, extracting the characteristic value of the audio signal preprocessed in the step (2), and putting the characteristic value into a specified table 3.
The construction of the dimension reduction model comprises the following steps:
Namely, the hidden space is a dimension q, the observation space dimension is d (q < d), the direct existence of a relation of y ═ f (z) + epsilon between an observation value and hidden space parameters is assumed, the noise epsilon follows Gaussian distribution with a mean value of 0 and a variance of ξ, and the hidden function f is assumed to be a square exponential kernel function satisfying the Gaussian process:
wherein, sigma is a coefficient parameter of square exponential kernel, l represents a distance influence factor parameter between two points z and z ', beta represents a hyper-parameter of the model, sigma (z, z') represents Kronecker delta function, and the parameter to be solved in the kernel function is theta (sigma, l, beta). The kernel function takes a maximum when z is close to z' and a minimum when the distance is far away. The covariance matrix is calculated as:
(2) assuming independent sampling of the d-dimensional observation space, the probability of observation for Y, where Y is:,iAs the i-th dimension in the observation space YN elements of
The invention adopts a particle swarm optimization algorithm to solve the parameters, and records theta (sigma, l, beta) as A ═ a1,a2,a3) Wherein the velocity of the particle i is denoted vi=(vi1,vi2,vi3) The best position where the particle passes is denoted as pg=(pg1,pg2,pg3) And a particle swarm algorithm position iterative updating formula:
wherein w is a non-negative inertia factor; acceleration constant c1And c2Is a non-negative number; r is1And r2Is at [01]Random numbers transformed within a range. And (4) bringing the solved nuclear parameters theta (sigma, l, beta) back to the model to obtain a dimensionality reduction model based on the nuclear function, and sending the extracted characteristic parameters to the dimensionality reduction model to obtain hidden parameters, wherein the hidden parameters are data subjected to dimensionality reduction.
In the dimension reduction analysis, because a person lives in a three-dimensional space, the space beyond the three-dimensional space cannot be imagined, and the dimension reduction result with more data sets is difficult to directly analyze, the preprocessed audio signal is sent into the built dimension reduction model for dimension reduction processing, and the obtained hidden parameter data is stored and visually displayed so as to be convenient for comparing and analyzing the advantages and the disadvantages of other dimension reduction models. The present invention is not limited to the above embodiments, and the dimension reduction algorithm can be applied to other related fields within the knowledge of those skilled in the art.
Claims (3)
1. A kernel function-based audio feature signal dimension reduction method is characterized in that: the method comprises the following specific steps:
(1) audio signal acquisition: collecting an audio signal to obtain an audio sample;
(2) audio signal preprocessing: converting analog signals in the collected audio samples into digital signals, writing the digital signals into a WAV file, and performing filtering, pre-emphasis and framing processing on the digital signals written into the WAV file;
(3) characteristic parameter extraction: extracting characteristic parameters of a linear prediction coefficient, a linear prediction cepstrum coefficient and a Mel frequency cepstrum coefficient in the processed digital signal;
building a dimension reduction model: sending the extracted characteristic parameters into a dimensionality reduction model built by a nucleation skill to directly obtain low-dimensional hidden variables, wherein the low-dimensional hidden variables are dimensionality reduced data;
the dimension reduction model is specifically built as follows:
(1) the dimension reduction model is built by first recording the hidden space asDimension q, let observation space beDimension d, q<d, assuming that a relation of y ═ f (z) + epsilon exists between the observed value and the hidden space parameter, the noise epsilon follows a gaussian distribution with mean 0 and variance β, and assuming that the hidden function f is a squared exponential kernel function satisfying the gaussian process:
wherein σ is a coefficient parameter of a square exponential kernel, l represents a distance influence factor parameter between z and z ', β represents a hyper-parameter of the model, σ (z, z ') represents a Kronecker delta function, the parameter requiring solution in the kernel function is θ (σ, l, β), it can be known from the above formula that the kernel function obtains a maximum value when z and z ' are very close, and obtains a minimum value when the distance is very far, and a calculation formula of a covariance matrix of the kernel function:
(2) assuming independent sampling of the d-dimensional observation space, the probability of observation for Y, where Y is:,iFor n elements of the i-th dimension in the observation space Y
Solving the parameters by a particle swarm optimization algorithm, and recording theta (sigma, l, beta) as A ═ a1,a2,a3) Wherein the velocity of the particle i is denoted vi=(vi1,vi2,vi3) The best position where the particle passes is denoted pg=(pg1,pg2,pg3) And a particle swarm algorithm position iteration formula:
(4) wherein w is a non-negative inertia factor; acceleration constant c1And c2Is a non-negative number; r is1And r2Is at [01]Random numbers transformed within the range are applied to a kernel parameter optimization process by an information exchange mode of a particle swarm optimization algorithm, the solved kernel parameters theta (sigma, l, beta) are brought back into the model to obtain a dimension reduction model, the extracted characteristic parameters are sent into the dimension reduction model to obtain hidden parameters, and the hidden parameters are dimension reduced data;
(5) and (3) analyzing a dimension reduction result: and carrying out visual display on the data subjected to the dimensionality reduction.
2. The kernel function-based audio feature signal dimension reduction method according to claim 1, wherein: the audio acquisition is performed by an audio acquisition device, and the audio acquisition device sets the sampling frequency, the number of sampling channels and the quantization precision when acquiring the audio signals.
3. The kernel function-based audio feature signal dimension reduction method according to claim 1, wherein: the audio signal pre-processing comprises the steps of:
(1) filtering the collected audio signal x (n) by adopting a rectangular window function w (n) to obtain a signal ya(n) wherein
(2) For the filtered signal ya(n) pre-emphasis processing is carried out by using a difference method to obtain a signal yb(n) wherein yb(n) y (n) - α y (n-1), α being a pre-emphasis coefficient and generally having a value close to 1;
(3) processing the pre-emphasis to obtain a signal yb(n) dividing the frame into a plurality of voice frames, and partially overlapping the frames, wherein the overlapped part is called frame shift.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810995309.7A CN109065070B (en) | 2018-08-29 | 2018-08-29 | Kernel function-based audio characteristic signal dimension reduction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810995309.7A CN109065070B (en) | 2018-08-29 | 2018-08-29 | Kernel function-based audio characteristic signal dimension reduction method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109065070A CN109065070A (en) | 2018-12-21 |
CN109065070B true CN109065070B (en) | 2022-07-19 |
Family
ID=64757611
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810995309.7A Active CN109065070B (en) | 2018-08-29 | 2018-08-29 | Kernel function-based audio characteristic signal dimension reduction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109065070B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112444785B (en) * | 2019-08-30 | 2024-04-12 | 华为技术有限公司 | Target behavior recognition method, device and radar system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105679321A (en) * | 2016-01-29 | 2016-06-15 | 宇龙计算机通信科技(深圳)有限公司 | Speech recognition method and device and terminal |
CN105913066A (en) * | 2016-04-13 | 2016-08-31 | 刘国栋 | Digital lung sound characteristic dimension reducing method based on relevance vector machine |
CN106898362A (en) * | 2017-02-23 | 2017-06-27 | 重庆邮电大学 | The Speech Feature Extraction of Mel wave filters is improved based on core principle component analysis |
CN109166591A (en) * | 2018-08-29 | 2019-01-08 | 昆明理工大学 | A kind of classification method based on audio frequency characteristics signal |
CN109346104A (en) * | 2018-08-29 | 2019-02-15 | 昆明理工大学 | A kind of audio frequency characteristics dimension reduction method based on spectral clustering |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8756061B2 (en) * | 2011-04-01 | 2014-06-17 | Sony Computer Entertainment Inc. | Speech syllable/vowel/phone boundary detection using auditory attention cues |
-
2018
- 2018-08-29 CN CN201810995309.7A patent/CN109065070B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105679321A (en) * | 2016-01-29 | 2016-06-15 | 宇龙计算机通信科技(深圳)有限公司 | Speech recognition method and device and terminal |
CN105913066A (en) * | 2016-04-13 | 2016-08-31 | 刘国栋 | Digital lung sound characteristic dimension reducing method based on relevance vector machine |
CN106898362A (en) * | 2017-02-23 | 2017-06-27 | 重庆邮电大学 | The Speech Feature Extraction of Mel wave filters is improved based on core principle component analysis |
CN109166591A (en) * | 2018-08-29 | 2019-01-08 | 昆明理工大学 | A kind of classification method based on audio frequency characteristics signal |
CN109346104A (en) * | 2018-08-29 | 2019-02-15 | 昆明理工大学 | A kind of audio frequency characteristics dimension reduction method based on spectral clustering |
Non-Patent Citations (6)
Title |
---|
"Hierarchical Gaussian Process Latent Variable Models";Neil D.Lawrence;《Machine Learning, Proceedings of the Twenty-Fourth International Conference》;20171230;第20-24页 * |
"Semi-supervised Gaussian process latent variable model with pairwise";Xiumei Wang 等;《Neurocomputing》;20101230;全文 * |
"语音情感特征提取及其降维方法综述";刘振焘 等;《计算机学报》;20181230;全文 * |
"基于语音特征的汉语数字语音降维与识别研究";高文曦;《中国优秀硕士学位论文全文数据库(信息科技辑)》;20120715;第31-33页 * |
"基于高斯过程隐变量模型的数据降维与分类";张家源;《中国优秀硕士学位论文全文数据库(信息科技辑)》;20181015;全文 * |
"降维技术与方法综述";张煜东;《四川兵工学报》;20101030;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN109065070A (en) | 2018-12-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109599120B (en) | Abnormal mammal sound monitoring method based on large-scale farm plant | |
CN111583954B (en) | Speaker independent single-channel voice separation method | |
WO2020220439A9 (en) | Highway traffic flow state recognition method based on deep neural network | |
CN109034046B (en) | Method for automatically identifying foreign matters in electric energy meter based on acoustic detection | |
CN110808033B (en) | Audio classification method based on dual data enhancement strategy | |
CN109166591B (en) | Classification method based on audio characteristic signals | |
CN109192200B (en) | Speech recognition method | |
US20220253700A1 (en) | Audio signal time sequence processing method, apparatus and system based on neural network, and computer-readable storage medium | |
Deshmukh et al. | Speech based emotion recognition using machine learning | |
CN105448291A (en) | Parkinsonism detection method and detection system based on voice | |
CN104795064A (en) | Recognition method for sound event under scene of low signal to noise ratio | |
CN104658538A (en) | Mobile bird recognition method based on birdsong | |
CN112599145A (en) | Bone conduction voice enhancement method based on generation of countermeasure network | |
CN116486834A (en) | Rolling sound classification method based on feature fusion and improved convolutional neural network | |
CN109065070B (en) | Kernel function-based audio characteristic signal dimension reduction method | |
CN110473548B (en) | Classroom interaction network analysis method based on acoustic signals | |
CN114694640A (en) | Abnormal sound extraction and identification method and device based on audio frequency spectrogram | |
CN116434759B (en) | Speaker identification method based on SRS-CL network | |
CN114842280A (en) | Automatic micro-seismic signal identification algorithm based on convolutional neural network | |
JP2003524218A (en) | Speech processing using HMM trained with TESPAR parameters | |
CN116229991A (en) | Motor fault diagnosis method based on MFCC voice feature extraction and machine learning | |
CN105206259A (en) | Voice conversion method | |
CN112735477B (en) | Voice emotion analysis method and device | |
CN113488069B (en) | Speech high-dimensional characteristic rapid extraction method and device based on generation type countermeasure network | |
CN114550747A (en) | Unmanned aerial vehicle acoustic identification control method based on wireless ultraviolet networking |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |