CN105551503A - Audio matching tracking method based on atom pre-selection and system thereof - Google Patents

Audio matching tracking method based on atom pre-selection and system thereof Download PDF

Info

Publication number
CN105551503A
CN105551503A CN201510982266.5A CN201510982266A CN105551503A CN 105551503 A CN105551503 A CN 105551503A CN 201510982266 A CN201510982266 A CN 201510982266A CN 105551503 A CN105551503 A CN 105551503A
Authority
CN
China
Prior art keywords
signal
atom
dictionary
atomic
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510982266.5A
Other languages
Chinese (zh)
Other versions
CN105551503B (en
Inventor
胡瑞敏
姜林
胡霞
王晓晨
涂卫平
张茂胜
李登实
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Booslink Suzhou Information Technology Co ltd
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201510982266.5A priority Critical patent/CN105551503B/en
Publication of CN105551503A publication Critical patent/CN105551503A/en
Application granted granted Critical
Publication of CN105551503B publication Critical patent/CN105551503B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/54Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for retrieval
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses an audio matching tracking method based on atom pre-selection and a system thereof. The method is characterized by using correlation between signal energy and auditory perception to carry out pretreatment on original signals based on energy and extracting parts of the signals with high energy distribution; aiming at the parts of the signals, carrying out matching and tracking and acquiring a sparse coefficient; and through the sparse coefficient and an original dictionary, carrying out signal reconstruction. In the invention, tone quality is guaranteed not to be decreased, simultaneously calculating complexity is greatly reduced and a calculating speed is greatly increased.

Description

Audio matching tracking method and system based on atom preselection
Technical Field
The invention belongs to the technical field of audio coding, and particularly relates to an audio matching tracking method and system based on atomic preselection.
Background
Sparse representation generally means that original signals are accurately represented by using the minimum number of basis functions, so that the main characteristics of the signals are grasped, and the signal processing cost is substantially reduced. Matching Pursuit (MP) is one of the more widely used sparse representation algorithms, and its basic idea is to select the optimal atoms from an overcomplete dictionary in turn in an iterative process, so that the approximation of the signal is more optimized. Because the over-complete dictionary base used by the MP algorithm to represent the signal can be flexibly selected in a self-adaptive manner according to the characteristics of the signal; and a greedy algorithm of repeated iterative approximation is adopted in the atom selection process, so that the number of finally obtained atom coefficients is small, and the MP algorithm is widely applied to various fields of signal analysis, such as image processing, biomedical signal processing, audio processing and the like.
Along with people convectionThe requirements for the media quality and the number of mobile terminal users are increasing, and the requirements for the audio and video coding efficiency are increasing. The traditional matching pursuit algorithm is not suitable for real-time processing due to the high calculation complexity. At present, a plurality of fast matching pursuit algorithms are proposed, such as the joint dictionary method of document 1 and the algorithm improvement optimization method of document 2, however, these algorithms all involve time-consuming optimization, or sacrifice sparse representation efficiency as compensation, and the calculation speed is also difficult to meet the requirement of large-scale problem, document 3The others propose a traversal algorithm based on short-time Gabor atoms, which traverses from a signal starting end to a terminal by using non-complete fixed-length atoms and iteratively selects optimal matching atoms for multiple times to obtain a final sparse coefficient. The data size of the algorithm dictionary is very small, and the storage calculation burden is effectively reduced while the calculation complexity is reduced.
Although this method has a slightly reduced computational complexity compared to other sparse representation algorithms, it is still difficult to use in real-time applications. One of the main approaches to reduce the computation complexity in the matching pursuit algorithm is to reduce the number of iterations, and when the used sparse dictionary is a short-term dictionary, the time consumption for locally performing the MP algorithm on a long-term signal is far less than that of the traversal MP algorithm.
The following references are referred to herein:
[1]RavelliE,RichardG,DaudetL.UnionofMDCTbasesforaudiocoding[J].Audio,Speech,andLanguageProcessing,IEEETransactionson,2008,16(8):1361-1372.
[2]Gharavi-AlkhansariM,HuangTS.Afastorthogonalmatchingpursuitalgorithm[C]//Acoustics,SpeechandSignalProcessing,1998.Proceedingsofthe1998IEEEInternationalConferenceon.IEEE,1998,3:1389-1392.
[3]S,GribonvalR.MPTK:Matchingpursuitmadetractable[C]//Acoustics,SpeechandSignalProcessing,2006.ICASSP2006Proceedings.2006IEEEInternationalConferenceon.IEEE,2006,3:III-III.
disclosure of Invention
Aiming at the defects in the prior art, the invention provides an audio matching tracking method and system based on atomic preselection according to the influence of energy on auditory perception.
The technical scheme adopted by the invention is as follows:
an audio matching tracking method based on atomic preselection comprises the following steps:
signal decomposition and signal reconstruction, wherein the signal decomposition comprises the steps of:
s1, selecting a short-time dictionary according to the type of the original signal, and taking the short-time dictionary as a sparse dictionary;
s2 calculating successive samples S in original signal one by onei,Si+1,...Si+N-1Energy of the sample, i takes 1, 2, … length (S) -N +1 in sequence, and the continuous sample with the highest energy is extracted and recorded as Smaxenergy(ii) a N is the atomic length of the short-time dictionary; length(s) is the original signal length;
s3 obtaining sparse dictionary atom at SmaxenergyThe maximum value of the absolute value of the atomic weight is
S4 calculating a signal residual Is composed ofThe corresponding atom; at the same time, willRecorded in ith of current sparse coefficient matrixoptmax row joptmax column, ioptmax isAtomic number of (1), joptmax isThe initial value of the current sparse coefficient matrix is a zero matrix;
s5 current signal residual S'laterWhen the target SNR is reached or the iteration times reach a preset value, ending signal decomposition and outputting a current sparse coefficient matrix; otherwise, the current signal residual is'laterRepeating the steps 2-5 as an original signal;
the signal reconstruction includes:
s7, extracting the atom weight in the current sparse coefficient matrix and the corresponding row number and column number;
s8 multiplying the atom weights with the corresponding atoms to obtain recovery signals, assigning the recovery signals to zero vector M with the same length as the original signal in step 1iWith zero vector MiJ thoptmax is the center of the recovered signal, joptmax is the column number of the atom weight corresponding to the current recovery signal; and sequentially accumulating the assigned vectors to obtain a reconstructed signal.
In step S2, consecutive samples { S } in the original signali,Si+1,...Si+N-1The energy of i.e. the sum of the squares of the amplitudes of all samples in the succession.
In the step S2, in the step S,successive samples S in the original signali,Si+1,...Si+N-1The energy of is the sum of the absolute values of the amplitudes of all samples in the succession.
In step S2, consecutive samples { S } in the original signali,Si+1,...Si+N-1The energy of the sample is the maximum of the amplitudes of all samples in the succession.
The system corresponding to the audio matching and tracking method based on atomic preselection comprises:
a signal decomposition unit and a signal reconstruction unit, wherein the signal decomposition unit further comprises:
the dictionary establishing module 101 is used for selecting a short-time dictionary according to the type of the original signal and taking the short-time dictionary as a sparse dictionary;
a preprocessing module 102 for calculating the continuous samples { S ] in the original signal one by onei,Si+1,...Si+N-1Energy of the sample, i takes 1, 2, … length (S) -N +1 in sequence, and the continuous sample with the highest energy is extracted and recorded as Smaxenergy(ii) a N is the atomic length of the short-time dictionary; length(s) is the original signal length;
a weight comparison module 103 for obtaining the atom number S of the sparse dictionarymaxenergyThe maximum value of the absolute value of the atomic weight is
A residual error calculation module 104 for calculating signal residual error Is composed ofThe corresponding atom; at the same time, willRecorded in ith of current sparse coefficient matrixoptmax row joptmax column, ioptmax isAtomic number of (1), joptmax isThe initial value of the current sparse coefficient matrix is a zero matrix;
a threshold control module 105 for determining the residual S'laterWhen the target SNR is reached or the iteration times reach a preset value, ending signal decomposition and outputting a current sparse coefficient matrix; otherwise, the current signal residual is'laterInputting the signal as an original signal into the preprocessing module 102;
the signal reconstruction unit further includes:
a reconstruction coefficient extraction module 201, configured to extract atom weights in a current sparse coefficient matrix and row numbers and column numbers corresponding to the atom weights;
a signal synthesis module 202, for multiplying the atom weights with the corresponding atoms to obtain recovery signals, and assigning the recovery signals to zero vectors M with the same length as the original signalsiWith zero vector MiJ thoptmax is the center of the recovered signal, joptmax is the column number of the atom weight corresponding to the current recovery signal; and sequentially accumulating the assigned vectors to obtain a reconstructed signal.
Compared with the prior art, the invention has the following characteristics:
the invention reduces the times of traversal calculation and reduces the calculation complexity by performing the MP algorithm of the incomplete dictionary on the part with higher short-time energy in the signal. In the dictionary construction, the frequency span of atoms is increased, and the constraint of the dictionary on frequency components is reduced. The sparse representation is calculatedThe method is not limited by the length of the signal to be processed, and the data volume of the dictionary is small. The reconstructed signal generated by the invention is compared with other matching pursuit fast algorithms (such asMethod) can obtain faster calculation speed without degradation of sound quality.
Drawings
FIG. 1 is a detailed flow chart of a signal decomposition section according to an embodiment of the present invention;
FIG. 2 is a detailed flow chart of a signal reconstruction section according to an embodiment of the present invention;
FIG. 3 is a block diagram of a signal decomposition subsystem according to an embodiment of the present invention;
FIG. 4 is a block diagram of a signal reconstruction subsystem according to an embodiment of the present invention;
FIG. 5 is a schematic of the atomic center position.
Detailed Description
For the convenience of understanding and implementation of the technical solution of the present invention, the technical solution of the present invention is further described in detail below with reference to the accompanying drawings and embodiments, it is to be understood that the embodiments described herein are only for illustrating and explaining the present invention, and are not to be used for limiting the present invention.
FIGS. 1-2 show the detailed process of the method of the present invention, which includes two major parts, signal decomposition and signal reconstruction.
The specific implementation of signal decomposition comprises the following steps:
step 1, selecting a short-time dictionary according to the type of an original signal.
This step is a conventional step in audio matching tracking methods. For a speech processing system, selecting a short-time dictionary having speech characteristics; for transient signal processing systems, a short-time dictionary of relative transients is selected. For systems where some features are not obvious or multiple types of signals need to be processed simultaneously, a short-term dictionary with strong universality is selected.
In this embodiment, the test sample includes types such as a speech signal and a music signal, and the short-time dictionary selects a Gabor dictionary with strong scalability. The atoms in the Gabor dictionary are constructed as follows:
g w , μ , σ ( n ) = λ w , μ , σ σ 2 π exp { - ( n - μ ) 2 2 σ 2 } c o s [ 2 π w ( n - μ ) ] - - - ( 1 )
in formula (1), w represents a frequency scale; μ represents a time offset; σ represents a time scale; lambda [ alpha ]w,μ,σRepresents atomic energy under w, mu and sigma; n represents a time domain sample point of a Gabor atom; gw,μ,σ(n) represents the atomic amplitude at the time-domain sample point n.
When the time offset mu is taken, the traditional matching pursuit method based on the Gabor dictionary can obtain the time offsets mu of various scales as much as possible within the range allowed by the number of atoms in the dictionary. In this embodiment, μ is 0, so that all atoms in the dictionary correspond to the part of the signal with the higher energy value selected in the preprocessing, and the energy is located at the center thereof. Assuming that the variation range of N is 1 to N, and the frequency scale w, the time scale sigma and the atomic energy lambda have M combinations, the dictionary size is M × N. In this example, M is 20, and N is 1001.
Step 2, preprocessing the original signal, and calculating continuous samples { S ] in the original signal one by onei,Si+1,...Si+N-1Energy of the energy, the consecutive samples with the highest energy value are marked as SmaxenergyThe continuous sample length is the atom length N of the short-time dictionary selected in step 1.
Several energy calculation methods will be provided below.
(1) The Energy value Energy of successive samples is calculated from the Energy definition as follows:
E n e r g y = Σ i = m + 1 m + N | S i | 2 - - - ( 2 )
in the formula (1), SiIs the ith sample of the original signal S and is also used for representing the amplitude of the ith sample of the original signal S; m is the scale translation amount, and m sequentially takes 0, 1, … length (S) -N, length (S) as the length of the original signal S.
(2) Since the sum of the squared amplitudes of the samples has a quasi-proportional relationship with the sum of the absolute amplitudes of the signals, the sum of the absolute amplitudes of the samples is much less computationally intensive than the sum of the squared amplitudes of the samples. Therefore, the Energy value Energy of successive samples can be approximately calculated using equation (3):
E n e r g y = Σ i = m + m + N 1 | S i | - - - ( 3 )
in the formula (3), SiIs the ith sample of the original signal S and is also used for representing the amplitude of the ith sample of the original signal S; m is the scale translation amount, and m sequentially takes 0, 1, … length (S) -N, length (S) as the length of the original signal S.
(3) Different energy calculation modes can be selected according to the characteristics of the original signal. If the original signal is mostly a signal with relatively continuous amplitude, the maximum value of the amplitudes of all samples of the continuous samples is taken as the energy of the continuous samples. The method further reduces the computational complexity compared with (1) and (2).
Step 3, the short-time dictionary selected in the step 1 is used as a sparse dictionary, and all atoms in the sparse dictionary are enabled to beIn turn with SmaxenergyInner products are made to obtain atomsAt SmaxenergyThe maximum value of the absolute value of the atomic weight is expressed as
The calculation formula of (a) is as follows:
c i opt m a x = m a x { a b s ( < S &prime; , g i o p t > ) } - - - ( 4 )
in the formula (4), ioptRepresenting atomic numbers, i, in sparse dictionariesoptM, which is the number of atoms in the sparse dictionary;i.e. ith in sparse dictionaryoptAn atom;to representAnd the absolute value of the inner product of S'.
Step 4, calculating SmaxenergyComponent at sparse dictionary maximum atomSignal residual S'laterNamely SmaxenergyAndsee equation (5); and simultaneously updating the current sparse coefficient matrix.
S l a t e r &prime; = S max e n e r g y - c i opt m a x &CenterDot; g i opt m a x - - - ( 5 )
Wherein,is composed ofThe corresponding atom, i.e., the largest atom.
The current sparse coefficient matrix is updated as follows:
&alpha; i o p t &prime; = &alpha; i o p t + c - - - ( 6 )
the initial value of the sparse coefficient matrix is a zero matrix, the row number of the matrix represents an atom label, the column number represents the atom center position, and the element is an atom weight. Atomic centre position, i.e. continuous sample SmaxenergyThe position of the central sample relative to the initial point of the original signal is shown in fig. 5, the initial point of the original signal is set to 0, and the central position of the atom is set tom。
In order to update the sparse coefficient matrix before updating,and the weight matrix c is the same as the sparse coefficient matrix in size for the updated sparse coefficient matrix. The weight matrix c is obtained in the following way: subjecting the product obtained in step 3 toIs assigned to the ith weight matrix coptmax row joptmax column, ioptmax is the maximum atomReference number of joptmax isAt the atomic center position of (i.e. S)maxenergyCentral sample position of
Step 5, when the signal residual error is S'laterWhen the target SNR is reached or the iteration times reach a preset value, ending signal decomposition and outputting a current sparse coefficient matrix; otherwise, the signal is residual S'laterRepeating steps 2-5 as the original signal in step 2.
The matching pursuit method processes the signal by accumulating iterations to represent the original signal as the sum of the superposition of the atomic weight multiplied by the corresponding atom and the residual of the signal. From step 4, a signal residual S 'can be obtained'laterIs when S'laterAnd terminating iteration when the target SNR is reached or the iteration times reach a preset value, and outputting the current sparse coefficient matrix. The target SNR and the preset value of the iteration times are artificially set according to experience and actual requirements.
The signal-to-noise ratio SNR is defined as follows:
S N R ( S , S &prime; ) = 20 log 10 ( | | S | | 2 2 | | S - S &prime; | | 2 2 ) - - - ( 7 )
in equation (7), S represents the original signal, and S' is the signal after this time of sparse recovery.
In this embodiment, for a segment signal with a sampling frequency of 48kHz, the length is 500000 sample points (10s), the number of iterations is preset to 20000 times, and the target SNR is 20 dB.
The signal reconstruction method comprises the following steps:
step 6: and extracting the atom weight to be used by the reconstruction signal and the atom mark number and the atom center position corresponding to the atom weight from the current sparse coefficient matrix.
Step 7, weighting the atomsAtoms respectively corresponding theretoMultiplying to obtain a recovered signal of length NRecovering each signalRespectively assigning zero vectors M with the same length as the original signals in the step 1iWhen assigned, with a zero vector MiJ (d) ofoptmax points are recovery signalsCentral point of (j)optmax, atomic weightColumn numbers in the current sparse coefficient matrix; assigned vector MiAnd accumulating the signals in sequence to obtain a reconstructed signal S'.
The reconstructed signal synthesis formula is as follows:
S &prime; = &Sigma; i = 1 k M i - - - ( 8 )
and k is the number of primitive weights in the current sparse coefficient matrix.
Referring to fig. 3 to 4, the invention further provides an audio matching and tracking system based on atomic preselection, which includes a signal decomposition unit and a signal reconstruction unit. The signal decomposition unit further comprises a dictionary establishing module 101, a preprocessing module 102, a weight value comparison module 103, a residual error calculation module 104 and a threshold control module 105; the signal reconstruction unit further comprises a reconstruction coefficient extraction module 201 and a signal synthesis module 202. Wherein:
the dictionary establishing module 101 is used for selecting a short-time dictionary according to the original signal type and taking the short-time dictionary as a sparse dictionary.
A preprocessing module 102 for calculating the continuous samples { S ] in the original signal one by onei,Si+1,...Si+N-1Energy of the sample, i takes 1, 2, … length (S) -N +1 in sequence, and the continuous sample with the highest energy is extracted and recorded as Smaxenergy(ii) a N is the atomic length of the short-time dictionary; length(s) is the original signal length.
In the preprocessing module 102, the energy of consecutive samples can be calculated as follows:
(1) will continue the samples { Si,Si+1,...Si+N-1The sum of the squares of the amplitudes of all samples in the sequence is taken as the energy of the consecutive samples, see equation (2).
(2) Will continue the samples { Si,Si+1,...Si+N-1The sum of the absolute values of the amplitudes of all samples in the sequence is used as the energy of the consecutive samples, see formula (3).
(3) Will continue the samples { Si,Si+1,...Si+N-1The maximum of all sample amplitudes in the constellation is taken as the energy of the consecutive samples.
The weight comparison module 103 is used for obtaining the atom number S of the sparse dictionarymaxenergyThe maximum value of the absolute value of the atomic weight is
The residual calculation module 104 is used for calculating signal residual Is composed ofThe corresponding atom; at the same time, willRecorded in ith of current sparse coefficient matrixoptmax row joptmax column, ioptmax isAtomic number of (1), joptmax isThe initial value of the current sparse coefficient matrix is a zero matrix.
A threshold control module 105 for determining the residual S'laterWhen the target SNR is reached or the iteration times reach a preset value, ending signal decomposition and outputting a current sparse coefficient matrix; otherwise, the current signal residual is'laterAs a raw signal input to the pre-processing module 102.
The reconstruction coefficient extraction module 201 is configured to extract the atom weights in the current sparse coefficient matrix and the row numbers and column numbers corresponding to the atom weights.
A signal synthesis module 202, for multiplying the atom weights with the corresponding atoms to obtain recovery signals, and assigning the recovery signals to zero vectors M with the same length as the original signalsiWith zero vector MiJ thoptmax is the center of the recovered signal, joptmax is the column number of the atom weight corresponding to the current recovery signal; and sequentially accumulating the assigned vectors to obtain a reconstructed signal.
It should be understood that parts of the specification not set forth in detail are well within the prior art.
It should be understood that the above description of the preferred embodiments is given for clarity and not for any purpose of limitation, and that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (5)

1. An audio matching tracking method based on atom preselection is characterized by comprising the following steps:
signal decomposition and signal reconstruction, wherein the signal decomposition comprises the steps of:
s1, selecting a short-time dictionary according to the type of the original signal, and taking the short-time dictionary as a sparse dictionary;
s2 calculating successive samples S in original signal one by onei,Si+1,...Si+N-1Energy of the sample, i takes 1, 2, … length (S) -N +1 in sequence, and the continuous sample with the highest energy is extracted and recorded as Smaxenergy(ii) a N is the atomic length of the short-time dictionary; length(s) is the original signal length;
s3 obtaining sparse dictionary atom at SmaxenergyThe maximum value of the absolute value of the atomic weight is
S4 calculating a signal residual Is composed ofThe corresponding atom; at the same time, willRecorded in ith of current sparse coefficient matrixoptmax row joptmax column, ioptmax isAtomic number of (1), joptmax isThe initial value of the current sparse coefficient matrix is a zero matrix;
s5 current signal residual S'laterWhen the target SNR is reached or the iteration times reach a preset value, ending signal decomposition and outputting a current sparse coefficient matrix; otherwise, the current signal residual is'laterRepeating the steps 2-5 as an original signal;
the signal reconstruction includes:
s7, extracting the atom weight in the current sparse coefficient matrix and the corresponding row number and column number;
s8 restoring the atomic weight by multiplying the atomic weight with the corresponding atomSignals, each restored signal is respectively assigned to a zero vector M with the same length as the original signal in the step 1iWith zero vector MiJ thoptmax is the center of the recovered signal, joptmax is the column number of the atom weight corresponding to the current recovery signal; and sequentially accumulating the assigned vectors to obtain a reconstructed signal.
2. The method for audio matching pursuit based on atomic preselection of claim 1, characterized by:
in step S2, consecutive samples { S } in the original signali,Si+1,...Si+N-1The energy of i.e. the sum of the squares of the amplitudes of all samples in the succession.
3. The method for audio matching pursuit based on atomic preselection of claim 1, characterized by:
in step S2, consecutive samples { S } in the original signali,Si+1,...Si+N-1The energy of is the sum of the absolute values of the amplitudes of all samples in the succession.
4. The method for audio matching pursuit based on atomic preselection of claim 1, characterized by:
in step S2, consecutive samples { S } in the original signali,Si+1,...Si+N-1The energy of the sample is the maximum of the amplitudes of all samples in the succession.
5. An audio matching tracking system based on atomic preselection, comprising:
a signal decomposition unit and a signal reconstruction unit, wherein the signal decomposition unit further comprises:
the dictionary establishing module 101 is used for selecting a short-time dictionary according to the type of the original signal and taking the short-time dictionary as a sparse dictionary;
a preprocessing module 102 for calculating the continuous samples { S ] in the original signal one by onei,Si+1,...Si+N-1Energy of }Quantity, i takes 1, 2, … length (S) -N +1 in turn, extracts the continuous sample with the highest energy, and records as Smaxenergy(ii) a N is the atomic length of the short-time dictionary; length(s) is the original signal length;
a weight comparison module 103 for obtaining the atom number S of the sparse dictionarymaxenergyThe maximum value of the absolute value of the atomic weight is
A residual error calculation module 104 for calculating signal residual error Is composed ofThe corresponding atom; at the same time, willRecorded in ith of current sparse coefficient matrixoptmax row joptmax column, ioptmax isAtomic number of (1), joptmax isThe initial value of the current sparse coefficient matrix is a zero matrix;
a threshold control module 105 for determining the residual S'laterWhen the target SNR is reached or the iteration times reach a preset value, ending signal decomposition and outputting a current sparse coefficient matrix; otherwise, the current signal residual is'laterInputting the signal as an original signal into the preprocessing module 102;
the signal reconstruction unit further includes:
a reconstruction coefficient extraction module 201, configured to extract atom weights in a current sparse coefficient matrix and row numbers and column numbers corresponding to the atom weights;
a signal synthesis module 202, for multiplying the atom weights with the corresponding atoms to obtain recovery signals, and assigning the recovery signals to zero vectors M with the same length as the original signalsiWith zero vector MiJ thoptmax is the center of the recovered signal, joptmax is the column number of the atom weight corresponding to the current recovery signal; and sequentially accumulating the assigned vectors to obtain a reconstructed signal.
CN201510982266.5A 2015-12-24 2015-12-24 Based on the preselected Audio Matching method for tracing of atom and system Active CN105551503B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510982266.5A CN105551503B (en) 2015-12-24 2015-12-24 Based on the preselected Audio Matching method for tracing of atom and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510982266.5A CN105551503B (en) 2015-12-24 2015-12-24 Based on the preselected Audio Matching method for tracing of atom and system

Publications (2)

Publication Number Publication Date
CN105551503A true CN105551503A (en) 2016-05-04
CN105551503B CN105551503B (en) 2019-03-01

Family

ID=55830651

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510982266.5A Active CN105551503B (en) 2015-12-24 2015-12-24 Based on the preselected Audio Matching method for tracing of atom and system

Country Status (1)

Country Link
CN (1) CN105551503B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106653061A (en) * 2016-11-01 2017-05-10 武汉大学深圳研究院 Audio matching tracking device and tracking method thereof based on dictionary classification
CN109507292A (en) * 2018-12-26 2019-03-22 西安科技大学 A kind of method for extracting signal

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003003745A1 (en) * 2001-06-29 2003-01-09 Ntt Docomo, Inc. Image encoder, image decoder, image encoding method, and image decoding method
US20030058943A1 (en) * 2001-07-18 2003-03-27 Tru Video Corporation Dictionary generation method for video and image compression
CN102879818A (en) * 2012-08-30 2013-01-16 中国石油集团川庆钻探工程有限公司地球物理勘探公司 Improved method for decomposing and reconstructing seismic channel data
CN103474066A (en) * 2013-10-11 2013-12-25 福州大学 Ecological voice recognition method based on multiband signal reconstruction
CN103531199A (en) * 2013-10-11 2014-01-22 福州大学 Ecological sound identification method on basis of rapid sparse decomposition and deep learning
CN103116112B (en) * 2013-01-06 2015-06-10 广东电网公司电力科学研究院 Double-circuit on same tower double-circuit line fault distance measurement method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003003745A1 (en) * 2001-06-29 2003-01-09 Ntt Docomo, Inc. Image encoder, image decoder, image encoding method, and image decoding method
US20030058943A1 (en) * 2001-07-18 2003-03-27 Tru Video Corporation Dictionary generation method for video and image compression
CN102879818A (en) * 2012-08-30 2013-01-16 中国石油集团川庆钻探工程有限公司地球物理勘探公司 Improved method for decomposing and reconstructing seismic channel data
CN103116112B (en) * 2013-01-06 2015-06-10 广东电网公司电力科学研究院 Double-circuit on same tower double-circuit line fault distance measurement method
CN103474066A (en) * 2013-10-11 2013-12-25 福州大学 Ecological voice recognition method based on multiband signal reconstruction
CN103531199A (en) * 2013-10-11 2014-01-22 福州大学 Ecological sound identification method on basis of rapid sparse decomposition and deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王文延 等: "匹配追踪时频分解算法的端点检测方法", 《声学技术》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106653061A (en) * 2016-11-01 2017-05-10 武汉大学深圳研究院 Audio matching tracking device and tracking method thereof based on dictionary classification
CN109507292A (en) * 2018-12-26 2019-03-22 西安科技大学 A kind of method for extracting signal
CN109507292B (en) * 2018-12-26 2021-08-06 西安科技大学 Signal extraction method

Also Published As

Publication number Publication date
CN105551503B (en) 2019-03-01

Similar Documents

Publication Publication Date Title
Wang et al. TSTNN: Two-stage transformer based neural network for speech enhancement in the time domain
CN111326168B (en) Voice separation method, device, electronic equipment and storage medium
CN113314140A (en) Sound source separation algorithm of end-to-end time domain multi-scale convolutional neural network
CN110767210A (en) Method and device for generating personalized voice
CN112767959B (en) Voice enhancement method, device, equipment and medium
CN110673222B (en) Magnetotelluric signal noise suppression method and system based on atomic training
CN112633175A (en) Single note real-time recognition algorithm based on multi-scale convolution neural network under complex environment
CN115116448B (en) Voice extraction method, neural network model training method, device and storage medium
CN117672176A (en) Rereading controllable voice synthesis method and device based on voice self-supervision learning characterization
CN105551503B (en) Based on the preselected Audio Matching method for tracing of atom and system
WO2011071560A1 (en) Compressing feature space transforms
CN116884438B (en) Method and system for detecting musical instrument training sound level based on acoustic characteristics
Raj et al. Multilayered convolutional neural network-based auto-CODEC for audio signal denoising using mel-frequency cepstral coefficients
CN106653061A (en) Audio matching tracking device and tracking method thereof based on dictionary classification
Narayanaswamy et al. Audio source separation via multi-scale learning with dilated dense u-nets
Joshi et al. Analysis of compressive sensing for non stationary music signal
Upadhyaya et al. Quality parameter index estimation for compressive sensing based sparse audio signal reconstruction
Upadhyaya et al. Basis & sensing matrix as key effecting parameters for compressive sensing
Xue et al. Low-latency speech enhancement via speech token generation
Qu et al. Noise-separated adaptive feature distillation for robust speech recognition
CN110830044B (en) Data compression method based on sparse least square optimization
CN111832596B (en) Data processing method, electronic device and computer readable medium
CN108322858B (en) Multi-microphone sound enhancement method based on tensor resolution
Bhadoria et al. Comparative analysis of basis & measurement matrices for non-speech audio signal using compressive sensing
CN113129920B (en) Music and human voice separation method based on U-shaped network and audio fingerprint

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210714

Address after: 215000 unit 01, 5 / F, building a, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Patentee after: BOOSLINK SUZHOU INFORMATION TECHNOLOGY Co.,Ltd.

Address before: 430072 Hubei Province, Wuhan city Wuchang District of Wuhan University Luojiashan

Patentee before: WUHAN University