CN105551503B - Based on the preselected Audio Matching method for tracing of atom and system - Google Patents
Based on the preselected Audio Matching method for tracing of atom and system Download PDFInfo
- Publication number
- CN105551503B CN105551503B CN201510982266.5A CN201510982266A CN105551503B CN 105551503 B CN105551503 B CN 105551503B CN 201510982266 A CN201510982266 A CN 201510982266A CN 105551503 B CN105551503 B CN 105551503B
- Authority
- CN
- China
- Prior art keywords
- signal
- atom
- dictionary
- atomic
- current
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 25
- 239000011159 matrix material Substances 0.000 claims description 41
- 238000000354 decomposition reaction Methods 0.000 claims description 20
- 239000013598 vector Substances 0.000 claims description 18
- 238000011084 recovery Methods 0.000 claims description 17
- 238000004364 calculation method Methods 0.000 claims description 14
- 238000007781 pre-processing Methods 0.000 claims description 10
- 230000015572 biosynthetic process Effects 0.000 claims description 5
- 238000003786 synthesis reaction Methods 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 4
- 230000008447 perception Effects 0.000 abstract description 2
- 230000007423 decrease Effects 0.000 abstract 1
- 239000000284 extract Substances 0.000 abstract 1
- 238000012545 processing Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/54—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for retrieval
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/21—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The invention discloses a kind of Audio Matching method for tracing and system preselected based on atom, the invention firstly uses correlations existing between signal energy and Auditory Perception, pretreatment based on energy is carried out to original signal, extracts the higher part signal of its Energy distribution;Matched jamming is carried out for the part signal again, obtains sparse coefficient;Signal reconstruction is carried out by sparse coefficient and original dictionary.Computation complexity and calculating speed can be greatly reduced while guaranteeing sound quality without decline in the present invention.
Description
Technical Field
The invention belongs to the technical field of audio coding, and particularly relates to an audio matching tracking method and system based on atomic preselection.
Background
Sparse representation generally means that original signals are accurately represented by using the minimum number of basis functions, so that the main characteristics of the signals are grasped, and the signal processing cost is substantially reduced. Matching Pursuit (MP), one of the more widely used sparse representation algorithms, has the basic idea of sequentially selecting optimal atoms from an overcomplete dictionary in an iterative process, so that the approximation of the signal is more optimized. Because the over-complete dictionary base used by the MP algorithm to represent the signal can be flexibly selected in a self-adaptive manner according to the characteristics of the signal; and a greedy algorithm of repeated iterative approximation is adopted in the atom selection process, so that the number of finally obtained atom coefficients is small, and the MP algorithm is widely applied to various fields of signal analysis, such as image processing, biomedical signal processing, audio processing and the like.
Mobile terminal and method for improving quality of streaming mediaThe number of users is increasing continuously, and the requirement of audio and video coding efficiency is increasing day by day. The traditional matching pursuit algorithm is not suitable for real-time processing due to the high calculation complexity. At present, a plurality of fast matching pursuit algorithms are proposed, such as the joint dictionary method of document 1 and the algorithm improvement optimization method of document 2, however, these algorithms all involve time-consuming optimization, or sacrifice sparse representation efficiency as compensation, and the calculation speed is also difficult to meet the requirement of large-scale problem, document 3The others propose a traversal algorithm based on short-time Gabor atoms, which traverses from a signal starting end to a terminal by using non-complete fixed-length atoms and iteratively selects optimal matching atoms for multiple times to obtain a final sparse coefficient. The data size of the algorithm dictionary is very small, and the storage calculation burden is effectively reduced while the calculation complexity is reduced.
Although this method has a slightly reduced computational complexity compared to other sparse representation algorithms, it is still difficult to use in real-time applications. One of the main approaches to reduce the computation complexity in the matching pursuit algorithm is to reduce the number of iterations, and when the used sparse dictionary is a short-term dictionary, the time consumption for locally performing the MP algorithm on a long-term signal is far less than that of the traversal MP algorithm.
The following references are referred to herein:
[1]Ravelli E,Richard G,Daudet L.Union of MDCT bases for audio coding[J].Audio,Speech,and Language Processing,IEEE Transactions on,2008,16(8):1361-1372.
[2]Gharavi-Alkhansari M,Huang T S.A fast orthogonal matching pursuitalgorithm[C]//Acoustics,Speech and Signal Processing,1998.Proceedings of the1998IEEE International Conference on.IEEE,1998,3:1389-1392.
[3]S,Gribonval R.MPTK:Matching pursuit made tractable[C]//Acoustics,Speech and Signal Processing,2006.ICASSP 2006Proceedings.2006IEEEInternational Conference on.IEEE,2006,3:III-III.
disclosure of Invention
Aiming at the defects in the prior art, the invention provides an audio matching tracking method and system based on atomic preselection according to the influence of energy on auditory perception.
The technical scheme adopted by the invention is as follows:
an audio matching tracking method based on atomic preselection comprises the following steps:
signal decomposition and signal reconstruction, wherein the signal decomposition comprises the steps of:
s1, selecting a short-time dictionary according to the type of the original signal, and taking the short-time dictionary as a sparse dictionary;
s2 calculating successive samples S in original signal one by onei,Si+1,...Si+N-1Energy of the sample, i takes 1, 2, … length (S) -N +1 in sequence, and the continuous sample with the highest energy is extracted and recorded as Smaxenergy(ii) a N is the atomic length of the short-time dictionary; length(s) is the original signal length;
s3 obtaining sparse dictionary atom at SmaxenergyThe maximum value of the absolute value of the atomic weight is
S4 calculating a signal residual Is composed ofThe corresponding atom; at the same time, willRecorded in ith of current sparse coefficient matrixoptmax row joptmax column, ioptmax isAtomic number of (1), joptmax isThe initial value of the current sparse coefficient matrix is a zero matrix;
s5 current signal residual S'laterWhen the target SNR is reached or the iteration times reach a preset value, ending signal decomposition and outputting a current sparse coefficient matrix; otherwise, the current signal residual is'laterRepeating the steps 2-5 as an original signal;
the signal reconstruction includes:
s7, extracting the atom weight in the current sparse coefficient matrix and the corresponding row number and column number;
s8 multiplying the atom weights with the corresponding atoms to obtain recovery signals, assigning the recovery signals to zero vector M with the same length as the original signal in step 1iWith zero vector MiJ thoptmax is the center of the recovered signal, joptmax is the column number of the atom weight corresponding to the current recovery signal; and sequentially accumulating the assigned vectors to obtain a reconstructed signal.
In step S2, consecutive samples { S } in the original signali,Si+1,...Si+N-1The energy of i.e. the sum of the squares of the amplitudes of all samples in the succession.
In step S2, primitiveSuccessive samples { S } in the signali,Si+1,...Si+N-1The energy of is the sum of the absolute values of the amplitudes of all samples in the succession.
In step S2, consecutive samples { S } in the original signali,Si+1,...Si+N-1The energy of the sample is the maximum of the amplitudes of all samples in the succession.
The system corresponding to the audio matching and tracking method based on atomic preselection comprises:
a signal decomposition unit and a signal reconstruction unit, wherein the signal decomposition unit further comprises:
the dictionary establishing module 101 is used for selecting a short-time dictionary according to the type of the original signal and taking the short-time dictionary as a sparse dictionary;
a preprocessing module 102 for calculating the continuous samples { S ] in the original signal one by onei,Si+1,...Si+N-1Energy of the sample, i takes 1, 2, … length (S) -N +1 in sequence, and the continuous sample with the highest energy is extracted and recorded as Smaxenergy(ii) a N is the atomic length of the short-time dictionary; length(s) is the original signal length;
a weight comparison module 103 for obtaining the atom number S of the sparse dictionarymaxenergyThe maximum value of the absolute value of the atomic weight is
A residual error calculation module 104 for calculating signal residual error Is composed ofThe corresponding atom; at the same time, willRecorded in ith of current sparse coefficient matrixoptmax row joptmax column, ioptmax isAtomic number of (1), joptmax isThe initial value of the current sparse coefficient matrix is a zero matrix;
a threshold control module 105 for determining the residual S'laterWhen the target SNR is reached or the iteration times reach a preset value, ending signal decomposition and outputting a current sparse coefficient matrix; otherwise, the current signal residual is'laterInputting the signal as an original signal into the preprocessing module 102;
the signal reconstruction unit further includes:
a reconstruction coefficient extraction module 201, configured to extract atom weights in a current sparse coefficient matrix and row numbers and column numbers corresponding to the atom weights;
a signal synthesis module 202, for multiplying the atom weights with the corresponding atoms to obtain recovery signals, and assigning the recovery signals to zero vectors M with the same length as the original signalsiWith zero vector MiJ thoptmax is the center of the recovered signal, joptmax is the column number of the atom weight corresponding to the current recovery signal; and sequentially accumulating the assigned vectors to obtain a reconstructed signal.
Compared with the prior art, the invention has the following characteristics:
the invention reduces the times of traversal calculation and reduces the calculation complexity by performing the MP algorithm of the incomplete dictionary on the part with higher short-time energy in the signal. In the dictionary construction, the frequency span of atoms is increased, and the constraint of the dictionary on frequency components is reduced. The sparse representation algorithmLimited by the length of the signal to be processed, the data volume of the dictionary is small. The reconstructed signal generated by the invention is compared with other matching pursuit fast algorithms (such asMethod) can obtain faster calculation speed without degradation of sound quality.
Drawings
FIG. 1 is a detailed flow chart of a signal decomposition section according to an embodiment of the present invention;
FIG. 2 is a detailed flow chart of a signal reconstruction section according to an embodiment of the present invention;
FIG. 3 is a block diagram of a signal decomposition subsystem according to an embodiment of the present invention;
FIG. 4 is a block diagram of a signal reconstruction subsystem according to an embodiment of the present invention;
FIG. 5 is a schematic of the atomic center position.
Detailed Description
For the convenience of understanding and implementation of the technical solution of the present invention, the technical solution of the present invention is further described in detail below with reference to the accompanying drawings and embodiments, it is to be understood that the embodiments described herein are only for illustrating and explaining the present invention, and are not to be used for limiting the present invention.
FIGS. 1-2 show the detailed process of the method of the present invention, which includes two major parts, signal decomposition and signal reconstruction.
The specific implementation of signal decomposition comprises the following steps:
step 1, selecting a short-time dictionary according to the type of an original signal.
This step is a conventional step in audio matching tracking methods. For a speech processing system, selecting a short-time dictionary having speech characteristics; for transient signal processing systems, a short-time dictionary of relative transients is selected. For systems where some features are not obvious or multiple types of signals need to be processed simultaneously, a short-term dictionary with strong universality is selected.
In this embodiment, the test sample includes types such as a speech signal and a music signal, and the short-time dictionary selects a Gabor dictionary with strong scalability. The atoms in the Gabor dictionary are constructed as follows:
in formula (1), w represents a frequency scale; μ represents a time offset; σ represents a time scale; lambda [ alpha ]w,μ,σRepresents atomic energy under w, mu and sigma; n represents a time domain sample point of a Gabor atom; gw,μ,σ(n) represents the atomic amplitude at the time-domain sample point n.
When the time offset mu is taken, the traditional matching pursuit method based on the Gabor dictionary can obtain the time offsets mu of various scales as much as possible within the range allowed by the number of atoms in the dictionary. In this embodiment, μ is 0, so that all atoms in the dictionary correspond to the part of the signal with the higher energy value selected in the preprocessing, and the energy is located at the center thereof. Assuming that the variation range of N is 1 to N, and the frequency scale w, the time scale sigma and the atomic energy lambda have M combinations, the dictionary size is M × N. In this example, M is 20, and N is 1001.
Step 2, preprocessing the original signal, and calculating continuous samples { S ] in the original signal one by onei,Si+1,...Si+N-1Energy of the energy, the consecutive samples with the highest energy value are marked as SmaxenergyThe continuous sample length is the atom length N of the short-time dictionary selected in step 1.
Several energy calculation methods will be provided below.
(1) The Energy value Energy of successive samples is calculated from the Energy definition as follows:
in the formula (1), SiIs the ith sample of the original signal S and is also used for representing the amplitude of the ith sample of the original signal S; m is the scale translation amount, and m sequentially takes 0, 1, … length (S) -N, length (S) as the length of the original signal S.
(2) Since the sum of the squared amplitudes of the samples has a quasi-proportional relationship with the sum of the absolute amplitudes of the signals, the sum of the absolute amplitudes of the samples is much less computationally intensive than the sum of the squared amplitudes of the samples. Therefore, the Energy value Energy of successive samples can be approximately calculated using equation (3):
in the formula (3), SiIs the ith sample of the original signal S and is also used for representing the amplitude of the ith sample of the original signal S; m is the scale translation amount, and m sequentially takes 0, 1, … length (S) -N, length (S) as the length of the original signal S.
(3) Different energy calculation modes can be selected according to the characteristics of the original signal. If the original signal is mostly a signal with relatively continuous amplitude, the maximum value of the amplitudes of all samples of the continuous samples is taken as the energy of the continuous samples. The method further reduces the computational complexity compared with (1) and (2).
Step 3, the short-time dictionary selected in the step 1 is used as a sparse dictionary, and all atoms in the sparse dictionary are enabled to beIn turn with SmaxenergyInner products are made to obtain atomsAt SmaxenergyOnAtomic weight, the maximum value of the absolute value of atomic weight is expressed as
The calculation formula of (a) is as follows:
in the formula (4), ioptRepresenting atomic numbers, i, in sparse dictionariesoptM, which is the number of atoms in the sparse dictionary;i.e. ith in sparse dictionaryoptAn atom;to representAnd the absolute value of the inner product of S'.
Step 4, calculating SmaxenergyComponent at sparse dictionary maximum atomSignal residual S'laterNamely SmaxenergyAndsee equation (5); and simultaneously updating the current sparse coefficient matrix.
Wherein,is composed ofThe corresponding atom, i.e., the largest atom.
The current sparse coefficient matrix is updated as follows:
the initial value of the sparse coefficient matrix is a zero matrix, the row number of the matrix represents an atom label, the column number represents the atom center position, and the element is an atom weight. Atomic centre position, i.e. continuous sample SmaxenergyThe position of the central sample relative to the initial point of the original signal is shown in fig. 5, and if the initial point of the original signal is set to 0, the central position of the atom is m.
In order to update the sparse coefficient matrix before updating,and the weight matrix c is the same as the sparse coefficient matrix in size for the updated sparse coefficient matrix. The weight matrix c is obtained in the following way: subjecting the product obtained in step 3 toIs assigned to the ith weight matrix coptmax row joptmax column, ioptmax is the maximum atomReference number of joptmax isAt the atomic center position of (i.e. S)maxenergyCentral sample position of
Step 5, when the signal residual error is S'laterWhen the target SNR is reached or the iteration times reach a preset value, ending signal decomposition and outputting a current sparse coefficient matrix; otherwise, the signal is residual S'laterRepeating steps 2-5 as the original signal in step 2.
The matching pursuit method processes the signal by accumulating iterations to represent the original signal as the sum of the superposition of the atomic weight multiplied by the corresponding atom and the residual of the signal. From step 4, a signal residual S 'can be obtained'laterIs when S'laterAnd terminating iteration when the target SNR is reached or the iteration times reach a preset value, and outputting the current sparse coefficient matrix. The target SNR and the preset value of the iteration times are artificially set according to experience and actual requirements.
The signal-to-noise ratio SNR is defined as follows:
in equation (7), S represents the original signal, and S' is the signal after this time of sparse recovery.
In this embodiment, for a segment signal with a sampling frequency of 48kHz, the length is 500000 sample points (10s), the number of iterations is preset to 20000 times, and the target SNR is 20 dB.
The signal reconstruction method comprises the following steps:
step 6: and extracting the atom weight to be used by the reconstruction signal and the atom mark number and the atom center position corresponding to the atom weight from the current sparse coefficient matrix.
Step 7, weighting the atomsAtoms respectively corresponding theretoMultiplying to obtain a recovered signal of length NRecovering each signalRespectively assigning zero vectors M with the same length as the original signals in the step 1iWhen assigned, with a zero vector MiJ (d) ofoptmax points are recovery signalsCentral point of (j)optmax, atomic weightColumn numbers in the current sparse coefficient matrix; assigned vector MiAnd accumulating the signals in sequence to obtain a reconstructed signal S'.
The reconstructed signal synthesis formula is as follows:
and k is the number of primitive weights in the current sparse coefficient matrix.
Referring to fig. 3 to 4, the invention further provides an audio matching and tracking system based on atomic preselection, which includes a signal decomposition unit and a signal reconstruction unit. The signal decomposition unit further comprises a dictionary establishing module 101, a preprocessing module 102, a weight value comparison module 103, a residual error calculation module 104 and a threshold control module 105; the signal reconstruction unit further comprises a reconstruction coefficient extraction module 201 and a signal synthesis module 202. Wherein:
the dictionary establishing module 101 is used for selecting a short-time dictionary according to the original signal type and taking the short-time dictionary as a sparse dictionary.
A preprocessing module 102 for calculating the continuous samples { S ] in the original signal one by onei,Si+1,...Si+N-1Energy of the sample, i takes 1, 2, … length (S) -N +1 in sequence, and the continuous sample with the highest energy is extracted and recorded as Smaxenergy(ii) a N is the atomic length of the short-time dictionary; length(s) is the original signal length.
In the preprocessing module 102, the energy of consecutive samples can be calculated as follows:
(1) will continue the samples { Si,Si+1,...Si+N-1The sum of the squares of the amplitudes of all samples in the sequence is taken as the energy of the consecutive samples, see equation (2).
(2) Will continue the samples { Si,Si+1,...Si+N-1The sum of the absolute values of the amplitudes of all samples in the sequence is used as the energy of the consecutive samples, see formula (3).
(3) Will continue the samples { Si,Si+1,...Si+N-1The maximum of all sample amplitudes in the constellation is taken as the energy of the consecutive samples.
The weight comparison module 103 is used for obtaining the atom number S of the sparse dictionarymaxenergyThe maximum value of the absolute value of the atomic weight is
The residual calculation module 104 is used for calculating signal residual Is composed ofThe corresponding atom; at the same time, willRecorded in ith of current sparse coefficient matrixoptmax row joptmax column, ioptmax isAtomic number of (1), joptmax isThe initial value of the current sparse coefficient matrix is a zero matrix.
A threshold control module 105 for determining the residual S'laterWhen the target SNR is reached or the iteration times reach a preset value, ending signal decomposition and outputting a current sparse coefficient matrix; otherwise, the current signal residual is'laterAs a raw signal input to the pre-processing module 102.
The reconstruction coefficient extraction module 201 is configured to extract the atom weights in the current sparse coefficient matrix and the row numbers and column numbers corresponding to the atom weights.
A signal synthesis module 202, for multiplying the atom weights with the corresponding atoms to obtain recovery signals, and assigning the recovery signals to zero vectors M with the same length as the original signalsiWith zero vector MiJ thoptmax is the center of the recovered signal, joptmax is the column number of the atom weight corresponding to the current recovery signal; and sequentially accumulating the assigned vectors to obtain a reconstructed signal.
It should be understood that parts of the specification not set forth in detail are well within the prior art.
It should be understood that the above description of the preferred embodiments is given for clarity and not for any purpose of limitation, and that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (5)
1. An audio matching tracking method based on atom preselection is characterized by comprising the following steps:
signal decomposition and signal reconstruction, wherein the signal decomposition comprises the steps of:
s1, selecting a short-time dictionary according to the type of the original signal, and taking the short-time dictionary as a sparse dictionary;
s2 calculating successive samples S in original signal one by onei,Si+1,...Si+N-1Energy of the sample, i takes 1, 2, … length (S) -N +1 in sequence, and the continuous sample with the highest energy is extracted and recorded as Smaxenergy(ii) a N is the atomic length of the short-time dictionary; length(s) is the original signal length;
s3 obtaining sparse dictionary atom at SmaxenergyThe maximum value of the absolute value of the atomic weight isS4 calculating a signal residual Is composed ofThe corresponding atom; at the same time, willRecorded in ith of current sparse coefficient matrixoptmax row joptmax column, ioptmax isAtomic number of (1), joptmax isThe initial value of the current sparse coefficient matrix is a zero matrix;
s5 current signal residual S'laterWhen the target SNR is reached or the iteration times reach a preset value, ending signal decomposition and outputting a current sparse coefficient matrix; otherwise, the current signal residual is'laterRepeating the steps 2-5 as an original signal;
the signal reconstruction includes:
s7, extracting the atom weight in the current sparse coefficient matrix and the corresponding row number and column number;
s8 multiplying the atom weight with the corresponding atom to obtain the recovery signals, assigning the recovery signals respectivelyZero vector M of the same length as the original signal in step 1iWith zero vector MiJ thoptmax is the center of the recovered signal, joptmax is the column number of the atom weight corresponding to the current recovery signal; and sequentially accumulating the assigned vectors to obtain a reconstructed signal.
2. The method for audio matching pursuit based on atomic preselection of claim 1, characterized by:
in step S2, consecutive samples { S } in the original signali,Si+1,...Si+N-1The energy of i.e. the sum of the squares of the amplitudes of all samples in the succession.
3. The method for audio matching pursuit based on atomic preselection of claim 1, characterized by:
in step S2, consecutive samples { S } in the original signali,Si+1,...Si+N-1The energy of is the sum of the absolute values of the amplitudes of all samples in the succession.
4. The method for audio matching pursuit based on atomic preselection of claim 1, characterized by:
in step S2, consecutive samples { S } in the original signali,Si+1,...Si+N-1The energy of the sample is the maximum of the amplitudes of all samples in the succession.
5. An audio matching tracking system based on atomic preselection, comprising:
a signal decomposition unit and a signal reconstruction unit, wherein the signal decomposition unit further comprises:
the dictionary establishing module 101 is used for selecting a short-time dictionary according to the type of the original signal and taking the short-time dictionary as a sparse dictionary;
a preprocessing module 102 for calculating the continuous samples { S ] in the original signal one by onei,Si+1,...Si+N-1Energy of the energy, i takes 1, 2, … length (S) -N in sequence+1, the successive samples with the highest energy are extracted, denoted Smaxenergy(ii) a N is the atomic length of the short-time dictionary; length(s) is the original signal length;
a weight comparison module 103 for obtaining the atom number S of the sparse dictionarymaxenergyThe maximum value of the absolute value of the atomic weight is
A residual error calculation module 104 for calculating signal residual error Is composed ofThe corresponding atom; at the same time, willRecorded in ith of current sparse coefficient matrixoptmax row joptmax column, ioptmax isAtomic number of (1), joptmax isThe initial value of the current sparse coefficient matrix is a zero matrix;
a threshold control module 105 for determining the residual S'laterWhen the target SNR is reached or the iteration times reach a preset value, ending signal decomposition and outputting a current sparse coefficient matrix; otherwise, the current signal residual is'laterInputting the signal as an original signal into the preprocessing module 102;
the signal reconstruction unit further includes:
a reconstruction coefficient extraction module 201, configured to extract atom weights in a current sparse coefficient matrix and row numbers and column numbers corresponding to the atom weights;
a signal synthesis module 202, for multiplying the atom weights with the corresponding atoms to obtain recovery signals, and assigning the recovery signals to zero vectors M with the same length as the original signalsiWith zero vector MiJ thoptmax is the center of the recovered signal, joptmax is the column number of the atom weight corresponding to the current recovery signal; and sequentially accumulating the assigned vectors to obtain a reconstructed signal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510982266.5A CN105551503B (en) | 2015-12-24 | 2015-12-24 | Based on the preselected Audio Matching method for tracing of atom and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510982266.5A CN105551503B (en) | 2015-12-24 | 2015-12-24 | Based on the preselected Audio Matching method for tracing of atom and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105551503A CN105551503A (en) | 2016-05-04 |
CN105551503B true CN105551503B (en) | 2019-03-01 |
Family
ID=55830651
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510982266.5A Active CN105551503B (en) | 2015-12-24 | 2015-12-24 | Based on the preselected Audio Matching method for tracing of atom and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105551503B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106653061A (en) * | 2016-11-01 | 2017-05-10 | 武汉大学深圳研究院 | Audio matching tracking device and tracking method thereof based on dictionary classification |
CN109507292B (en) * | 2018-12-26 | 2021-08-06 | 西安科技大学 | Signal extraction method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2003003745A1 (en) * | 2001-06-29 | 2003-01-09 | Ntt Docomo, Inc. | Image encoder, image decoder, image encoding method, and image decoding method |
CN102879818A (en) * | 2012-08-30 | 2013-01-16 | 中国石油集团川庆钻探工程有限公司地球物理勘探公司 | Improved method for decomposing and reconstructing seismic channel data |
CN103474066A (en) * | 2013-10-11 | 2013-12-25 | 福州大学 | Ecological voice recognition method based on multiband signal reconstruction |
CN103531199A (en) * | 2013-10-11 | 2014-01-22 | 福州大学 | Ecological sound identification method on basis of rapid sparse decomposition and deep learning |
CN103116112B (en) * | 2013-01-06 | 2015-06-10 | 广东电网公司电力科学研究院 | Double-circuit on same tower double-circuit line fault distance measurement method |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7003039B2 (en) * | 2001-07-18 | 2006-02-21 | Avideh Zakhor | Dictionary generation method for video and image compression |
-
2015
- 2015-12-24 CN CN201510982266.5A patent/CN105551503B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2003003745A1 (en) * | 2001-06-29 | 2003-01-09 | Ntt Docomo, Inc. | Image encoder, image decoder, image encoding method, and image decoding method |
CN102879818A (en) * | 2012-08-30 | 2013-01-16 | 中国石油集团川庆钻探工程有限公司地球物理勘探公司 | Improved method for decomposing and reconstructing seismic channel data |
CN103116112B (en) * | 2013-01-06 | 2015-06-10 | 广东电网公司电力科学研究院 | Double-circuit on same tower double-circuit line fault distance measurement method |
CN103474066A (en) * | 2013-10-11 | 2013-12-25 | 福州大学 | Ecological voice recognition method based on multiband signal reconstruction |
CN103531199A (en) * | 2013-10-11 | 2014-01-22 | 福州大学 | Ecological sound identification method on basis of rapid sparse decomposition and deep learning |
Non-Patent Citations (1)
Title |
---|
匹配追踪时频分解算法的端点检测方法;王文延 等;《声学技术》;20070228 |
Also Published As
Publication number | Publication date |
---|---|
CN105551503A (en) | 2016-05-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | TSTNN: Two-stage transformer based neural network for speech enhancement in the time domain | |
CN110673222B (en) | Magnetotelluric signal noise suppression method and system based on atomic training | |
CN110767210A (en) | Method and device for generating personalized voice | |
CN112633175A (en) | Single note real-time recognition algorithm based on multi-scale convolution neural network under complex environment | |
JPH09181611A (en) | Signal coder and its method | |
Bao et al. | Learning a discriminative dictionary for single-channel speech separation | |
CN112767959A (en) | Voice enhancement method, device, equipment and medium | |
EP1513137A1 (en) | Speech processing system and method with multi-pulse excitation | |
Dendani et al. | Speech enhancement based on deep AutoEncoder for remote Arabic speech recognition | |
CN114495969A (en) | Voice recognition method integrating voice enhancement | |
CN110569728A (en) | Kernel signal extraction method based on dictionary training and orthogonal matching pursuit | |
CN105551503B (en) | Based on the preselected Audio Matching method for tracing of atom and system | |
WO2011071560A1 (en) | Compressing feature space transforms | |
US5664053A (en) | Predictive split-matrix quantization of spectral parameters for efficient coding of speech | |
CN106653061A (en) | Audio matching tracking device and tracking method thereof based on dictionary classification | |
CN116884438B (en) | Method and system for detecting musical instrument training sound level based on acoustic characteristics | |
CN107065006B (en) | A kind of seismic signal coding method based on online dictionary updating | |
CN109346104A (en) | A kind of audio frequency characteristics dimension reduction method based on spectral clustering | |
Xue et al. | Low-latency speech enhancement via speech token generation | |
CN104463245A (en) | Target recognition method | |
Narayanaswamy et al. | Audio source separation via multi-scale learning with dilated dense u-nets | |
Upadhyaya et al. | Basis & sensing matrix as key effecting parameters for compressive sensing | |
Nijhawan et al. | Real time speaker recognition system for hindi words | |
CN111832596B (en) | Data processing method, electronic device and computer readable medium | |
Andrew et al. | A unified approach to selecting optimal step lengths for adaptive vector quantizers |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20210714 Address after: 215000 unit 01, 5 / F, building a, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province Patentee after: BOOSLINK SUZHOU INFORMATION TECHNOLOGY Co.,Ltd. Address before: 430072 Hubei Province, Wuhan city Wuchang District of Wuhan University Luojiashan Patentee before: WUHAN University |
|
TR01 | Transfer of patent right |