CN1862661A - Nonnegative matrix decomposition method for speech signal characteristic waveform - Google Patents
Nonnegative matrix decomposition method for speech signal characteristic waveform Download PDFInfo
- Publication number
- CN1862661A CN1862661A CNA2006100122964A CN200610012296A CN1862661A CN 1862661 A CN1862661 A CN 1862661A CN A2006100122964 A CNA2006100122964 A CN A2006100122964A CN 200610012296 A CN200610012296 A CN 200610012296A CN 1862661 A CN1862661 A CN 1862661A
- Authority
- CN
- China
- Prior art keywords
- matrix
- row
- representing
- pitch
- class
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000011159 matrix material Substances 0.000 title claims abstract description 288
- 238000000354 decomposition reaction Methods 0.000 title claims abstract description 44
- 238000000034 method Methods 0.000 title claims abstract description 43
- 239000013598 vector Substances 0.000 claims description 9
- 238000002474 experimental method Methods 0.000 claims description 7
- 238000012549 training Methods 0.000 claims description 7
- 238000010606 normalization Methods 0.000 claims description 6
- 238000009472 formulation Methods 0.000 claims description 3
- 239000000203 mixture Substances 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 abstract description 5
- 238000012545 processing Methods 0.000 abstract description 3
- 238000001914 filtration Methods 0.000 description 5
- 230000005284 excitation Effects 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000005086 pumping Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
Images
Abstract
The present invention relates to a nonnegative matrix decomposition method of speech signal characteristic waveform, belonging to speech signal processing technology. Said method includes the following several steps: firstly, utilizing fundamental tone pitch of speech signal to divide the speech characteristic waveform into 9 classes, for every class of characteristic waveform utilizing iteration method of standard nonnegative matrix decomposition to train out base matrix W, then for given a frame characteristic waveform utilizing its fundamental tone pitch to make subsumption, then taking out the trained base matrix W correspondent to said class of characteristic waveform, and utilizing iteration method to obtain code matrix H correspondent to said frame characteristic waveform, so that said frame characteristic waveform can be approximately decomposed into the product of base matrix W and code matrix H.
Description
Technical field
The present invention relates to a kind of nonnegative matrix decomposition method of phonic signal character waveform, belong to field of voice signal.
Background technology
Along with wireless mobile communications, secure voice communications and network VoIP Rapid development in communication systems, people increase day by day to the demand of the speech coding technology below the high-quality 4kb/s.At present mainly contain binary Excited Linear Prediction model (LPC-10-Linear Prediction Coding-10), MELP (Mixed Excitation Linear Prediction) model (MELP-MixedExcitation Linear Prediction), be with excitation (MBE-Muliti-BandExcitation) and waveform interpolation model (WI-Waveform Interpolation) more at the voice coding model of the following speed of 4kb/s in the world.These models are all based on the source-system model that produces voice signal, promptly use excitation source signal (simulation is from the air-flow of lung) linear time-varying filtering device of de-energisation (simulation sound channel) to produce voice signal, in this provenance-excitation, the extraction of channel parameters is extremely successful, and how according to the different characteristics of voice pumping signal (or claim " signature waveform ") is carried out high-precision decomposition and quantification is the bottleneck that present voice quality improves.The WI speech coding algorithm is the low rate speech coding algorithm of tool potentiality, and the key issue of this encryption algorithm also is how effectively to decompose and quantization characteristic waveform (in the WI scrambler, the title excitation source signal is " signature waveform ").
In waveform interpolation voice coding scheme, have the decomposition method of three kinds of signature waveforms (CW-CharacterWaveform) at present: (1) utilizes the linear-phase filtering method to decompose.(2) wavelet transformation decomposition method.(3) singular value decomposition method.These three kinds of decomposition methods are decomposing on precision, computation complexity, three technical indicators of extra time-delay size, exist following defective separately:
(1) utilize the linear-phase filtering method to decompose: to decompose low precision; Bring 1 frame additionally to delay time.
(2) wavelet transformation: bring 5 frames additionally to delay time.
(3) svd: computation complexity is very high.
Summary of the invention
In order to address the above problem, the invention provides a kind of nonnegative matrix decomposition method of phonic signal character waveform, this method problem to be solved is exactly when decomposing the phonic signal character waveform in the waveform interpolation speech coder, the purpose that reaches high precision, low complex degree and do not have extra signature waveform of delaying time to decompose.
Nonnegative matrix is decomposed (NMF-Non-negative Matrix Factorization) technology and has been widely used in the every field of signal Processing, its basic thought is: for any given nonnegative matrix V, NMF can search out a nonnegative matrix W and a nonnegative matrix H by the limited number of time iteration, make and to satisfy approximation relation: V ≈ W * H, thereby the product of two nonnegative matrixes about a non-negative approximate matrix is decomposed into.Wherein left matrix W is storing the local feature of such things of V matrix representative in (having another name called " basis matrix "), " parts " that promptly forms V, linear combination these " parts " can be similar to synthetic original V matrix information, and combination coefficient is stored in the right matrix H (having another name called " encoder matrix ").
The present invention is based on the nonnegative matrix technology of being used widely, decompose the signature waveform of voice signal in the waveform interpolation speech coder in the signal Processing field.The basic thought that decomposes the phonic signal character waveform with the nonnegative matrix decomposition technique is: by the experiment sample about 10000 frames, train the basis matrix of phonic signal character waveform, and basis matrix is stored.Because the training basis matrix is that off-line carries out, so when the computation complexity of assessment this method, do not add up the computation complexity in this step; Then the given signature waveform of each frame is made nonnegative matrix and decompose, promptly the basis matrix that has trained corresponding to this frame signature waveform is taken out, by the encoder matrix of alternative manner acquisition corresponding to this frame signature waveform.So far, the approximate product that is broken down into basis matrix and encoder matrix of this frame signature waveform that is to say and finished the overall process that nonnegative matrix is decomposed.
How argumentation utilizes standard nonnegative matrix decomposition method to train basis matrix and how to obtain encoder matrix respectively below.
A, training basis matrix.
A, at first the phonetic feature waveform is divided into 9 classes according to the size of the pitch period (pitch) of this frame voice signal, classification foundation is as shown in table 1:
The |
The |
The |
The 4th class 50≤pitch<60 | The 5th class 60≤pitch<70 | The 6th class 70≤pitch<80 |
The 7th class 80≤pitch<90 | The 8th class 90≤pitch<100 | The |
The classification of table 1 signature waveform
2) then first kind signature waveform is chosen experiment sample about 10000 frames, form matrix V, the alternative manner according to the nonnegative matrix of standard is decomposed trains basis matrix W, and concrete steps are as follows:
(1) such signature waveform is chosen experiment sample about 10000 frames, constitute matrix V;
(2) maximum iteration time being set is N time, and the value of N is greater than 10; It is M that the decomposition exponent number is set, the value of M is the integer between 8~32, Seung and the employed formulation of Lee when " decomposition exponent number " this notion has been continued to use the proposition of standard nonnegative matrix decomposition method, the decomposition exponent number is set to M and in other words the row dimension of basis matrix W is made as M;
(3) be used in equally distributed random number between [0,1], all elements in initialization W, the H matrix;
(4) each row to matrix W carry out normalization, each element in promptly every row all divided by this row all elements and;
(5) the current iteration number of times being set is 1;
(6), then change (7), otherwise change (8) if the current iteration number of times is less than or equal to maximum iteration time N;
(7) upgrade encoder matrix H, upgrade basis matrix W subsequently, the update mode of two matrixes is as follows, then the current iteration number of times is increased by 1, changes (6);
Symbol description related in above-mentioned (1) formula is as follows:
(a) H
A μThe element of the capable μ row of a of representing matrix H;
(b) W
IaThe element of the capable a row of the i of representing matrix W;
(c) V
I μThe element of the capable μ row of representing matrix V i;
(d) (WH)
I μThe element of the capable μ row of the i of gained matrix after representing matrix W and matrix H multiply each other;
(e)
The element of the capable μ row of i of gained matrix after the element of the capable μ row of the i of representing matrix V multiplies each other divided by the matrix W matrix H;
(f)
Expression to all different i and μ by
μ row in the matrix that all elements constituted that is calculated with a row of W matrix, are made inner product of vectors;
(g) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " renewal " operation of each element of H matrix.
Symbol description related in above-mentioned (2) formula is as follows:
(a) W
IaThe element of the capable a row of representing matrix W i;
(b) H
A μThe element of the capable μ row of a of representing matrix H;
(c) V
I μThe element of the capable μ row of the i of representing matrix V;
(d) (WH)
I μThe element of the capable μ row of the i of gained matrix after representing matrix W and matrix H multiply each other;
(e)
The element of the capable μ of representing matrix V i row is divided by matrix W and the matrix H element that the capable μ of i of back gained matrix is listed as that multiplies each other;
(f)
H
A μExpression to all different i and μ by
I in the matrix that all elements constituted that is calculated is capable, and is capable with a of H matrix, makes inner product of vectors;
(g) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " renewal " operation of each element of W matrix.
Symbol description related in above-mentioned (3) formula is as follows:
(a) W
IaThe element of the capable a row of the i of representing matrix W;
(b)
The all elements summation of representing matrix W a row;
(c)
The element of the capable a of representing matrix W i row divided by matrix W a row all elements and;
(d) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " normalization " operation of each element of W matrix.
(8) end loop, and preserve basis matrix W.
3) repeat said method, obtain the basis matrix W of the 2nd~9 category feature waveform respectively, obtain 9 basis matrixs that correspond respectively to the inhomogeneity signature waveform altogether;
B, obtain the encoder matrix H of the given signature waveform of a certain frame
A, to the given signature waveform of a certain frame, at first be divided into 9 classes according to the size of pitch period according to this frame signature waveform, sorting technique is identical with point-score in the step 1);
B, from 9 basis matrixs that trained, take out basis matrix W then, use existing nonnegative matrix decomposition method at last, obtain encoder matrix H, obtain encoder matrix H's corresponding to such signature waveform
Concrete steps are as follows:
(1) this frame signature waveform, is considered as matrix V;
(2) the decomposition exponent number being set is M, and the value of M is the integer between 8~32, and it is 10 times that maximum iteration time is set;
(3) be used in equally distributed random number between [0,1], all elements in the initialization H matrix;
(4) the current iteration number of times being set is 1;
(5) if the current iteration number of times is less than or equal to maximum iteration time N, the value of N is then changeed (6) greater than 10 times, otherwise changes (7);
(6) upgrade encoder matrix H, update mode as shown in the formula, then the current iteration number of times is increased by 1, change (5);
(4) related symbol description is as follows in the formula:
(a) H
A μThe element of the capable μ row of a of representing matrix H;
(b) W
IaThe element of the capable a row of the i of representing matrix W;
(c) V
I μThe element of the capable μ row of the i of representing matrix V;
(d) (WH)
I μThe element of the capable μ row of the i of gained matrix after representing matrix W and matrix H multiply each other;
(e)
The element of the capable μ of the i of representing matrix V row is divided by matrix W and the matrix H element that the capable μ of i of back gained matrix is listed as that multiplies each other;
(f)
Expression to all different i and μ by
μ row in the matrix that all elements constituted that is calculated with a row of W matrix, are made inner product of vectors;
(g) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " renewal " operation of each element of H matrix;
(7) end loop, and preserve encoder matrix H; So far, this frame signature waveform V, the product that has resolved into two nonnegative matrix W, H that is similar to,
Be V ≈ W * H.
Beneficial effect of the present invention:
1. owing to decompose the phonic signal character waveform with the nonnegative matrix decomposition method, only need decompose just passablely to the present frame voice signal, and do not need the participation of future frame voice signal, so the decomposition of the nonnegative matrix of signature waveform can not bring extra time-delay.
2. the decomposition ratio of precision of signature waveform nonnegative matrix decomposition is higher.The signature waveform (Fig. 2) that synthesizes again after nonnegative matrix is decomposed, can obtain with svd after the approximate reconstruction effect of signature waveform (Fig. 3) of being synthesized by the second order singular value, and slightly be better than the reconstruction effect of the signature waveform (Fig. 4) that is synthesized by the single order singular value after the svd.The part that the with dashed lines ellipse encloses among Fig. 4 is compared with this part in the primitive character waveform (Fig. 1), has bigger reconstruction error, will be worse than the signature waveform (Fig. 2) that synthesizes again after nonnegative matrix is decomposed so it rebuilds effect.
3. the computation complexity of the nonnegative matrix of signature waveform decomposition is lower.Table 2 has compared the computation complexity that linear-phase filtering, wavelet transformation, svd, nonnegative matrix decompose four kinds of signature waveform decomposition methods:
Decomposition method | Linear-phase filtering | Wavelet transformation | Svd | Nonnegative matrix is decomposed |
Computation complexity | 0(mn) | 0(mn) | 0(mn 2) | 0(mn) |
The computation complexity of four kinds of decomposition methods of table 2
In the table 2, m, n be line number, the columns of representation feature waveform matrix respectively.
4. after signature waveform being done the nonnegative matrix decomposition, the signature waveform matrix of former higher-dimension can characterize by the encoder matrix H of low-dimensional is approximate, make nonnegative matrix to signature waveform thus and decompose the purpose that has played data compression.
Accompanying drawing is described
Signature waveform before Fig. 1 four frame voice signals are undecomposed
Fig. 2 nonnegative matrix is decomposed the synthetic signature waveform in back
After Fig. 3 svd, with the synthetic signature waveform of second order singular value
After Fig. 4 svd, with the synthetic signature waveform of single order singular value
Fig. 5 trains the flow process of basis matrix
Fig. 6 obtains the flow process of encoder matrix
Embodiment
The specific embodiment of the present invention, how argumentation utilizes standard nonnegative matrix decomposition method to train basis matrix and how to obtain encoder matrix respectively below.
A, training basis matrix.
A, at first the phonetic feature waveform is divided into 9 classes according to the size of the pitch period (pitch) of this frame voice signal, classification foundation is as shown in table 1:
The |
The |
The |
The 4th class 50≤pitch<60 | The 5th class 60≤pitch<70 | The 6th class 70≤pitch<80 |
The 7th class 80≤pitch<90 | The 8th class 90≤pitch<100 | The |
The classification of table 1 signature waveform
B, then each category feature waveform is all chosen experiment sample about 10000 frames, form matrix V, the alternative manner that decomposes according to the nonnegative matrix of standard (here, the present invention has adopted and made that of V and W * H Euclidean distance minimum overlap iteration method) trains basis matrix W.Because signature waveform has been divided into 9 classes according to the size of pitch period, so the process of training basis matrix is exactly that each category feature waveform is all trained its corresponding basis matrix W.Be concrete steps (referring to Fig. 5) below with standard nonnegative matrix decomposition method training basis matrix W:
(1) such signature waveform is chosen experiment sample about 10000 frames, constitute matrix V;
(2) maximum iteration time being set is 1000 times, and present embodiment maximum iteration time N is set to 1000 times, and W matrix and H matrix all can be restrained in the time of guaranteeing the iteration end.It is 16 that the decomposition exponent number is set, Seung and the employed formulation of Lee when " decomposition exponent number " this notion has been continued to use the proposition of standard nonnegative matrix decomposition method, and the decomposition exponent number is set to 16 and in other words the row dimension of basis matrix W is made as 16;
(3) be used in equally distributed random number between [0,1], all elements in initialization W, the H matrix;
(4) each row to matrix W carry out normalization, each element in promptly every row all divided by this row all elements and;
(5) the current iteration number of times being set is 1;
(6), then change (7), otherwise change (8) if the current iteration number of times is less than or equal to maximum iteration time 1000;
(7) upgrade encoder matrix H, upgrade basis matrix W subsequently, the update mode of two matrixes following (this is the alternative manner of the standard nonnegative matrix decomposition method quoted of the present invention) increases by 1 to the current iteration number of times then, changes (6);
Symbol description related in above-mentioned (1) formula is as follows:
(a) H
A μThe element of the capable μ row of a of representing matrix H;
(b) W
IaThe element of the capable a row of the i of representing matrix W;
(c) V
I μThe element of the capable μ row of the i of representing matrix V;
(d) (WH)
I μThe element of the capable μ row of the i of gained matrix after representing matrix W and matrix H multiply each other;
(e)
The element of the capable μ of the i of representing matrix V row is divided by matrix W and the matrix H element that the capable μ of i of back gained matrix is listed as that multiplies each other;
(f)
Expression to all different i and μ by
μ row in the matrix that all elements constituted that is calculated with a row of W matrix, are made inner product of vectors;
(g) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " renewal " operation of each element of H matrix.
Symbol description related in above-mentioned (2) formula is as follows:
(a) W
IaThe element of the capable a row of the i of representing matrix W;
(b) H
A μThe element of the capable μ row of a of representing matrix H;
(c) V
I μThe element of the capable μ row of the i of representing matrix V;
(d) (WH)
I μThe element of the capable μ row of the i of gained matrix after representing matrix W and matrix H multiply each other;
(e)
The element of the capable μ of the i of representing matrix V row is divided by matrix W and the matrix H element that the capable μ of i of back gained matrix is listed as that multiplies each other;
(f)
H
A μExpression to all different i and μ by
I in the matrix that all elements constituted that is calculated is capable, and is capable with a of H matrix, makes inner product of vectors;
(g) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " renewal " operation of each element of W matrix.
Symbol description related in above-mentioned (3) formula is as follows:
(a) W
IaThe element of the capable a row of the i of representing matrix W;
(b)
The all elements summation of representing matrix W a row;
(c)
The element of the capable a of representing matrix W i row divided by matrix W a row all elements and;
(d) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " normalization " operation of each element of W matrix.
(8) end loop, and preserve basis matrix W.
B, obtain the encoder matrix of the given signature waveform of a certain frame.
A, to the given signature waveform of a certain frame, at first this frame signature waveform is sorted out according to the size of pitch period (pitch), classification foundation is still as shown in table 1;
B, take out basis matrix (training process trains) then, use existing nonnegative matrix decomposition method at last, obtain encoder matrix H corresponding to such signature waveform.Obtain the concrete steps following (referring to Fig. 6) of encoder matrix H:
(1) this frame signature waveform, is considered as matrix V;
(2) the decomposition exponent number being set is 16, and it is 10 times that maximum iteration time is set;
(3) be used in equally distributed random number between [0,1], all elements in the initialization H matrix;
(4) the current iteration number of times being set is 1;
(5), then change (6), otherwise change (7) if the current iteration number of times is less than or equal to maximum iteration time 10;
(6) upgrade encoder matrix H, update mode as shown in the formula, then the current iteration number of times is increased by 1, change (5);
(4) related symbol description is as follows in the formula:
(a) H
A μThe element of the capable μ row of a of representing matrix H;
(b) W
IaThe element of the capable a row of the i of representing matrix W;
(c) V
I μThe element of the capable μ row of the i of representing matrix V;
(d) (WH)
I μThe element of the capable μ row of the i of gained matrix after representing matrix W and matrix H multiply each other;
(e)
The element of the capable μ of the i of representing matrix V row is divided by matrix W and the matrix H element that the capable μ of i of back gained matrix is listed as that multiplies each other;
(f)
Expression to all different i and μ by
μ row in the matrix that all elements constituted that is calculated with a row of W matrix, are made inner product of vectors;
(g) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " renewal " operation of each element of H matrix.
(7) end loop, and preserve encoder matrix H.So far, this frame signature waveform V, what be similar to has resolved into two nonnegative matrix W, H product, i.e. V ≈ W * H.
Claims (1)
1, a kind of nonnegative matrix decomposition method of phonic signal character waveform is characterized in that, this method is carried out according to the following steps:
A, training basis matrix W
1) at first to the pitch period of phonetic feature waveform according to this frame voice signal, promptly the size of pitch is divided into 9 classes, and is as follows:
The 1st class 20≤pitch<30
The 2nd class 30≤pitch<40
The 3rd class 40≤pitch<50
The 4th class 50≤pitch<60
The 5th class 60≤pitch<70
The 6th class 70≤pitch<80
The 7th class 80≤pitch<90
The 8th class 90≤pitch<100
The 9th class 100≤pitch≤120
2) then first kind signature waveform is chosen experiment sample about 10000 frames, form matrix V, the alternative manner according to the nonnegative matrix of standard is decomposed trains basis matrix W, and concrete steps are as follows:
(1) such signature waveform is chosen experiment sample about 10000 frames, constitute matrix V;
(2) maximum iteration time being set is N time, and the value of N is greater than 10; It is M that the decomposition exponent number is set, the value of M is the integer between 8~32, Seung and the employed formulation of Lee when " decomposition exponent number " this notion has been continued to use the proposition of standard nonnegative matrix decomposition method, the decomposition exponent number is set to M and in other words the row dimension of basis matrix W is made as M;
(3) be used in equally distributed random number between [0,1], all elements in initialization W, the H matrix;
(4) each row to matrix W carry out normalization, each element in promptly every row all divided by this row all elements and;
(5) the current iteration number of times being set is 1;
(6), then change (7), otherwise change (8) if the current iteration number of times is less than or equal to maximum iteration time N;
(7) upgrade encoder matrix H, upgrade basis matrix W subsequently, the update mode of two matrixes is as follows, then the current iteration number of times is increased by 1, changes (6);
Symbol description related in above-mentioned (1) formula is as follows:
(a) H
A μThe element of the capable μ row of a of representing matrix H;
(b) W
IaThe element of the capable a row of the i of representing matrix W;
(c) V
I μThe element of the capable μ row of the i of representing matrix V;
(d) (WH)
I μThe element of the capable μ row of the i of gained matrix after representing matrix W and matrix H multiply each other;
(e)
The element of the capable μ of the i of representing matrix V row is divided by matrix W and the matrix H element that the capable μ of i of back gained matrix is listed as that multiplies each other;
(f)
Expression to all different i and μ by
μ row in the matrix that all elements constituted that is calculated with a row of W matrix, are made inner product of vectors;
(g) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " renewal " operation of each element of H matrix.
Symbol description related in above-mentioned (2) formula is as follows:
(a) W
IaThe element of the capable a row of the i of representing matrix W;
(b) H
A μThe element of the capable μ row of a of representing matrix H;
(c) V
I μThe element of the capable μ row of the i of representing matrix V;
(d) (WH)
I μThe element of the capable μ row of the i of gained matrix after representing matrix W and matrix H multiply each other;
(e)
The element of the capable μ of the i of representing matrix V row is divided by matrix W and the matrix H element that the capable μ of i of back gained matrix is listed as that multiplies each other;
(f)
Expression to all different i and μ by
I in the matrix that all elements constituted that is calculated is capable, and is capable with a of H matrix, makes inner product of vectors;
(g) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " renewal " operation of each element of W matrix.
Symbol description related in above-mentioned (3) formula is as follows:
(a) W
IaThe element of the capable a row of the i of representing matrix W;
(b)
The all elements summation of representing matrix W a row;
(c)
The element of the capable a of representing matrix W i row divided by matrix W a row all elements and;
(d) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " normalization " operation of each element of W matrix.
(8) end loop, and preserve basis matrix W.
3) repeat said method, obtain the basis matrix W of the 2nd~9 category feature waveform respectively, obtain 9 basis matrixs that correspond respectively to the inhomogeneity signature waveform altogether;
B, obtain the encoder matrix H of the given signature waveform of a certain frame
A, to the given signature waveform of a certain frame, at first be divided into 9 classes according to the size of pitch period according to this frame signature waveform, sorting technique is identical with point-score in the step 1);
B, take out basis matrix W corresponding to such signature waveform then from 9 basis matrixs that trained, use existing nonnegative matrix decomposition method at last, obtain encoder matrix H, the concrete steps that obtain encoder matrix H are as follows:
(1) this frame signature waveform, is considered as matrix V;
(2) the decomposition exponent number being set is M, and the value of M is the integer between 8~32, and it is 10 times that maximum iteration time is set;
(3) be used in equally distributed random number between [0,1], all elements in the initialization H matrix;
(4) the current iteration number of times being set is 1;
(5) if the current iteration number of times is less than or equal to maximum iteration time N, the value of N is then changeed (6) greater than 10 times, otherwise changes (7);
(6) upgrade encoder matrix H, update mode as shown in the formula, then the current iteration number of times is increased by 1, change (5);
(4) related symbol description is as follows in the formula:
(a) H
A μThe element of the capable μ row of a of representing matrix H;
(b) W
IaThe element of the capable a row of the i of representing matrix W;
(c) V
I μThe element of the capable μ row of the i of representing matrix V;
(d) (WH)
I μThe element of the capable μ row of the i of gained matrix after representing matrix W and matrix H multiply each other;
(e)
The element of the capable μ of the i of representing matrix V row is divided by matrix W and the matrix H element that the capable μ of i of back gained matrix is listed as that multiplies each other;
μ row in the matrix that element constituted with a row of W matrix, are made inner product of vectors;
(g) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " renewal " operation of each element of H matrix;
(7) end loop, and preserve encoder matrix H;
So far, this frame signature waveform V, the product that has resolved into two nonnegative matrix W, H that is similar to, i.e. V ≈ W * H.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNA2006100122964A CN1862661A (en) | 2006-06-16 | 2006-06-16 | Nonnegative matrix decomposition method for speech signal characteristic waveform |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNA2006100122964A CN1862661A (en) | 2006-06-16 | 2006-06-16 | Nonnegative matrix decomposition method for speech signal characteristic waveform |
Publications (1)
Publication Number | Publication Date |
---|---|
CN1862661A true CN1862661A (en) | 2006-11-15 |
Family
ID=37390074
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNA2006100122964A Pending CN1862661A (en) | 2006-06-16 | 2006-06-16 | Nonnegative matrix decomposition method for speech signal characteristic waveform |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN1862661A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101441872B (en) * | 2007-11-19 | 2011-09-14 | 三菱电机株式会社 | Denoising acoustic signals using constrained non-negative matrix factorization |
CN107610710A (en) * | 2017-09-29 | 2018-01-19 | 武汉大学 | A kind of audio coding and coding/decoding method towards Multi-audio-frequency object |
-
2006
- 2006-06-16 CN CNA2006100122964A patent/CN1862661A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101441872B (en) * | 2007-11-19 | 2011-09-14 | 三菱电机株式会社 | Denoising acoustic signals using constrained non-negative matrix factorization |
CN107610710A (en) * | 2017-09-29 | 2018-01-19 | 武汉大学 | A kind of audio coding and coding/decoding method towards Multi-audio-frequency object |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1199151C (en) | Speech coder | |
CN1150516C (en) | Vector quantizer method | |
CN1154283C (en) | Coding method and apparatus, and decoding method and apparatus | |
CN1210689C (en) | Improved spectral translation/folding in subband domain | |
CN1132153C (en) | Filter for speech modification or enhancement, and various apparatus, system and method using same | |
CN1154086C (en) | CELP transcoding | |
CN101044552A (en) | Sound encoder and sound encoding method | |
CN101044553A (en) | Scalable encoding apparatus, scalable decoding apparatus, and methods thereof | |
CN101031960A (en) | Scalable encoding device, scalable decoding device, and method thereof | |
CN101044554A (en) | Scalable encoder, scalable decoder,and scalable encoding method | |
CN1144179C (en) | Information decorder and decoding method, information encoder and encoding method and distribution medium | |
CN1681213A (en) | Lossless audio coding/decoding method and apparatus | |
CN1181150A (en) | Algebraic codebook with signal-selected pulse amplitudes for fast coding of speech | |
CN1265217A (en) | Method and appts. for speech enhancement in speech communication system | |
CN1906664A (en) | Audio encoder and audio decoder | |
CN1266671C (en) | Apparatus and method for estimating harmonic wave of sound coder | |
CN1863039A (en) | Hidden communication system and communication method based on audio frequency | |
CN1787383A (en) | Methods and apparatuses for transforming, adaptively encoding, inversely transforming and adaptively decoding an audio signal | |
CN1692402A (en) | Speech synthesis method and speech synthesis device | |
CN1787075A (en) | Method for distinguishing speek speek person by supporting vector machine model basedon inserted GMM core | |
CN1531348A (en) | Image encoder, encodnig method and programm, image decoder, decoding method and programm | |
CN1849648A (en) | Coding apparatus and decoding apparatus | |
CN1145925C (en) | Transmitter with improved speech encoder and decoder | |
CN1862661A (en) | Nonnegative matrix decomposition method for speech signal characteristic waveform | |
CN1795491A (en) | Method for analyzing fundamental frequency information and voice conversion method and system implementing said analysis method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |