CN1862661A - Nonnegative matrix decomposition method for speech signal characteristic waveform - Google Patents

Nonnegative matrix decomposition method for speech signal characteristic waveform Download PDF

Info

Publication number
CN1862661A
CN1862661A CNA2006100122964A CN200610012296A CN1862661A CN 1862661 A CN1862661 A CN 1862661A CN A2006100122964 A CNA2006100122964 A CN A2006100122964A CN 200610012296 A CN200610012296 A CN 200610012296A CN 1862661 A CN1862661 A CN 1862661A
Authority
CN
China
Prior art keywords
matrix
row
representing
pitch
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2006100122964A
Other languages
Chinese (zh)
Inventor
鲍长春
张鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CNA2006100122964A priority Critical patent/CN1862661A/en
Publication of CN1862661A publication Critical patent/CN1862661A/en
Pending legal-status Critical Current

Links

Images

Abstract

The present invention relates to a nonnegative matrix decomposition method of speech signal characteristic waveform, belonging to speech signal processing technology. Said method includes the following several steps: firstly, utilizing fundamental tone pitch of speech signal to divide the speech characteristic waveform into 9 classes, for every class of characteristic waveform utilizing iteration method of standard nonnegative matrix decomposition to train out base matrix W, then for given a frame characteristic waveform utilizing its fundamental tone pitch to make subsumption, then taking out the trained base matrix W correspondent to said class of characteristic waveform, and utilizing iteration method to obtain code matrix H correspondent to said frame characteristic waveform, so that said frame characteristic waveform can be approximately decomposed into the product of base matrix W and code matrix H.

Description

A kind of nonnegative matrix decomposition method of phonic signal character waveform
Technical field
The present invention relates to a kind of nonnegative matrix decomposition method of phonic signal character waveform, belong to field of voice signal.
Background technology
Along with wireless mobile communications, secure voice communications and network VoIP Rapid development in communication systems, people increase day by day to the demand of the speech coding technology below the high-quality 4kb/s.At present mainly contain binary Excited Linear Prediction model (LPC-10-Linear Prediction Coding-10), MELP (Mixed Excitation Linear Prediction) model (MELP-MixedExcitation Linear Prediction), be with excitation (MBE-Muliti-BandExcitation) and waveform interpolation model (WI-Waveform Interpolation) more at the voice coding model of the following speed of 4kb/s in the world.These models are all based on the source-system model that produces voice signal, promptly use excitation source signal (simulation is from the air-flow of lung) linear time-varying filtering device of de-energisation (simulation sound channel) to produce voice signal, in this provenance-excitation, the extraction of channel parameters is extremely successful, and how according to the different characteristics of voice pumping signal (or claim " signature waveform ") is carried out high-precision decomposition and quantification is the bottleneck that present voice quality improves.The WI speech coding algorithm is the low rate speech coding algorithm of tool potentiality, and the key issue of this encryption algorithm also is how effectively to decompose and quantization characteristic waveform (in the WI scrambler, the title excitation source signal is " signature waveform ").
In waveform interpolation voice coding scheme, have the decomposition method of three kinds of signature waveforms (CW-CharacterWaveform) at present: (1) utilizes the linear-phase filtering method to decompose.(2) wavelet transformation decomposition method.(3) singular value decomposition method.These three kinds of decomposition methods are decomposing on precision, computation complexity, three technical indicators of extra time-delay size, exist following defective separately:
(1) utilize the linear-phase filtering method to decompose: to decompose low precision; Bring 1 frame additionally to delay time.
(2) wavelet transformation: bring 5 frames additionally to delay time.
(3) svd: computation complexity is very high.
Summary of the invention
In order to address the above problem, the invention provides a kind of nonnegative matrix decomposition method of phonic signal character waveform, this method problem to be solved is exactly when decomposing the phonic signal character waveform in the waveform interpolation speech coder, the purpose that reaches high precision, low complex degree and do not have extra signature waveform of delaying time to decompose.
Nonnegative matrix is decomposed (NMF-Non-negative Matrix Factorization) technology and has been widely used in the every field of signal Processing, its basic thought is: for any given nonnegative matrix V, NMF can search out a nonnegative matrix W and a nonnegative matrix H by the limited number of time iteration, make and to satisfy approximation relation: V ≈ W * H, thereby the product of two nonnegative matrixes about a non-negative approximate matrix is decomposed into.Wherein left matrix W is storing the local feature of such things of V matrix representative in (having another name called " basis matrix "), " parts " that promptly forms V, linear combination these " parts " can be similar to synthetic original V matrix information, and combination coefficient is stored in the right matrix H (having another name called " encoder matrix ").
The present invention is based on the nonnegative matrix technology of being used widely, decompose the signature waveform of voice signal in the waveform interpolation speech coder in the signal Processing field.The basic thought that decomposes the phonic signal character waveform with the nonnegative matrix decomposition technique is: by the experiment sample about 10000 frames, train the basis matrix of phonic signal character waveform, and basis matrix is stored.Because the training basis matrix is that off-line carries out, so when the computation complexity of assessment this method, do not add up the computation complexity in this step; Then the given signature waveform of each frame is made nonnegative matrix and decompose, promptly the basis matrix that has trained corresponding to this frame signature waveform is taken out, by the encoder matrix of alternative manner acquisition corresponding to this frame signature waveform.So far, the approximate product that is broken down into basis matrix and encoder matrix of this frame signature waveform that is to say and finished the overall process that nonnegative matrix is decomposed.
How argumentation utilizes standard nonnegative matrix decomposition method to train basis matrix and how to obtain encoder matrix respectively below.
A, training basis matrix.
A, at first the phonetic feature waveform is divided into 9 classes according to the size of the pitch period (pitch) of this frame voice signal, classification foundation is as shown in table 1:
The 1st class 20≤pitch<30 The 2nd class 30≤pitch<40 The 3rd class 40≤pitch<50
The 4th class 50≤pitch<60 The 5th class 60≤pitch<70 The 6th class 70≤pitch<80
The 7th class 80≤pitch<90 The 8th class 90≤pitch<100 The 9th class 100≤pitch≤120
The classification of table 1 signature waveform
2) then first kind signature waveform is chosen experiment sample about 10000 frames, form matrix V, the alternative manner according to the nonnegative matrix of standard is decomposed trains basis matrix W, and concrete steps are as follows:
(1) such signature waveform is chosen experiment sample about 10000 frames, constitute matrix V;
(2) maximum iteration time being set is N time, and the value of N is greater than 10; It is M that the decomposition exponent number is set, the value of M is the integer between 8~32, Seung and the employed formulation of Lee when " decomposition exponent number " this notion has been continued to use the proposition of standard nonnegative matrix decomposition method, the decomposition exponent number is set to M and in other words the row dimension of basis matrix W is made as M;
(3) be used in equally distributed random number between [0,1], all elements in initialization W, the H matrix;
(4) each row to matrix W carry out normalization, each element in promptly every row all divided by this row all elements and;
(5) the current iteration number of times being set is 1;
(6), then change (7), otherwise change (8) if the current iteration number of times is less than or equal to maximum iteration time N;
(7) upgrade encoder matrix H, upgrade basis matrix W subsequently, the update mode of two matrixes is as follows, then the current iteration number of times is increased by 1, changes (6);
H aμ ← H aμ Σ i W ia V iμ ( WH ) iμ (1)
W ia ← W ia Σ μ V iμ ( WH ) iμ H aμ (2)
W ia ← W ia Σ j W ja (3)
Symbol description related in above-mentioned (1) formula is as follows:
(a) H A μThe element of the capable μ row of a of representing matrix H;
(b) W IaThe element of the capable a row of the i of representing matrix W;
(c) V I μThe element of the capable μ row of representing matrix V i;
(d) (WH) I μThe element of the capable μ row of the i of gained matrix after representing matrix W and matrix H multiply each other;
(e)
Figure A20061001229600104
The element of the capable μ row of i of gained matrix after the element of the capable μ row of the i of representing matrix V multiplies each other divided by the matrix W matrix H;
(f)
Figure A20061001229600105
Expression to all different i and μ by
Figure A20061001229600106
μ row in the matrix that all elements constituted that is calculated with a row of W matrix, are made inner product of vectors;
(g) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " renewal " operation of each element of H matrix.
Symbol description related in above-mentioned (2) formula is as follows:
(a) W IaThe element of the capable a row of representing matrix W i;
(b) H A μThe element of the capable μ row of a of representing matrix H;
(c) V I μThe element of the capable μ row of the i of representing matrix V;
(d) (WH) I μThe element of the capable μ row of the i of gained matrix after representing matrix W and matrix H multiply each other;
(e)
Figure A20061001229600111
The element of the capable μ of representing matrix V i row is divided by matrix W and the matrix H element that the capable μ of i of back gained matrix is listed as that multiplies each other;
(f)
Figure A20061001229600112
H A μExpression to all different i and μ by
Figure A20061001229600113
I in the matrix that all elements constituted that is calculated is capable, and is capable with a of H matrix, makes inner product of vectors;
(g) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " renewal " operation of each element of W matrix.
Symbol description related in above-mentioned (3) formula is as follows:
(a) W IaThe element of the capable a row of the i of representing matrix W;
(b) The all elements summation of representing matrix W a row;
(c)
Figure A20061001229600115
The element of the capable a of representing matrix W i row divided by matrix W a row all elements and;
(d) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " normalization " operation of each element of W matrix.
(8) end loop, and preserve basis matrix W.
3) repeat said method, obtain the basis matrix W of the 2nd~9 category feature waveform respectively, obtain 9 basis matrixs that correspond respectively to the inhomogeneity signature waveform altogether;
B, obtain the encoder matrix H of the given signature waveform of a certain frame
A, to the given signature waveform of a certain frame, at first be divided into 9 classes according to the size of pitch period according to this frame signature waveform, sorting technique is identical with point-score in the step 1);
B, from 9 basis matrixs that trained, take out basis matrix W then, use existing nonnegative matrix decomposition method at last, obtain encoder matrix H, obtain encoder matrix H's corresponding to such signature waveform
Concrete steps are as follows:
(1) this frame signature waveform, is considered as matrix V;
(2) the decomposition exponent number being set is M, and the value of M is the integer between 8~32, and it is 10 times that maximum iteration time is set;
(3) be used in equally distributed random number between [0,1], all elements in the initialization H matrix;
(4) the current iteration number of times being set is 1;
(5) if the current iteration number of times is less than or equal to maximum iteration time N, the value of N is then changeed (6) greater than 10 times, otherwise changes (7);
(6) upgrade encoder matrix H, update mode as shown in the formula, then the current iteration number of times is increased by 1, change (5);
H aμ ← H aμ Σ i W ia V iμ ( WH ) iμ (4)
(4) related symbol description is as follows in the formula:
(a) H A μThe element of the capable μ row of a of representing matrix H;
(b) W IaThe element of the capable a row of the i of representing matrix W;
(c) V I μThe element of the capable μ row of the i of representing matrix V;
(d) (WH) I μThe element of the capable μ row of the i of gained matrix after representing matrix W and matrix H multiply each other;
(e) The element of the capable μ of the i of representing matrix V row is divided by matrix W and the matrix H element that the capable μ of i of back gained matrix is listed as that multiplies each other;
(f) Expression to all different i and μ by μ row in the matrix that all elements constituted that is calculated with a row of W matrix, are made inner product of vectors;
(g) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " renewal " operation of each element of H matrix;
(7) end loop, and preserve encoder matrix H; So far, this frame signature waveform V, the product that has resolved into two nonnegative matrix W, H that is similar to,
Be V ≈ W * H.
Beneficial effect of the present invention:
1. owing to decompose the phonic signal character waveform with the nonnegative matrix decomposition method, only need decompose just passablely to the present frame voice signal, and do not need the participation of future frame voice signal, so the decomposition of the nonnegative matrix of signature waveform can not bring extra time-delay.
2. the decomposition ratio of precision of signature waveform nonnegative matrix decomposition is higher.The signature waveform (Fig. 2) that synthesizes again after nonnegative matrix is decomposed, can obtain with svd after the approximate reconstruction effect of signature waveform (Fig. 3) of being synthesized by the second order singular value, and slightly be better than the reconstruction effect of the signature waveform (Fig. 4) that is synthesized by the single order singular value after the svd.The part that the with dashed lines ellipse encloses among Fig. 4 is compared with this part in the primitive character waveform (Fig. 1), has bigger reconstruction error, will be worse than the signature waveform (Fig. 2) that synthesizes again after nonnegative matrix is decomposed so it rebuilds effect.
3. the computation complexity of the nonnegative matrix of signature waveform decomposition is lower.Table 2 has compared the computation complexity that linear-phase filtering, wavelet transformation, svd, nonnegative matrix decompose four kinds of signature waveform decomposition methods:
Decomposition method Linear-phase filtering Wavelet transformation Svd Nonnegative matrix is decomposed
Computation complexity 0(mn) 0(mn) 0(mn 2) 0(mn)
The computation complexity of four kinds of decomposition methods of table 2
In the table 2, m, n be line number, the columns of representation feature waveform matrix respectively.
4. after signature waveform being done the nonnegative matrix decomposition, the signature waveform matrix of former higher-dimension can characterize by the encoder matrix H of low-dimensional is approximate, make nonnegative matrix to signature waveform thus and decompose the purpose that has played data compression.
Accompanying drawing is described
Signature waveform before Fig. 1 four frame voice signals are undecomposed
Fig. 2 nonnegative matrix is decomposed the synthetic signature waveform in back
After Fig. 3 svd, with the synthetic signature waveform of second order singular value
After Fig. 4 svd, with the synthetic signature waveform of single order singular value
Fig. 5 trains the flow process of basis matrix
Fig. 6 obtains the flow process of encoder matrix
Embodiment
The specific embodiment of the present invention, how argumentation utilizes standard nonnegative matrix decomposition method to train basis matrix and how to obtain encoder matrix respectively below.
A, training basis matrix.
A, at first the phonetic feature waveform is divided into 9 classes according to the size of the pitch period (pitch) of this frame voice signal, classification foundation is as shown in table 1:
The 1st class 20≤pitch<30 The 2nd class 30≤pitch<40 The 3rd class 40≤pitch<50
The 4th class 50≤pitch<60 The 5th class 60≤pitch<70 The 6th class 70≤pitch<80
The 7th class 80≤pitch<90 The 8th class 90≤pitch<100 The 9th class 100≤pitch≤120
The classification of table 1 signature waveform
B, then each category feature waveform is all chosen experiment sample about 10000 frames, form matrix V, the alternative manner that decomposes according to the nonnegative matrix of standard (here, the present invention has adopted and made that of V and W * H Euclidean distance minimum overlap iteration method) trains basis matrix W.Because signature waveform has been divided into 9 classes according to the size of pitch period, so the process of training basis matrix is exactly that each category feature waveform is all trained its corresponding basis matrix W.Be concrete steps (referring to Fig. 5) below with standard nonnegative matrix decomposition method training basis matrix W:
(1) such signature waveform is chosen experiment sample about 10000 frames, constitute matrix V;
(2) maximum iteration time being set is 1000 times, and present embodiment maximum iteration time N is set to 1000 times, and W matrix and H matrix all can be restrained in the time of guaranteeing the iteration end.It is 16 that the decomposition exponent number is set, Seung and the employed formulation of Lee when " decomposition exponent number " this notion has been continued to use the proposition of standard nonnegative matrix decomposition method, and the decomposition exponent number is set to 16 and in other words the row dimension of basis matrix W is made as 16;
(3) be used in equally distributed random number between [0,1], all elements in initialization W, the H matrix;
(4) each row to matrix W carry out normalization, each element in promptly every row all divided by this row all elements and;
(5) the current iteration number of times being set is 1;
(6), then change (7), otherwise change (8) if the current iteration number of times is less than or equal to maximum iteration time 1000;
(7) upgrade encoder matrix H, upgrade basis matrix W subsequently, the update mode of two matrixes following (this is the alternative manner of the standard nonnegative matrix decomposition method quoted of the present invention) increases by 1 to the current iteration number of times then, changes (6);
H aμ ← H aμ Σ i W ia V iμ ( WH ) iμ (1)
W ia ← W ia Σ μ V iμ ( WH ) iμ (2)
W ia ← W ua Σ j W ja (3)
Symbol description related in above-mentioned (1) formula is as follows:
(a) H A μThe element of the capable μ row of a of representing matrix H;
(b) W IaThe element of the capable a row of the i of representing matrix W;
(c) V I μThe element of the capable μ row of the i of representing matrix V;
(d) (WH) I μThe element of the capable μ row of the i of gained matrix after representing matrix W and matrix H multiply each other;
(e) The element of the capable μ of the i of representing matrix V row is divided by matrix W and the matrix H element that the capable μ of i of back gained matrix is listed as that multiplies each other;
(f) Expression to all different i and μ by
Figure A20061001229600163
μ row in the matrix that all elements constituted that is calculated with a row of W matrix, are made inner product of vectors;
(g) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " renewal " operation of each element of H matrix.
Symbol description related in above-mentioned (2) formula is as follows:
(a) W IaThe element of the capable a row of the i of representing matrix W;
(b) H A μThe element of the capable μ row of a of representing matrix H;
(c) V I μThe element of the capable μ row of the i of representing matrix V;
(d) (WH) I μThe element of the capable μ row of the i of gained matrix after representing matrix W and matrix H multiply each other;
(e)
Figure A20061001229600164
The element of the capable μ of the i of representing matrix V row is divided by matrix W and the matrix H element that the capable μ of i of back gained matrix is listed as that multiplies each other;
(f)
Figure A20061001229600165
H A μExpression to all different i and μ by
Figure A20061001229600166
I in the matrix that all elements constituted that is calculated is capable, and is capable with a of H matrix, makes inner product of vectors;
(g) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " renewal " operation of each element of W matrix.
Symbol description related in above-mentioned (3) formula is as follows:
(a) W IaThe element of the capable a row of the i of representing matrix W;
(b) The all elements summation of representing matrix W a row;
(c)
Figure A20061001229600172
The element of the capable a of representing matrix W i row divided by matrix W a row all elements and;
(d) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " normalization " operation of each element of W matrix.
(8) end loop, and preserve basis matrix W.
B, obtain the encoder matrix of the given signature waveform of a certain frame.
A, to the given signature waveform of a certain frame, at first this frame signature waveform is sorted out according to the size of pitch period (pitch), classification foundation is still as shown in table 1;
B, take out basis matrix (training process trains) then, use existing nonnegative matrix decomposition method at last, obtain encoder matrix H corresponding to such signature waveform.Obtain the concrete steps following (referring to Fig. 6) of encoder matrix H:
(1) this frame signature waveform, is considered as matrix V;
(2) the decomposition exponent number being set is 16, and it is 10 times that maximum iteration time is set;
(3) be used in equally distributed random number between [0,1], all elements in the initialization H matrix;
(4) the current iteration number of times being set is 1;
(5), then change (6), otherwise change (7) if the current iteration number of times is less than or equal to maximum iteration time 10;
(6) upgrade encoder matrix H, update mode as shown in the formula, then the current iteration number of times is increased by 1, change (5);
H aμ ← H aμ Σ i W ia V iμ ( WH ) iμ (4)
(4) related symbol description is as follows in the formula:
(a) H A μThe element of the capable μ row of a of representing matrix H;
(b) W IaThe element of the capable a row of the i of representing matrix W;
(c) V I μThe element of the capable μ row of the i of representing matrix V;
(d) (WH) I μThe element of the capable μ row of the i of gained matrix after representing matrix W and matrix H multiply each other;
(e) The element of the capable μ of the i of representing matrix V row is divided by matrix W and the matrix H element that the capable μ of i of back gained matrix is listed as that multiplies each other;
(f)
Figure A20061001229600183
Expression to all different i and μ by
Figure A20061001229600184
μ row in the matrix that all elements constituted that is calculated with a row of W matrix, are made inner product of vectors;
(g) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " renewal " operation of each element of H matrix.
(7) end loop, and preserve encoder matrix H.So far, this frame signature waveform V, what be similar to has resolved into two nonnegative matrix W, H product, i.e. V ≈ W * H.

Claims (1)

1, a kind of nonnegative matrix decomposition method of phonic signal character waveform is characterized in that, this method is carried out according to the following steps:
A, training basis matrix W
1) at first to the pitch period of phonetic feature waveform according to this frame voice signal, promptly the size of pitch is divided into 9 classes, and is as follows:
The 1st class 20≤pitch<30
The 2nd class 30≤pitch<40
The 3rd class 40≤pitch<50
The 4th class 50≤pitch<60
The 5th class 60≤pitch<70
The 6th class 70≤pitch<80
The 7th class 80≤pitch<90
The 8th class 90≤pitch<100
The 9th class 100≤pitch≤120
2) then first kind signature waveform is chosen experiment sample about 10000 frames, form matrix V, the alternative manner according to the nonnegative matrix of standard is decomposed trains basis matrix W, and concrete steps are as follows:
(1) such signature waveform is chosen experiment sample about 10000 frames, constitute matrix V;
(2) maximum iteration time being set is N time, and the value of N is greater than 10; It is M that the decomposition exponent number is set, the value of M is the integer between 8~32, Seung and the employed formulation of Lee when " decomposition exponent number " this notion has been continued to use the proposition of standard nonnegative matrix decomposition method, the decomposition exponent number is set to M and in other words the row dimension of basis matrix W is made as M;
(3) be used in equally distributed random number between [0,1], all elements in initialization W, the H matrix;
(4) each row to matrix W carry out normalization, each element in promptly every row all divided by this row all elements and;
(5) the current iteration number of times being set is 1;
(6), then change (7), otherwise change (8) if the current iteration number of times is less than or equal to maximum iteration time N;
(7) upgrade encoder matrix H, upgrade basis matrix W subsequently, the update mode of two matrixes is as follows, then the current iteration number of times is increased by 1, changes (6);
H aμ ← H aμ Σ i W ia V iμ ( WH ) iμ - - - ( 1 )
H ia ← H ia Σ μ V iμ ( WH ) iμ H aμ - - - ( 2 )
H ia ← W ia Σ j W ja - - - ( 3 )
Symbol description related in above-mentioned (1) formula is as follows:
(a) H A μThe element of the capable μ row of a of representing matrix H;
(b) W IaThe element of the capable a row of the i of representing matrix W;
(c) V I μThe element of the capable μ row of the i of representing matrix V;
(d) (WH) I μThe element of the capable μ row of the i of gained matrix after representing matrix W and matrix H multiply each other;
(e)
Figure A2006100122960003C4
The element of the capable μ of the i of representing matrix V row is divided by matrix W and the matrix H element that the capable μ of i of back gained matrix is listed as that multiplies each other;
(f) Expression to all different i and μ by μ row in the matrix that all elements constituted that is calculated with a row of W matrix, are made inner product of vectors;
(g) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " renewal " operation of each element of H matrix.
Symbol description related in above-mentioned (2) formula is as follows:
(a) W IaThe element of the capable a row of the i of representing matrix W;
(b) H A μThe element of the capable μ row of a of representing matrix H;
(c) V I μThe element of the capable μ row of the i of representing matrix V;
(d) (WH) I μThe element of the capable μ row of the i of gained matrix after representing matrix W and matrix H multiply each other;
(e)
Figure A2006100122960004C1
The element of the capable μ of the i of representing matrix V row is divided by matrix W and the matrix H element that the capable μ of i of back gained matrix is listed as that multiplies each other;
(f) Expression to all different i and μ by I in the matrix that all elements constituted that is calculated is capable, and is capable with a of H matrix, makes inner product of vectors;
(g) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " renewal " operation of each element of W matrix.
Symbol description related in above-mentioned (3) formula is as follows:
(a) W IaThe element of the capable a row of the i of representing matrix W;
(b) The all elements summation of representing matrix W a row;
(c) The element of the capable a of representing matrix W i row divided by matrix W a row all elements and;
(d) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " normalization " operation of each element of W matrix.
(8) end loop, and preserve basis matrix W.
3) repeat said method, obtain the basis matrix W of the 2nd~9 category feature waveform respectively, obtain 9 basis matrixs that correspond respectively to the inhomogeneity signature waveform altogether;
B, obtain the encoder matrix H of the given signature waveform of a certain frame
A, to the given signature waveform of a certain frame, at first be divided into 9 classes according to the size of pitch period according to this frame signature waveform, sorting technique is identical with point-score in the step 1);
B, take out basis matrix W corresponding to such signature waveform then from 9 basis matrixs that trained, use existing nonnegative matrix decomposition method at last, obtain encoder matrix H, the concrete steps that obtain encoder matrix H are as follows:
(1) this frame signature waveform, is considered as matrix V;
(2) the decomposition exponent number being set is M, and the value of M is the integer between 8~32, and it is 10 times that maximum iteration time is set;
(3) be used in equally distributed random number between [0,1], all elements in the initialization H matrix;
(4) the current iteration number of times being set is 1;
(5) if the current iteration number of times is less than or equal to maximum iteration time N, the value of N is then changeed (6) greater than 10 times, otherwise changes (7);
(6) upgrade encoder matrix H, update mode as shown in the formula, then the current iteration number of times is increased by 1, change (5);
H aμ ← H aμ Σ i W ia V iμ ( WH ) iμ - - - ( 4 )
(4) related symbol description is as follows in the formula:
(a) H A μThe element of the capable μ row of a of representing matrix H;
(b) W IaThe element of the capable a row of the i of representing matrix W;
(c) V I μThe element of the capable μ row of the i of representing matrix V;
(d) (WH) I μThe element of the capable μ row of the i of gained matrix after representing matrix W and matrix H multiply each other;
(e)
Figure A2006100122960006C1
The element of the capable μ of the i of representing matrix V row is divided by matrix W and the matrix H element that the capable μ of i of back gained matrix is listed as that multiplies each other;
(f)
Figure A2006100122960006C2
Expression to all different i and μ by
Figure A2006100122960006C3
All that are calculated
μ row in the matrix that element constituted with a row of W matrix, are made inner product of vectors;
(g) matrix element on " ← " symbolic representation handle " ← " the right is composed the matrix element to " ← " left side correspondence position, has promptly finished " renewal " operation of each element of H matrix;
(7) end loop, and preserve encoder matrix H;
So far, this frame signature waveform V, the product that has resolved into two nonnegative matrix W, H that is similar to, i.e. V ≈ W * H.
CNA2006100122964A 2006-06-16 2006-06-16 Nonnegative matrix decomposition method for speech signal characteristic waveform Pending CN1862661A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA2006100122964A CN1862661A (en) 2006-06-16 2006-06-16 Nonnegative matrix decomposition method for speech signal characteristic waveform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA2006100122964A CN1862661A (en) 2006-06-16 2006-06-16 Nonnegative matrix decomposition method for speech signal characteristic waveform

Publications (1)

Publication Number Publication Date
CN1862661A true CN1862661A (en) 2006-11-15

Family

ID=37390074

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2006100122964A Pending CN1862661A (en) 2006-06-16 2006-06-16 Nonnegative matrix decomposition method for speech signal characteristic waveform

Country Status (1)

Country Link
CN (1) CN1862661A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101441872B (en) * 2007-11-19 2011-09-14 三菱电机株式会社 Denoising acoustic signals using constrained non-negative matrix factorization
CN107610710A (en) * 2017-09-29 2018-01-19 武汉大学 A kind of audio coding and coding/decoding method towards Multi-audio-frequency object

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101441872B (en) * 2007-11-19 2011-09-14 三菱电机株式会社 Denoising acoustic signals using constrained non-negative matrix factorization
CN107610710A (en) * 2017-09-29 2018-01-19 武汉大学 A kind of audio coding and coding/decoding method towards Multi-audio-frequency object

Similar Documents

Publication Publication Date Title
CN1199151C (en) Speech coder
CN1150516C (en) Vector quantizer method
CN1154283C (en) Coding method and apparatus, and decoding method and apparatus
CN1210689C (en) Improved spectral translation/folding in subband domain
CN1132153C (en) Filter for speech modification or enhancement, and various apparatus, system and method using same
CN1154086C (en) CELP transcoding
CN101044552A (en) Sound encoder and sound encoding method
CN101044553A (en) Scalable encoding apparatus, scalable decoding apparatus, and methods thereof
CN101031960A (en) Scalable encoding device, scalable decoding device, and method thereof
CN101044554A (en) Scalable encoder, scalable decoder,and scalable encoding method
CN1144179C (en) Information decorder and decoding method, information encoder and encoding method and distribution medium
CN1681213A (en) Lossless audio coding/decoding method and apparatus
CN1181150A (en) Algebraic codebook with signal-selected pulse amplitudes for fast coding of speech
CN1265217A (en) Method and appts. for speech enhancement in speech communication system
CN1906664A (en) Audio encoder and audio decoder
CN1266671C (en) Apparatus and method for estimating harmonic wave of sound coder
CN1863039A (en) Hidden communication system and communication method based on audio frequency
CN1787383A (en) Methods and apparatuses for transforming, adaptively encoding, inversely transforming and adaptively decoding an audio signal
CN1692402A (en) Speech synthesis method and speech synthesis device
CN1787075A (en) Method for distinguishing speek speek person by supporting vector machine model basedon inserted GMM core
CN1531348A (en) Image encoder, encodnig method and programm, image decoder, decoding method and programm
CN1849648A (en) Coding apparatus and decoding apparatus
CN1145925C (en) Transmitter with improved speech encoder and decoder
CN1862661A (en) Nonnegative matrix decomposition method for speech signal characteristic waveform
CN1795491A (en) Method for analyzing fundamental frequency information and voice conversion method and system implementing said analysis method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication