CN102664021B - Low-rate speech coding method based on speech power spectrum - Google Patents
Low-rate speech coding method based on speech power spectrum Download PDFInfo
- Publication number
- CN102664021B CN102664021B CN2012101195671A CN201210119567A CN102664021B CN 102664021 B CN102664021 B CN 102664021B CN 2012101195671 A CN2012101195671 A CN 2012101195671A CN 201210119567 A CN201210119567 A CN 201210119567A CN 102664021 B CN102664021 B CN 102664021B
- Authority
- CN
- China
- Prior art keywords
- dictionary
- power spectrum
- sparse
- speech
- receiving end
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Landscapes
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The invention discloses a low-rate speech coding method based on a speech power spectrum, in particular to a speech processing technique based on signal sparse-representation and reconstruction of dictionary learning. A high-efficiency speech model taking the speech power spectrum as a main output parameter is adopted as a low-rate speech coding model. At a sending end, the speech power spectrum is output after a speech signal is processed, then the parameter is compressed through the sparse theory and is finally converted into a bit stream, and wireless transmission is realized. A dictionarylearning method at a receiving end is adopted, so that that the realization of the low-rate speech communication is guaranteed. The dictionary learning is maximized by utilizing all kinds of information of a former frame of synthetic speech. The match of a sparse coefficient based on energy and a dictionary atom is adopted, a measurement matrix is constructed so that the correctness of match is increased, and the optimal recovery of the speech power spectrum at the receiving end is realized.
Description
Technical field
The present invention relates to a kind of voice coding method of low rate, be specifically related to based on the signal rarefaction representation of dictionary study and the voice processing technology of reconstruct.
Background technology
In recent years, along with the development of data compression technique, the low rate speech coding technology also develops rapidly, has all obtained using widely in mobile communication, secret communication, underwater communication.Though the CELP wave coder still can produce high-quality voice when the 4.8kb/s bit rate, when bit rate is reduced to 4kb/s when following, the coded system performance sharply worsens.At this moment, must carry out data compression process efficiently to voice signal, and require this processing to adapt to the communications requirement.
At Boucheron, Laura E.; De Leon, Phillip L.; Sandoval, Steven.Low bit-rate speech coding through quantization of mel-frequency cepstral coefficients[J] .IEEE Transactions on Audio, Speech and Language Processing, 2012,20 (2): show in the document that 610-619. delivers, directly the Mei Er cepstrum coefficient that is transformed extraction by the phonetic speech power spectrum is carried out quantization encoding and more can effectively improve the voice coding quality.This method when code rate is 1.2kbps, can obtain than encode about 0.1 effect promoting with the stage enhancement type MELP (Mixed Excitation Linear Prediction) in PESQ test and appraisal, and it is then more considerable to promote effect when lower 0.6kbps, is about 0.25.And in the literary composition to the coding of Mei Er cepstrum coefficient, come down to one by large space (phonetic speech power spectral domain) data set to the data compression process of little space (cepstrum domain) data set.
To aspect effective expression of small data set, signal rarefaction representation and reconstruct theory are a kind of emerging signal indication means that occurred in recent years, can be used for a plurality of fields such as data mining, pattern classification, compressed sensing at large data sets.Signal rarefaction representation and reconstruct theory do not require accurate recovery raw data, but the sparse coefficient according to certain criterion searching minimum number approaches raw data to the full extent in certain basis set (dictionary) space, realizes data reconstruction.
Summary of the invention
The objective of the invention is signal rarefaction representation and the reconstruct theory of dictionary study are applied in the voice signal processing, structure is realized voice communication based on its theoretical low rate voice compression coding system.
Technical scheme of the present invention is considered from following two aspects: 1. sparse theoretical side, because voice signal itself has certain message structure, as harmonic information, if therefore dictionary can be by this peculiar structure of study picked up signal, the sparse coefficient that obtains at this dictionary must be the maximization performance of signal characteristic so, thereby reduce the redundance of output coefficient greatly, realize the purpose of efficient data compression.2. in communication aspects, in order to reduce code rate, then require to receive both sides and all must have identical dictionary, only transmit sparse coefficient during communication.At transmitting terminal, according to dictionary raw data is decomposed, obtain sparse coefficient; At receiving end, then the sparse coefficient by receiving utilizes the dictionary of agreement to recover raw data, finishes communication process.Therefore, the present invention with above-mentioned both carry out combination, by adopt dictionary study at receiving end, realize data compression; By analyzing-synthetic method, obtain the dictionary identical with receiving end at transmitting terminal, realize communicating requirement.Finally, realize a kind of low rate voice coding method of receiving end dictionary study.
The present invention adopts the phonetic speech power spectrum as major parameter, in conjunction with carrying out data compression based on signal rarefaction representation and the reconstruct theory of dictionary study, be implemented in high-quality vocoder structural scheme under the 1.2kbps speed, can when same-code speed, obtain the synthetic speech quality more excellent than same level vocoder.
A kind of low rate voice coding method based on the phonetic speech power spectrum is characterized in that, comprises following steps:
(1) step of transmitting terminal coding: voice signal is by the speech model output parameter, and the parameter of output is handled through data, produces sparse coefficient, and converts bit stream to;
(2) step of receiving end decoding: the parameter that receives is carried out data handle, recover correlation parameter, and obtain final synthetic speech by the phonetic synthesis model based on the phonetic speech power spectrum.
The parameter that described transmitting terminal is exported behind speech model is fundamental tone, normalization phonetic speech power spectrum and three parameters of energy gain.
Described transmitting terminal transmits the form that described fundamental tone, energy gain and these three parameters of sparse coefficient of being produced by described normalization phonetic speech power spectrum by sparse theory are converted into bit stream.
The data of described transmitting terminal are handled and are adopted a plurality of modules to realize, described a plurality of modules comprise the preceding sparse coefficient of frame of sparse decomposing module, dictionary study module and buffer memory and preceding frame dictionary module.
The parameter that described receiving end receives is fundamental tone, energy gain and three parameters of sparse coefficient.
The data of described receiving end are handled and are adopted a plurality of modules to realize, described a plurality of modules comprise the preceding sparse coefficient of frame of sparse reconstructed module, dictionary study module and buffer memory and preceding frame dictionary module.
Described transmitting terminal dictionary study module, at dictionary learning algorithm identical with receiving end of transmitting terminal structure, the dictionary that produces this frame according to sparse coefficient and the dictionary of preceding some frames only.
Described receiving end dictionary study module, sparse coefficient and the dictionary of some frames make up this frame dictionary before utilizing; The study of this frame dictionary is divided into the study in two spaces: preceding some frames receive the space of signal correspondences, with and the study of complementary space.
The dictionary learning method of receiving end provides safeguard for the low rate voice communication is achieved, and the various information of frame synthetic speech are carried out maximized dictionary study before utilizing as far as possible.
Some frames receive the method for the space learning of signal correspondences and are before described, at n constantly, preceding frame are received the study of the dictionary subspace of signal,
|| ||
pBe p norm, D
iAnd a
iBe respectively dictionary and the sparse coefficient of frame before receiving end,
With
Be dictionary and the corresponding sparse coefficient of learning out, λ
1Be Lagrangian coefficient.
The method of described complementary space study, adopting the average speech power spectrum is supplementary, obtains the dictionary of complementary space,
With
Be respectively dictionary and corresponding sparse coefficient that complementary space is learnt out, under each pitch period, the training of phonetic speech power spectrum be divided into the K class,
Represent the average speech power spectrum of i class this moment, λ
2Be Lagrangian coefficient.
Adopt sparse restructing algorithm to rebuild for data in described sparse reconstructed module, i.e. the data of the dictionary atom coupling method of rebuilding is by n dictionary D constantly
nSparse coefficient a with correspondence
nThe acquisition of multiplying each other, the sparse coefficient a of receiving end
nSparse position then mate acquisition from the angle of energy, recover normalization phonetic speech power spectrum by following formula,
D
nAnd a
nDifference n dictionary and corresponding sparse coefficient constantly.The present invention adopts following formula to mate from the angle of energy.
N dictionary D constantly
nBuilding method be:
φ is at the stochastic matrix that receives and transmitting terminal is produced by the same seed number of arranging.And to dictionary D
nEach atom (namely to matrix D
nEach row) carry out energy normalized and handle.
The beneficial effect that the present invention reaches:
It is that the efficient speech model of main output parameter is the model of low rate voice coding that the present invention adopts with the phonetic speech power spectrum, at transmitting terminal, voice signal is exported the phonetic speech power spectrum after treatment, and this parameter is compressed by sparse theory subsequently, finally convert bit stream to, realize wireless transmission.Adopt the dictionary learning method of receiving end, provide safeguard for the low rate voice communication is achieved, and the various information of frame synthetic speech are carried out maximized dictionary study before utilizing as far as possible; Employing is based on the coupling of sparse coefficient and the dictionary atom of energy, and structure is measured matrix makes the correctness of coupling be improved, and is implemented in the optimized database restore of receiving end phonetic speech power spectrum.
Description of drawings
Fig. 1 is low rate voice coding frame diagram of the present invention;
Fig. 2 A is the low rate encoding and decoding speech scheme transmitting terminal coding block diagram of signal rarefaction representation of the present invention and reconstruct theory;
Fig. 2 B is the low rate encoding and decoding speech scheme receiving encoding block diagram of signal rarefaction representation of the present invention and reconstruct theory;
Fig. 3 A is the dictionary learning framework figure of receiving end of the present invention;
Fig. 3 B is that the dictionary of receiving end of the present invention is learnt concrete enforcement figure.
Embodiment
Below in conjunction with accompanying drawing, low rate encoding and decoding speech method of the present invention is further elaborated.
The low rate voice coding framework of the present invention design as shown in Figure 1, voice signal at first produces the model parameter of output by speech model, this model parameter is compressed by sparse theory subsequently, finally converts bit stream to, realizes wireless transmission.
Referring to Fig. 2 A, Fig. 2 B, provide the structural representation of low rate voice transmitting terminal scrambler of the present invention and receiving end demoder respectively.
Voice signal (8kHz sampling rate) is that a frame carries out the processing of branch frame with 25ms at first in the transmitting terminal coding block diagram of Fig. 2 A, compose and three parameters of energy gain by speech model output fundamental tone, normalization phonetic speech power: find the solution power spectrum as calculating with 512 FFT, but in order to be convenient to quantize and carry out follow-up signal rarefaction representation and reconstruct more, power spectrum parameters to be resolved into gain and normalized power compose two parts and handle.Transmitting terminal is transmitted by the form that quantizing encoder is converted into bit stream by these three parameters of sparse coefficient that normalization phonetic speech power spectrum produces with fundamental tone, energy gain with by sparse theory.
For the fundamental tone parameter, can adopt autocorrelation method to obtain, pitch period length is in 20 to 147 sampling point scopes.By signal rarefaction representation and the reconstruct part based on sparse theory in the frame of broken lines among the figure normalized power spectrum is carried out the data processing.Data are handled and are adopted three modules to realize, these three modules comprise sparse decomposing module, dictionary study module, the preceding sparse coefficient of frame of buffer memory and preceding frame dictionary module.The present invention constructs a dictionary learning algorithm identical with receiving end (the synthetic end of signal) at transmitting terminal (signal analysis end), and passes through certain algorithm at this dictionary, as basic tracing algorithm, tries to achieve sparse coefficient.In quantizing encoder, fundamental tone is carried out the uniform quantization of 7bit, energy gain is carried out the uniform quantization of 6bit at log-domain.Preceding 10 coefficients that sparse coefficient then is chosen for normalized power spectral factorization error energy minimum are the quantizer input parameter, these 10 parameters are at first carried out from greatly to minispread, and carry out the vector quantization of 17bit, and transmitting at last, concrete quantization method is the LBG algorithm.
In the receiving end decoding block diagram of Fig. 2 B, receiving end is at first decoded to these three parameters of fundamental tone, energy gain and sparse coefficient of receiving.At first the sparse coefficient that receives is handled by signal rarefaction representation and the reconstruct part based on sparse theory in the frame of broken lines among the figure.Data are handled and are adopted three modules to realize, these three modules comprise sparse reconstructed module, dictionary study module, the preceding sparse coefficient of frame of buffer memory and preceding frame dictionary module.
The dictionary study module utilizes the preceding sparse coefficient of frame of buffer memory and preceding frame dictionary module to make up this frame dictionary; The study of this frame dictionary is divided into the study in two spaces: preceding frame receives the space of signal correspondence, with and the study of complementary space.The dictionary of this frame is united generation by the sparse coefficient of preceding frame and dictionary and supplementary (as fundamental tone, the average power spectra that trains).The normalized power spectrum is then synthesized by relevant sparse restructing algorithm with dictionary by the sparse coefficient of this frame.Generation that power spectrum is composed by the normalized power that recovers and the gain energy multiplies each other.At last, phase information, power spectrum two parameters obtain final synthetic speech by the phonetic synthesis model based on the phonetic speech power spectrum.Phase information is herein obtained by the method based on the inverse Fourier transform in short-term (LSE-ISTFT) of least mean-square error, adopts this method can obtain final synthetic speech simultaneously.
The receiving end dictionary study of Fig. 3 B is concrete implements among the figure the sparse coefficient a of the preceding frame of buffer memory
iWith preceding frame dictionary D
iCarry out dictionary study, obtain the dictionary of learning out
Sparse coefficient with correspondence.In to the study of the dictionary of complementary space, supplementary average speech power spectrum is as training data, dictionary
As a part of dictionary, learn, obtain the dictionary of complementary space
Sparse coefficient with correspondence.Above two parts dictionary is merged, and the dictionary D that multiplies each other and construct this frame with stochastic matrix
nAnd the sparse coefficient of this frame and this frame dictionary can synthesize the normalized power spectrum by sparse restructing algorithm.
Gordian technique of the present invention is:
One, learns based on the dictionary of different subspace
If at the communication transmitting terminal original signal is carried out dictionary study, and sends sparse coefficient to receiving end, but at receiving end, sparse coefficient is only arranged and do not have corresponding dictionary information, so can't carry out data reconstruction, cause communication failure.The present invention adopts the algorithm of receiving end dictionary study, and transmitting terminal by analyze-synthetic method obtains the dictionary of receiving end, thereby realize that all there is mutually the same dictionary constantly in transmitting-receiving two-end at each, the final realization communicated by letter, and the dictionary study of receiving end as shown in Figure 3A.
For the dictionary study of receiving end, the present invention launches to learn to make the dictionary space to cover whole data space to the corresponding dictionary in different subspaces at receiving end.
The current dictionary of receiving end dictionary study must be divided into the study in two spaces: preceding frame receives the space of signal correspondence, with and complementary space.
At n constantly, preceding frame is received the study of the dictionary subspace of signal, generally can be expressed from the next:
In the formula (1), || ||
pBe p norm, D
iAnd a
iBe respectively dictionary and the sparse coefficient of frame before receiving end,
With
Be dictionary and the corresponding sparse coefficient of learning out.
Adopt as the inverse matrix iteration scheduling algorithm of K-SVD, recurrence minimum variance and realize.λ
1Be Lagrangian coefficient, general λ
1Between 0~1.
The present invention increases the dictionary acquisition that certain supplementary is learnt complementary space.Adopting the average speech power spectrum is supplementary, is learnt by following formula:
In the formula (2),
With
Be respectively dictionary and corresponding sparse coefficient that complementary space is learnt out.If
Be the matrix of the capable n row of m, then
In the space that constitutes of each row inevitable in the m dimension total space that is constituted by m row orthogonal basis, at this moment
It only is the subspace in the total space.And in the total space, remove
Shared space segment then is called
Complementary space with respect to the total space.The present invention is divided into K class (K=64) with the training of phonetic speech power spectrum under each pitch period (fundamental tone number of cycles scope is 20~147 sampling points),
Represent the average speech power spectrum of i class this moment.The dictionary subspace
Can adopt with
Identical mode obtains.λ
2Be Lagrangian coefficient, general λ
2Between 0~1.
N of the present invention is dictionary D constantly
nThe building method that adopts:
In the formula (3), φ is the stochastic matrix that is produced by the same seed number of agreement at receiving end and transmitting terminal.
Two, rebuild based on the data of dictionary atom coupling
In sparse reconstructed module, adopt sparse restructing algorithm to rebuild for data, specifically adopt following algorithmic formula to recover normalization phonetic speech power spectrum, the dictionary atom is correctly mated with corresponding sparse coefficient:
In the formula (4), D
nAnd a
nBe respectively n dictionary and corresponding sparse coefficient constantly.The present invention adopts formula (4) to mate from the angle of energy.Because dictionary D
nDestroyed its orthogonality by increasing stochastic matrix φ when structure, therefore having guaranteed can flux matched being achieved.
The above only is preferred implementation of the present invention; should be pointed out that for those skilled in the art, under the prerequisite that does not break away from the technology of the present invention principle; can also make some improvement and distortion, these improvement and distortion also should be considered as protection scope of the present invention.
Claims (4)
1. the low rate voice coding method based on the phonetic speech power spectrum is characterized in that, comprises following steps:
(1) step of transmitting terminal coding: voice signal is by the speech model output parameter, and the parameter of output is handled through data, produces sparse coefficient, and converts bit stream to;
(2) step of receiving end decoding: the parameter that receives is carried out data handle, recover correlation parameter, and obtain final synthetic speech by the phonetic synthesis model based on the phonetic speech power spectrum;
The data of described transmitting terminal are handled and are adopted a plurality of modules to realize, described a plurality of modules comprise the preceding sparse coefficient of frame of sparse decomposing module, dictionary study module and buffer memory and preceding frame dictionary module;
The data of described receiving end are handled and are adopted a plurality of modules to realize, described a plurality of modules comprise the preceding sparse coefficient of frame of sparse reconstructed module, dictionary study module and buffer memory and preceding frame dictionary module;
Described transmitting terminal dictionary study module, at dictionary learning algorithm identical with receiving end of transmitting terminal structure, the dictionary that produces this frame according to sparse coefficient and the dictionary of preceding some frames only;
Described receiving end dictionary study module, sparse coefficient and the dictionary of some frames make up this frame dictionary before utilizing; The study of this frame dictionary is divided into the study in two spaces: preceding some frames receive the space of signal correspondences, with and the study of complementary space;
Some frames receive the method for the space learning of signal correspondences and are before described, at n constantly, preceding frame are received the study of the dictionary subspace of signal,
|| .||
pBe p norm, D
iAnd a
iBe respectively dictionary and the sparse coefficient of frame before receiving end,
With
Be dictionary and the corresponding sparse coefficient of learning out, λ
1Be Lagrangian coefficient;
The method of described complementary space study, adopting the average speech power spectrum is supplementary, obtains the dictionary of complementary space,
With
Be respectively dictionary and corresponding sparse coefficient that complementary space is learnt out, under each pitch period, the training of phonetic speech power spectrum be divided into the K class,
Represent the average speech power spectrum of i class this moment, λ
2Be Lagrangian coefficient;
N dictionary D constantly
nThe building method that adopts is:
φ is the stochastic matrix that is produced by the same seed number of agreement at receiving end and transmitting terminal;
At last to dictionary D
nEach atom, namely to matrix D
nEach row carry out energy normalized and handle;
In described sparse reconstructed module, adopt sparse restructing algorithm to rebuild for data, by n dictionary D constantly
nSparse coefficient a with correspondence
nThe acquisition of multiplying each other, the sparse coefficient a of receiving end
nSparse position then mate acquisition from the angle of energy, recover normalization phonetic speech power spectrum by following formula,
2. the low rate voice coding method based on the phonetic speech power spectrum according to claim 1 is characterized in that, the parameter that described transmitting terminal is exported behind speech model is fundamental tone, normalization phonetic speech power spectrum and three parameters of energy gain.
3. the low rate voice coding method based on phonetic speech power spectrum according to claim 2, it is characterized in that described transmitting terminal transmits the form that described fundamental tone, energy gain and these three parameters of sparse coefficient of being produced by described normalization phonetic speech power spectrum by sparse theory are converted into bit stream.
4. the low rate voice coding method based on the phonetic speech power spectrum according to claim 1 is characterized in that the parameter that described receiving end receives is fundamental tone, energy gain and three parameters of sparse coefficient.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012101195671A CN102664021B (en) | 2012-04-20 | 2012-04-20 | Low-rate speech coding method based on speech power spectrum |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012101195671A CN102664021B (en) | 2012-04-20 | 2012-04-20 | Low-rate speech coding method based on speech power spectrum |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102664021A CN102664021A (en) | 2012-09-12 |
CN102664021B true CN102664021B (en) | 2013-10-02 |
Family
ID=46773486
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2012101195671A Expired - Fee Related CN102664021B (en) | 2012-04-20 | 2012-04-20 | Low-rate speech coding method based on speech power spectrum |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102664021B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102881293A (en) * | 2012-10-10 | 2013-01-16 | 南京邮电大学 | Over-complete dictionary constructing method applicable to voice compression sensing |
CN103280221B (en) * | 2013-05-09 | 2015-07-29 | 北京大学 | A kind of audio lossless compressed encoding, coding/decoding method and system of following the trail of based on base |
CN103345920B (en) * | 2013-05-29 | 2015-07-15 | 河海大学常州校区 | Self-adaptation interpolation weighted spectrum model voice conversion and reconstructing method based on Mel-KSVD sparse representation |
CN103474067B (en) * | 2013-08-19 | 2016-08-24 | 科大讯飞股份有限公司 | speech signal transmission method and system |
CN107622777B (en) * | 2016-07-15 | 2020-04-14 | 公安部第三研究所 | High-code-rate signal acquisition method based on over-complete dictionary pair |
CN107874783A (en) * | 2017-11-23 | 2018-04-06 | 西安电子科技大学 | A kind of intravascular ultrasound imaging equipment being wirelessly transferred based on WIFI |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101199005A (en) * | 2005-06-17 | 2008-06-11 | 松下电器产业株式会社 | Post filter, decoder, and post filtering method |
CN102034478A (en) * | 2010-11-17 | 2011-04-27 | 南京邮电大学 | Voice secret communication system design method based on compressive sensing and information hiding |
CN102332268A (en) * | 2011-09-22 | 2012-01-25 | 王天荆 | Speech signal sparse representation method based on self-adaptive redundant dictionary |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080219466A1 (en) * | 2007-03-09 | 2008-09-11 | Her Majesty the Queen in Right of Canada, as represented by the Minister of Industry, through | Low bit-rate universal audio coder |
CN102576531B (en) * | 2009-10-12 | 2015-01-21 | 诺基亚公司 | Method and apparatus for processing multi-channel audio signals |
-
2012
- 2012-04-20 CN CN2012101195671A patent/CN102664021B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101199005A (en) * | 2005-06-17 | 2008-06-11 | 松下电器产业株式会社 | Post filter, decoder, and post filtering method |
CN102034478A (en) * | 2010-11-17 | 2011-04-27 | 南京邮电大学 | Voice secret communication system design method based on compressive sensing and information hiding |
CN102332268A (en) * | 2011-09-22 | 2012-01-25 | 王天荆 | Speech signal sparse representation method based on self-adaptive redundant dictionary |
Also Published As
Publication number | Publication date |
---|---|
CN102664021A (en) | 2012-09-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102664021B (en) | Low-rate speech coding method based on speech power spectrum | |
CN103778919B (en) | Based on compressed sensing and the voice coding method of rarefaction representation | |
CN101083076B (en) | Method and apparatus to encode and/or decode signal using bandwidth extension technology | |
CN103325375B (en) | One extremely low code check encoding and decoding speech equipment and decoding method | |
CN102341849B (en) | Pyramid vector audio coding | |
CN101577605B (en) | Speech LPC hiding and extraction algorithm based on filter similarity | |
US11594236B2 (en) | Audio encoding/decoding based on an efficient representation of auto-regressive coefficients | |
CN105070293A (en) | Audio bandwidth extension coding and decoding method and device based on deep neutral network | |
CN1552059A (en) | Method and apparatus for speech reconstruction in a distributed speech recognition system | |
CN103531205A (en) | Asymmetrical voice conversion method based on deep neural network feature mapping | |
CN110491400B (en) | Speech signal reconstruction method based on depth self-encoder | |
CN110473557B (en) | Speech signal coding and decoding method based on depth self-encoder | |
CN103081006A (en) | Method and device for processing audio signals | |
CN103714822A (en) | Sub-band coding and decoding method and device based on SILK coder decoder | |
CN107274883B (en) | Voice signal reconstruction method and device | |
CN103946918A (en) | Voice signal encoding method, voice signal decoding method, and apparatus using the same | |
CN102918590B (en) | Encoding method and device, and decoding method and device | |
CN109616129B (en) | Mixed multi-description sinusoidal coder method for improving voice frame loss compensation performance | |
CN103236262A (en) | Transcoding method for code streams of voice coder | |
CN102982807B (en) | Method and system for multi-stage vector quantization of speech signal LPC coefficients | |
CN103854655A (en) | Low-bit-rate voice coder and decoder | |
CN101604524B (en) | Stereo coding method, stereo coding device, stereo decoding method and stereo decoding device | |
CN103400582B (en) | Towards decoding method and the system of multisound path three dimensional audio frequency | |
US9524727B2 (en) | Method and arrangement for scalable low-complexity coding/decoding | |
CN102903365A (en) | Method for refining parameter of narrow band vocoder on decoding end |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20131002 Termination date: 20160420 |
|
CF01 | Termination of patent right due to non-payment of annual fee |