KR20130047630A - Apparatus and method for coding signal in a communication system - Google Patents
Apparatus and method for coding signal in a communication system Download PDFInfo
- Publication number
- KR20130047630A KR20130047630A KR1020120119933A KR20120119933A KR20130047630A KR 20130047630 A KR20130047630 A KR 20130047630A KR 1020120119933 A KR1020120119933 A KR 1020120119933A KR 20120119933 A KR20120119933 A KR 20120119933A KR 20130047630 A KR20130047630 A KR 20130047630A
- Authority
- KR
- South Korea
- Prior art keywords
- residual signal
- vector
- frequency
- signal
- order
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/12—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/26—Pre-filtering or post-filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L2019/0001—Codebooks
- G10L2019/0016—Codebook for LPC parameters
Abstract
The present invention relates to an apparatus and method for encoding a speech and audio signal using a Code Excited Linear Prediction (CELP) encoding scheme in a communication system. Calculate a residual signal for the speech and audio signal, convert the residual signal into a frequency domain, calculate frequency energy of the residual signal using frequency coefficients of the residual signal, and calculate the frequency energy of the residual signal from the The energy concentration rate for each vector order of the residual signal is calculated, and the target vector order of the residual signal is determined by comparing the energy concentration rates for each vector order.
Description
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a communication system, and more particularly, to an apparatus and method for encoding a speech and audio signal using a code excited linear prediction (CELP) coding scheme in a communication system. It is about.
In a communication system, active research is being conducted to provide users with services of various quality of service (QoS: QoS) having a high transmission speed. In such a communication system, methods for rapidly transmitting data having various types of QoS through limited resources have been proposed, and in recent years, as network development and user demand for high-quality services increase, voice and audio signals are compressed in a network. As a scheme for transmitting, a scheme for compressing and restoring a pulse code modulation (PCM) signal has been proposed, and many voice / audio codecs have been proposed for compressing and restoring such a PCM signal. (codec) were developed.
On the other hand, as an example of the voice / audio codec, recent codecs such as ITU-T G.729.1 and G.718 support multiple bit rates in an embedded structure, and voice and audio signals at low bit rates of the multiple bit rates. A high compression ratio is achieved based on the CELP technique modeling the generation of the signal, and at the high bit rate of the multiple bit rate, a residual discrete cosine transform (MDCT) of a speech and audio signal is transformed in the time domain. It is referred to as 'MDCT') or a Discrete Fourier Transform (DFT: Discrete Fourier Transform, hereinafter referred to as 'DFT') to quantize by transforming into a frequency domain.
Here, the CELP technology is designed to be more suitable for voice than audio in voice and audio signals, and the characteristics of the residual signal, which is a difference between the original sound and the synthesized signal encoded by the CELP technology, are different. In other words, in the case of voice, CELP technology can properly express formants and pitches having a large frequency size, but in the case of music, it can not express formants and pitches properly, and thus large frequency components in the residual signal There is a lot left. That is, in the CELP technology, even if the same voice is used, not only a signal in which the frequency distribution is evenly distributed by accurately expressing the formant and the pitch, but also a coefficient having a large frequency because the coefficient and the pitch cannot be accurately expressed as described above. It may appear in.
However, in the current communication system, when the voice and audio signals are encoded using the CELP technology, that is, the CELP coding scheme, as described above, a specific method for normally processing the residual signal for the voice and audio signals is proposed. In particular, since the residual signal is not normally processed, the encoding performance of the voice and audio signals through the CELP encoding method is degraded, thereby providing a high quality service to users.
Accordingly, in order to provide high quality voice and audio services in a communication system, a method of encoding voice and audio signals using a CELP encoding method is required.
Accordingly, an object of the present invention is to provide an apparatus and method for encoding a signal in a communication system.
Another object of the present invention is to provide an apparatus and method for encoding a speech and audio signal using a Code Excited Linear Prediction (CELP) coding scheme in a communication system.
Another object of the present invention is to determine a quantization vector dimension according to a distribution of frequency coefficients of a residual signal for a speech and audio signal when encoding a speech and audio signal using a CELP encoding scheme in a communication system. The present invention provides a signal encoding apparatus and a method for processing the residual signal normally.
In addition, another object of the present invention, when encoding a speech and audio signal using the CELP coding scheme in a communication system, through the frequency characteristic analysis of the residual signal for the speech and audio signal, the quantization vector order according to the energy concentration The present invention provides a signal encoding apparatus and method for improving the voice and audio quality of service by determining a and processing the residual signal normally.
According to an aspect of the present invention, there is provided a signal encoding apparatus in a communication system, the apparatus comprising: an encoding unit encoding a speech and an audio signal by a code excited linear prediction (CELP) encoding scheme; A residual signal calculator configured to calculate a residual signal for the voice and audio signals; A frequency converter converting the residual signal into a frequency domain; An energy calculator configured to calculate frequency energy of the residual signal using frequency coefficients of the residual signal; An energy concentration ratio calculator for calculating an energy concentration ratio for each vector order of the residual signal from the frequency energy of the residual signal; And a vector order determiner which determines an object vector order of the residual signal by comparing energy concentration rates with respect to the respective vector orders.
In accordance with another aspect of the present invention, there is provided a method of encoding a signal in a communication system, the method comprising: encoding a speech and an audio signal by a code excited linear prediction (CELP) encoding scheme; Calculating a residual signal for the speech and audio signal; Converting the residual signal into a frequency domain; Calculating frequency energy of the residual signal using the frequency coefficients of the residual signal; Calculating an energy concentration rate for each vector order of the residual signal from the frequency energy of the residual signal; And comparing the energy concentration rates for the respective vector orders to determine a target vector order of the residual signal.
According to the present invention, in encoding a speech and audio signal using a CELP coding scheme in a communication system, a quantization vector dimension is determined according to a distribution of frequency coefficients of a residual signal for the speech and audio signal, in particular the residual. By analyzing the frequency characteristics of the signal, by determining the quantization vector order according to the energy concentration, the residual signal for the speech and audio signals are processed normally, thereby improving the encoding performance of the speech and audio signals using the CELP encoding method. High quality voice and audio services can be provided.
1 is a view schematically showing a structure of a signal encoding apparatus in a communication system according to an embodiment of the present invention.
2 is a diagram schematically illustrating an encoding process of a signal encoding apparatus in a communication system according to an embodiment of the present invention.
Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. It should be noted that in the following description, only parts necessary for understanding the operation according to the present invention will be described, and descriptions of other parts will be omitted so as not to distract from the gist of the present invention.
The present invention proposes a signal encoding apparatus and method in a communication system. Here, according to an embodiment of the present invention, a voice and audio signal for encoding services of various quality of service (QoS: hereinafter referred to as 'QoS') in a communication system, for example, voice and audio service, are encoded. Although an apparatus and a method are described as an example, the signal encoding scheme proposed by the present invention may be equally applied to the case of encoding signals corresponding to other services.
In addition, an embodiment of the present invention proposes an apparatus and method for encoding a speech and audio signal using a Code Excited Linear Prediction (CELP) coding scheme in a communication system. do. Here, in an embodiment of the present invention, when encoding a voice and audio signal using a CELP coding scheme in a communication system, a quantization vector dimension is determined according to the distribution of frequency coefficients of the residual signal for the voice and audio signal. Thus, the residual signal is processed normally, thereby improving the encoding performance of the speech and audio signals. In the embodiment of the present invention, as described above, when encoding the voice and audio signals using the CELP encoding method, the quantization vector according to the energy concentration is analyzed through frequency characteristic analysis of the residual signal for the voice and audio signals. The order is determined, and thus the residual signal for the voice and audio signals is normally processed to provide high quality voice and audio services.
Here, in the communication system according to the embodiment of the present invention, the frequency characteristics of the residual signal for the speech signal are uniformly distributed during the encoding of the speech and audio signals using the CELP encoding scheme, but for the music signal, that is, the audio signal. Since the peak component of the residual signal appears strongly, the quantization vector order is determined in consideration of the distribution of frequency characteristics of the residual signal for the speech and audio signals, thereby efficiently processing the residual signal by performing efficient quantization with limited bits. That is, in an embodiment of the present invention, when CELP coding is applied to speech and audio signals, and the residual signal for the speech and audio signals is encoded, the target vector order is determined according to the distribution of frequency coefficients of the residual signals. By processing the residual signal normally, a high quality voice and audio service is provided.
According to an embodiment of the present invention, an spectral distribution of a residual signal for speech and audio signals that are structurally generated according to the CELP coding scheme in a speech / audio codec using multiple bit rates is analyzed and coded. By adjusting the dimension, the residual signal for the voice and audio signals is normally processed to provide high quality voice and audio services. In the embodiment of the present invention, in the spectral distribution of the residual signal, when the residual signal has a certain number of large frequencies, the order of the target vector to be coded is reduced, so that the large frequency of the residual signal is more precisely defined by a limited bit. The residual signal is processed more normally to improve the quality of speech and audio services, and in the spectral distribution of the residual signal, if the frequencies of the residual signal are evenly distributed, the order of the target vector to be coded is large. To perform quantization.
That is, in an embodiment of the present invention, the frequency distribution of the residual signal for the speech and audio signals is analyzed to determine the target vector order, thereby reducing the quantization error, thereby improving the sound quality of the speech / audio codec. Here, in the embodiment of the present invention, the energy concentration ratio of the residual signal is calculated, and the frequency distribution of the residual signal is analyzed. In this case, when the energy concentration ratio is high, when the target vector order is made small and the energy concentration ratio is low, The order is determined in such a way as to increase the target vector order, and the quantization of more important frequency coefficients is elaborated, thereby improving the quality of voice and audio services.
In other words, in the existing speech / audio codec, the target vector order of the residual signal is fixed to be equal to the number of frequency coefficients, so that several large frequencies, that is, tones, such as the residual signal for the music signal, that is, the audio signal, are fixed. If the component contains a strong frequency, the quantization of the large frequency coefficients is not precisely performed, and thus, the degradation of the voice and audio quality of service occurs when the existing voice and audio signals are encoded. In an embodiment of the present invention, when encoding a voice and audio signal using a CELP encoding scheme, the target vector order is determined according to the frequency distribution of the residual signal for the voice and audio signal, thereby determining the voice and audio service, in particular, music. More precisely quantizes audio signals, such as signals, And audio services. Next, a device for encoding voice and audio signals in a communication system according to an exemplary embodiment of the present invention will be described in more detail with reference to FIG. 1.
1 is a diagram schematically illustrating a structure of a signal encoding apparatus in a communication system according to an embodiment of the present invention.
Referring to FIG. 1, the apparatus for encoding a signal includes a
The signal encoding apparatus includes a
In more detail, the
The
The
The residual
The
For example, the frequency
In Equation 1, e (n) denotes subband energy for the four MDCT coefficients, and n denotes a subband index.
The energy
In addition, the energy
The vector
The
The vector
The
The
The
The
As described above, the
2 is a diagram schematically illustrating an encoding process of a signal encoding apparatus in a communication system according to an embodiment of the present invention.
Referring to FIG. 2, in
In
Next, in
Then, in
In
Next, in
Thus, in the communication system according to the embodiment of the present invention, the spectral distribution of the residual signal for the speech and audio signals that are structurally generated according to the CELP coding scheme in the voice / audio codec using multiple bit rates, that is, the residual By analyzing the frequency distribution of the residual signal through the energy concentration ratio of the signal, the target vector dimension to be coded is determined. In this case, in the frequency distribution of the residual signal, the residual signal has some specific frequencies. In this case, the order of the target vector to be coded is reduced, so that the large frequency of the residual signal is more quantized more precisely with a limited bit to process the residual signal more normally, thereby improving the quality of voice and audio service, and also in the frequency distribution of the residual signal. If the frequencies of the residual signal are evenly distributed, Greatly to have a degree of ever performs vector quantization. Here, in the communication system according to the embodiment of the present invention, as described above, the frequency distribution of the residual signal is analyzed by calculating the energy concentration ratio of the residual signal and the like, and when the energy concentration ratio is high, the target vector order is made small. When the energy concentration rate is low, the order is determined by increasing the target vector order, and the quantization of more important frequency coefficients is elaborated, thereby improving the quality of voice and audio services.
While the invention has been shown and described with reference to certain preferred embodiments thereof, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention. Therefore, the scope of the present invention should not be limited by the described embodiments, but should be determined by the scope of the appended claims, as well as the appended claims.
Claims (20)
An encoder which encodes a voice and an audio signal by a code excited linear prediction (CELP) coding method;
A residual signal calculator configured to calculate a residual signal for the voice and audio signals;
A frequency converter converting the residual signal into a frequency domain;
An energy calculator configured to calculate frequency energy of the residual signal using frequency coefficients of the residual signal;
An energy concentration ratio calculator for calculating an energy concentration ratio for each vector order of the residual signal from the frequency energy of the residual signal; And
And a vector order determiner for comparing an energy concentration rate with respect to each vector order to determine a target vector order of the residual signal.
And the vector order determiner determines a vector order having a maximum value as the target vector order at an energy concentration rate for each vector order.
And the residual signal calculator calculates a difference between the speech and audio signals and a signal resynthesized by a code excitation linear prediction codec through the code excitation linear prediction coding scheme.
The frequency converter is a signal encoding, characterized in that for transforming the residual signal into the frequency domain through a modified Discrete Cosine Transform (MDCT) or Discrete Fourier Transform (DFT) in the time domain Device.
And a residual signal weighting unit for obtaining a weighted signal of the residual signal by applying a perceptual weighting filter to the frequency coefficients of the residual signal.
And the energy calculator calculates energy in a subband of the residual signal by using frequency coefficients of the residual signal.
The energy concentration ratio calculation unit arranges the frequency energy of the residual signal in the order of magnitude, calculates the frequency energy for each vector order according to the magnitude order, and calculates the energy concentration rate for each vector order. And a signal encoding apparatus.
A position determiner which stores the positions of the object vectors to which the frequency coefficients are allocated by assigning the frequency coefficients to the object vector of the residual signal in the order of the highest absolute value of the frequency coefficients by the object vector order;
And a position quantizer configured to calculate a position of frequency coefficients allocated to the object vector and quantize the position of the object vector.
A gain quantizer for quantizing the gain of the target vector;
A normalizer for normalizing the object vector with a gain of the quantized object vector;
A shape quantizer for quantizing the normalized object vector;
And a code quantizer for quantizing the position code of the object vector.
The gain quantization unit quantizes the gain of the target vector to a value closest to a codebook previously generated using training data;
The shape quantization unit quantizes the normalized object vector by applying an Algebraic vector quantization, or quantizes the normalized object vector to a value closest to the codebook.
Encoding a speech and an audio signal using a Code Excited Linear Prediction (CELP) coding scheme;
Calculating a residual signal for the speech and audio signal;
Converting the residual signal into a frequency domain;
Calculating frequency energy of the residual signal using the frequency coefficients of the residual signal;
Calculating an energy concentration rate for each vector order of the residual signal from the frequency energy of the residual signal; And
And determining an object vector order of the residual signal by comparing energy concentration rates with respect to each vector order.
And a vector order having a maximum value as the target vector order in the energy concentration rate for each vector order for determining the target vector order.
The calculating of the residual signal may include calculating a difference between the speech and audio signals and a signal resynthesized by a code excitation linear prediction codec through the code excitation linear prediction coding scheme. Way.
The transforming into the frequency domain may include converting the residual signal into the frequency domain through a modified discrete cosine transform (MDCT) or a discrete fourier transform (DFT) in the time domain. A signal encoding method.
And applying a perceptual weighting filter to the frequency coefficients of the residual signal to obtain a weighted signal of the residual signal.
The calculating of the frequency energy may include calculating energy in a subband of the residual signal by using frequency coefficients of the residual signal.
The calculating of the energy concentration rate may include: aligning the frequency energy of the residual signal in the order of magnitude, and then calculating the frequency energy for each of the vector orders according to the magnitude order, thereby concentrating the energy for each vector order. And a method for calculating the rate.
Allocating the frequency coefficients to the object vector of the residual signal in order of increasing absolute value of the frequency coefficients by the object vector order, and storing the position of the object vector to which the frequency coefficients are assigned;
And calculating the positions of the frequency coefficients assigned to the object vector, and quantizing the position of the object vector.
Quantizing the gain of the target vector;
Normalizing the object vector with the gain of the quantized object vector;
Shape quantizing the normalized object vector;
And quantizing the position code of the object vector.
Quantizing the gain comprises: quantizing gain of the object vector to a value closest to a codebook previously generated using training data;
The shape quantization may include performing quantization by applying an Algebraic vector quantization to the normalized object vector, or quantizing the normalized object vector to a value closest to the codebook.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/661,945 US8924203B2 (en) | 2011-10-28 | 2012-10-26 | Apparatus and method for coding signal in a communication system |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020110111464 | 2011-10-28 | ||
KR20110111464 | 2011-10-28 |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20130047630A true KR20130047630A (en) | 2013-05-08 |
Family
ID=48659026
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020120119933A KR20130047630A (en) | 2011-10-28 | 2012-10-26 | Apparatus and method for coding signal in a communication system |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR20130047630A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI642053B (en) * | 2016-04-12 | 2018-11-21 | 弗勞恩霍夫爾協會 | Audio encoder for encoding an audio signal, method for encoding an audio signal and computer program under consideration of a detected peak spectral region in an upper frequency band |
-
2012
- 2012-10-26 KR KR1020120119933A patent/KR20130047630A/en not_active Application Discontinuation
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI642053B (en) * | 2016-04-12 | 2018-11-21 | 弗勞恩霍夫爾協會 | Audio encoder for encoding an audio signal, method for encoding an audio signal and computer program under consideration of a detected peak spectral region in an upper frequency band |
US10825461B2 (en) | 2016-04-12 | 2020-11-03 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio encoder for encoding an audio signal, method for encoding an audio signal and computer program under consideration of a detected peak spectral region in an upper frequency band |
US11682409B2 (en) | 2016-04-12 | 2023-06-20 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio encoder for encoding an audio signal, method for encoding an audio signal and computer program under consideration of a detected peak spectral region in an upper frequency band |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6980871B2 (en) | Signal coding method and its device, and signal decoding method and its device | |
US8099275B2 (en) | Sound encoder and sound encoding method for generating a second layer decoded signal based on a degree of variation in a first layer decoded signal | |
TWI576832B (en) | Apparatus and method for generating bandwidth extended signal | |
KR20090122142A (en) | A method and apparatus for processing an audio signal | |
JP6600054B2 (en) | Method, encoder, decoder, and mobile device | |
EP2814028B1 (en) | Audio and speech coding device, audio and speech decoding device, method for coding audio and speech, and method for decoding audio and speech | |
KR102386738B1 (en) | Signal encoding method and apparatus, and signal decoding method and apparatus | |
US9424857B2 (en) | Encoding method and apparatus, and decoding method and apparatus | |
US9830919B2 (en) | Acoustic signal coding apparatus, acoustic signal decoding apparatus, terminal apparatus, base station apparatus, acoustic signal coding method, and acoustic signal decoding method | |
US8924203B2 (en) | Apparatus and method for coding signal in a communication system | |
CN108701462B (en) | Adaptive quantization of weighting matrix coefficients | |
WO2011045926A1 (en) | Encoding device, decoding device, and methods therefor | |
US8706509B2 (en) | Method and a decoder for attenuation of signal regions reconstructed with low accuracy | |
KR20130047630A (en) | Apparatus and method for coding signal in a communication system | |
KR20160098597A (en) | Apparatus and method for codec signal in a communication system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WITN | Withdrawal due to no request for examination |