KR100348137B1

KR100348137B1 - Speech Encoding and Decoding Method by Sampling Rate Conversion

Info

Publication number: KR100348137B1
Application number: KR1019950050669A
Authority: KR
Inventors: 김상룡; 김흥국
Original assignee: 삼성전자 주식회사
Priority date: 1995-12-15
Filing date: 1995-12-15
Publication date: 2002-11-30
Also published as: KR970055619A

Abstract

음성 신호를 선형 예측 분석하여 얻어진 잔차 신호의 다이내믹 레인지를 줄여서 효율적인 부호화를 가능하게 하는 부호화 방법 및 이에 상응하는 복호화 방법이 개시된다.Disclosed are an encoding method and a decoding method corresponding to the dynamic range of a residual signal obtained by linear predictive analysis of a speech signal to enable efficient encoding.

본 발명에 따른 부호화 방법은 프레임 단위의 음성 신호를 선형 예측 분석하여 잔차 신호를 구하고, 이 잔차 신호를 부호화하는 음성 부호화 방법에 있어서, 프레임 단위의 음성 신호를 선형 예측 분석하여 제1잔차 신호를 발생하는 제1선형 예측 분석 과정; 상기 제1선형 예측 분석 과정에서 발생된 제1잔차 신호를 보간하는 과정; 상기 보간 과정에서 보간된 제1잔차 신호를 상기 제1선형 예측분석 과정에서 사용된 차수보다 낮은 차수로 선형 예측 분석하여 제2잔차 신호를 구하는 제2선형 예측 분석 과정; 제2잔차 신호를 감축하는 과정; 및 상기 감축 과정을 통하여 감축된 잔차 신호를 부호화하는 과정을 포함함을 특징으로 한다.In the encoding method according to the present invention, a residual signal is obtained by linearly predicting and analyzing a speech signal in a frame unit, and in the speech encoding method of encoding the residual signal, a first residual signal is generated by linearly predicting and analyzing a speech signal in a frame unit. A first linear predictive analysis process; Interpolating a first residual signal generated in the first linear prediction analysis process; A second linear prediction analysis process of linearly analyzing the first residual signal interpolated in the interpolation process to an order lower than the order used in the first linear prediction analysis process to obtain a second residual signal; Reducing the second residual signal; And encoding the residual signal reduced through the reduction process.

본 발명에 따른 음성 부호화 방법에서는 잔차 신호의 다이내믹 레인지를 작게 함으로써 적은 비트 수로 부호화할 수 있는 효과를 갖는다.In the speech coding method according to the present invention, the dynamic range of the residual signal is reduced, so that the coding can be performed with a small number of bits.

Description

Speech coding and decoding method by sampling rate conversion

본 발명은 음성 신호의 부호화 방법에 관한 것으로서 더욱 상세하게는 음성 신호를 선형 예측 분석(Lenear Predictive Analysis)하여 얻어진 찬차 신호의 다이내믹 레인지(dynamic range)를 줄여서 효율적인 부호화를 가능하게 하는 부호화 방법 및 이에 상응하는 복호화 방법에 관한 것이다.The present invention relates to a method for encoding a speech signal, and more particularly, to an encoding method for enabling efficient encoding by reducing a dynamic range of a cold order signal obtained by linear predictive analysis. It relates to a decoding method.

음성신호의 중복성을 제거하여 정보량을 줄이는 음성부호화방법은 음성신호의 전송시 전송효율을 높여주며, 음성정보를 저장할 때 기억용량을 줄여준다. 음성부호화방법은 크게 파형부호화방법과 음원부호화방법 그리고 두가지의 방법을 혼합한 하이브리드형 부호화방법으로 분류될 수 있다.The voice encoding method that reduces the amount of information by removing the redundancy of the voice signal increases the transmission efficiency when transmitting the voice signal and reduces the storage capacity when storing the voice information. The speech encoding method can be broadly classified into a waveform encoding method, a sound source encoding method, and a hybrid encoding method in which the two methods are mixed.

하이브리드형 부호화방법에서, 포만트정보는 선형 예측부호화법으로 보통 부호화하고, 그 나머지 잔차신호를 어떻게 부호화하느냐에 따라 잔차여기선형예측 (RELP:Residual Excited Linear Prediction)법, 벡터합여기선형예측 (VSELP:Vector Sum Excited Linear Prediction)법, 다중펄스선형예측 (MPLP:Multipulse-Excited Linear Prediction)법 및 부호여기선형예측 (CELP:Code Excited Linear Prediction)법 등이 제안되어져 있다.In the hybrid coding method, formant information is normally encoded by linear prediction encoding, and residual exccited linear prediction (RELP) and vector-excited linear prediction (VSELP) are determined according to how the residual signal is encoded. Vector Sum Excited Linear Prediction (MPLP), Multipulse-Excited Linear Prediction (MPLP), and Code Excited Linear Prediction (CELP) have been proposed.

일반적으로 음성 신호는 준주기성 신호로 가정하여 펄스 열이나 백색 잡음을 입력으로 하는 선형 예측 필터로 모델링된다. 즉, 음성 s(n)은 이전의p개의 신호 {s(n-1),s(n-2),,,,s(n-p)}로부터 예측 가능하다고 가정된다.In general, a speech signal is modeled as a linear prediction filter that assumes a quasi-periodic signal and inputs a pulse train or white noise. That is, it is assumed that the voice s (n) is predictable from the previous p signals {s (n-1), s (n-2) ,,, s (np)}.

이를 수식으로 표현하면 s(n)의 예측 신호는 다음 식과 같이 표현된다.Expressed as a formula, the prediction signal of s (n) Is expressed as

여기서, {α_i}는 선형 예측 계수 (linear predictive coefficients)이다. 이때 원신호와 예측 신호와의 차(잔차 신호)를 e(n)이라 하면Where {α _i } is linear predictive coefficients. If the difference between the original signal and the prediction signal (residual signal) is e (n)

로 주어진다.Is given by

e(n)은 여러 가지로 모델링될 수 있으나 주로 음성이 유성음일 경우는 펄스 열로, 무성음일 경우는 백색 잡음으로 모델링되어 필터의 여기 신호로 사용된다.e (n) can be modeled in various ways, but is mainly used as a pulse train when voice is voiced and white noise when voice is used as an excitation signal of a filter.

종래의 음성 분석 및 이를 이용한 음성 부호화 장치에서는 음성의 일정 구간을 10차의 LPC 필터로 모델링하였다.In the conventional speech analysis and the speech encoding apparatus using the same, a certain section of the speech is modeled using a 10th order LPC filter.

그러나, 이러한 모델링 방법에서는 음성을 정확히 모델링할 수가 없기 때문에 잔차 신호의 다이내믹 레인지가 증가하게 된다. 따라서, 합성 음성의 고품질을 보장하기 위해서 잔차 신호 e(n)는 많은 비트 수로 표현되어져야 한다는 문제점이 있다.However, in this modeling method, since the speech cannot be accurately modeled, the dynamic range of the residual signal increases. Therefore, there is a problem that the residual signal e (n) must be represented by a large number of bits in order to ensure the high quality of the synthesized speech.

본 발명은 상기의 문제점을 해결하기 위하여 창출된 것으로서 잔차신호의 다이내믹 레인지를 줄여서 잔차 신호를 표현하는 비트 수를 줄일 수 있는 개선된 부호화 방법을 제공하는 것을 그 목적으로 한다.It is an object of the present invention to provide an improved encoding method which can reduce the number of bits representing a residual signal by reducing the dynamic range of the residual signal.

본 발명의 또 다른 목적은 상기의 부호화 방법에 상응하는 복호화 방법을 제공하는 것에 있다.Another object of the present invention is to provide a decoding method corresponding to the above encoding method.

상기의 목적을 달성하는 본 발명에 따른 부호화 방법은The encoding method according to the present invention to achieve the above object

프레임 단위의 음성 신호를 선형 예측 분석하여 잔차 신호를 구하고, 이 잔차 신호를 부호화하는 음성 부호화 방법에 있어서,In the speech encoding method of linearly predicting and analyzing a speech signal in a frame unit to obtain a residual signal, and encoding the residual signal,

프레임 단위의 음성 신호를 선형 예측 분석하여 제1잔차 신호를 발생하는 제1선형 예측 분석 과정 ;A first linear prediction analysis process of generating a first residual signal by linearly predicting and analyzing a speech signal in a frame unit;

상기 제1선형 예측 분석 과정에서 발생된 제1잔차 신호를 보간하는 과정;Interpolating a first residual signal generated in the first linear prediction analysis process;

상기 보간 과정에서 보간된 제1잔차 신호를 상기 제1선형 예측분석 과정에서 사용된 차수보다 낮은 차수로 선형 예측 분석하여 제2잔차 신호를 구하는 제2선형 예측 분석 과정;A second linear prediction analysis process of linearly analyzing the first residual signal interpolated in the interpolation process to an order lower than the order used in the first linear prediction analysis process to obtain a second residual signal;

제2잔차 신호를 감축하는 과정: 및Reducing the second residual signal: and

상기 감축 과정을 통하여 감축된 잔차 신호를 부호화하는 과정을 포함함을 특징으로 한다.And encoding the residual signal reduced by the reduction process.

본 발병은 반복적인 보간(interpolation)과 감축(decimation)을 통해 잔차 신호의 다이나믹 레인지를 줄여 음성의 정확한 모델링을 가능하게 하며, 적은 비트수로 이들 파라메터의 전송이 가능하게 한다. 따라서, 저전송률 음성 부호화기를 효과적으로 구현할 수 있다.The onset reduces the dynamic range of the residual signal through iterative interpolation and reduction, enabling accurate modeling of speech and the transmission of these parameters with fewer bits. Therefore, it is possible to effectively implement a low rate speech coder.

상기의 다른 목적을 달성하는 본 발명에 따른 복호화 방법은The decoding method according to the present invention to achieve the above another object

음성 신호를 선형 예측 부호화하여 얻어진 제1잔차 신호를 보간하여 선형 예측분석을 수행한 후 이로부터 얻어진 제2잔차 신호를 감축하고 부호화하여 전송된 파라메터로부터 음성 신호를 복원하는 방법에 있어서,In a method of restoring a speech signal from a transmitted parameter by performing linear predictive analysis by interpolating a first residual signal obtained by linear prediction encoding a speech signal, and then reducing and encoding a second residual signal obtained therefrom,

전송 파라메터로부터 제2잔차 신호를 복원하는 복원 과정;Restoring the second residual signal from the transmission parameter;

상기 복원 과정에서 복원된 제2잔차 신호를 보간하는 과정;Interpolating the second residual signal restored in the restoration process;

상기 보간 과정을 통하여 보간된 제2잔차 신호를 합성 필터를 통해 합성하는 제1합성 과정;A first synthesis process of synthesizing the second residual signal interpolated through the interpolation process through a synthesis filter;

상기 제1합성 과정을 통하여 합성된 잔차 신호를 감축하여 제1잔차 신호를 발생하는 감축 과정; 및A reduction process of generating a first residual signal by reducing the residual signal synthesized through the first synthesis process; And

상기 감축 과정을 통하여 얻어진 제1잔차 신호를 합성 필터를 통해 합성하는 제2합성 과정을 포함함을 특징으로 한다. 이하 첨부된 도면을 참조하여 본 발명의 특징 및 효과를 상세히 설명한다.And a second synthesis process of synthesizing the first residual signal obtained through the reduction process through a synthesis filter. Hereinafter, with reference to the accompanying drawings will be described in detail the features and effects of the present invention.

제1도는 본 발명에 따른 부호화방법을 보이는 흐름도이다.1 is a flowchart showing an encoding method according to the present invention.

먼저, 제1선형 예측 분석 과정(S102)에서는 프레임 단위의 음성 신호를 선형 예측 분석하여 제1잔차 신호를 발생한다.First, in the first linear prediction analysis process (S102), a first residual signal is generated by linearly predicting and analyzing a speech signal in a frame unit.

보간 과정(S104)에서는 제1선형 예측 분석 과정(S102)에서 발생된 제1잔차 신호를 보간한다.In the interpolation process S104, the first residual signal generated in the first linear prediction analysis process S102 is interpolated.

제2선형 예측 분석 과정(S106)에서는 보간 과정(S104)에서 보간된 제1잔차 신호를 제1선형 예측분석 과정(S102)에서 사용된 차수보다 낮은 차수로 선형 예측 분석하여 제2잔차 신호를 구한다.In the second linear prediction analysis process S106, the second residual signal is obtained by linearly predicting and analyzing the first residual signal interpolated in the interpolation process S104 at a lower order than the order used in the first linear prediction analysis process S102. .

감축 과정(S108)에서는 제2선형 예측 분석 과정(S106)에서 구해진 제2잔차 신호를 감축한다.In the reduction process S108, the second residual signal obtained in the second linear prediction analysis process S106 is reduced.

부호화 과정(S110)에서는 감축 과정(S108)을 통하여 감축된 잔차 신호를 부호화한다.In the encoding process S110, the residual signal reduced through the reduction process S108 is encoded.

제2도는 제1도에 도시된 부호화 과정의 바람직한 실시예를 보이는 흐름도이다. 도 2에 도시된 실시예에 있어서, 도 1의 제1선형 예측 분석 과정(S102)은 제202단계(S202), 도 1의 보간 과정(S104)은 제212단계(S112), 도 1의 제2선형 예측 분석 과정(S106)는 제214단계(S214), 도 1의 감축 과정(S108)은 제216단계(S216), 그리고 도 1의 부호화 과정(S110)는 제218단계(S218)에 상응한다.2 is a flowchart showing a preferred embodiment of the encoding process shown in FIG. In the embodiment illustrated in FIG. 2, the first linear prediction analysis process S102 of FIG. 1 is performed in step 202 (S202), and the interpolation process S104 of FIG. 1 is performed in step 212 (S112) in FIG. 1. The two-linear prediction analysis process S106 corresponds to step 214 (S214), the reduction process S108 of FIG. 1 corresponds to step 216 (S216), and the encoding process S110 of FIG. 1 corresponds to step 218 (S218). do.

200단계(S200)-202단계(S202)에서는 입력된 프레임 단위의 음성으로부터 선형 예측 계수(이하 LPC 계수라 함)를 구하고, 이를 이용하여 원음성 신호에 예측 신호를 제거한 잔차 신호를 구한다.In step 200 (S200) to step 202 (S202), linear prediction coefficients (hereinafter referred to as LPC coefficients) are obtained from the input frame-based speech, and a residual signal obtained by removing the prediction signal from the original audio signal is obtained using the same.

입력된 음성 신호 s(n)은 p차 선형 예측 분석으로 모델링된다.The input speech signal s (n) is modeled by p-order linear prediction analysis.

그리고, m=0로 하고 다음을 수행한다.Then, let m = 0 and do the following.

모델링된 음성 신호로부터 잔차 신호를 다음 식과 같이 구한다.The residual signal is obtained from the modeled speech signal as follows.

여기서, e_m(n)는 m차 반복일 때의 잔차 신호이고, α_mi는 m차 반복일 때의 선형 예측 계수이고, s(n-i)는 이전 프레임의 음성 신호들이다. 수식 4의 두 번째 항은 현재 프레임의 음성 신호에 대한 예측된 음성 신호를 나타낸다.Here, e _m (n) is a residual signal at the m-th iteration, α _mi is a linear prediction coefficient at the m-th iteration, and s (ni) are speech signals of the previous frame. The second term of Equation 4 represents the predicted speech signal for the speech signal of the current frame.

204단계(S204)에서는 반복 횟수 m이 주어진 최대 반복 횟수 Mmax보다 적은 가를 판단한다. 반복 횟수 m이 주어진 최대 반복 횟수 Mmax보다 적지 않다면 218단계(S218)로 분기하고, 크다면 206단계(S206)으로 진행한다.In step 204 (S204) it is determined whether the repetition number m is less than the given maximum repetition number Mmax. If the number of repetitions m is not less than the given maximum number of repetitions Mmax, the process branches to step 218 (S218), and if greater, the process proceeds to step 206 (S206).

206단계(S206)에서는 펄스의 수가 임계치 미만인 가를 판단한다. 펄스의 수가 임계치 미만인 경우에는 218단계(S218)로 분기하고, 그렇지 않으면207단계(S207)로 진행한다.In step 206 (S206) it is determined whether the number of pulses is less than the threshold. If the number of pulses is less than the threshold, the process branches to step 218 (S218); otherwise, to step 207 (S207).

여기서, 프레임의 길이를 N이라 하면 다음의 식으로 펄스의 수를 정의할 수 있다.Here, if the length of the frame is N, the number of pulses can be defined by the following equation.

여기서, δ₁과 δ₂는 소정의 임계치이다.Here, δ ₁ and δ ₂ are predetermined thresholds.

207단계(S207)에서는 잔차 신호의 파워 P_e을 계산한다. 파워 P_e는 하기의 식에 의해 산출된다.In step 207 (S207), the power P _e of the residual signal is calculated. Power P _e is calculated by the following equation.

208단계(S208)에서는 P_e가 소정의 임계치보다 작은 가를 판단한다.208단계(S208)에서 P_e가 소정의 임계치(P)보다 작으면 218단계(S218)로 분기한다.In step 208 (S208) and determines whether the P _e is smaller than a predetermined threshold value, if 0.208 in step (S208) P _e is smaller than a predetermined threshold value (P) branches to step 218 (S218).

여기서 P는 소정의 임계치이다.Where P is a predetermined threshold.

한편, 208단계(S208)에서는 P_e가 소정의 임계치 P보다 크거나 같으면 210단계(S210)로 진행한다.On the other hand, in step 208 (S208) if P _e is greater than or equal to the predetermined threshold value P proceeds to step 210 (S210).

210단계(S210)에서는 반복 횟수 m을 증가시킨다.In step 210, the number of repetitions m is increased.

212단계(S212)에서는 잔차 신호에 대해 보간(interpolation)을 행한다. 보간은 잔차 신호에 리던던시(redundancy)를 부가하기 위하여 수행되며 실험적으로 보간 팩타 I의 값은 적어도 5이상이어야 한다.In step 212, interpolation is performed on the residual signal. Interpolation is performed to add redundancy to the residual signal and experimentally the value of interpolation factor I should be at least 5.

보간된 잔차 신호는 {e¹ _m(n),n=0,,,,NI-1}는 하기의 식에 의해 구한다.The interpolated residual signal is obtained by the following formula {e ¹ _m (n), n = 0 ,,, NI -1}.

214단계(S214)에서는 보간된 잔차 신호에 대해 선형 예측 분석을 수행하고, 얻어진 LPC 계수에 의해 잔차 신호를 다시 구한다. 이때 선형 예측 분석의 차수는 202단계(S202)에서 사용된 분석 차수 p보다 적은 p_l을 사용한다.In step 214 (S214), a linear prediction analysis is performed on the interpolated residual signal, and the residual signal is obtained again based on the obtained LPC coefficients. In this case, the order of the linear predictive analysis uses p _l less than the analysis order p used in step 202 (S202).

여기서 잔차 신호의 총 샘플 수는 보간 팩타 I와 프레임의 길이 N의 곱인 NI가 된다.Here, the total number of samples of the residual signal is NI, which is the product of the interpolation factor I and the frame length N.

잔차 신호 e^I _m(n)는 다음과 같이 구한다. 여기서, 첨자 I은 반복 횟수를 뜻한다.The residual signal e ^I _m (n) is obtained as follows. Here, the subscript I means the number of repetitions.

216단계(S216)에서는 212단계(S212)의 역과정인 감축을 수행한다. 사용될 감축 팩타(decimation factor)는 보간 팩타와 같은 I가 된다.In step 216 (S216), the reverse process of step 212 (S212) is reduced. The reduction factor to be used is I equal to the interpolation factor.

감축된 후의 잔차 신호는 {e_m(n),n=0,,,,N-1}로 표현할 수 있다.The residual signal after the reduction can be expressed as {e _m (n), n = 0 ,,, N -1}.

한편, 218단계(S218)에서는 m차의 잔차 신호에 대해 모델링을 한다.On the other hand, in step 218 (S218) is modeled for the residual signal of the m order.

모델링하는 방법으로는 비트 수를 최소화하기 위해 코드북(codebook)방식을 사용한다. 따라서, 전송 파라메터로는 반복 회수 m과 m차 잔차 신호의 코드북 인덱스, 그리고 m가지의 LPC 계수가 된다.The modeling method uses a codebook method to minimize the number of bits. Therefore, the transmission parameters include the iteration number m, the codebook index of the mth order residual signal, and the m LPC coefficients. Becomes

그러나, 실제에 있어서는 처음의 LPC 계수를 제외한 나머지는 전송할 필요가 없다. 이 때 임의의 음성 구간에 대해서 얻은 LPC 계수는 음성의 부호화 과정과 복호화 과정에서 공통으로 사용된다.In practice, however, there is no need to transmit anything except the initial LPC coefficient. At this time, the LPC coefficients obtained for the arbitrary speech section are commonly used in the encoding and decoding of the speech.

제3도는 제1도에 도시된 부호화 방법에 의해 전송된 파라메터들을 이용하여 음성을 복호화하는 방법을 보이는 흐름도이다.3 is a flowchart illustrating a method of decoding a speech using parameters transmitted by the encoding method illustrated in FIG. 1.

제3도에 도시된 바의 본 발명에 따른 복호화 방법에 있어서, 먼저 복원 과정(S302)에서는 전송 파라메터로부터 제2잔차 신호를 복원한다.In the decoding method according to the present invention as shown in FIG. 3, first, in the reconstruction process (S302), the second residual signal is reconstructed from the transmission parameter.

보간 과정(S304)에서는 복원 과정(S302)에서 복원된 제2잔차 신호를 보간한다.In the interpolation process S304, the second residual signal reconstructed in the reconstruction process S302 is interpolated.

제1합성 과정(S306)에서는 보간 과정(S304)을 통하여 보간된 제2잔차 신호를 합성 필터를 통해 합성한다.In the first synthesis process S306, the second residual signal interpolated through the interpolation process S304 is synthesized through a synthesis filter.

감축 과정 (S308)에서는 제1합성 과정 (S306)을 동하여 합성된 잔차 신호를 감축하여 제1잔차 신호를 발생한다.In the reduction process S308, the synthesized residual signal is reduced by performing the first synthesis process S306 to generate a first residual signal.

제2합성 과정(S310)에서는 감축 과정(S308)을 통하여 얻어진 제1잔차 신호를 합성 필터를 통해 합성한다.In the second synthesis process S310, the first residual signal obtained through the reduction process S308 is synthesized through a synthesis filter.

제4도는 제3도에 도시된 복호화 방법의 바람직한 실시예를 보이는 흐름도이다. 제2도에 도시된 복호화 방법에 있어서, 제3도의 복원과정(S302)은 제402단계(S402), 제3도의 보간 과정(S304은 제406단계(S406), 제3도의 제1합성 과정(S306)은 제408단계(S408), 제3도의 감축 과정(S308)은 제410단계(S410), 그리고 제3도의 제2합성과정(S310)은 제414단계(S414)에 상응한다.4 is a flowchart showing a preferred embodiment of the decoding method shown in FIG. In the decoding method shown in FIG. 2, the reconstruction process S302 of FIG. 3 is performed in step 402 (S402), the interpolation process of FIG. 3 (S304 in step 406 (S406), and the first synthesis process of FIG. S306 corresponds to step 408 (S408), the reduction process S308 of FIG. 3 corresponds to step 410 (S410), and the second synthesis process S310 of FIG. 3 corresponds to step 414 (S414).

제400단계(S400)에서는 입력된 전송 파라메터를 입력한다.In operation 400, the input transmission parameter is input.

제402단계(S402)에서는 전송 파라메터로부터 추출된 코드북 인덱스를 가지고 m차 잔차 신호을 복원한다.In step 402, the m-th order residual signal has a codebook index extracted from the transmission parameter. Restore

제404단계(S404)에서는 전송 파라메터로부터 추출된 m이 0인가를 판단하는 데, m이 0이면 제414단계로 분기하고, 그렇지 않으면 제406단계 (S406)로 진행한다.In step S404, it is determined whether m extracted from the transmission parameter is 0. If m is 0, the flow branches to step 414. Otherwise, the flow proceeds to step 406 (S406).

제406단계(S406)에서는 제1도에 도시된 부호화 과정에서 사용된 보간 팩타 I로 복원된 잔차 신호을 보간한다.In step 406, the residual signal reconstructed by the interpolation factor I used in the encoding process illustrated in FIG. Interpolate

제408단계(S408)에서는 보간된 잔차 신호를 LPC 계수 {α_m1,,,,α_mp)을 갖는 합성 필터로 합성한다.In step 408, the interpolated residual signal is synthesized using a synthesis filter having an LPC coefficient {α _m1 ,,, α _mp ).

제410단계(S410)에서는 합성된 신호를 다시 제1도에 도시된 부호화 과정에서 사용된 감축 팩타 I로 감축한다. 이 신호는 (m-1)차의 잔차 신호이 된다.In operation 410, the synthesized signal is reduced to the reduction factor I used in the encoding process illustrated in FIG. 1. This signal is the residual signal of (m-1) Becomes

이러한 과정을 m=0이 될 때까지 반복한다.This process is repeated until m = 0.

제414단계(S414)는 최종 잔차 신호을 제1도에 도시된 부호화 과정에서 구한 LPC 계수를 갖는 합성 필터로 여기시켜 음성을 재생해 낸다.Step S414 (S414) is the final residual signal Is excited by a synthesis filter having LPC coefficients obtained in the encoding process shown in FIG. 1 to reproduce speech.

상술한 바와 같이 본 발명에 따른 음성 부호화 방법에서는 반복적인 보간(interpolation)과 감축(decimation)을 통해 잔차 신호의 다이나믹 레인지를 줄여 음성의 정확한 모델링을 가능하게 하며, 적은 비트수로 이들 파라메터의 전송이 가능하게 한다. 따라서, 저전송률 음성 부호화기를 효과적으로 구현할 수 있다.As described above, the speech coding method according to the present invention enables accurate modeling of speech by reducing the dynamic range of the residual signal through iterative interpolation and reduction, and transmits these parameters with a small number of bits. Make it possible. Therefore, it is possible to effectively implement a low rate speech coder.

본 발명에 따른 음성 부호화 방법은 전송 파라메터의 수가 매우 적어 4kbps 이하의 음성 코더를 구현할 수 있으며, 통신용 코덱(codec), 디지털 녹음기 및 음성 저장용 매체, 컴퓨터 주변기기의 음성 메시지 재생기 및 녹음장치에 이용이 가능하다.The speech coding method according to the present invention has a very small number of transmission parameters, so that a voice coder of 4 kbps or less can be realized. It is possible.

제2도는 제1도에 도시된 부호화 과정의 바람직한 실시예를 보이는 흐름도이다.2 is a flowchart showing a preferred embodiment of the encoding process shown in FIG.

제4도는 제3도에 도시된 복호화 방법의 바람직한 실시예를 보이는 흐름도이다.4 is a flowchart showing a preferred embodiment of the decoding method shown in FIG.

Claims

In the speech encoding method of linearly predicting and analyzing a speech signal in a frame unit to obtain a residual signal, and encoding the residual signal,

A first linear prediction analysis process of generating a first residual signal by linearly predicting and analyzing a speech signal in a frame unit;

Interpolating a first residual signal generated in the first linear prediction analysis process;

A second linear prediction analysis process of linearly analyzing the first residual signal interpolated in the interpolation process to an order lower than the order used in the first linear prediction analysis process to obtain a second residual signal;

Reducing the second residual signal; And

And encoding the residual signal reduced by the reduction process.

The method of claim 1,

And repeating the interpolation process, the second linear prediction analysis process, and the reduction process until the result of the first linear prediction analysis process satisfies a condition in which the number of pulses of the residual signal is less than a predetermined threshold.

The method of claim 1, wherein the interpolation process, the second linear prediction analysis process, and the reduction process are repeated until the result of the first linear prediction analysis process satisfies a condition that the power of the residual signal is less than a predetermined threshold. Speech coding method.

The speech encoding method of claim 1, wherein the interpolation factor of the interpolation process and the reduction factor of the reduction process are the same.

The method of claim 1,

The interpolation factor of the interpolation process and the reduction factor of the reduction process are at least five or more.

The speech encoding method of claim 1, wherein an analysis order of the second linear prediction analysis process is less than an analysis order of the first linear prediction analysis process.

The speech encoding method of claim 1, wherein the encoding process is performed using a codebook.

The speech encoding method of claim 1, wherein the encoding process encodes a linear prediction coefficient resulting from the first-order linear prediction analysis process.

In a method of restoring a speech signal from a transmitted parameter by performing linear predictive analysis by interpolating a first residual signal obtained by linear prediction encoding a speech signal, and then reducing and encoding a second residual signal obtained therefrom,

Restoring the second residual signal from the transmission parameter;

Interpolating the second residual signal restored in the restoration process;

A first synthesis process of synthesizing the second residual signal interpolated through the interpolation process through a synthesis filter;

A reduction process of generating a first residual signal by reducing the residual signal synthesized through the first synthesis process; And

And a second synthesis process of synthesizing the first residual signal obtained through the reduction process through a synthesis filter.

The method of claim 1, wherein the residual signal e _m (n) is derived from the speech signal modeled in the first linear prediction analysis.

Is,

Where s (n) is the input speech signal modeled by p-order linear prediction analysis,

The speech coding method characterized by the above-mentioned.

2. The interpolated residual signal of claim 1, wherein

{e ¹ _m (n), n = 0 ,,, NI -1}

The speech coding method characterized by the above-mentioned.

The method of claim 1, wherein the residual signal e ^I _m (n) in the second linear prediction analysis process is

(Where I is an interpolation factor).

The method of claim 1, wherein the residual signal after reduction in the reduction process is {e _m (n), n = 0 ,,,, N −1}.

e _m (n) = e ^I _m-1 (nI) n = 0 ,, N −1.