KR101591597B1

KR101591597B1 - Adaptive muting system and mehtod using g.722 codec packet loss concealment and steepest descent criterion

Info

Publication number: KR101591597B1
Application number: KR1020140082514A
Authority: KR
Inventors: 이봉기; 장준혁
Original assignee: 한양대학교 산학협력단
Priority date: 2014-07-02
Filing date: 2014-07-02
Publication date: 2016-02-19
Also published as: KR20160004462A

Abstract

G.722 코덱 패킷손실은닉 및 최급강하법을 이용한 적응형 뮤팅 시스템 및 방법이 개시된다. 적응형 뮤팅 시스템은, G.722 인코더에서 인코딩된 프레임을 노멀 디코딩(normal decoding)하여 디코딩 신호를 생성하는 디코딩 신호 생성부, 인코딩된 프레임 이전에 수신되어 패킷 손실이 발생하지 않은 이전 프레임에 대해 선형 예측 기반 피치 반복(Linear Prediction based Pitch Repetition)을 수행하여 손실 가정 디코딩 신호를 생성하는 손실 가정 디코딩 신호 생성부, 및 디코딩 신호와 손실 가정 디코딩 신호를 이용하여 시그모이드 함수의 파라미터를 결정하는 파라미터 결정부를 포함할 수 있다.An adaptive muting system and method using G.722 codec packet loss concealment and top-down method is disclosed. The adaptive muting system includes a decoding signal generator for normally decoding the encoded frame in the G.722 encoder to generate a decoded signal, a linear encoder for receiving the encoded frame, A loss hypothesis decoding signal generation unit for generating a loss hypothesis decoding signal by performing prediction based pitch repetition and a parameter determination unit for determining parameters of the sigmoid function using the decoding signal and the loss hypothesis decoding signal, Section.

Description

[0001] DESCRIPTION [0002] ADAPTIVE MUTING SYSTEM AND MEHTOD USING G.722 CODEC PACKET LOSS CONCEPT AND STEEPEST DESCENT CRITERION [

본 발명의 실시예들은 G.722 코덱의 패킷손실은닉 알고리즘에서 최급강하법을 이용하여 시그모이드 함수의 파라미터를 결정하는 기술에 관한 것이다.Embodiments of the present invention are directed to a technique for determining parameters of a sigmoid function using a best fit method in a packet loss concealment algorithm of a G.722 codec.

일반적으로, VoIP(Voice over Internet Protocol)를 이용한 음성 통신 시스템은 패킷이 중간에 손실되는 패킷 손실이 발생한다. 이러한 패킷 손실은 음성통화품질을 감소시킨다. 이에 따라, 종래의 VoIP 음성 통신 시스템은 패킷손실은닉(Packet Loss Concealment) 알고리즘을 이용하여 패킷 손실을 보상한다.Generally, in a voice communication system using Voice over Internet Protocol (VoIP), packets are lost in the middle of packets. This packet loss reduces voice call quality. Accordingly, the conventional VoIP voice communication system compensates for packet loss using a packet loss concealment algorithm.

G.722 코덱은 ITU-T에 의해 표준화된 광대역 기반의 코덱(CODEC)으로, VoIP 음성 통신에 널리 쓰이는 코덱이다. G.722 코덱의 경우, 그리드 탐색 방식의 훈련 기법을 이용하여 시그모이드 함수의 파라미터를 결정한다.The G.722 codec is a broadband-based codec (CODEC) standardized by ITU-T and widely used for VoIP voice communications. For the G.722 codec, the parameters of the sigmoid function are determined using a grid search training technique.

이처럼, 그리드 탐색 방식의 훈련 기법을 이용하는 경우, 한 번 결정된 시그모이드 함수의 파라미터는 이후에 변하지 않는 고정된 값으로서, 훈련 데이터베이스에 의존적이다. Thus, in the case of using a training technique of the grid search method, the parameter of the once determined sigmoid function is fixed value which does not change afterwards, and is dependent on the training database.

이에 따라, 패킷손실은닉 알고리즘을 정의하고 있는 ITU-T에 의해 표준화된 G.722 Appendix IV에서, 그리드 탐색 방식의 훈련 기법을 이용하여 시그모이드 함수의 파라미터를 결정한 경우, 미리 정해진 값의 불연속전인 커브를 사용하여 적응형 뮤팅 기법을 통해 뮤팅 게인(muting gain)을 계산함에 따라 최적화된 뮤팅을 이루어내기 어렵다.Therefore, in G.722 Appendix IV standardized by the ITU-T that defines the packet loss concealment algorithm, when the parameter of the sigmoid function is determined using the training method of the grid search method, the discontinuity of the predetermined value It is difficult to achieve optimized muting by calculating the muting gain using an adaptive muting technique using curves.

다시 말해, 시그모이드 함수의 파라미터가 훈련 데이터에 의존적이고, 한번 결정된 파라미터가 변하지 않고 고정적으로 이용되기 때문에 음성의 짧은 시간 변화에 대응하기 어렵다.In other words, the parameters of the sigmoid function are dependent on the training data, and it is difficult to cope with a short time variation of the voice since the determined parameter is used constantly and unchanged.

따라서, 시그모이드 함수의 파라미터가 고정되지 않고, 매 음성의 특성에 따라 시그모이드 함수의 파라미터를 샘플 단위로 업데이트할 수 있는 기술이 요구된다. Therefore, there is a need for a technique capable of updating the parameters of the sigmoid function on a sample-by-sample basis according to the characteristics of each speech without setting the parameters of the sigmoid function fixed.

본 발명은 고정된 시그모이드 함수의 파라미터를 사용함에 따른 문제점을 해소하고자 안출된 것으로, 음성 품질이 개선될 수 있도록 최급강하법을 이용하여 시그모이드 함수의 파라미터를 샘플 단위로 결정하여 업데이트하기 위한 것이다.In order to solve the problem of using the parameters of the fixed sigmoid function, the present invention determines and updates the parameter of the sigmoid function on a sample basis using the best-fit method to improve the voice quality .

본 발명의 일실시예에 따른 적응형 뮤팅 시스템은, G.722 인코더에서 인코딩된 프레임을 노멀 디코딩(normal decoding)/G.722 디코딩하여 디코딩 신호를 생성하는 디코딩 신호 생성부, 상기 인코딩된 프레임 이전에 수신되어 패킷 손실이 발생하지 않은 이전 프레임에 대해 선형 예측 기반 피치 반복(Linear Prediction based Pitch Repetition)을 수행하여 손실 가정 디코딩 신호를 생성하는 손실 가정 디코딩 신호 생성부, 및 상기 디코딩 신호와 상기 손실 가정 디코딩 신호를 이용하여 시그모이드 함수의 파라미터를 결정하는 파라미터 결정부를 포함할 수 있다.The adaptive muting system according to an embodiment of the present invention includes a decoding signal generation unit for decoding a G.722 encoder normal decoding / G.722 encoding to generate a decoded signal, A loss hypothesis decoding signal generation unit for performing a linear prediction based pitch repetition on a previous frame that has been received by the loss prediction unit and has no packet loss to generate a loss hypothesis decoding signal, And a parameter determination unit for determining a parameter of the sigmoid function using the decoded signal.

일측면에 따르면, 상기 시그모이드 함수의 파라미터는, 최급강하법(Steepest Descent Criterion))을 이용하여 매 프레임마다 결정되어 업데이트될 수 있다.According to an aspect, the parameter of the sigmoid function may be determined and updated every frame using a Steepest Descent Criterion.

다른 측면에 따르면, 상기 인코딩된 프레임이 손실 프레임인지 여부를 판단하는 판단부, 및 상기 손실 프레임이 아닌 것으로 판단됨에 따라, 현재 프레임에 해당하는 시그모이드 함수의 파라미터를 상기 손실 가정 디코딩 신호를 이용하여 결정된 시그모이드 함수의 파라이터로 업데이트하고, 상기 손실 프레임으로 판단됨에 따라, 현재 프레임에 해당하는 시그모이드 함수의 파라미터를 이전 프레임에 해당하는 시그모이드 함수의 파라미터로 업데이트하는 파라미터 업데이트부를 더 포함할 수 있다.According to another aspect of the present invention, there is provided a decoding apparatus for decoding a lossy encoded frame, the apparatus comprising: a determination unit for determining whether the encoded frame is a lost frame; And a parameter updating unit for updating the parameter of the sigmoid function corresponding to the current frame with the parameters of the sigmoid function corresponding to the previous frame as determined to be the lost frame, .

또 다른 측면에 따르면, 상기 인코딩된 프레임은, 패킷 손실이 발생하지 않은 프레임이고, 상기 손실 가정 디코딩 신호 생성부는, 상기 인코딩된 프레임에 패킷 손실이 발생한 것을 가정하고 상기 선형 예측 기반 피치 반복을 수행할 수 있다.According to another aspect of the present invention, the encoded frame is a frame in which packet loss has not occurred, and the loss hypothesis decoding signal generation unit performs the linear prediction based pitch repetition on the assumption that a packet loss occurs in the encoded frame .

또 다른 측면에 따르면, 상기 파라미터 결정부는, 상기 인코딩된 프레임을 복원한 복원 신호와 상기 디코딩 신호 간의 차이를 나타내는 에러 신호가 감소 또는 최소화되도록 상기 시그모이드 함수의 파라미터를 결정할 수 있다.According to another aspect of the present invention, the parameter determination unit may determine the parameter of the sigmoid function so that an error signal indicating a difference between the reconstructed signal reconstructed from the encoded frame and the decoded signal is reduced or minimized.

또 다른 측면에 따르면, 상기 시그모이드 함수의 파라미터를 이용하여 뮤팅 게인(muting gain)을 계산하는 뮤팅 게인 계산부, 및 상기 뮤팅 게인과 상기 손실 가정 디코딩 신호를 이용하여 현재 프레임을 복원하는 복원 신호 생성부를 더 포함할 수 있다.According to another aspect of the present invention, there is provided a signal processing apparatus comprising: a muting gain calculator for calculating a muting gain using parameters of the sigmoid function; and a restoring unit for restoring a current frame using the muting gain and the lost- And a generating unit.

또 다른 측면에 따르면, 상기 디코딩 신호 생성부는, 상기 인코딩 프레임에 포함된 로우 대역(lower band) 및 하이 대역(higher band) 신호 중 로우 대역에 해당하는 신호를 디코딩할 수 있다. 그리고, 상기 손실 가정 디코딩 신호 생성부는, 상기 로우 대역에 해당하는 신호에 대해 상기 선형 예측 기반 피치 반복을 수행할 수 있다.According to another aspect of the present invention, the decoding signal generator may decode a signal corresponding to a low band of a lower band and a higher band signal included in the encoded frame. The loss hypothesis decoding signal generation unit may perform the linear prediction based pitch repetition on the signal corresponding to the low band.

또 다른 측면에 따르면, 상기 손실 가정 디코딩 신호 생성부는, 상기 인코딩 프레임에 포함된 로우 대역(lower band) 및 하이 대역(higher band) 신호 중 상기 로우 대역에 해당하는 신호에 대해 상기 선형 예측 기반 피치 반복을 수행할 수 있다.According to another aspect of the present invention, the loss hypothesis decoding signal generation unit generates the loss hypothesis decoding signal based on the linear prediction-based pitch repetition signal for a signal corresponding to the low band among a lower band and a higher band signal included in the encoded frame, Can be performed.

본 발명의 일 실시예에 따른 적응형 뮤팅 방법은, G.722 인코더에서 인코딩된 프레임을 노멀 디코딩(normal decoding)/G.722 디코딩하여 디코딩 신호를 생성하는 단계, 상기 인코딩된 프레임 이전에 수신되어 패킷 손실이 발생하지 않은 이전 프레임에 대해 선형 예측 기반 피치 반복(Linear Prediction based Pitch Repetition)을 수행하여 손실 가정 디코딩 신호를 생성하는 단계, 및 상기 디코딩 신호와 상기 손실 가정 디코딩 신호를 이용하여 시그모이드 함수의 파라미터를 결정하는 단계를 포함할 수 있다.An adaptive muting method according to an embodiment of the present invention includes the steps of normal decoding / G.722 decoding a frame encoded in a G.722 encoder to generate a decoded signal, Generating a lossy hypothesis decoding signal by performing linear prediction based pitch repetition on a previous frame in which no packet loss has occurred, and generating a loss hypothesis decoding signal by using the lossy hypothesis decoding signal, And determining a parameter of the function.

일 측면에 따르면, 상기 시그모이드 함수의 파라미터는, 최급강하법(Steepest Descent Criterion))을 이용하여 매 프레임마다 결정되어 업데이트될 수 있다.According to an aspect, the parameter of the sigmoid function may be determined and updated every frame using a Steepest Descent Criterion.

다른 측면에 따르면, 상기 인코딩된 프레임이 손실 프레임인지 여부를 판단하는 단계, 상기 손실 프레임이 아닌 것으로 판단됨에 따라, 현재 프레임에 해당하는 시그모이드 함수의 파라미터를 상기 손실 가정 디코딩 신호를 이용하여 결정된 시그모이드 함수의 파라이터로 업데이트하는 단계, 및 상기 손실 프레임으로 판단됨에 따라, 현재 프레임에 해당하는 시그모이드 함수의 파라미터를 이전 프레임에 해당하는 시그모이드 함수의 파라미터로 업데이트하는 단계를 더 포함할 수 있다.According to another aspect of the present invention, there is provided a method of decoding a lost frame, the method comprising: determining whether the encoded frame is a lost frame; determining whether the encoded frame is a lost frame; determining a parameter of a sigmoid function corresponding to a current frame, Updating the parameter of the sigmoid function corresponding to the current frame with the parameter of the sigmoid function corresponding to the previous frame according to the determination as the lost frame, .

또 다른 측면에 따르면, 상기 인코딩된 프레임은, 패킷 손실이 발생하지 않은 프레임이고, 상기 손실 가정 디코딩 신호를 생성하는 단계는, 상기 인코딩된 프레임에 패킷 손실이 발생한 것을 가정하고 상기 선형 예측 기반 피치 반복을 수행할 수 있다.According to another aspect, the encoded frame is a frame in which no packet loss has occurred, and the step of generating the lossy hypothesis decoding signal includes: Can be performed.

또 다른 측면에 따르면, 상기 시그모이드 함수의 파라미터를 결정하는 단계는, 상기 인코딩된 프레임을 복원한 복원 신호와 상기 디코딩 신호 간의 차이를 나타내는 에러 신호가 감소 또는 최소화되도록 상기 시그모이드 함수의 파라미터를 결정할 수 있다.According to another aspect of the present invention, the determining of the parameter of the sigmoid function may include determining a parameter of the sigmoid function so that an error signal indicating a difference between a reconstructed signal obtained by reconstructing the encoded frame and the decoded signal is reduced or minimized, Can be determined.

또 다른 측면에 따르면, 상기 시그모이드 함수의 파라미터를 이용하여 뮤팅 게인(muting gain)을 계산하는 단계, 및 상기 뮤팅 게인과 상기 손실 가정 디코딩 신호를 이용하여 현재 프레임을 복원하는 단계를 더 포함할 수 있다.According to another aspect, the method further comprises calculating a muting gain using the parameters of the sigmoid function, and restoring the current frame using the muting gain and the loss hypothesis decoding signal .

또 다른 측면에 따르면, 상기 손실 가정 디코딩 신호를 생성하는 단계는, 상기 인코딩 프레임에 포함된 로우 대역(lower band) 및 하이 대역(higher band) 신호 중 상기 로우 대역에 해당하는 신호에 대해 상기 선형 예측 기반 피치 반복을 수행할 수 있다.According to another aspect of the present invention, the step of generating the loss hypothesis decoding signal may include a step of, for a signal corresponding to the low band among a lower band and a higher band signal included in the encoded frame, Based pitch iteration.

또 다른 측면에 따르면, 상기 디코딩 신호를 생성하는 단계는, 상기 인코딩 프레임에 포함된 로우 대역(lower band) 및 하이 대역(higher band) 신호 중 로우 대역에 해당하는 신호를 디코딩하고, 상기 손실 가정 디코딩 신호를 생성하는 단계는, 상기 로우 대역에 해당하는 신호에 대해 상기 선형 예측 기반 피치 반복을 수행할 수 있다.According to another aspect of the present invention, the step of generating the decoding signal may include decoding a signal corresponding to a low band of a lower band and a higher band signal included in the encoded frame, The generating of the signal may perform the linear prediction based pitch iteration on the signal corresponding to the low band.

본 발명에 의하면, 최급강하법을 이용하여 시그모이드 함수의 파라미터를 샘플 단위로 결정하여 업데이트함에 따라, 음성 품질을 개선할 수 있다.According to the present invention, the voice quality can be improved by determining and updating the parameters of the sigmoid function on a sample basis using the best-fit method.

도 1은 본 발명의 일실시예에 따른 G.722 디코더의 전반적인 구성을 도시한 블록도이다.
도 2는 본 발명의 일실시예에 따른 적응형 뮤팅 시스템의 구성을 도시한 블록도이다.
도 3은 도 1의 LP 기반 피치 반복 수행부에서 복원 신호를 생성하는 과정을 나타내는 블록도이다.
도 4는 본 발명의 일실시예에 따라 패킷손실은닉 및 최급강하법을 이용하여 복원 신호를 생성하는 동작을 설명하기 위해 제공되는 흐름도이다.
도 5는 본 발명의 일실시예에 따른 뮤팅 게인을 이용하여 생성된 복원 신호의 파형을 도시한 도면이다.1 is a block diagram illustrating an overall configuration of a G.722 decoder according to an embodiment of the present invention.
2 is a block diagram illustrating a configuration of an adaptive muting system according to an embodiment of the present invention.
3 is a block diagram illustrating a process of generating a restoration signal in the LP-based pitch repetition unit of FIG.
FIG. 4 is a flowchart illustrating an operation of generating a restoration signal using a packet loss concealment method and a top-down method according to an embodiment of the present invention.
5 is a diagram illustrating a waveform of a restoration signal generated using a muting gain according to an embodiment of the present invention.

이하, 본 발명에 따른 실시예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, embodiments according to the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일실시예에 따른 G.722 디코더의 전반적인 구성을 도시한 블록도이다.1 is a block diagram illustrating an overall configuration of a G.722 decoder according to an embodiment of the present invention.

도 1에 따르면, G.722 디코더(100)는 로우 대역 ADPCM 디코더(Lower Band Adaptive Differential Pulse Code Modulation Decoder, 101) 로우 대역 ADPCM 스테이트 업데이트부(Lower band Adaptive Differential Pulse Code Modulation State Update, 102), LP 기반 피치 반복 수행부(Linear Prediction Pitch Repetition, 103), 크로스 페이딩(Crossfading, 104), 하이 대역 ADPCM 디코더(Higher Band Adaptive Differential Pulse Code Modulation Decoder, 105), 하이 대역 ADPCM 스테이트 업데이트부(Higher Band Adaptive Differential Pulse Code Modulation State Update, 106), 피치 반복 수행부(Pitch Repetition, 107), A(107), Hpost(109), B(110), 및 QMF 합성 필터 뱅크(Quadrature Mirror Filter Synthesis Filter Bank, 111)를 포함할 수 있다.Referring to FIG. 1, the G.722 decoder 100 includes a low-band ADPCM (Low-Band Adaptive Differential Pulse Code Modulation) decoder 101, a low-band ADPCM state update unit 102, Based pitch repetition unit 103, crossfading 104, a high-band ADPCM decoder 105, a high-band ADPCM state differential update unit 105, A Pulse Code Modulation State Update 106, a Pitch Repeat 107, a A 107, a Hpost 109, a B 110, and a Quadrature Mirror Filter Synthesis Filter Bank 111, . &Lt; / RTI >

도 1에서, G.722 디코더(100)는, 패킷손실은닉을 위해 G.722 Appendix IV에 정의된 G.722 디코더보다 로우 대역 ADPCM 스테이트 업데이트부(Lower Band ADPCM State Update, 102), LP 기반 피치 반복 수행부(LP based pitch repetition, 103), 크로스 페이딩(Corssfading, 104), 하이 대역 ADPCM 스테이트 업데이트부(Higher Band ADPCM State Update, 106), 피치 반복 수행부(pitch repetition, 107), Hpost(109) 및 B(110)을 더 포함할 수 있다. 1, the G.722 decoder 100 includes a lower band ADPCM State Update (102), a LP-based pitch An LP-based pitch repetition unit 103, a crossovering unit 104, a high-band ADPCM state update unit 106, a pitch repetition unit 107, an Hpost 109 ) And B (110).

도 1에서, Zl(n)은 패킷 손실이 발생하지 않은 n번째 샘플의 이전 프레임일 수 있다. 여기서, 이전 프레임은, G.722 인코더에서 인코딩되어 연속하여 수신되는 인코딩 프레임들 중 현재 프레임 바로 이전에 수신된 프레임일 수 있다. 예를 들어, G.722 인코더에서 인코딩된 프레임 1, 프레임 2, 프레임 3이 연속하여 수신되고, 현재 프레임이 프레임 2인 경우, 이전 프레임은 프레임 1일 수 있다.In Fig. 1, Zl (n) may be the previous frame of the nth sample where no packet loss occurred. Here, the previous frame may be a frame received immediately before the current frame among the encoded frames continuously received encoded in the G.722 encoder. For example, if frame 1, frame 2, and frame 3 encoded in the G.722 encoder are successively received and the current frame is frame 2, the previous frame may be frame 1.

일례로, 도 1에서, 이전 프레임에는 패킷 손실이 발생하지 않다가, 현재 프레임에 패킷 손실이 발생한 경우, G.722 디코더(100)는 n번째 샘플의 이전 프레임 Zl(n)을 이용하여 현재 프레임을 복원하여 n번째 샘플의 복원 신호 yl(n)를 생성할 수 있다. 다시 말해, 패킷 손실이 발생한 배드 프레임(bad frame)이 현재 프레임으로 수신된 경우, G.722 디코더(100)는 n번째 샘플의 이전 프레임 Zl(n)을 이용하여 현재 프레임을 복원할 수 있다.For example, in FIG. 1, when packet loss does not occur in the previous frame but packet loss occurs in the current frame, the G.722 decoder 100 uses the previous frame Zl (n) To generate the reconstruction signal yl (n) of the n-th sample. In other words, when a bad frame in which a packet loss occurs is received in the current frame, the G.722 decoder 100 can recover the current frame using the previous frame Zl (n) of the nth sample.

이처럼, 현재 프레임을 복원한 이후, n번째 샘플의 다음 프레임 xl(n)이 패킷 손실이 발생하지 않은 프레임인 경우, 크로스 페이딩(104)은 n번째 샘플의 복원 신호 yl(n)과 n번째 샘플의 다음 프레임 xl(n) 간의 불연속점을 감소 또는 제거하기 위해 n번째 샘플의 복원 신호 yl(n)과 n번째 샘플의 다음 프레임 xl(n) 을 크로스 페이딩할 수 있다. When the next frame xl (n) of the n-th sample is a frame in which no packet loss has occurred after restoring the current frame, the cross fading 104 is performed in such a manner that the reconstruction signal yl (n) (N) of the nth sample and the next frame xl (n) of the nth sample to reduce or eliminate the discontinuity between the next frame xl (n) of the nth sample.

여기서, 다음 프레임은, G.722 인코더에서 인코딩되어 연속하여 수신되는 프레임들 중 현재 프레임 이후에 수신된 프레임으로, 패킷 손실이 발생하지 않은 굿 프레임(good frame)에 해당할 수 있다. 예를 들어, G.722 인코더에서 인코딩된 프레임 1, 프레임 2, 프레임 3이 연속하여 수신되고, 현재 프레임이 프레임 2인 경우, 다음 프레임은 프레임 2일 수 있다.Here, the next frame may be a frame received after the current frame among the frames continuously received and encoded in the G.722 encoder, and may correspond to a good frame in which packet loss has not occurred. For example, if Frame 1, Frame 2, and Frame 3 encoded in the G.722 encoder are successively received and the current frame is Frame 2, the next frame may be Frame 2.

G.722 코덱은 하이 대역과 로우 대역으로 구성된 서브 밴드(sub-band) 방식의 코덱일 수 있다. 이때, 음성 신호는 로우 대역에 존재할 수 있다. 그러면, LP 기반 피치 반복 수행부(103)는 선형 예측 기반 피치 반복(Linear Prediction based Pitch Repetition) 알고리즘을 이용하여 로우 대역에 존재하는 음성 신호를 복원할 수 있다. 이때, G.722 디코더(100)는 n번째 샘플의 이전 프레임 Zl(n)의 피치 주기만을 이용하여 하이 대역에 존재하는 신호를 복원함에 따라 계산 복잡도를 감소시킬 수 있다.The G.722 codec may be a sub-band codec configured as a high band and a low band. At this time, the voice signal may exist in the low band. Then, the LP-based pitch iteration performing unit 103 may restore the voice signal existing in the low band using a linear prediction based pitch repetition algorithm. At this time, the G.722 decoder 100 can reduce the calculation complexity by restoring the signal existing in the high band using only the pitch period of the previous frame Zl (n) of the n-th sample.

도 2는 본 발명의 일실시예에 따른 적응형 뮤팅 시스템의 구성을 도시한 블록도이다.2 is a block diagram illustrating a configuration of an adaptive muting system according to an embodiment of the present invention.

도 2에 따르면, 적응형 뮤팅 시스템(200)은 판단부(201), 디코딩 신호 생성부(202), 손실 가정 디코딩 신호 생성부(203), 뮤팅 처리부(204), 버퍼(208), 파라미터 업데이트부(207), 및 복원 신호 생성부(209)를 포함할 수 있다. 2, the adaptive muting system 200 includes a determination unit 201, a decoding signal generation unit 202, a loss hypothesis decoding signal generation unit 203, a muting processing unit 204, a buffer 208, Unit 207, and a restoration signal generator 209. [0050]

먼저, 판단부(201)는, G.722 인코더에서 인코딩된 프레임을 수신하고, 인코딩된 프레임이 패킷 손실이 발생한 프레임인지 여부를 판단할 수 있다. 여기서, G.722 인코더는 ITU-T에 의해 표준화된 인코더로서 표준과 변화가 없으므로 자세한 설명은 생략하기로 한다. 그리고, G.722는 서브 밴드(sub-band) 방식의 코덱으로, 대부분의 음성 신호가 로우 대역(lower band) 쪽에 존재할 수 있다.First, the determination unit 201 receives the encoded frame from the G.722 encoder, and can determine whether the encoded frame is a frame in which packet loss has occurred. Here, the G.722 encoder is an encoder standardized by ITU-T, and there is no change with the standard, so a detailed description will be omitted. In addition, G.722 is a sub-band codec, and most voice signals may exist in a lower band.

이때, 인코딩된 프레임이 패킷 손실이 발생하지 않은 프레임인 경우, 디코딩 신호 생성부(202)는 인코딩된 프레임을 노멀 디코딩(normal decoding)함에 따라 n번째 샘플의 디코딩 신호 dl(n)를 생성할 수 있다. 예를 들어, 디코딩 신호 생성부(202)는 패킷 손실이 발생하지 않은 인코딩 프레임에 대해 G.722 디코딩을 수행할 수 있다. 이에 따라, n번째 샘플의 디코딩 신호 dl(n)은 깨끗한 신호로서, desired signal일 수 있다.At this time, if the encoded frame is a frame in which packet loss has not occurred, the decoding signal generation unit 202 can generate an n-th sample decoded signal dl (n) as normal decoding of the encoded frame have. For example, the decoding signal generation unit 202 may perform G.722 decoding on an encoded frame in which packet loss has not occurred. Accordingly, the decoded signal dl (n) of the n-th sample is a clear signal and can be a desired signal.

손실 가정 디코딩 신호 생성부(203)는 인코딩된 프레임 이전에 수신되어 패킷 손실이 발생하지 않은 이전 프레임에 대해 선형 예측 기반의 피치 반복(LP based Pitch Repetition)을 수행하여 손실 가정 디코딩 신호를 생성할 수 있다. The lossy decoding signal generation unit 203 may generate a lossy decoding signal by performing LP based Pitch Repetition on a previous frame received before the encoded frame and having no packet loss have.

예를 들어, 손실 가정 디코딩 신호 생성부(203)는 인코딩된 프레임에 손실이 발생한 것으로 가정할 수 있다. 그리고, 손실 가정 디코딩 신호 생성부(203)는, 인코딩된 프레임의 이전 프레임 zl(n)에 대해 선형 예측 기반의 피치 반복(LP based Pitch Repetition)을 수행함에 따라 n번째 샘플의 손실 가정 디코딩 신호 ylpre(n)를 생성할 수 있다.For example, the lossy decoding signal generation unit 203 may assume that a loss occurs in the encoded frame. The loss hypothesis decoding signal generation unit 203 performs LP based pitch repetition on the previous frame zl (n) of the encoded frame, and outputs the loss hypothesis decoding signal ylpre (n) (n) < / RTI >

뮤팅 처리부(204)는 최급강하법을 이용하여 시그모이드 함수의 파라미터를 결정하기 위해 파라미터 결정부(205), 및 뮤팅 게인 계산부(206)를 포함할 수 있다.The muting processing unit 204 may include a parameter determination unit 205 and a muting gain calculation unit 206 for determining the parameters of the sigmoid function using the best-fit method.

파라미터 결정부(205)는, n번째 샘플의 디코딩 신호 dl(n)와 손실 가정 디코딩 신호 ylpre(n)에 최급강하법을 적용함에 따라 시그모이드 함수의 파라미터를 결정할 수 있다. 이때, 파라미터 결정부(205)는 매 프레임마다 시그모이드 함수의 파라미터를 결정할 수 있다.The parameter determination unit 205 can determine the parameter of the sigmoid function by applying the most probable descent method to the decoding signal dl (n) of the n-th sample and the loss hypothesis decoding signal ylpre (n). At this time, the parameter determination unit 205 can determine the parameter of the sigmoid function every frame.

일례로, 파라미터 결정부(205)는 아래의 수학식 1 내지 수학식 4에 기초하여 인코딩된 프레임을 복원한 n번째 샘플의 복원 신호 yl(n)와 n번째 샘플의 디코딩 신호 dl(n) 간의 차이를 나타내는 n번째 샘플의 에러 신호 e(n)가 감소 또는 최소화되도록 시그모이드 함수의 파라미터를 결정할 수 있다.For example, the parameter determining unit 205 may determine the difference between the reconstructed signal yl (n) of the n-th sample and the decoded signal dl (n) of the n-th sample obtained by restoring the encoded frame based on Equations (1) The parameters of the sigmoid function can be determined such that the error signal e (n) of the nth sample representing the difference is reduced or minimized.

수학식 1에서, yl(n)은 인코딩된 프레임을 복원한 n번째 샘플의 복원 신호, G(n)은 n번째 샘플의 뮤팅 게인(muting gain), ylpre(n)은 n번째 샘플의 손실 가정 디코딩 신호일 수 있다.(N) is the muting gain of the n-th sample, ylpre (n) is the loss of the n-th sample, y (n) Decoded signal.

수학식 1에 따르면, n번째 샘플의 복원 신호 yl(n)은 n번째 샘플의 뮤팅 게인 G(n)과 n번째 샘플의 손실 가정 디코딩 신호 ylpre(n)의 곱에 의해 계산될 수 있다. 이때, n번째 샘플의 손실 가정 디코딩 신호 ylpre(n)는 손실 가정 디코딩 신호 생성부(203)로부터 얻을 수 있으나, n번째 샘플의 복원 신호 yl(n) 및 n번째 샘플의 뮤팅 게인 G(n)은 아직 알 수 없다. 이에 따라, 파라미터 결정부(205)는 뮤팅 게인과 시그모이드 함수의 파라미터 간의 관계를 나타내는 아래의 수학식 2에 기초하여 시그모이드 함수의 파라미터를 결정할 수 있다.According to Equation (1), the reconstruction signal yl (n) of the nth sample can be calculated by multiplying the muting gain G (n) of the nth sample by the loss hypothesis decoding signal ylpre (n) of the nth sample. At this time, the loss hypothesis decode signal ylpre (n) of the nth sample can be obtained from the loss hypothesis decoding signal generation section 203, but the restoration signal yl (n) of the nth sample and the muting gain G I do not know yet. Accordingly, the parameter determination unit 205 can determine the parameters of the sigmoid function based on the following Equation (2) indicating the relationship between the muting gain and the parameters of the sigmoid function.

수학식 2에서, α와 β는 시그모이드 함수의 모양을 결정하는 파라미터이고, n₀는 오프셋일 수 있다. In Equation 2, α and β is a parameter that determines the shape of the sigmoid function, n ₀ may be offset.

일례로, 4프레임 이상 연속하여 패킷 손실이 발생한 경우, n번째 샘플의 뮤팅 게인 G(n)은 0으로 기설정될 수 있다. 수학식 2를 이용함에 따라, 뮤팅 커브는 프레임에 대해 연속적이며, 유연한 뮤팅 게인을 획득할 수 있다.For example, when packet loss occurs continuously for four or more frames, the muting gain G (n) of the n-th sample may be set to zero. Using Equation (2), the muting curve is continuous for the frame and can obtain a flexible muting gain.

이때, 최급강하법은 원하는 신호(desired signal)에서 n번째 샘플의 복원 신호 yl(n) 간의 차이를 최소화하는 알고리즘이다. 이에 따라, 적응형 뮤팅 시스템(200)은 최급강하법을 이용하기 위해 패킷 손실이 발생하지 않은 굿 프레임을 대상으로 LP 기반 피치 반복(LP based pitch repetition)을 수행하여 복원 신호를 생성할 수 있다. 다시 말해, 파라미터 결정부(205)는 아래의 수학식 3에 기초하여 노멀 디코딩한 결과인 n번째 샘플의 디코딩 신호 dl(n)와 n번째 샘플의 손실 가정 디코딩 신호 ylpre(n)를 최급강하법에 적용할 수 있다. At this time, the best-fit method is an algorithm that minimizes the difference between the reconstructed signals yl (n) of the n-th sample in a desired signal. Accordingly, the adaptive muting system 200 can generate the restored signal by performing the LP based pitch repetition on the good frame in which the packet loss does not occur in order to utilize the fastest descent method. In other words, the parameter determination unit 205 multiplies the decoded signal dl (n) of the n-th sample and the loss-based decoding signal ylpre (n) of the n-th sample, which are the results of normal decoding based on Equation .

수학식 3에서, e(n)은 n번째 샘플의 에러 신호, dl(n)은 n번째 샘플의 디코딩 신호, yl(n)은 n번째 샘플의 복원 신호일 수 있다. 이때, 수학식 1에 따르면, n번째 샘플의 복원 신호 yl(n)은 n번째 샘플의 뮤팅 게인 G(n)과 n번째 샘플의 손실 가정 디코딩 신호 ylpre(n)로 표현될 수 있다. 그러면, 파라미터 결정부(205)는 수학식 3에 수학식 1을 대입할 수 있다. 이어, 수학식 2에 따르면, n번째 샘플의 뮤팅 게인 G(n)은 시그모이드 함수의 기울기를 결정하는 파라미터 α, β, 및 n₀로 표현될 수 있다. 이에 따라, 파라미터 결정부(205)는 수학식 3에 수학식 2를 대입할 수 있다. 그러면, n번째 샘플의 에러 신호 e(n)은 n번째 샘플의 디코딩 신호 dl(n)에서 시그모이드 함수의 파라미터와 손실 가정 신호의 곱을 감산한 것으로 표현될 수 있다.In Equation (3), e (n) is the error signal of the nth sample, dl (n) is the decoded signal of the nth sample, and yl (n) is the restored signal of the nth sample. At this time, according to Equation 1, the reconstruction signal yl (n) of the nth sample can be expressed by the muting gain G (n) of the nth sample and the loss hypothesis decoding signal ylpre (n) of the nth sample. Then, the parameter determination unit 205 can substitute the equation (1) into the equation (3). Then, according to equation (2), the muting gain G (n) of the n-th sample may be represented by parameters α, β, and n ₀ for determining the gradient of the sigmoid function. Accordingly, the parameter determination unit 205 can substitute the equation (2) into the equation (3). Then, the error signal e (n) of the n-th sample can be expressed by subtracting the product of the parameter of the sigmoid function and the loss assumption signal in the decoded signal dl (n) of the n-th sample.

수학식 3에 기초하여, 파라미터 결정부(205)는 n번째 샘플의 디코딩 신호 dl(n)과 n번째 샘플의 손실 가정 디코딩 신호 ylpre(n) 간의 차이를 나타내는 에러 신호가 최소화되도록 시그모이드 함수의 파라미터 α, β, 및 n₀를 결정할 수 있다. 이때, 에러 신호는 비용 함수로써 아래의 수학식 4와 같이 표현될 수 있다.Based on Equation (3), the parameter determination unit 205 determines whether the error signal indicating the difference between the n-th sample decoded signal dl (n) and the n-th sample lossy decoding signal ylpre (n) of it may determine the parameters α, β, and n _0. At this time, the error signal can be expressed as the following equation (4) as a cost function.

수학식 4에서, 비용함수 J(α, β)는 α, β에 의해 결정되는 함수일 수 있다. 그러면, 파라미터 결정부(205)는 최급강하법을 이용하여 비용함수 J(α, β)를 최소화시키는 방향으로 α, β를 결정할 수 있다.In Equation (4), the cost function J (?,?) May be a function determined by?,?. Then, the parameter determination unit 205 can determine α and β in the direction of minimizing the cost function J (α, β) by using the most rapid method.

그러면, 버퍼(208)는 파라미터 결정부(205)에서 매 프레임마다 결정된 시그모이드 함수의 파라미터를 임시 저장할 수 있다. 예를 들어, 버퍼(208)는 프레임 1에 해당하는 시그모이드 함수의 파라미터, 프레임 2에 해당하는 시그모이드 함수의 파라미터, 프레임 3에 해당하는 시그모이드 함수의 파라미터를 임시 저장할 수 있다.Then, the buffer 208 can temporarily store the parameters of the sigmoid function determined for each frame in the parameter determination unit 205. [ For example, the buffer 208 may temporarily store a parameter of a sigmoid function corresponding to frame 1, a parameter of a sigmoid function corresponding to frame 2, and a parameter of a sigmoid function corresponding to frame 3.

이어, 파라미터 업데이트부(207)는 매 프레임마다 결정된 시그모이드 함수의 파라미터를 업데이트할 수 있다. Then, the parameter update unit 207 may update the parameter of the sigmoid function determined for each frame.

일례로, 판단부(201)에서 인코딩된 프레임이 손실 프레임이 아닌 것으로 판단한 경우, 파라미터 업데이트부(207)는 현재 프레임에 해당하는 시그모이드 함수의 파라미터를 손실 가정 디코딩 신호를 이용하여 결정된 시그모이드 함수의 파라미터로 업데이트할 수 있다. 다시 말해, 파라미터 업데이트부(207)는 위의 수학식 4에 기초하여 결정된 파라미터를 현재 프레임에 해당하는 시그모이드 함수의 파라미터로 업데이트할 수 있다.For example, when the determination unit 201 determines that the encoded frame is not a lost frame, the parameter update unit 207 updates the parameter of the sigmoid function corresponding to the current frame to the sigmoid function determined using the loss- You can update it with the parameters of the de function. In other words, the parameter update unit 207 can update the parameter determined based on Equation (4) above with parameters of the sigmoid function corresponding to the current frame.

그리고, 손실 프레임인 것으로 판단한 경우, 파라미터 업데이트부(207)는 버퍼(208)를 참조하여 현재 프레임에 해당하는 시그모이드 함수의 파라미터를 이전 프레임에 해당하는 시그모이드 함수의 파라미터로 업데이트할 수 있다.If it is determined that the frame is a lost frame, the parameter update unit 207 refers to the buffer 208 to update the parameter of the sigmoid function corresponding to the current frame with the parameter of the sigmoid function corresponding to the previous frame have.

예를 들어, 파라미터 업데이트부(207)는 아래의 수학식 5 내지 8에 기초하여 시그모이드 함수의 파라미터를 업데이트할 수 있다.For example, the parameter update unit 207 may update the parameters of the sigmoid function based on the following equations (5) to (8).

수학식 5에서, α(n)은 n번째 샘플에서의 시그모이드 함수의 모양을 결정하는 파라미터, α(n+1)은 n+1번째 샘플에서의 시그모이드 함수의 모양을 결정하는 파라미터이고,

는 아래의 수학식 6과 같이 표현될 수 있다.In the equation (5),? (N) is a parameter for determining the shape of the sigmoid function in the nth sample,? (N + 1) is a parameter for determining the shape of the sigmoid function in the ego,

Can be expressed by Equation (6) below.

수학식 6은 G를 α에 대해 미분한 값을 나타낼 수 있다.Equation (6) can represent a value obtained by differentiating G with respect to?.

이처럼, 파라미터 업데이트부(207)는 수학식 5 및 6에 기초하여 시그모이드 함수의 파라미터 α를 매 프레임마다 업데이트할 수 있다.As described above, the parameter update unit 207 can update the parameter a of the sigmoid function every frame based on Equations (5) and (6).

그리고, 파라미터 업데이트부(207)는 아래의 수학식 7 및 8에 기초하여 시그모이드 함수의 파라미터 β를 매 프레임마다 업데이트할 수 있다.The parameter update unit 207 may update the parameter beta of the sigmoid function every frame based on Equations (7) and (8) below.

수학식 7에서, β(n)은 n번째 샘플에서의 시그모이드 함수의 모양을 결정하는 파라미터, β(n+1)은 n+1번째 샘플에서의 시그모이드 함수의 모양을 결정하는 파라미터이고,

는 아래의 수학식 8과 같이 표현될 수 있다.In Equation (7),? (N) is a parameter for determining the shape of the sigmoid function in the nth sample,? (N + 1) is a parameter for determining the shape of the sigmoid function in the ego,

Can be expressed as Equation (8) below.

수학식 8은 G를 β에 대해 미분한 값을 나타낼 수 있다.Equation (8) can represent a value obtained by differentiating G with respect to?.

이상의 수학식 1 내지 8에서 설명한 바와 같이, 시그모이드 함수의 파라미터는 최급강하법을 이용하여 비용함수를 최소화하도록 매 프레임마다 결정될 수 있다. 그러면, 파라미터 업데이트부(207)는 비용함수를 최소화시키도록 매 프레임마다 결정된 파라미터로 시그모이드 함수의 파라미터를 계속하여 업데이트할 수 있다. 그리고, 패킷 손실이 발생하지 않은 프레임에서 구해진 시그모이드 함수의 파라미터는, 패킷 손실이 발생한 프레임에서, 적응형 뮤팅을 위해 사용될 수 있다.As described in Equations (1) to (8) above, the parameters of the sigmoid function can be determined every frame so as to minimize the cost function using the best-fit method. Then, the parameter update unit 207 can continuously update the parameters of the sigmoid function with the parameters determined every frame to minimize the cost function. The parameters of the sigmoid function obtained in a frame in which no packet loss occurs can be used for adaptive muting in a frame in which packet loss occurs.

이때, 시그모이드 함수의 모양을 고려하여 n₀는 150으로 기설정될 수 있으며, α 및 β는

범위 내에서 결정될 수 있다.Considering the shape of the sigmoid function, n ₀ can be set to 150, and α and β

Can be determined within a range.

이처럼, 시그모이드 함수의 파라미터가 결정 및 업데이트되면, 뮤팅 게인 계산부(206)는 위의 수학식 2에 기초하여 뮤팅 게인을 계산할 수 있다. 그러면, 복원 신호 생성부(209)는 위의 수학식 1에 기초하여 복원 신호를 생성할 수 있다. 예를 들어, 복원 신호 생성부(209)는 n번째 샘플에서의 뮤팅 게인 G(n)과 n번째 샘플에서의 손실 가정 디코딩 신호 ylpre(n)의 곱으로써 n번째 샘플에서의 복원 신호 yl(n)을 생성할 수 있다. As described above, when the parameter of the sigmoid function is determined and updated, the muting gain calculation unit 206 can calculate the muting gain based on Equation (2) above. Then, the restoration signal generator 209 can generate a restoration signal based on Equation (1). For example, the reconstruction signal generator 209 multiplies the restoration signal yl (n) in the n-th sample by the product of the muting gain G (n) in the n-th sample and the loss hypothesis decoding signal ylpre Can be generated.

이때, 시그모이드 함수의 파라미터가 계속하여 업데이트됨에 따라, 뮤팅 게인 계산부(206)는 업데이트된 시그모이드 함수의 파라미터를 이용하여 n번째 샘플에서의 뮤팅 게인 G(n)을 계산할 수 있다. 다시 말해, n번째 샘플에서의 뮤팅 게인 G(n)은 적응적인 값을 가질 수 있다. 이에 따라, 복원 신호 생성부(209)는 n번째 샘플에서의 적응형 뮤팅 게인 G(n)을 n번째 샘플에서의 손실 가정 디코딩 신호 ylpre(n)에 곱함에 따라 점차적으로 신호를 감쇄시켜 복원 신호를 생성할 수 있다. 이에 따라, 패킷 손실이 발생 시 n번째 샘플에서의 손실 가정 디코딩 신호 ylpre(n)를 그대로 사용할 때 보다 적응형 뮤팅 게인을 이용하여 복원 신호를 생성하는 경우 기계음과 같은 불필요한 잡음이 감소 또는 제거될 수 있다.At this time, as the parameter of the sigmoid function is continuously updated, the muting gain calculation unit 206 can calculate the muting gain G (n) in the n-th sample using the parameters of the updated sigmoid function. In other words, the muting gain G (n) in the nth sample may have an adaptive value. Accordingly, the reconstruction signal generation unit 209 gradually multiplies the adaptive muting gain G (n) in the n-th sample by the loss hypothesis decoding signal ylpre (n) in the n-th sample to gradually attenuate the signal, Lt; / RTI > Accordingly, when a restored signal is generated using the adaptive muting gain rather than using the lost presumed decoded signal ylpre (n) in the nth sample at the time of packet loss, unwanted noise such as a mechanical noise can be reduced or eliminated have.

도 3은 도 1의 LP 기반 피치 반복 수행부에서 복원 신호를 생성하는 과정을 나타내는 블록도이다.3 is a block diagram illustrating a process of generating a restoration signal in the LP-based pitch repetition unit of FIG.

도 3은 도 1의 과정 중 본 발명의 일실시예에 따른 패킷손실은닉 및 최급강하법을 적용하여 복원 신호를 생성하는 과정을 나타내는 블록도이다. FIG. 3 is a block diagram illustrating a process of generating a restoration signal by applying a packet loss concealment method and a method of optimizing the strength of a packet according to an exemplary embodiment of the present invention. Referring to FIG.

도 3에 따르면, 패킷 손실이 발생하지 않은 n번째 샘플에서의 이전 프레임 Zl(n)을 이용하여 현재 프레임에 해당하는 n번째 샘플에서의 복원 신호 yl(n)가 생성될 수 있다. 이때, LP 분석(LP analysis, 301)는 n번째 샘플에서의 이전 프레임 Zl(n)을 대상으로 선형 예측 분석을 수행하여 LPC 계수를 추출하여 사용할 수 있다. 그리고, LTP 분석(LTP analysis, 302)는 n번째 샘플에서의 이전 프레임 Zl(n)의 피치 구간 T₀를 추출하고, 신호 분류부(signal classification, 303)은 n번째 샘플에서의 이전 프레임 Zl(n)의 클래스(class)를 추출할 수 있다. Referring to FIG. 3, a reconstructed signal yl (n) in the nth sample corresponding to the current frame can be generated using the previous frame Zl (n) in the nth sample in which no packet loss has occurred. At this time, the LP analysis (LP analysis) 301 can extract the LPC coefficients by performing the linear prediction analysis on the previous frame Zl (n) in the nth sample. The LTP analysis 302 extracts the pitch interval T ₀ of the previous frame Zl (n) in the n-th sample and the signal classification 303 extracts the previous frame Zl (n) n) can be extracted.

그러면, 수정 및 피치 반복부(Modification/pitch repetition, 304)는 LPC 계수, 클래스(class) 및 피치 구간 T₀를 이용하여 n번째 샘플에서의 손실 가정 디코딩 신호 ylpre(n)를 생성할 수 있다. 여기서, LP 분석(LP analysis, 301), LTP 분석(LTP analysis, 302), 신호 분류부(signal classification, 303), 수정 및 피치 반복부(Modification/pitch repetition, 304)는 도 2의 손실 가정 디코딩 신호 생성부에 포함될 수 있다.Then, the modification and pitch repetition unit 304 can generate the lost presumed decoded signal ylpre (n) in the n-th sample using the LPC coefficient, the class, and the pitch interval T ₀ . Here, LP analysis (LP analysis) 301, LTP analysis (LTP analysis) 302, signal classification (303), and correction and pitch repetition (304) And may be included in the signal generation unit.

그리고, 뮤팅 팩터 계산부(muting factor computation, 305)는 시그모이드 함수의 파라미터를 프레임마다 결정하여 업데이트하고, 업데이트된 파라미터를 이용하여 뮤팅 게인을 적응적으로 계산할 수 있다. 여기서, 뮤팅 팩터 계산부(305)는 도 2의 뮤팅 처리부(204)에 해당할 수 있다.The muting factor computation unit 305 determines and updates the parameter of the sigmoid function for each frame and adaptively calculates the muting gain using the updated parameter. Here, the muting factor calculator 305 may correspond to the muting processor 204 of FIG.

이어, 복원 신호 생성부(306)는 적응형 뮤팅 게인을 n번째 샘플에서의 손실 가정 디코딩 신호 ylpre(n)에 곱함에 따라 n번째 샘플에서의 복원 신호 yl(n)을 생성할 수 있다. 복원 신호 생성부(306)는 적응형 뮤팅 게인을 이용함에 따라 피치의 주기적인 반복으로 인해 생성되는 불필요한 잡음을 감소 또는 제거시켜 복원 신호를 생성할 수 있다.Then, the reconstruction signal generator 306 may generate the reconstruction signal yl (n) at the n-th sample by multiplying the adaptive muting gain by the lost presumption decoded signal ylpre (n) at the n-th sample. The restoration signal generator 306 may generate a restoration signal by reducing or eliminating unnecessary noise caused by periodic repetition of the pitch by using the adaptive muting gain.

도 4는 본 발명의 일실시예에 따라 패킷손실은닉 및 최급강하법을 이용하여 복원 신호를 생성하는 동작을 설명하기 위해 제공되는 흐름도이다.FIG. 4 is a flowchart illustrating an operation of generating a restoration signal using a packet loss concealment method and a top-down method according to an embodiment of the present invention.

도 4에서, 복원 신호를 생성하는 동작은 도 2의 적응형 뮤팅 시스템에 의해 수행될 수 있다.In Fig. 4, the operation of generating the restoration signal can be performed by the adaptive muting system of Fig.

먼저, 401 단계에서, 적응형 뮤팅 시스템(200)은 G.722 인코더에서 인코딩된 프레임을 수신할 수 있다. First, in step 401, the adaptive muting system 200 may receive an encoded frame from a G.722 encoder.

이어, 402 단계에서, 적응형 뮤팅 시스템(200)은 인코딩된 프레임이 패킷 손실이 발생한 손실 프레임인지 여부를 판단할 수 있다.Then, in step 402, the adaptive muting system 200 may determine whether the encoded frame is a lost frame in which packet loss occurred.

이때, 손실 프레임이 아닌것으로 판단된 경우(402: No), 403 단계에서, 적응형 뮤팅 시스템(200)은 인코딩된 프레임에 대해 노멀 디코딩(normal decoding)을 수행하여 n번째 샘플에서의 디코딩 신호 dl(n)을 생성할 수 있다. 예를 들어, 적응형 뮤팅 시스템(200)은 G.722 디코딩을 수행할 수 있다. 그러면, n번째 샘플에서의 디코딩 신호 dl(n)은 깨끗한 신호로 desired signal일 수 있다.If it is determined that the frame is not a lost frame (No in step 402), in step 403, the adaptive muting system 200 performs normal decoding on the encoded frame to generate a decoded signal dl (n) < / RTI > For example, the adaptive muting system 200 may perform G.722 decoding. Then, the decoded signal dl (n) in the nth sample may be a desired signal with a clean signal.

이어, 404 단계에서, 적응형 뮤팅 시스템(200)은 최급강하법을 적용하기 위해 패킷 손실이 발생하지 않은 프레임에 패킷 손실이 발생한 것을 가정하고, LP 기반 피치 반복(LP based pitch repetition)을 수행할 수 있다. 예를 들어, 적응형 뮤팅 시스템(200)은 패킷 손실을 가정함에 따라 n번째 샘플에서의 이전 프레임 Zl(n)에 대해 LP 기반 피치 반복(LP based pitch repetition)을 수행하여 n번째 샘플에서의 손실 가정 디코딩 신호 ylpre(n)을 생성할 수 있다.Next, in step 404, the adaptive muting system 200 performs LP based pitch repetition, assuming that a packet loss occurs in a frame in which packet loss has not occurred in order to apply the best luck method . For example, the adaptive muting system 200 may perform LP based pitch repetition on the previous frame Zl (n) in the nth sample as assuming packet loss, It is possible to generate the hypothesis decoding signal ylpre (n).

그러면, 405 단계에서, 적응형 뮤팅 시스템(200)은 n번째 샘플에서의 손실 가정 디코딩 신호 ylpre(n)와 n번째 샘플에서의 디코딩 신호 dl(n)을 이용하여 시그모이드 함수의 파라미터를 결정할 수 있다.Then, in step 405, the adaptive muting system 200 determines a parameter of the sigmoid function using the lost presumption decoded signal ylpre (n) in the nth sample and the decoded signal dl (n) in the nth sample .

예를 들어, 적응형 뮤팅 시스템(200)은 위의 수학식 1 내지 5에 기초하여 디코딩 신호와 복원 신호 간의 차이를 나타내는 에러 신호가 최소 또는 감소되도록 시그모이드 함수의 파라미터를 결정할 수 있다. 이때, 적응형 뮤팅 시스템(200)은 프레임 마다 시그모이드 함수의 파라미터를 결정할 수 있다.For example, the adaptive muting system 200 may determine the parameters of the sigmoid function such that the error signal indicative of the difference between the decoded signal and the reconstructed signal is minimized or decreased based on Equations 1 to 5 above. At this time, the adaptive muting system 200 may determine the parameters of the sigmoid function for each frame.

그러면, 406 단계에서, 적응형 뮤팅 시스템(200)은 결정된 시그모이드 함수의 파라미터를 프레임마다 업데이트할 수 있다.Then, in step 406, the adaptive muting system 200 may update the parameters of the determined sigmoid function for each frame.

그리고, 407 단계에서, 적응형 뮤팅 시스템(200)은 업데이트된 시그모이드 함수의 파라미터를 이용하여 뮤팅 게인을 계산할 수 있다. 이때, 시그모이드 함수의 파라미터가 매 프레임마다 업데이트됨에 따라, 적응형 뮤팅 시스템(200)은 적응적으로 n번째 샘플에서의 뮤팅 게인 G(n)을 계산할 수 있다.Then, in step 407, the adaptive muting system 200 may calculate the muting gain using the parameters of the updated sigmoid function. At this time, as the parameters of the sigmoid function are updated every frame, the adaptive muting system 200 may adaptively calculate the muting gain G (n) in the nth sample.

그러면, 408 단계에서, 적응형 뮤팅 시스템(200)은 n번째 샘플에서의 적응형 뮤팅 게인 G(n)과 n번째 샘플에서의 손실 가정 디코딩 신호 ylpre(n)을 이용하여 복원 신호를 생성할 수 있다. Then, in step 408, the adaptive muting system 200 can generate a restoration signal using the adaptive muting gain G (n) in the n-th sample and the loss hypothesis decoding signal ylpre (n) in the n-th sample have.

예를 들어, 적응형 뮤팅 시스템(200)은 n번째 샘플에서의 적응형 뮤팅 게인 G(n)과 n번째 샘플에서의 손실 가정 디코딩 신호 ylpre(n)을 곱함에 따라 n번째 샘플에서의 복원 신호 yl(n)을 생성할 수 있다.For example, the adaptive muting system 200 multiplies the adaptive muting gain G (n) at the n-th sample by the loss hypothesis decoding signal ylpre (n) at the n-th sample, yl (n) < / RTI >

이처럼, 적응형 뮤팅 시스템(200)은 패킷 손실이 발생하지 않은 경우에도 패킷 손실이 발생한 것으로 가정하고 시그모이드 함수의 파라미터를 계속하여 업데이트하여 뮤팅 게인을 적응적으로 계산할 수 있다.Thus, the adaptive muting system 200 can adaptively calculate the muting gain by continuously updating the parameters of the sigmoid function, assuming that packet loss has occurred even if no packet loss has occurred.

이때, 패킷 손실이 발생한 경우, 다시 말해, 인코딩된 프레임이 손실 프레임으로 판단된 경우(402: YES), 적응형 뮤팅 시스템(200)은 이전 프레임에 해당하는 시그모이드 함수의 파라미터를 이용하여 뮤팅 게인을 계산할 수 있다. 여기서, 이전 프레임은, 손실 프레임 이전에 수신되어 패킷 손실이 발생하지 않은 프레임에 해당할 수 있다.At this time, if a packet loss occurs, that is, if the encoded frame is determined to be a lost frame (402: YES), the adaptive muting system 200 performs muting using the parameters of the sigmoid function corresponding to the previous frame The gain can be calculated. Here, the previous frame may correspond to a frame received before the lost frame, and no packet loss has occurred.

예를 들어, 적응형 뮤팅 시스템(200)은 버퍼를 참조하여 패킷 손실이 발생하지 않은 이전 프레임에 해당하는 시그모이드 함수의 파라미터를 패킷 손실이 발생한 현재 프레임에 해당하는 시그모이드 함수의 파라미터로 결정하여 파라미터를 업데이트할 수 있다. 그리고, 409 단계에서, 적응형 뮤팅 시스템(200)은 업데이트된 현재 프레임에 해당하는 시그모이드 함수의 파라미터를 이용하여 뮤팅 게인을 계산할 수 있다.For example, the adaptive muting system 200 refers to a buffer to determine a parameter of a sigmoid function corresponding to a previous frame in which packet loss has not occurred as a parameter of a sigmoid function corresponding to a current frame in which a packet loss occurs And update the parameters. In step 409, the adaptive muting system 200 may calculate the muting gain using parameters of the sigmoid function corresponding to the updated current frame.

이처럼, 패킷 손실이 발생한 경우, 적응형 뮤팅 시스템(200)은 패킷 손실이 발생하지 않은 이전 프레임에 해당하는 시그모이드 함수의 파라미터를 이용함에 따라, 뮤팅 게인을 적응적으로 계산할 수 있다. 그리고, 적응형 뮤팅 시스템(200)은 계산된 적응형 뮤팅 게인과 손실 가정 디코딩 신호를 이용하여 복원 신호를 생성함에 따라, 기계음과 같은 불필요한 잡음을 제거하여 복원 신호를 생성할 수 있다.In this way, when the packet loss occurs, the adaptive muting system 200 can adaptively calculate the muting gain by using the parameters of the sigmoid function corresponding to the previous frame in which packet loss has not occurred. Then, the adaptive muting system 200 generates a restored signal by using the calculated adaptive muting gain and the lost-hypothesized decoded signal, and then generates a restored signal by removing unnecessary noise such as a mechanical sound.

도 5는 본 발명의 일실시예에 따른 뮤팅 게인을 이용하여 생성된 복원 신호의 파형을 도시한 도면이다.5 is a diagram illustrating a waveform of a restoration signal generated using a muting gain according to an embodiment of the present invention.

도 5에서, 성능을 검증하기 위해 NTT 데이터 베이스의 음성 파일을 사용하였으며, 501은 뮤팅을 동작하지 않고 복원 신호를 생성한 경우의 파형, 502는 표준에서 정의한 뮤팅 방법에 따라 복원 신호를 생성한 경우의 파형, 503은 본 발명에서 제안하는 적응형 뮤팅 게인을 이용하여 복원 신호를 생성한 경우의 파형이다.5, a voice file of the NTT database is used for verifying the performance, 501 is a waveform when a restoration signal is generated without muting, 502 is a waveform when a restoration signal is generated according to a muting method defined in the standard And 503 is a waveform when a restoration signal is generated using the adaptive muting gain proposed in the present invention.

도 5에서는, 뮤팅의 중요성을 확인하기 위해 뮤팅이 동작하지 않을 때의 파형 역시 포함하고 있다(501). 도 5에서, 0 내지 0.01초 구간은 패킷 손실이 발생하지 않은 구간이고, 0.01 내지 0.05초 구간은 패킷 손실이 발생한 구간에 해당한다. 이때, 501 내지 503 에서 n번째 샘플에서의 복원 신호 yl(n)와 n번째 샘플에서의 원하는 신호 dl(n)를 비교해 보면, 본 발명에 따른 뮤팅 게인을 이용하여 복원 신호를 생성한 503이 복원 신호가 원하는 신호와 가장 비슷하게 복원된 것을 확인할 수 있다. 여기서, 원하는 신호는 디코딩된 신호일 수 있다. FIG. 5 also includes waveforms when the muting does not operate in order to confirm the importance of muting (501). In FIG. 5, a period from 0 to 0.01 seconds is a period in which no packet loss occurs, and a period from 0.01 to 0.05 seconds corresponds to a period in which a packet loss occurs. At this time, when the restored signal yl (n) in the nth sample and the desired signal dl (n) in the nth sample are compared in 501 to 503, the restored signal 503, which generates the restored signal using the muting gain according to the present invention, It can be seen that the signal is restored most like the desired signal. Here, the desired signal may be a decoded signal.

또한, 도 5의 501을 참고하면, 뮤팅이 동작하지 않을 때는 동일 파형이 같은 피치 주기로 반복되면서 불필요한 잡음 및 기계음이 발생하는 것을 확인할 수 있다. 그리고, 502를 참고하면, 표준에 따른 뮤팅 방법은 본 발명의 일실시예에 따른 적응형 뮤팅 방법보다 뮤팅이 더 많이 된 것을 확인할 수 있다.Also, referring to FIG. 5, when muting is not performed, it is confirmed that the same waveform is repeated at the same pitch period, and unnecessary noise and mechanical noise are generated. Referring to 502, it can be seen that the muting method according to the standard has more muting than the adaptive muting method according to the embodiment of the present invention.

아래의 표 1은 여러 가지 패킷 손실률에 대해 객관적인 음성품질평가 지표인 segmental SNR, PESQ, C_ov1에 대한 결과와 주관적 음성 품질평가 지표인 MOS에 대한 결과를 나타내고 있다. MOS 실험을 위해 총 10명의 참가자가 각자 주관적으로 1점부터 5점까지 음성 품질에 대한 점수를 매기었다. Table 1 below shows the results for segmental SNR, PESQ, and _Cov1 , which are objective voice quality metrics for various packet loss rates, and the results for MOS, a subjective voice quality evaluation index. For the MOS experiment, a total of 10 participants rated each voice subjectively from 1 to 5 points.

패킷손실률Packet loss rate 음질평가지표Quality evaluation index 비교 알고리즘Comparison algorithm 기존 표준에 따른 뮤팅 방법 알고리즘Algorithm for muting according to existing standard 본 발명에 따른 적응형 뮤팅 방법The adaptive muting method according to the present invention 3%3% Seg. SNRSeg. SNR 24.3724.37 24.4224.42 PESQPESQ 3.403.40 3.423.42 C_ovl C _ovl 3.693.69 3.803.80 MOSMOS 3.393.39 3.453.45 6%6% Seg. SNRSeg. SNR 21.0821.08 21.1421.14 PESQPESQ 3.013.01 3.133.13 C_ovl C _ovl 3.343.34 3.573.57 MOSMOS 3.133.13 3.193.19 10%10% Seg. SNRSeg. SNR 17.7317.73 17.8317.83 PESQPESQ 2.762.76 2.922.92 C_ovl C _ovl 2.862.86 3.213.21 MOSMOS 2.642.64 2.802.80 15%15% Seg. SNRSeg. SNR 13.2113.21 13.3613.36 PESQPESQ 2.322.32 2.552.55 C_ovl C _ovl 2.412.41 2.762.76 MOSMOS 2.162.16 2.362.36

표 1에 따르면, 본 발명의 일실시예에 따른 적응형 뮤팅 방법이 표준에서 정의하고 있는 뮤팅 방법보다 효과적으로 뮤팅을 수행하고 있음을 확인할 수 있다. 이는, 최급강하법을 이용하여 결정된 시그모이드 함수의 파라미터가 뮤팅에 최적화되어 있음을 나타낼 수 있다. 이처럼, 뮤팅에 최적화된 시그모이드 함수의 파라미터를 이용하여 뮤팅을 수행하여 복원 신호를 생성함에 따라, 기계음 등의 잡음이 제거 또는 감소되어 음성 품질이 향상될 수 있다.According to Table 1, it can be seen that the adaptive muting method according to an embodiment of the present invention performs muting more effectively than the muting method defined in the standard. This may indicate that the parameters of the sigmoid function determined using the best fit method are optimized for muting. As described above, by performing the muting using the parameter of the sigmoid function optimized for muting to generate the restored signal, the noise such as the mechanical noise can be removed or reduced and the voice quality can be improved.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to an embodiment may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions to be recorded on the medium may be those specially designed and configured for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다.　 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. For example, it is to be understood that the techniques described may be performed in a different order than the described methods, and / or that components of the described systems, structures, devices, circuits, Lt; / RTI > or equivalents, even if it is replaced or replaced.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

Claims

A decoding signal generator for normally decoding the encoded frames in the G.722 encoder to generate a decoded signal;
A loss hypothesis decoding signal generation unit for generating a loss hypothesis decoding signal by performing linear prediction based pitch repetition on a previous frame which is a frame in which packet loss has not occurred among the encoded frames;
A parameter determination unit for determining a parameter of the sigmoid function using the decoded signal and the loss hypothesis decoding signal,
A muting gain calculator for calculating a muting gain using the parameter of the sigmoid function; And
A restoration signal generation unit for restoring a current frame by using the muting gain and the lost-
Lt; / RTI >
The loss hypothesis decoding signal generation unit may include:
Extracts an LPC coefficient, a pitch interval (T ₀ ) of a previous frame and a class of a previous frame with respect to the previous frame, generates the loss hypothesis decoding signal based on the extracted LPC coefficient, pitch interval, and class ,
Wherein the previous frame indicates a frame received before the current frame of the encoded frames
Wherein the adaptive muting system comprises:

The method according to claim 1,
The parameters of the sigmoid function are:
(Steepest Descent Criterion) is used to determine and update each frame.

The method according to claim 1,
A determination unit for determining whether the encoded frames are lost frames; And
Updating a parameter of a sigmoid function corresponding to a current frame with a parameterer of a sigmoid function determined using the loss hypothesis decoding signal as it is determined that the frame is not the lost frame, A parameter update unit for updating the parameter of the sigmoid function corresponding to the current frame with the parameter of the sigmoid function corresponding to the previous frame,
&Lt; / RTI > further comprising an adaptive muting system.

The method according to claim 1,
The encoded frames are frames in which packet loss has not occurred,
The loss hypothesis decoding signal generation unit may include:
And performs the linear prediction based pitch iteration on the assumption that a packet loss has occurred in the encoded frames.

The method according to claim 1,
Wherein the parameter determination unit comprises:
Wherein the parameter of the sigmoid function is determined such that an error signal indicating a difference between the reconstructed signal reconstructed from the encoded frames and the decoded signal is reduced or minimized.

delete

The method according to claim 1,
The loss hypothesis decoding signal generation unit may include:
Wherein the linear prediction based pitch repetition is performed on a signal corresponding to the low band among low and high band signals included in the encoded frames.

Normal decoding the encoded frames in the G.722 encoder to generate a decoded signal;
Generating a loss hypothesized decoding signal by performing linear prediction based pitch repetition on a previous frame, which is a frame in which no packet loss occurred, from the encoded frames;
Determining a parameter of the sigmoid function using the decoded signal and the lost hypothesized decoded signal;
Calculating a muting gain using parameters of the sigmoid function; And
And restoring a current frame using the muting gain and the lossy decoding signal
Lt; / RTI >
Wherein the step of generating the lost hypothesized decoded signal comprises:
Extracts an LPC coefficient, a pitch interval (T ₀ ) of a previous frame and a class of a previous frame with respect to the previous frame, generates the loss hypothesis decoding signal based on the extracted LPC coefficient, pitch interval, and class ,
Wherein the previous frame indicates a frame received before the current frame of the encoded frames
Wherein the adaptive muting method comprises: