CN106531175B

CN106531175B - A kind of method that network phone comfort noise generates

Info

Publication number: CN106531175B
Application number: CN201610996520.1A
Authority: CN
Inventors: 丁海忠; 何延伟; 叶成竞
Original assignee: Nanjing Hanlong Technology Co Ltd
Current assignee: Nanjing Hanlong Technology Co Ltd
Priority date: 2016-11-13
Filing date: 2016-11-13
Publication date: 2019-09-03
Anticipated expiration: 2036-11-13
Also published as: CN106531175A

Abstract

The present invention discloses a kind of method that network phone comfort noise generates, under the feelings for not changing standard agreement, using joined random adaptive codebook and random fixed codebook in white noise, it whether is active speech or nonactive voice by detection load packet signal source, voice signal model is being generated after noise decoding and linear predictive coding calculate, soft noise is generated using linear prediction filter, the effect of the technical program can preferably reflect the ambient noise of actual environment, make acoustically to feel with stability.

Description

A kind of method that network phone comfort noise generates

Technical field

The present invention relates to audio digital signals processing system in a kind of network telephone more particularly to a kind of network telephones The method that middle comfort noise generates, the invention belongs to embedded computer system, network communication, media information processing technique necks Domain.

Background technique

International Telecommunication Union's communication standardization tissue publication audio coder & decoder (codec) technical standard ITU-T G.711 In Appendix II annex 2 of agreement (ITU-T G.711), the production method of two kinds of comfort noises is defined, it is international electricity Letter alliance ITU-T makes a set of voice compression come, and represents logarithm PCM (logarithmic pulse-code Modulation) sampling standard；It is mainly used for phone, using pulse code modulation to audio sample, sample rate is that 8k is per second, benefit With the uncompressed channel transfer speech sound signal of a 64Kbps, 16 data are compressed into 8, G.711 marked by compression ratio 1:2 Standard is the waveform sound codec of mainstream, and G.711 there are mainly two types of compression algorithms under standard, is described below respectively:

As shown in Figure 1, traditional noise generation method schematic diagram with noise gain and frequency spectrum parameter, first method are The voice signal that payload length is 11 bytes is sent, wherein the 1st byte is the gain parameter of comfort noise, subsequent 10 words Section be noise frequency spectrum parameter, in receiving end, as long as from load packet in decoding obtain noise gain and frequency spectrum parameter it is linearly pre- It surveys code coefficient and comfort noise can be obtained using random white noise as driving source.

As shown in Fig. 2, traditional only noise generation method schematic diagram with noise gain, second method are to send load Length only has the voice signal of 1 byte, only includes the gain parameter of comfort noise in load packet, without mentioning in first method The frequency spectrum parameter arrived.

In order to reduce the capacity of load packet, so what is actually used at present is all the comfort noise that second method generates, Since the comfort noise generated under 1 byte mode does not have frequency spectrum parameter, so " soft " noise that practical this method generates is simultaneously It is not soft.

In addition, in above-mentioned two ways, what is used due to the driving source of comfort noise is all white noise, and use is white Noise is also bad as comfort noise effect caused by driving source, acoustically also there is discontinuous feeling.

Therefore, in this case, it is proposed that a kind of method that comfort noise generates in new network phone, solves to make an uproar The stability problem of sound effective value.

Summary of the invention

In order to solve the drawbacks described above of the prior art, the technical program purpose is, under the feelings for not changing standard agreement, It whether is living by detection load packet signal source using joined random adaptive codebook and random fixed codebook in white noise Property voice or nonactive voice, generating voice signal model after noise decoding and linear predictive coding calculate, using Linear Prediction filter is generated than milder noise, can preferably be reflected the ambient noise of actual environment, be made acoustically to feel have There is stability.

The purpose of the present invention is achieved through the following technical solutions:

A kind of method that network phone comfort noise generates, it is characterised in that: in the voice recognition processing mistake of network phone Whether Cheng Zhong, detection load packet signal source are active speech or nonactive voice, and the noise of nonactive voice is decoded by noise Afterwards, so that output noise band is changed spectral characteristic, then enter linear prediction filter；One group of random adaptive code of addition is set The exciting signal source of this and random fixed codebook, the output noise and exciting signal source that band changes spectral characteristic are by linear pre- Survey filter；Meanwhile voice output of the voice signal that is obtained after tone decoding of active speech as vocoder, Huo Zhezuo Comfort noise signal is exported after linear prediction filter for the voice input that linear predictive coding calculates.

The present invention has unique advantages and beneficial effects as follows:

The method that a kind of network phone comfort noise that the technical program proposes generates, before not changing the capacity that load is wrapped It puts, using the signal for the generation model for more meeting voice signal as driving source, so that the comfort noise of output is more comfortable, Improve the user experience of voice communication.

Detailed description of the invention

Fig. 1 is traditional noise generation method schematic diagram with noise gain and frequency spectrum parameter；

Fig. 2 is traditional only noise generation method schematic diagram with noise gain；

Fig. 3 is the architecture diagram for the method that a kind of network phone comfort noise of the present invention generates；

Fig. 4 is the calculation formula of the noise gain for the method that a kind of network phone comfort noise of the present invention generates.

Specific embodiment

Technical solution of the present invention is described in detail with reference to the accompanying drawings of the specification:

As shown in figure 3, a kind of method that network phone comfort noise generates, it is characterised in that: in the voice of network phone In identification processing procedure, whether detection load packet signal source is active speech or nonactive voice, the noise warp of nonactive voice After crossing noise decoding, output noise band is made to change spectral characteristic, then enters linear prediction filter；Be arranged one group of addition with The exciting signal source of machine adaptive codebook and random fixed codebook, band change the output noise and exciting signal source of spectral characteristic By linear prediction filter；Meanwhile voice of the voice signal that is obtained after tone decoding of active speech as vocoder Output, or the voice input calculated as linear predictive coding, after linear prediction filter, output comfort noise letter Number.

Further, during the voice recognition processing of network phone, receive load source speech signal, according to load packet Length detection judges whether load packet signal source is active speech signal, and setting noise as the length of nonactive speech payloads packet is 1 byte is 0 byte, when for 1 byte, decodes and obtains noise gain, when for 0 byte, then the noise gain of present frame It is constant, then it is replaced with the noise gain of previous frame, and the length of active speech load packet is 160 bytes, for example active speech frame, It is then transferred to tone decoding, otherwise turns noise decoding.

Further, active speech load packet is transferred to tone decoding, using the compress speech side in G.711 standard Formula, speech decoding process are one group of obtained voice signal from μ rate or A rate to the conversion of Linear Pulse Code Modulation, a side Output of the face as vocoder, current frame speech decoding terminates, then turns to do next frame decoding；On the other hand it is compiled as linear prediction The voice input that code device calculates.

Further, the noise of the nonactive voice carries out noise decoding, is the noise energy solution that will be loaded in packet Code, obtains noise gain G, using audio coder & decoder (codec) technical standard ITU-T G.711 II protocol mode of Appendix, meter Calculation mode isWherein E is the noise energy that load decoding obtains.

And then, the voice signal that the active speech obtains after tone decoding is as Linear Predictive Coder meter The voice of calculation inputs, and determines linear forecast coding coefficient using CELP QCELP Qualcomm.

Then, one group of exciting signal source that random adaptive codebook and random fixed codebook are added in white noise is set, Using the International Telecommunication Union voice compression algorithm ITU-T calculation method that G.729 B.4.4 annex B agreement saves, according to Audio coder & decoder (codec) technical standard G.711 in the length of every frame determine, obtain e [80] sequence as output voice signal.

Finally, using the e [80] obtained after calculating as driving source, the noise gain G that noise decodes, by linear The voice signal that predictive coding is calculated is filtered by linear prediction filter, obtains comfort noise output signal.

In conclusion the method that a kind of network phone comfort noise proposed by the present invention generates, joined in white noise Random adaptive codebook and random fixed codebook generate voice signal model, so that the comfort noise of output is softer.

Above-mentioned technical proposal is only specific application example of the invention, can be drunk as the case may be in actual application Feelings select alternate device device, but are not limited in any way to the protection scope of invention.

Claims

1. a kind of method that network phone comfort noise generates, it is characterised in that: in the voice recognition processing process of network phone In, whether detection load packet signal source is active speech or nonactive voice, the noise of nonactive voice after noise decodes, So that output noise band is changed spectral characteristic, then enters linear prediction filter；One group of random adaptive codebook of addition is set With the exciting signal source of random fixed codebook, assisted using International Telecommunication Union voice compression algorithm ITU-TG.729annexB Discuss the calculation method that B.4.4 saves, according to audio coder & decoder (codec) technical standard G.711 in the length of every frame determine, obtain e [80] sequence is as output voice signal, and band changes the output noise of spectral characteristic and exciting signal source is filtered by linear prediction Wave device；Meanwhile voice output of the voice signal that is obtained after tone decoding of active speech as vocoder, or as line Property predictive coding calculate voice input, after linear prediction filter, export comfort noise signal.

2. the method that a kind of network phone comfort noise according to claim 1 generates, it is characterised in that: network phone During voice recognition processing, receive load source speech signal, is according to the length detection judgement load packet signal source of load packet No is active speech signal, sets noise and is 1 byte as the length of nonactive speech payloads packet or is 0 byte, when for 1 byte When, decoding obtains noise gain, and when for 0 byte, then the noise gain of present frame is constant, then uses the noise gain generation of previous frame It replaces, and the length of active speech load packet is 160 bytes, for example active speech frame is then transferred to tone decoding, otherwise turns noise solution Code.

3. the method that a kind of network phone comfort noise according to claim 1 generates, it is characterised in that: the load Packet signal source is detected as being transferred to tone decoding when active speech, and speech decoding process is encoded from μ rate or A rate to linear impulsive The conversion of modulation, one group of obtained voice signal, on the one hand as the output of vocoder, current frame speech decoding terminates, then turns Do next frame decoding；On the other hand the voice input calculated as linear predictive coding.

4. the method that a kind of network phone comfort noise according to claim 1 generates, it is characterised in that: described is non-live Property voice noise carry out noise decoding, be will load packet in noise energy decode, noise gain G is obtained, using voice coder Decoder technique standard agreement mode.

5. the method that a kind of network phone comfort noise according to claim 1 generates, it is characterised in that: the activity The voice input that the voice signal that voice obtains after tone decoding is calculated as Linear Predictive Coder, is swashed using CELP code Linear predictive coding is encouraged to determine linear forecast coding coefficient.

6. the method that a kind of network phone comfort noise according to claim 1 generates, it is characterised in that: after calculating Obtained e [80] is used as driving source, the noise gain G that noise decodes, the voice being calculated by linear predictive coding Signal is filtered by linear prediction filter, obtains comfort noise output signal.