CN101320563A

CN101320563A - Background noise encoding/decoding device, method and communication equipment

Info

Publication number: CN101320563A
Application number: CN 200710110573
Authority: CN
Inventors: 王喆; 张立斌
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2007-06-05
Filing date: 2007-06-05
Publication date: 2008-12-10
Anticipated expiration: 2027-06-05
Also published as: CN101320563B; WO2008148321A1

Abstract

The present invention provides a background noise encoding/decoding device, a method thereof and a communication device, so as to flexibly adjust the SID frame encoding way and the non-continuous transmission mechanism according to the background noise in the communication environment. The technical proposal can flexibly adopt different SID frame encoding ways and non-continuous transmission mechanisms according to different types of the background noise, and adapts the encoding accuracy and time resolving capability which are required by different types of background noise when the actual resources of the communication system is used reasonably. Thus the technical proposal can provide different types of background noise with high quality at the decoding end when the actual resources of the communication system is used reasonably.

Description

A kind of ground unrest coding/decoding device, method and communication facilities

Technical field

The present invention relates to the voice communication technology, particularly a kind of ground unrest coding/decoding device, method and communication facilities.

Background technology

In the voice communication course, the voice signal that is transmitted can be divided into audible signal and no acoustical signal two classes, and wherein: audible signal is the voice signal that sends when detecting efficient voice, mainly comprises communication party's voice and information such as ground unrest at that time; No acoustical signal mainly comprises the diversity of settings noise at the ambient noise signal that the gap that does not detect efficient voice sends, such as white noise, background brouhaha and quiet etc.Concerning communicating pair, audible signal is a useful signal, is the carrier of the main communication information, is garbage signal and there is not acoustical signal, does not comprise the main communication information.

In voice communication course, be useful audible signal because communicating pair mainly pays close attention to, therefore consider the pressure of bandwidth, do not wish to transmit useless no acoustical signal, to reduce transmission bandwidth, increase power system capacity.If but only pass audible signal, can cause the discontinuous of ground unrest, make a side who listens to feel very uncomfortable, especially more obvious under the stronger situation of ground unrest, even can make the voice indigestion, transmit useless no acoustical signal so can only reduce as far as possible, on the basis that guarantees user's susceptibility, reduce bandwidth as far as possible, CNG (Comfort Noise generation has arisen at the historic moment for this reason, comfort noise generates) technology, in the CNG technology, transmitting terminal adopts silence compression coding techniques coding background noise frames in noiseless signal phase, and adopts DTX (Discontinuous Transmission, discontinuous transmission) mode to transmit, receiving end adopts the mode of interpolation to produce continuous ambient noise signal, to increase the comfort of listener.

As shown in Figure 1, for adopting the communication system principle schematic of silence compression coding techniques, wherein, transmitting terminal comprises:

VAD (Voice Activity Detector, speech detector), original signal is imported VAD after certain pre-service, VAD is usually according to the eigenwert of each frame signal, signal to noise ratio (S/N ratio) for example, whether comprise efficient voice in the detection signal, and according to testing result signal is sent into different scramblers and encode;

Speech coder, the audible signal that is used for comprising efficient voice is encoded to speech frame and output, and speech frame is transferred to receiving end by communication channel;

The non-voice scrambler, the noiseless signal encoding that adopts the silence compression technology will not comprise efficient voice is a background noise frames;

The background noise frames transmitter, adopt the DTX mode of setting to give receiving end by traffic channel with background noise frames, for example send a background noise frames every several frames, perhaps whether the characteristic parameter situation of change decision according to ground unrest sends a background noise frames;

Accordingly, receiving end comprises:

Voice decoder, the speech frame that is used to decode and receives from communication channel, and export after being reconstructed into audible signal;

The non-voice demoder is used to the background noise frames of decoding and receiving from communication channel;

Reconstructor is used to adopt the mode of interpolation that decoded data is reconstructed into continuous ambient noise signal, to increase the comfort of listener.

The non-voice scrambler of transmitting terminal and background noise frames transmitter are formed the DTX unit, and the non-voice demoder of receiving end and reconstructor are formed the CNG unit, VAD and the DTX of silence compression technology by being positioned at transmitting terminal, and the CNG realization that is positioned at receiving end.

With CELP (Code-Excited Linear Prediction, Code Excited Linear Prediction) speech coding technology commonly used is example, and CELP is a kind of high efficiency speech coding technology.Because adopted perceptual weighting, analyzed technology such as synthetic, vector quantization and back filtering, CELP can provide high-quality synthetic speech on middle low rate.In the CELP speech coding technology, carry out speech signal analysis obtaining the characteristic parameter of voice, the characteristic parameter of voice is encoded in the code stream sends at transmitting terminal; Receiving end carries out phonetic synthesis, reconstructed speech signal according to the characteristic parameter in the code stream.Voice signal s (n) is divided into the short time period, is called frame, and the speech analysis of transmitting terminal and the phonetic synthesis of receiving end are all carried out frame by frame.

At transmitting terminal, original signal s (n) can regard the output that a wave filter v of a pumping signal e (n) excitation (n) is produced as, i.e. s (n)=e (n) * v (n), and receiving end uses identical wave filter that voice are synthesized, and is called composite filter.Therefore the characteristic parameter that transmits in the encoding code stream of voice signal comprises composite filter parameter and excitation parameters, wherein: the composite filter parameter is mainly line spectral frequencies LSF (Line SpectrumFrequency) quantization parameter, and the pumping signal parameter comprises: adaptive codebook parameter (comprising pitch delay parameter and fundamental tone gain parameter) and fixed codebook parameters (position and symbol and the fixed codebook gain parameter that comprise non-zero pulses in the fixed codebook).

Different with speech coding parameters, what the ground unrest coding parameter was described is the ground unrest characteristic, because the pumping signal of ground unrest can be thought simple noise random series, and these sequences all can simply produce with the random noise generation device at the encoding and decoding end, control the amplitude of these sequences then with energy parameter, just can produce final pumping signal, therefore the pumping signal characteristic parameter can simply be represented with energy parameter, and do not need to further describe with other some characteristic parameters, so different with speech frame in the ground unrest encoding code stream is that its excitation parameters is the energy parameter of current background noise frame; And identically with speech frame in the ground unrest encoding code stream be that the composite filter parameter in the encoding code stream also is a line spectral frequencies LSF quantization parameter, is concrete quantization method difference to some extent.By above analysis, can think that the coded system to ground unrest is exactly a kind of simple voice coding modes in essence.

Be commonly referred to as SID (SilenceDescriptor, silence description frames) frame through the DTX compressed encoding and by the background noise frames of discontinuous transmission.In current various speech coding systems, each intrasystem ambient noise signal all uses SID frame format and discontinuous transmission mechanism.

Wherein, G.729B silence compression scheme is a kind of representative silence compression technology, G.729B be based on the CELP algorithm model, so the ground unrest parameter that it transmitted is based on also that the CELP model obtains, comprises the composite filter parameter and the excitation parameters that are used to describe ground unrest.At coding side, for example the court verdict of VAD is 0 o'clock, represent that current signal is a noise, then be admitted to the ground unrest code device through certain pretreated original signal, the ground unrest parameter is extracted and quantized at the ground unrest code device, form the SID frame, the ground unrest code device determines whether sending a frame SID frame according to the parameter situation of change in the SID frame then, sends a frame SID frame when for example filter parameter in the SID frame and energy parameter change greatly.G.729B in, the interval of two adjacent S ID frames is at least 2 frames.In decoding end,, then in the ground unrest decoding device, synthesize the comfort noise of describing coding side ground unrest characteristic according to the SID frame that receives if enter the ground unrest decoding and reconstituting stage.G.729B the bit number that distributes for each parameter in the SID frame is as shown in table 1 below:

Table 1.

Bit Allocation in Discrete (from a high position to the low level)	Parametric description
	Parametric description	S1	Quantitative prediction device index
S2-S6	One-level LSF quantisation vector index	S1	Quantitative prediction device index
S2-S6	One-level LSF quantisation vector index	S7-S10	Secondary LSF quantisation vector index
S11-S15	Gain (energy)	S7-S10	Secondary LSF quantisation vector index

AMR (Adaptive Multi-Rate speech codec, self-adaptation multi code Rate of Chinese character speech coder) and AMR-WB (Adaptive Multi-Rate Wide Band speech codec, wideband adaptive multi code Rate of Chinese character speech coder) DTX in the silence compression scheme then adopts fixing transmit mechanism, no matter how the characteristic of ground unrest changes, and per 8 frames send a SID frame.The bit number that distributes for each parameter in the SID frame of AMR is as shown in table 2 below, and the bit number that distributes for each parameter in the SID frame of AMR-WB is as shown in table 3 below.

Table 2.

Bit Allocation in Discrete (from a high position to the low level)	Parametric description
	Parametric description	S1-S3	Quantitative prediction device index
S4-S11	One-level LSF quantisation vector index	S1-S3	Quantitative prediction device index
S4-S11	One-level LSF quantisation vector index	S12-S20	Secondary LSF quantisation vector index
S21-S29	Three grades of LSF quantisation vector index	S12-S20	Secondary LSF quantisation vector index
S21-S29	Three grades of LSF quantisation vector index	S30-S35	Logarithm frame energy index

Table 3.

Bit Allocation in Discrete (from a high position to the low level)	Parametric description
	Parametric description	S1-S6	One-level LSF quantisation vector index
S7-S12	Secondary LSF quantisation vector index	S1-S6	One-level LSF quantisation vector index

S13-S18	Three grades of LSF quantisation vector index
S13-S18	Three grades of LSF quantisation vector index	S19-S23	Level Four LSF quantisation vector index
S24-S28	Pyatyi LSF quantisation vector index	S19-S23	Level Four LSF quantisation vector index
S24-S28	Pyatyi LSF quantisation vector index	S29-S34	Logarithm frame energy index
S35	The shake sign	S29-S34	Logarithm frame energy index

Existing ground unrest coded system can be carried out simple code to ground unrest, and synthesizes simple comfortable ground unrest in decoding end, can better satisfy the requirement of answering quality to a certain extent.Yet along with development of Communication Technique, the voice coding quality is more and more higher, and is corresponding, the coding quality of the ground unrest in sound stage is also more and more higher, in this case, the ground unrest when making voice communication is more continuous, also needs to improve the coding quality of silent period ground unrest.Existing comparatively single S ID frame coding mode and discontinuous transmission mechanism obviously can't satisfy the specific requirement of multiple communication environment.

Summary of the invention

The embodiment of the invention provides a kind of ground unrest coding/decoding device, method and communication facilities, to realize adjusting SID frame coding mode and discontinuous transmission mechanism flexibly according to the ground unrest situation of communication environment.

A kind of ground unrest code device comprises:

The background noise analysis unit is used to obtain at least one signal characteristic parameter value of ambient noise signal;

The ground unrest taxon is used for determining the ground unrest type according to described at least one signal characteristic parameter value;

Coding switch unit and at least two discontinuous transmission units, described coding switch unit is used for according to described ground unrest type the corresponding discontinuous transmission unit of ambient noise signal input, each discontinuous transmission unit is used for ambient noise signal is encoded to silence description frames and the discontinuous transmission that comprises discontinuous transmission unit indication information, and wherein the silence description frames coded system of each discontinuous transmission unit and/or the discontinuous transmission mechanism of silence description frames are inequality.

A kind of ground unrest decoding device comprises:

First recognition unit is used for discerning the discontinuous transmission unit indication information of silence description frames;

Decoding switch unit and at least two comfort noise generation units, described decoding switch unit is used for the recognition result according to described recognition unit, with the corresponding comfort noise generation unit of described silence description frames input, each comfort noise generation unit is according to the corresponding decoding of the coded system silence description frames of the silence description frames that receives, and the reconstruct ambient noise signal.

A kind of communication facilities comprises:

First speech detector, whether the signal that is used for detecting collection comprises efficient voice and classification output;

Speech coder, the signal encoding that is used for comprising efficient voice is speech frame and output continuously;

Also comprise the ground unrest code device, this ground unrest code device specifically comprises:

The background noise analysis unit is used to obtain at least one signal characteristic parameter value of the ambient noise signal that does not comprise efficient voice;

A kind of communication facilities comprises the ground unrest decoding device, and described ground unrest decoding device comprises:

A kind of ground unrest coding method comprises:

Obtain at least one signal characteristic parameter value of ambient noise signal;

Determine the ground unrest type according to described at least one signal characteristic parameter value;

According to a kind of discontinuous transmission mechanism of described ground unrest type selecting ambient noise signal is encoded to silence description frames and the discontinuous transmission that comprises discontinuous transmission mechanism indication information, wherein silence description frames coded system and/or the discontinuous send mode of silence description frames in each discontinuous transmission mechanism is inequality.

A kind of coding/decoding method comprises:

Discontinuous transmission mechanism indication information in the identification silence description frames, wherein the silence description frames coded system in each discontinuous transmission mechanism is inequality;

According to described discontinuous transmission mechanism indication information, adopt corresponding decoding process decoding silence description frames, and be reconstructed into ambient noise signal.

Use technical solution of the present invention, SID frame coding method and discontinuous transmission mechanism that can be different according to the dissimilar flexible employing of ground unrest, under the situation of rationally utilizing the communication system real resource, be adaptive to desired encoding precision of dissimilar ground unrests and time sense, reach under the prerequisite of rationally utilizing channel resource in the high-quality purpose that all kinds ground unrest is provided of decoding end.

Description of drawings

Fig. 1 is the existing communication system principle schematic that adopts the silence compression coding techniques;

Fig. 2, Fig. 3, Fig. 5 and Fig. 6 are respectively a kind of ground unrest code device primary structure synoptic diagram that the embodiment of the invention provides;

The switch mode principle schematic that Fig. 4 provides for the embodiment of the invention;

Fig. 7, Fig. 8 and Fig. 9 are respectively a kind of communication facilities primary structure synoptic diagram that the embodiment of the invention provides.

Embodiment

Further analyze existing ground unrest coding, all adopt single ground unrest frame coding mode in each coding/decoding system, can't describe the concrete ambient noise signal of the communication environment of various characteristics targetedly, thereby satisfy the multifarious demand of communication environment; Single discontinuous transmission mechanism also can't transmit background noise frames targetedly to the background noise frames of different qualities, so that can't produce comfort noise preferably targetedly in decoding end; And the unicity of discontinuous transmission mechanism also can't be adaptive to the actual state of communication system, and communication bandwidth for example is so that the resource of communication system can not be by rational Application.

Therefore, a kind of encoding and decoding technique scheme that the embodiment of the invention provides, can carry out respective handling according to the ground unrest characteristic targetedly, at coding side ground unrest is analyzed, characteristic according to ground unrest adopts different DTX scheme codings to send the SID frame, and adopts corresponding CNG scheme reconstructed background noise signal at receiving end.

As shown in Figure 2, be a kind of ground unrest code device 20 that the embodiment of the invention provides, this ground unrest code device 20 is used for the silence compression navamander, and primary structure comprises:

At least two DTX unit 21, comprise DTX unit 211,212......DTX unit, DTX unit 21n, each DTX unit 21 is used for according to different DTX mechanism ambient noise signal being encoded to SID frame and discontinuous output, wherein, difference between the different DTX mechanism comprises: the SID frame coding mode is inequality, the discontinuous transmission mechanism of SID frame is inequality, and perhaps the two is all inequality;

Background noise analysis unit 22 is used for calculating at least one signal characteristic parameter value according to ambient noise signal;

Ground unrest taxon 23 is used for determining the ground unrest type according to described at least one signal characteristic parameter value;

Coding switch unit 24 is used for according to described ground unrest type the corresponding DTX unit 21 of ambient noise signal input that receives.

The function of background noise analysis unit 22 is to analyze the ambient noise signal of input to obtain the characteristic parameter of background signal, analytical approach is to calculate one or more signal characteristic parameter values that can characterize ambient noise signal, and these signal characteristic parameters can be one of following parameter or combination in any: SNR (signalto Noise Ratio), ground unrest are estimated level, ground unrest energy hunting estimation, ground unrest spectrum fluctuation estimation, ground unrest vibration frequency etc. when long.Other can the characterization signal background the parameter of size, kind (stable state/unstable state changes size, changes speed), SNR (when long, in short-term, peak value) also can be calculated and be used for analyzing.Except that ambient noise signal being imported the background noise analysis unit 22, VAD can also be adjudicated current demand signal is the indicator signal input background noise analysis unit 22 of background signal, and background noise analysis unit 22 triggers the signal calculated characteristic ginseng value according to the ambient noise signal indication information.

The function of ground unrest taxon 23 is the signal characteristic parameter values that calculate according to background noise analysis unit 22, divides the type of current background noise signal.The ground unrest type defines two kinds at least, can define according to whether stablizing of ground unrest, for example: ground unrest is divided into stationary background noise signal and unstable state ambient noise signal two classes; And for example reach the size definition of SNR when long: ground unrest is divided into the big SNR ambient noise signal of stable state, the little SNR ambient noise signal of stable state, the big SNR ambient noise signal of unstable state, little SNR ambient noise signal four classes of unstable state according to whether stablizing of ground unrest.The size definition ground unrest type that can also comprise the information meaning according to ground unrest, can be divided into melodious ambient noise signal (as music, flowing water, bird's twitter etc.), environmental background noise signal (as cocktail party such as ambient noise signal, the curbside, airport etc.) or other ambient noise signal three classes.In a word, the division type of ground unrest can be set as required, does not limit the enforcement of technical solution of the present invention.

Stored the DTX unit 21 that each ground unrest type is suitable in the coding switch unit 24 in advance, the ground unrest type of determining according to ground unrest taxon 23, switch between at least two DTX unit 21, selection and the corresponding DTX of current background noise type unit 21 are handled.

Each ground unrest type is distinguished corresponding a kind of different DTX scheme generally speaking, but consider the demand or the communication channel resources situation of change of concrete communication scenes, can also set different switch modes, the switching controls scheme that each switch mode is corresponding different, the pairing concrete switching controls scheme of each switch mode can be stored in advance in the coding switch unit 24, and by the required switch mode of switch mode steering order indication coding switch unit 24, the switch mode steering order can be a limited instruction set of the current channel resource situation of reflection, the corresponding a kind of switch mode of each specific switch mode steering order, the distribution method of promptly a kind of DTX unit.Also can be when switch mode steering order each time, the concrete switching controls scheme of required switch mode correspondence is sent to coding switch unit 24 simultaneously.

Coding switch unit 24 is determined corresponding switch mode according to the indication information of concrete reason.To those skilled in the art, concrete switch mode switching control method is a lot, enumerates no longer one by one here.

Therefore as shown in Figure 3, can also increase in the ground unrest code device 20 shown in Figure 2 command reception unit 25 is set, receive the switch mode steering order and be transmitted to coding switch unit 24, the switch mode steering order can be operating personnel from the outside input, also can be transmitting terminal according to the communication channel resource conditions generation that detects etc.

Coding switch unit 24 can be selected appropriate switch mode according to the switch mode steering order, has at least the DTX unit 21 of a ground unrest type correspondence inequality between any two switch modes.To comprise in the ground unrest code device 20 that four DTX unit 21 are example, the setting of switch mode is for example shown in Figure 4:

When command reception unit 25 was not set, coding switch unit 24 was that ground unrest is selected appropriate DTX unit according to the ground unrest type only, and for example switch mode 1;

When being provided with command reception unit 25, coding switch unit 24 determines to select the pattern of DTX unit according to the switch mode steering order.For example:

When receiving switch mode steering order 1, switch according to switch mode 1;

When receiving switch mode steering order 2, switch according to switch mode 2;

When receiving switch mode steering order 3, switch according to switch mode 3.

The switch mode steering order can be sent according to current communication channel resource conditions, when for example current channel resource is nervous, wherein the ground unrest of part ground unrest type may use same DTX unit than low rate, even all ground unrest types are all used minimum speed limit DTX unit.Again for example, when not having the switch mode steering order, have only wherein that part DTX unit can be used, but when the switch mode steering order represented that current channel resource is abundant or nervous, another part DTX unit was used.

By example shown in Figure 4 as seen:

Instruction 1 indication enters switch mode 1, all corresponding a kind of different DTX unit of each ground unrest type this moment, and instruction 1 can be that the current channel resource of expression is normal;

Instruction 2 indications enter switch mode 2, the corresponding identical DTX unit of part ground unrest type meeting this moment, as ground unrest Class1 and ground unrest type 2 equal corresponding DTX unit 212, ground unrest type 3 and ground unrest type 4 equal corresponding DTX unit 214, instruction 2 can be represented current channel resource, and abundant (DTX unit 212 is than DTX unit 211 code rate height, and it is many to take resource; DTX unit 214 is than DTX unit 213 code rate height, and it is many to take resource), perhaps can representing current channel resource, nervous (DTX unit 212 is low than DTX unit 211 code rates, and it is few to take resource; DTX unit 214 is low than DTX unit 213 code rates, and it is few to take resource);

Instruction 3 indications enter switch mode 3, the all only corresponding DTX of the ground unrest type unit 214 that this moment is all, instruction 3 can be represented current channel resource, and very abundant (DTX unit 214 code rates are the highest, it is maximum to take resource), perhaps can represent current channel resource very nervous (DTX unit 214 code rates are minimum, and it is minimum to take resource).

As shown in Figure 5, when very abundance or ground unrest are melodious especially, can also speech coder be set in ground unrest code device 20 again, utilize speech coder that the specific background noise is carried out full-rate codes at current channel resource.The signal characteristic parameter value of specific background noise is satisfied to impose a condition, whether ground unrest taxon 23 satisfies to impose a condition according at least one signal characteristic parameter value is judged the specific background noise type, and indicate coding switch unit 24 when the specific background noise, to switch to speech coder, speech coder is encoded to speech frame and output continuously with this specific background noise signal.Certainly, also can utilize the existing speech coder of transmitting terminal to bear this function.

Consult shown in Figure 1ly, comprise two parts in each DTX unit 21, one is the non-voice scrambler, is used for ambient noise signal is encoded to the SID frame, and it two is the background noise frames transmitter, is used for sending background noise frames according to the discontinuous mode of setting.Difference between the DTX unit can be the difference of SID frame coding mode, also can be the difference of discontinuous transmission mechanism, and perhaps the two is neither together.Wherein: the difference of SID frame coding mode can be embodied in the following aspects:

1, the difference of coding parameter

Different SID frames can adopt the parameter of different types of parameter and varying number;

2, the difference of identical parameters quantified precision

Can adopt different quantified precisions for the different SID frames of same parameter;

Therefore, the structure of the SID frame that different SID frame coding modes produce may be different with length, and for example: the coding parameter of stationary background noise signal noise can be simple relatively, and the quantified precision of parameter can be relatively low, and the length of SID frame can be shorter relatively.Corresponding, the coding parameter of unstable state ambient noise signal can be more relatively, a kind of method is the excitation parameters that increases in existing SID frame in the speech coding parameters, as adaptive codebook parameter (pitch delay and gain) and fixed codebook parameters (position of non-zero pulses and symbol and gain) etc., the quantified precision of parameter can be higher relatively, as increasing the quantified precision etc. of existing SID frame median filter parameter and excitation parameters, the length of SID frame also can correspondingly increase.And for example, the SID frame of melodious ambient noise signal should use the most complicated coded system (can be full-rate codes), the SID frame of environmental background noise signal uses time complicated coded system, and the SID frame of other ambient noise signal then uses better simply coded system.

Different ground unrest types can also be used different discontinuous transmission mechanisms, and the difference of the SID frame transmit mechanism of different background noise type is mainly reflected on the transmission intercal of adjacent two SID frames.For example for the stationary background noise signal, because the change of background is less, the human auditory system is not high to the quality decline susceptibility of such ground unrest after DTX/CNG, only needs to get final product with relatively long discontinuous transmission intercal.For the unstable state ambient noise signal, because background change in time is bigger, after DTX/CNG, can have a higher subjective quality in order to guarantee such ground unrest, need to use relatively short discontinuous transmission intercal or even continuous transmission.And for example, for melodious background, because the non-noise character of itself, the requirement of subjective quality is quite high, so the DTX system just need carry out continuous transmission with it.For environmental background noise, owing to itself carried certain environmental information, in order to ensure not losing these information, the DTX system also need upgrade the SID frame with higher speed.For other ground unrest, because the true appearance and the non-staple purpose of reduction background, so the renewal of carrying out the SID frame that the DTX system just can be relatively slow.

The transmission intercal of SID frame is divided into constant transmissions at interval and two kinds at Adaptive Transmission interval.Under constant transmissions mechanism at interval, the fixing at interval of adjacent two SID interframe, size at interval just depends on the type of ground unrest.Under Adaptive Transmission mechanism at interval,, the restriction of minimum time interval is arranged also though the transmission of SID frame is to be adaptive to the speed that noise changes.In the embodiment of the invention, can determine to adopt different minimum intervals according to the different background noise type.

As shown in Figure 6, the primary structure synoptic diagram for first kind of communication facilities provided by the invention comprises: VAD10, ground unrest code device 20, speech coder 30 and communication interface 40, wherein:

VAD10 is used to detect efficient voice, audible signal is sent into speech coder 30 be encoded to speech frame, and communication interface sends speech frame by communication channel; Ambient noise signal is sent into ground unrest code device 20, ground unrest code device 20 output SID frames, communication interface sends the SID frame by communication channel.

In the ground unrest code device 20, coding switch unit 24 switches the DTX unit 21 that is suitable for according to the type of ambient noise signal.

As shown in Figure 7, the second kind of communication facilities primary structure synoptic diagram that provides for the embodiment of the invention, wherein: the coding switch unit 24 in the ground unrest code device 20 is determined switch mode according to the instruction of command reception unit 25, and switches the DTX unit 21 that is suitable for according to the type of ambient noise signal and the switch mode of current use.

Referring to shown in Figure 5, in the ground unrest code device 20 in Fig. 6 and the communication facilities shown in Figure 7 speech coder can also be set, speech coder is finished the full-rate codes of ground unrest.

As shown in Figure 8, finish the full-rate codes of ground unrest if utilize existing speech coder 30, the switch unit 24 of then encoding is connected on the speech coder 30, ambient noise signal is sent into speech frame and the output continuously that is encoded to full rate in the speech coder 30.

The technical scheme that adopts the embodiment of the invention to provide needs the coding send mode of corresponding each SID frame, and corresponding D TX unit indication information is set in the SID frame, and DTX unit indication information is determined corresponding decoding and reconstituting scheme for decoding end.For speech frame, decode and output continuously by Voice decoder.

Calculate identical because 22 pairs of background noise analysis unit calculate the calculating of background characteristics parameter value with the part of each DTX unit 21 when encoding the SID frame, therefore background noise analysis unit 22 can also connect each DTX unit 21, analysis result is exported to each DTX unit 21, thereby part reduces the calculated amount of DTX unit 21.

As shown in Figure 9, a kind of communication facilities structural representation that provides for the embodiment of the invention, the unit of finishing the receiving end function mainly comprises: ground unrest decoding device 50, recognition unit 60, decoding switch unit 70, Voice decoder 80 and communication interface, wherein:

Comprise at least two CNG unit (51,52...5n) in this ground unrest decoding device 50, each CNG unit is according to a kind of decoding and reconstituting scheme decoding SID frame of setting and be reconstructed into ground unrest;

Recognition unit 60 is used to discern the speech frame and the SID frame that receive by communication interface, and further discerns the DTX unit indication information in the SID frame;

Decoding switch unit 70 is used for switching according to the recognition result of recognition unit 60, and speech frame is sent into voice decipher 80, the SID frame is sent into corresponding CNG unit carry out decoding and reconstituting.

Wherein, consult shown in Figure 1ly, comprise the non-voice demoder in each CNG unit, be used to the background noise frames of decoding and receiving from communication channel; Reconstructor is used to adopt the mode of interpolation that decoded data is reconstructed into continuous ambient noise signal, to increase the comfort of listener.

If recognition unit 60 is at first discerned speech frame or SID frame and SID frame that the coded frame that receives is a full-rate codes, further discern the DTX unit indication information in this SID frame., should have 1 or several bit that is used to carry DTX unit indication information in the code stream of SID frame here, the bit bit position both can also can be at the end in the beginning of SID frame, perhaps at other assigned address of SID frame, for example shown in following table 3 or the table 4:

Table 3.

Bit Allocation in Discrete (from a high position to the low level)	Parametric description
	Parametric description	S1	DTX unit indication information
S2	Quantitative prediction device index	S1	DTX unit indication information
S2	Quantitative prediction device index	S3-S7	One-level LSF quantisation vector index
S8-S11	Secondary LSF quantisation vector index	S3-S7	One-level LSF quantisation vector index
S8-S11	Secondary LSF quantisation vector index	S12-S16	Gain (energy)

Table 4.

Bit Allocation in Discrete (from a high position to the low level)	Parametric description
	Parametric description	S1	Quantitative prediction device index
S2-S6	One-level LSF quantisation vector index	S1	Quantitative prediction device index
S2-S6	One-level LSF quantisation vector index	S7-S10	Secondary LSF quantisation vector index
S11-S15	Gain (energy)	S7-S10	Secondary LSF quantisation vector index
S11-S15	Gain (energy)	S16-S17	DTX unit indication information

Can comprise two recognin unit in the recognition unit 60, a recognin unit is used to discern speech frame or the SID frame that the coded frame that receives is a full-rate codes, if identify speech frame then notify decoding switch unit 70, the decoding switch unit is sent speech frame into Voice decoder 80 decoding and output continuously; If identifying is the SID frame, then the SID frame is sent into another recognin unit and further discerned DTX unit indication information in the SID frame.

Because different ground unrest types has different SID frame format or update mechanism, select suitable DTX decoding scheme decoding SID frame according to DTX unit indication information, noise interpolating method difference in the different decoding schemes, the comfort noise that finally is restored out.

Communication facilities can promptly possess the transmitting terminal function, also concrete receiving end function, structure shown in Figure 9 usually and Fig. 7 or structure shown in Figure 8 merging be arranged in the communication facilities.

Be elaborated with specific embodiment more below.

Embodiment one

At coding side, ambient noise signal is imported background noise analysis unit 22 frame by frame, background noise analysis unit 22 calculates the subband level level[k of each current demand signal frame, i], wherein k and i represent respectively to be the level of i frame k subband and to be kept in the internal memory.The calculating of subband can realize also can realizing by the method for conversion by bank of filters.

When the enough some of signal frame accumulation, 22 beginnings of background noise analysis unit are calculated ground unrest spectrum fluctuation parameter s pec_var[i according to the subband level that is kept in the internal memory], and after update calculation result when receiving each frame background noise frames.

Ground unrest spectrum fluctuation parameter s pec_var[i] computing method be:

spec_var [i] = Σ_{m = i - L + 1}^{i} (Σ_{n = i - L + 1, n &NotEqual; m}^{i} (Σ_{k = 1}^{N} {((level [k, m] - level [k, n]) / (\frac{1}{L \cdot N} Σ_{m = i - L + 1}^{i} Σ_{k = 1}^{N} level [k, i]))}^{2}))

Wherein N is the quantity of subband, and L is chosen as to do ambient noise signal number of frames average when long.Ground unrest spectrum fluctuation parameter s pec_var[i] calculating also can be based on the LSF coefficient, that is:

spec_var [i] = Σ_{m = i - L + 1}^{i} (Σ_{n = i - L + 1, n &NotEqual; m}^{i} (Σ_{k = 1}^{N} {((lsf [k, m] - lsf [k, n]) / (\frac{1}{L \cdot N} Σ_{m = i - L + 1}^{i} Σ_{k = 1}^{N} lsf [k, i]))}^{2}))

Lsf[k wherein, i] k lsf coefficient of expression i frame, N is the lsf number of coefficients, L is chosen as to do ambient noise signal number of frames average when long.

Background noise analysis unit 22 exports the ground unrest spectrum fluctuation parameter that calculates to ground unrest taxon 23.

Coding switch unit 24 is selected corresponding DTX unit according to the ground unrest type information of ground unrest taxon 23 outputs for the current background noise frame.

In the present embodiment, two kinds of background type have been defined: stationary background noise signal and unstable state ambient noise signal.Ground unrest taxon 23 incorporates the current background noise into in stable state or the unstable state a kind of by ground unrest spectrum fluctuation parameter and thresholding spec_var_thr are compared.If spec_var[i]＜spec_var_thr, the current background noise signal is classified as the stationary background noise signal, otherwise is classified as the unstable state ambient noise signal.

In the present embodiment, there are two kinds of different programmes of work in the DTX system:

Scheme one adopts per 8 frames to send the discontinuous transmission mechanism of a SID frame, only comprises wave filter and energy parameter in the SID frame;

Scheme two adopts per 3 frames to send the discontinuous transmission mechanism of a SID frame, and except comprising wave filter, energy parameter has also increased fixed codebook parameters (belonging to excitation parameters) outward in the SID frame.

The current background noise belongs to the stable state time-like, and coding switch unit 24 is selected programme of work one.The current background noise belongs to the unstable state time-like, and coding switch unit 24 is selected programme of work two.

A DTX unit indication information is set to state which kind of DTX unit this SID frame belongs to during every kind of SID frame assembling in frame.In the present embodiment, DTX unit indication information can represent that 0 and 1 represents DTX unit one and DTX unit two respectively with 1 bit.

In decoding end, recognition unit 60 is at first judged DTX unit under this SID frame according to the DTX unit indication information in the SID frame, and decoding switch unit 70 carries out decoding and reconstituting with the corresponding CNG unit 50 of SID frame input.Wherein:

If DTX unit indication information is 0, corresponding CNG unit then obtains filter parameter and energy parameter and thereafter 8 ambient noise signal frames is carried out interpolation according to the decoding of the SID frame format in the above-mentioned DTX programme of work one;

If DTX unit indication information is 1, corresponding CNG unit then obtains filter parameter, energy parameter and fixed codebook parameters and thereafter 3 ambient noise signal frames is carried out interpolation according to the decoding of the SID frame format in the above-mentioned DTX programme of work two.

Embodiment two

At coding side, ambient noise signal is inputed to background noise analysis unit 22 frame by frame, and background noise analysis unit 22 calculates the subband level level[k of current background noise signal frame, i], frame energy pow[i] and signal to noise ratio (S/N ratio) snr[i in short-term], and be kept in the internal memory.Wherein:

Subband level level[k, i] calculating can realize by bank of filters also can be by the method realization of conversion;

Frame energy calculation mode is:

pow [i] = Σ_{k = 1}^{N} level {[k, i]}^{2},

Wherein N represents total sub band number;

The account form of signal to noise ratio (S/N ratio) is in short-term:

snr [i] = \frac{pow [i]}{bckr_noise_pow [i]},

Bckr_noise_pow[i wherein] ground unrest Energy Estimation when long.

When the accumulation of ambient noise signal frame reaches some, 22 beginnings of background noise analysis unit are according to short-time characteristic parameter (the subband level that is kept in the internal memory, the frame energy, signal to noise ratio (S/N ratio) in short-term) calculate ground unrest energy hunting pow_var[i], signal to noise ratio (S/N ratio) snr_long[i when long], ground unrest Energy Estimation bckr_noise_pow[i when long], and when receiving each ambient noise signal frame the update calculation result.

When long the signal to noise ratio (S/N ratio), when all the other calculation of parameter results' renewal occurred over just the current demand signal frame and is the ambient noise signal frame, the renewal of signal to noise ratio (S/N ratio) was when to occur over just the current demand signal frame be non-ambient noise signal frame when long.

Signal to noise ratio (S/N ratio) snr_long[i when long] computing method be:

snr_long [i] = \frac{Σ_{m = i - L + 1}^{i} snr [m]}{L},

Wherein L is chosen as to do non-ambient noise signal number of frames average when long.

Ground unrest energy hunting pow_var[i] computing method be:

pow_var [i] = \frac{1}{L} * Σ_{m = i - L + 1}^{i} {(pow [m] - \frac{1}{L} * Σ_{m = i - L + 1}^{i} pow [m])}^{2},

Wherein L is chosen as to do ambient noise signal number of frames average when long.

Ground unrest Energy Estimation bckr_noise_pow[i when long] computing method be:

Bckr_noise_pow[i]=(1-α) * bckr_noise_pow[i-1]+α * pow[i], wherein α is one 0 to 1 a scale factor, value is 5% in the present embodiment.

Background noise analysis unit 22 is with the ground unrest energy hunting pow_var[i that calculates], signal to noise ratio (S/N ratio) snr_long[i when long] export ground unrest taxon 23 to.

In the present embodiment, four kinds of background type have been defined: the big SNR ambient noise signal of stable state, the little SNR ambient noise signal of stable state, the big SNR ambient noise signal of unstable state and the little SNR ambient noise signal of unstable state.

Ground unrest taxon 23 is by with pow_var[i] and snr_long[i] compare with an energy hunting thresholding pow_var_thr and a signal-noise ratio threshold snr_thr respectively, it is a class of 4 kinds of ground unrest type kinds that the current background noise is incorporated into, wherein:

As pow_var[i]＞when pow_var_thr and snr_long＞snr_thr, current background is the big SNR ambient noise signal of unstable state;

As pow_var[i]＞when pow_var_thr and snr_long＜snr_thr, current background is the little SNR ambient noise signal of unstable state;

As pow_var[i]＜when pow_var_thr and snr_long＞snr_thr, current background is the big SNR ambient noise signal of stable state;

As pow_var[i]＜when pow_var_thr and snr_long＜snr_thr, current background is the little SNR ambient noise signal of stable state.

In the present embodiment, one has four kinds of different DTX unit, wherein:

DTX unit one adopts per 8 frames to send the discontinuous transmission mechanism of a SID frame, only comprises filter parameter and energy parameter in the SID frame;

DTX unit two adopts per 8 frames to send the discontinuous transmission mechanism of a SID frame, also comprises fixed codebook parameters in the SID frame except that comprising filter parameter and energy parameter;

DTX unit three adopts per 3 frames to send the discontinuous transmission mechanism of a SID frame, also comprises fixed codebook parameters in the SID frame except that comprising filter parameter and energy parameter;

DTX unit four adopts the continuous transmission mechanism of full-rate codes.

Coding switch unit 24 also links to each other with a switch mode steering order interface except the ground unrest type information that receives ground unrest taxon 23.The switch mode steering order has 5 kinds of indications, represent respectively that bandwidth is normal, bandwidth is more, bandwidth is less, bandwidth is a lot of and bandwidth seldom.

The switch mode relation of switch mode steering order and DTX system is as follows:

Represent bandwidth just often when the switch mode steering order, the switch mode of DTX system is: the big SNR class of stable state background is adopted DTX unit one; The little SNR class of stable state background is adopted DTX unit two; The big SNR class of unstable state background is adopted DTX unit three; The little SNR class of unstable state background is adopted DTX unit four.

Represent bandwidth more for a long time when the switch mode steering order, the switch mode of DTX system is: stable state big SNR class background and the little SNR class of stable state background are all adopted DTX unit two; Unstable state big SNR class background and the little SNR class of unstable state background are all adopted DTX unit four;

Represent bandwidth more after a little while when the switch mode steering order, the switch mode of DTX system is: stable state big SNR class background and the little SNR class of stable state background are all adopted DTX unit one; Unstable state big SNR class background and the little SNR class of unstable state background are all adopted DTX unit three;

When the switch mode steering order represented that bandwidth is a lot, the switch mode of DTX system was: four kinds of ground unrest types are all adopted DTX unit four;

When the switch mode steering order represented that bandwidth seldom, the switch mode of DTX system was: four kinds of ground unrest types are all adopted DTX unit one.

A DTX unit indication information is set to state which kind of DTX unit this SID frame belongs to during every kind of SID frame assembling in frame.DTX unit indication information in the present embodiment can represent that wherein: 00,01 and 10 represent DTX unit one, DTX unit two and DTX unit three respectively with 2 bits, so scheme four is owing to be that full-rate codes need not to identify.

In decoding end, recognition unit 60 is at first judged DTX unit under this SID frame according to the DTX unit indication information in the SID frame, and the corresponding CNG unit decodes reconstruct of indication, wherein:

If DTX unit indication information is 00, then obtains filter parameter and energy parameter and thereafter 8 ambient noise signal frames are carried out interpolation according to the decoding of the SID frame format in the above-mentioned DTX programme of work one;

If DTX unit indication information is 01, then obtain excitation parameters according to the decoding of the SID frame format in the above-mentioned DTX programme of work two, filter parameter and energy parameter also carry out interpolation to thereafter 8 ambient noise signal frames.

If DTX unit indication information is 10, then obtains filter parameter, energy parameter and fixed codebook parameters and thereafter 3 ambient noise signal frames are carried out interpolation according to the decoding of the SID frame format in the above-mentioned DTX programme of work three.

Those skilled in the art can be flexibly according to signal characteristic parameter dividing background noise type, and sets switch mode, enumerates no longer one by one here.

In sum, the embodiment of the invention provides a kind of coding method, comprising:

Obtain at least one signal characteristic parameter value of ambient noise signal to be sent;

According to a kind of DTX mechanism of described ground unrest type selecting ambient noise signal is encoded to SID frame and the discontinuous transmission that comprises DTX mechanism indication information, wherein SID frame coding mode and/or the discontinuous send mode of SID frame in each DTX mechanism is inequality.

Further, described coding method also comprises: set different DTX mechanism preference patterns, promptly different switch modes has at least the DTX mechanism of a ground unrest type correspondence inequality between any two DTX mechanism preference pattern; And according to selected DTX mechanism preference pattern, for each ground unrest type is determined corresponding DTX mechanism.

In a nearlyer step, described method also comprises: when setting value that described at least one signal characteristic parameter value is, ambient noise signal is encoded to the speech frame of full rate and sends continuously.

Wherein, the signal characteristic parameter value comprises one of following or combination in any: SNR, ground unrest estimate that level, ground unrest energy hunting estimation, the fluctuation of ground unrest spectrum estimate and the ground unrest vibration frequency when long.

The embodiment of the invention also provides a kind of method that the SID frame of correspondence is decoded, and comprising:

DTX mechanism indication information in the identification SID frame, wherein the SID frame coding mode in each DTX mechanism is inequality;

According to described DTX mechanism indication information, adopt corresponding decoding process decoding SID frame, and be reconstructed into ambient noise signal.

Use the technical scheme that the embodiment of the invention proposes, can be according to the actual state of the dissimilar and communication system of ground unrest, adopt different SID frame coding methods and discontinuous transmission mechanism, under the situation of rationally utilizing the communication system real resource, be adaptive to desired encoding precision of dissimilar ground unrests and time sense, reach under the prerequisite of rationally utilizing channel resource in the high-quality purpose that all kinds ground unrest is provided of decoding end.

Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, if of the present invention these are revised and modification belongs within the scope of claim of the present invention and equivalent technologies thereof, then the present invention also is intended to comprise these changes and modification interior.

Claims

1, a kind of ground unrest code device is characterized in that, comprising:

2, ground unrest code device as claimed in claim 1, it is characterized in that, also comprise: the command reception unit, be used to receive the switch mode steering order and be transmitted to the coding switch unit, described switch mode steering order is used for indication coding switch unit and uses a kind of switch mode, has the corresponding different discontinuous transmission unit of a ground unrest type between any two switch modes at least.

3, ground unrest code device as claimed in claim 1, it is characterized in that, also comprise: speech coder, wherein a kind of specific context noise type that described coding switch unit identifies according to the ground unrest taxon is imported speech coder with ambient noise signal and is encoded to speech frame and output continuously.

4, as claim 1,2 or 3 described ground unrest code devices, it is characterized in that described discontinuous transmission unit specifically comprises:

The non-voice scrambler is used for ambient noise signal is encoded to silence description frames;

The silence description frames transmitter is used for the described silence description frames of discontinuous transmission.

5, a kind of ground unrest decoding device that the silence description frames of the described ground unrest code device output of claim 1 is decoded is characterized in that, comprising:

6, ground unrest decoding device as claimed in claim 5 is characterized in that, also comprises:

Second recognition unit, described second recognition unit is used for recognizing voice frame and silence description frames, and described silence description frames is sent into first recognition unit, perhaps notifies described decoding switch unit when identifying speech frame;

Voice decoder, the speech frame that described decoding switch unit identifies second recognition unit are sent into Voice decoder decoding and output continuously.

7, ground unrest decoding device as claimed in claim 6 is characterized in that, described first recognition unit and second recognition unit merge setting.

8, a kind of communication facilities comprises:

First speech coder, the signal encoding that is used for comprising efficient voice is speech frame and output continuously;

It is characterized in that, also comprise the ground unrest code device, this ground unrest code device specifically comprises:

9, communication facilities as claimed in claim 8, it is characterized in that, described ground unrest decoding device also comprises: the command reception unit, be used to receive the switch mode steering order and be transmitted to the coding switch unit, described switch mode steering order is used for indication coding switch unit and uses a kind of switch mode, has the corresponding different discontinuous transmission unit of a ground unrest type between any two switch modes at least.

10, communication facilities as claimed in claim 8, it is characterized in that, described coding switch unit also connects described first speech coder, wherein a kind of specific context noise type that described coding switch unit identifies according to the ground unrest taxon is imported first speech coder with ambient noise signal and is encoded to speech frame and output continuously.

11, communication facilities as claimed in claim 8, it is characterized in that, described ground unrest decoding device also comprises: second speech coder, wherein a kind of specific context noise type that described coding switch unit identifies according to the ground unrest taxon is imported second speech coder with ambient noise signal and is encoded to speech frame and output continuously.

As claim 8,9,10 or 11 described communication facilitiess, it is characterized in that 12, described discontinuous transmission unit specifically comprises:

The non-voice scrambler is used for according to the silence description frames coded system of setting ambient noise signal being encoded to silence description frames;

The silence description frames transmitter is used for sending described silence description frames according to the discontinuous transmission mechanism of setting.

13, communication facilities as claimed in claim 8 is characterized in that, also comprises:

14, communication facilities as claimed in claim 13 is characterized in that, also comprises:

15, communication facilities as claimed in claim 14 is characterized in that, described first recognition unit and second recognition unit merge setting.

16, a kind of communication facilities comprises the ground unrest decoding device, it is characterized in that, described ground unrest decoding device comprises:

17, communication facilities as claimed in claim 16 is characterized in that, also comprises:

Second recognition unit, described second recognition unit is used for recognizing voice frame and silence description frames, and described silence description frames is sent into first recognition unit, perhaps notifies described decoding switch unit when identifying silence description frames;

Voice decoder, the speech frame that described decoding switch unit identifies second recognition unit are sent into Voice decoder decoding and output.

18, communication facilities as claimed in claim 17 is characterized in that, described first recognition unit and second recognition unit merge setting.

19, a kind of ground unrest coding method is characterized in that, comprising:

20, coding method as claimed in claim 19, it is characterized in that, describedly be: select according to the corresponding relation between ground unrest type and the discontinuous transmission mechanism in the discontinuous transmission mechanism preference pattern of setting, have at least the discontinuous transmission mechanism of a ground unrest type correspondence inequality between any two switch modes according to a kind of discontinuous transmission mechanism of ground unrest type selecting.

21, coding method as claimed in claim 19 is characterized in that, also comprises: when described at least one signal characteristic parameter value satisfies when imposing a condition, ambient noise signal is encoded to the speech frame of full rate and sends continuously.

22, as claim 19,20 or 21 described coding methods, it is characterized in that described signal characteristic parameter comprises one of following or combination in any: SNR, ground unrest estimate that level, ground unrest energy hunting estimation, the fluctuation of ground unrest spectrum estimate and the ground unrest vibration frequency when long.

23, coding method as claimed in claim 22 is characterized in that, according to the size of each signal characteristic parameter value, described ground unrest type comprises stationary background noise signal and unstable state ambient noise signal.

24, coding method as claimed in claim 23 is characterized in that, when described signal characteristic parameter comprises when long SNR:

Described stationary background noise signal further comprises: big SNR ambient noise signal of stable state and the little SNR ambient noise signal of stable state;

Described unstable state ambient noise signal further comprises: big SNR ambient noise signal of unstable state and the little SNR ambient noise signal of unstable state.

25, coding method as claimed in claim 19 is characterized in that, the difference between the different silence description frames coded systems comprises: at least one silence description frames coding parameter is inequality, and the quantified precision of perhaps same silence description frames coding parameter is inequality.

26, coding method as claimed in claim 25 is characterized in that, described silence description frames coding parameter comprises composite filter parameter and excitation parameters.

27, coding method as claimed in claim 26 is characterized in that, described excitation parameters comprises adaptive codebook parameter and/or fixed codebook parameters.

28, coding method as claimed in claim 19 is characterized in that, the difference between the discontinuous send mode of different silence description frames comprises: the transmission intercal between adjacent two silence description frames of transmission is inequality.

29, coding method as claimed in claim 28 is characterized in that, described transmission intercal is determined according to the variation self-adaptation of background signal, and the minimum interval between the discontinuous send mode of different silence description frames is inequality.

30, coding method as claimed in claim 28 is characterized in that, the transmission intercal of the discontinuous send mode of each silence description frames is a fixed value inequality.

31, a kind of method that the silence description frames of the described method generation of claim 19 is decoded is characterized in that, comprising: