CN101320563B

CN101320563B - Background noise encoding/decoding device, method and communication equipment

Info

Publication number: CN101320563B
Application number: CN 200710110573
Authority: CN
Inventors: 王喆; 张立斌
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2007-06-05
Filing date: 2007-06-05
Publication date: 2012-06-27
Anticipated expiration: 2027-06-05
Also published as: CN101320563A; WO2008148321A1

Abstract

The present invention provides a background noise encoding/decoding device, a method thereof and a communication device, so as to flexibly adjust the SID frame encoding way and the non-continuous transmission mechanism according to the background noise in the communication environment. The technical proposal can flexibly adopt different SID frame encoding ways and non-continuous transmission mechanisms according to different types of the background noise, and adapts the encoding accuracy and time resolving capability which are required by different types of background noise when the actual resources of the communication system is used reasonably. Thus the technical proposal can provide different types of background noise with high quality at the decoding end when the actual resources of the communication system is used reasonably.

Description

A kind of ground unrest coding/decoding device, method and communication facilities

Technical field

The present invention relates to the voice communication technology, particularly a kind of ground unrest coding/decoding device, method and communication facilities.

Background technology

In the voice communication course, the voice signal that is transmitted can be divided into audible signal and two types of no acoustical signals, and wherein: audible signal is the voice signal that when detecting efficient voice, sends, and mainly comprises communication party's voice and information such as ground unrest at that time; The ambient noise signal that no acoustical signal is sent in the gap that does not detect efficient voice mainly comprises the diversity of settings noise, such as white noise, background brouhaha and quiet etc.Concerning communicating pair, audible signal is a useful signal, is the carrier of the main communication information, is garbage signal and there is not acoustical signal, does not comprise the main communication information.

In voice communication course, be useful audible signal because communicating pair mainly pays close attention to, therefore consider the pressure of bandwidth, do not hope to transmit useless no acoustical signal, to reduce transmission bandwidth, increase power system capacity.If but only pass audible signal, can cause the discontinuous of ground unrest, make a side who listens to feel very uncomfortable; Especially more obvious under the stronger situation of ground unrest, even can make the voice indigestion, transmit useless no acoustical signal so can only reduce as far as possible; On the basis that guarantees user's susceptibility, reduce bandwidth, the CNG that for this reason arisen at the historic moment (Comfort Noise generation, comfort noise generates) technology as far as possible; In the CNG technology; Transmitting terminal adopts silence compression coding techniques coding background noise frames in noiseless signal phase, and adopts DTX (Discontinuous Transmission, discontinuous transmission) mode to transmit; The mode of inserting in receiving end adopts produces continuous ambient noise signal, to increase the comfort of listener.

As shown in Figure 1, for adopting the communication system principle schematic of silence compression coding techniques, wherein, transmitting terminal comprises:

VAD (Voice Activity Detector; Speech detector); Original signal is imported VAD after certain pre-service, VAD is usually according to the eigenwert of each frame signal, for example signal to noise ratio (S/N ratio); Whether comprise efficient voice in the detection signal, and according to testing result signal is sent into the different coding device and encode;

Speech coder is used for the audible signal that comprises efficient voice is encoded to speech frame and output, and speech frame is transferred to receiving end through communication channel;

The non-voice scrambler, the noiseless signal encoding that adopts the silence compression technology will not comprise efficient voice is a background noise frames;

The background noise frames transmitter; Adopt the DTX mode of setting to give receiving end through traffic channel with background noise frames; For example every at a distance from background noise frames of several frames transmissions, perhaps whether send a background noise frames according to the characteristic parameter situation of change decision of ground unrest;

Accordingly, receiving end comprises:

Voice decoder, the speech frame that is used to decode and receives from communication channel, and export after being reconstructed into audible signal;

The non-voice demoder is used to the background noise frames of decoding and receiving from communication channel;

Reconstructor, the mode of inserting in being used to adopt is reconstructed into continuous ambient noise signal with decoded data, to increase the comfort of listener.

The non-voice scrambler of transmitting terminal and background noise frames transmitter are formed the DTX unit, and the non-voice demoder of receiving end and reconstructor are formed the CNG unit, VAD and the DTX of silence compression technology through being positioned at transmitting terminal, and the CNG realization that is positioned at receiving end.

CELP (Code-Excited Linear Prediction, Code Excited Linear Prediction) speech coding technology with commonly used is an example, and CELP is a kind of high efficiency speech coding technology.Because adopted perceptual weighting, analyzed technology such as synthetic, vector quantization and back filtering, CELP can provide high-quality synthetic speech on middle low rate.In the CELP speech coding technology, carry out speech signal analysis obtaining the characteristic parameter of voice, the characteristic parameter of voice is encoded in the code stream sends at transmitting terminal; Receiving end carries out phonetic synthesis, reconstructed speech signal according to the characteristic parameter in the code stream.Voice signal s (n) is divided into the short time period, is called frame, and the speech analysis of transmitting terminal and the phonetic synthesis of receiving end are all undertaken by frame.

At transmitting terminal, original signal s (n) can regard the output that a wave filter v of a pumping signal e (n) excitation (n) is produced as, i.e. s (n)=e (n) * v (n), and receiving end uses identical wave filter that voice are synthesized, and is called composite filter.Therefore the characteristic parameter that transmits in the encoding code stream of voice signal comprises composite filter parameter and excitation parameters; Wherein: the composite filter parameter is mainly line spectral frequencies LSF (Line SpectrumFrequency) quantization parameter, and the pumping signal parameter comprises: adaptive codebook parameter (comprising pitch delay parameter and fundamental tone gain parameter) and fixed codebook parameters (position and symbol and the fixed codebook gain parameter that comprise non-zero pulses in the fixed codebook).

Different with speech coding parameters; What the ground unrest coding parameter was described is the ground unrest characteristic, because the pumping signal of ground unrest can be thought simple noise random series, and these sequences all can simply produce with the random noise generation device at the encoding and decoding end; Control the amplitude of these sequences then with energy parameter; Just can produce final pumping signal, so the pumping signal characteristic parameter can represent with energy parameter simply, and need not further describe with other some characteristic parameters; So different with speech frame in the ground unrest encoding code stream is that its excitation parameters is the energy parameter of current background noise frame; And identically with speech frame in the ground unrest encoding code stream be that the composite filter parameter in the encoding code stream also is a line spectral frequencies LSF quantization parameter, is concrete quantization method difference to some extent.Through above analysis, can think that the coded system to ground unrest is exactly a kind of simple voice coding modes in essence.

Be commonly referred to as SID (SilenceDescriptor, silence description frames) frame through the DTX compressed encoding and by the background noise frames of discontinuous transmission.In current various speech coding systems, each intrasystem ambient noise signal all uses SID frame format and discontinuous transmission mechanism.

Wherein, G.729B silence compression scheme is a kind of representative silence compression technology; G.729B be based on the CELP algorithm model, so the ground unrest parameter that it transmitted is based on also that the CELP model obtains, comprises the composite filter parameter and the excitation parameters that are used to describe ground unrest.At coding side; For example the court verdict of VAD is 0 o'clock, representes that current signal is a noise, then is admitted to the ground unrest code device through certain pretreated original signal; The ground unrest parameter is extracted and quantized at the ground unrest code device; Form the SID frame, the ground unrest code device determines whether sending a frame SID frame according to the parameter situation of change in the SID frame then, sends a frame SID frame when for example filter parameter in the SID frame and energy parameter change greatly.G.729B in, the interval of two adjacent S ID frames is at least 2 frames.In decoding end,, then in the ground unrest decoding device, synthesize the comfort noise of describing coding side ground unrest characteristic according to the SID frame that receives if get into the ground unrest decoding and reconstituting stage.The bit number that distributes for each parameter in the SID frame of G729B is as shown in table 1 below:

Table 1.

Bit Allocation in Discrete (from a high position to the low level)	Parametric description
		S1	Quantitative prediction device index
S2-S6	One-level LSF quantisation vector index
		S7-S10	Secondary LSF quantisation vector index
S11-S15	Gain (energy)

AMR (Adaptive Multi-Rate speech codec; Self-adaptation multi code Rate of Chinese character speech coder) and AMR-WB (Adaptive Multi-Rate Wide Band speech codec; Wideband adaptive multi code Rate of Chinese character speech coder) DTX in the silence compression scheme then adopts fixing transmit mechanism; No matter how the characteristic of ground unrest changes, and per 8 frames send a SID frame.The bit number that distributes for each parameter in the SID frame of AMR is as shown in table 2 below, and the bit number that distributes for each parameter in the SID frame of AMR-WB is as shown in table 3 below.

Table 2.

Bit Allocation in Discrete (from a high position to the low level)	Parametric description
		S1-S3	Quantitative prediction device index
S4-S11	One-level LSF quantisation vector index
		S12-S20	Secondary LSF quantisation vector index
S21-S29	Three grades of LSF quantisation vector index
		S30-S35	Logarithm frame energy index

Table 3.

Bit Allocation in Discrete (from a high position to the low level)	Parametric description
		S1-S6	One-level LSF quantisation vector index
S7-S12	Secondary LSF quantisation vector index

S13-S18	Three grades of LSF quantisation vector index
		S19-S23	Level Four LSF quantisation vector index
S24-S28	Pyatyi LSF quantisation vector index
		S29-S34	Logarithm frame energy index
S35	The shake sign

Existing ground unrest coded system can be carried out simple code to ground unrest, and synthesizes simple comfortable ground unrest in decoding end, can better satisfy the requirement of answering quality to a certain extent.Yet along with development of Communication Technique, the voice coding quality is increasingly high, and is corresponding; The coding quality of the ground unrest in sound stage is also increasingly high; In this case, the ground unrest when making voice communication is more continuous, also need improve the coding quality of silent period ground unrest.Existing comparatively single S ID frame coding mode and discontinuous transmission mechanism obviously can't satisfy the specific requirement of multiple communication environment.

Summary of the invention

The embodiment of the invention provides a kind of ground unrest coding/decoding device, method and communication facilities, to realize adjusting SID frame coding mode and discontinuous transmission mechanism flexibly according to the ground unrest situation of communication environment.

A kind of ground unrest code device comprises:

The background noise analysis unit is used to obtain at least one signal characteristic parameter value of ambient noise signal;

The ground unrest taxon is used for confirming the ground unrest type according to said at least one signal characteristic parameter value;

Coding switch unit and at least two discontinuous transmission units; Said coding switch unit is used for according to said ground unrest type the corresponding discontinuous transmission unit of ambient noise signal input; Each discontinuous transmission unit is used for ambient noise signal is encoded to silence description frames and the discontinuous transmission that comprises discontinuous transmission unit indication information; Wherein the silence description frames coded system of each discontinuous transmission unit and/or the discontinuous transmission mechanism of silence description frames are inequality; Inequality being specially of silence description frames coded system of each discontinuous transmission unit: the coding parameter of different silence description frames is inequality; The quantified precision of the different silence description frames of perhaps same coding parameter is inequality, and inequality being specially of the discontinuous transmission mechanism of the silence description frames of each discontinuous transmission unit: the transmission intercal of adjacent two silence description frames of transmission is inequality.

A kind of ground unrest decoding device comprises:

First recognition unit is used for discerning the discontinuous transmission unit indication information of silence description frames;

Decoding switch unit and at least two comfort noise generation units; Said decoding switch unit is used for the recognition result according to said recognition unit; With the corresponding comfort noise generation unit of said silence description frames input; Each comfort noise generation unit is according to the corresponding decoding of the coded system silence description frames of the silence description frames that receives, and the reconstruct ambient noise signal.

A kind of communication facilities comprises:

First speech detector, whether the signal that is used for detecting collection comprises efficient voice and classification output;

Speech coder, being used for the signal encoding that comprises efficient voice is speech frame and output continuously;

Also comprise the ground unrest code device, this ground unrest code device specifically comprises:

The background noise analysis unit is used to obtain at least one signal characteristic parameter value of the ambient noise signal that does not comprise efficient voice;

A kind of communication facilities comprises the ground unrest decoding device, and said ground unrest decoding device comprises:

A kind of ground unrest coding method comprises:

Obtain at least one signal characteristic parameter value of ambient noise signal;

Confirm the ground unrest type according to said at least one signal characteristic parameter value;

Ambient noise signal is encoded to silence description frames and the discontinuous transmission that comprises discontinuous transmission mechanism indication information according to a kind of discontinuous transmission mechanism of said ground unrest type selecting; Wherein silence description frames coded system and/or the discontinuous send mode of silence description frames in each discontinuous transmission mechanism is inequality; Inequality being specially of silence description frames coded system of each discontinuous transmission unit: the coding parameter of different silence description frames is inequality; The quantified precision of the different silence description frames of perhaps same coding parameter is inequality, and inequality being specially of the discontinuous transmission mechanism of the silence description frames of each discontinuous transmission unit: the transmission intercal of adjacent two silence description frames of transmission is inequality.

A kind of coding/decoding method comprises:

Discontinuous transmission mechanism indication information in the identification silence description frames, wherein the silence description frames coded system in each discontinuous transmission mechanism is inequality;

According to said discontinuous transmission mechanism indication information, adopt corresponding decoding process decoding silence description frames, and be reconstructed into ambient noise signal.

Use technical scheme of the present invention; SID frame coding method and discontinuous transmission mechanism that can be different according to the dissimilar flexible employing of ground unrest; Under the situation of rationally utilizing the communication system real resource, be adaptive to desired encoding precision of dissimilar ground unrests and time sense, reach under the prerequisite of rationally utilizing channel resource in the high-quality purpose that all kinds ground unrest is provided of decoding end.

Description of drawings

Fig. 1 is the existing communication system principle schematic that adopts the silence compression coding techniques;

Fig. 2, Fig. 3, Fig. 5 and Fig. 6 are respectively a kind of ground unrest code device primary structure synoptic diagram that the embodiment of the invention provides;

The switch mode principle schematic that Fig. 4 provides for the embodiment of the invention;

Fig. 7, Fig. 8 and Fig. 9 are respectively a kind of communication facilities primary structure synoptic diagram that the embodiment of the invention provides.

Embodiment

Further analyze existing ground unrest coding, all adopt single ground unrest frame coding mode in each coding/decoding system, can't describe the concrete ambient noise signal of the communication environment of various characteristics targetedly, thereby satisfy the multifarious demand of communication environment; Single discontinuous transmission mechanism also can't transmit background noise frames to the background noise frames of different qualities targetedly, so that can't produce comfort noise preferably targetedly in decoding end; And the unicity of discontinuous transmission mechanism also can't be adaptive to the actual state of communication system, and communication bandwidth for example is so that the resource of communication system can not be by rational Application.

Therefore; A kind of encoding and decoding technique scheme that the embodiment of the invention provides; Can carry out handled according to the ground unrest characteristic targetedly; At coding side ground unrest is analyzed, the DTX scheme codings different according to the characteristic employing of ground unrest send the SID frame, and adopt corresponding CNG scheme reconstructed background noise signal at receiving end.

As shown in Figure 2, be a kind of ground unrest code device 20 that the embodiment of the invention provides, this ground unrest code device 20 is used for the silence compression navamander, and primary structure comprises:

At least two DTX unit 21; Comprise DTX unit 211,212......DTX unit, DTX unit 21n; Each DTX unit 21 is used for according to different DTX mechanism ambient noise signal being encoded to SID frame and discontinuous output; Wherein, the difference between the different DTX mechanism comprises: the SID frame coding mode is inequality, the discontinuous transmission mechanism of SID frame is inequality, and perhaps the two is all inequality;

Background noise analysis unit 22 is used for calculating at least one signal characteristic parameter value according to ambient noise signal;

Ground unrest taxon 23 is used for confirming the ground unrest type according to said at least one signal characteristic parameter value;

Coding switch unit 24 is used for according to said ground unrest type the corresponding DTX unit 21 of ambient noise signal input that receives.

The function of background noise analysis unit 22 is to analyze the ambient noise signal of input to obtain the characteristic parameter of background signal; Analytical approach is to calculate one or more signal characteristic parameter values that can characterize ambient noise signal, and these signal characteristic parameters can be one of following parameter or combination in any: SNR (signalto Noise Ratio), ground unrest are estimated level, ground unrest energy hunting estimation, ground unrest spectrum fluctuation estimation, ground unrest vibration frequency etc. when long.Other can the characterization signal background the parameter of size, kind (stable state/unstable state changes size, changes speed), SNR (when long, in short-term, peak value) also can be calculated and be used for analyzing.Except that ambient noise signal being imported the background noise analysis unit 22; Can also VAD be adjudicated the indicator signal input background noise analysis unit 22 that current demand signal is a background signal, background noise analysis unit 22 triggers the signal calculated characteristic ginseng value according to the ambient noise signal indication information.

The function of ground unrest taxon 23 is the signal characteristic parameter values that calculate according to background noise analysis unit 22, divides the type of current background noise signal.The ground unrest type defines two kinds at least, can define according to whether stablizing of ground unrest, for example: ground unrest is divided into two types of stationary background noise signal and unstable state ambient noise signals; And for example reach the size definition of SNR when long: ground unrest is divided into four types of the big SNR ambient noise signals of stable state, the little SNR ambient noise signal of stable state, the big SNR ambient noise signal of unstable state, the little SNR ambient noise signal of unstable state according to whether stablizing of ground unrest.The size definition ground unrest type that can also comprise the information meaning according to ground unrest; Can be divided into melodious ambient noise signal (like music, flowing water, bird's twitter etc.), environmental background noise signal (like cocktail party such as ambient noise signal; Curbside, airport etc.) or three types of other ambient noise signals.In a word, the division type of ground unrest can be set as required, does not limit the enforcement of technical scheme of the present invention.

Stored the DTX unit 21 that each ground unrest type is suitable in the coding switch unit 24 in advance; The ground unrest type of confirming according to ground unrest taxon 23; Between at least two DTX unit 21, switch, selection and the corresponding DTX of current background noise type unit 21 are handled.

Each ground unrest type is distinguished corresponding a kind of different DTX scheme generally speaking; But consider the demand or the communication channel resources situation of change of concrete communication scenes; Can also set different switch modes; The switching controls scheme that each switch mode is corresponding different; Can the pairing concrete switching controls scheme of each switch mode be stored in the coding switch unit 24 in advance, and through the required switch mode of switch mode steering order indication coding switch unit 24, the switch mode steering order can be a limited instruction set of the current channel resource situation of reflection; The corresponding a kind of switch mode of each specific switch mode steering order, the distribution method of promptly a kind of DTX unit.Also can be when switch mode steering order each time, the concrete switching controls scheme that required switch mode is corresponding sends to coding switch unit 24 simultaneously.

Coding switch unit 24 is determined corresponding switch mode according to the indication information of concrete reason.To those skilled in the art, concrete switch mode switching control method is a lot, enumerates no longer one by one here.

Therefore as shown in Figure 3; Can also increase in the ground unrest code device 20 shown in Figure 2 command reception unit 25 is set; Receive the switch mode steering order and be transmitted to coding switch unit 24; The input from the outside that can be operating personnel of switch mode steering order also can be a transmitting terminal according to the communication channel resource conditions generation that detects etc.

Coding switch unit 24 can be selected appropriate switch mode according to the switch mode steering order, has at least the corresponding DTX unit 21 of a ground unrest type inequality between any two switch modes.To comprise in the ground unrest code device 20 that four DTX unit 21 are example, the setting of switch mode is for example shown in Figure 4:

When command reception unit 25 was not set, coding switch unit 24 was that ground unrest is selected appropriate DTX unit according to the ground unrest type only, and for example switch mode 1;

When being provided with command reception unit 25, coding switch unit 24 confirms to select the pattern of DTX unit according to the switch mode steering order.For example:

When receiving switch mode steering order 1, switch according to switch mode 1;

When receiving switch mode steering order 2, switch according to switch mode 2;

When receiving switch mode steering order 3, switch according to switch mode 3.

The switch mode steering order can be sent according to current communication channel resource conditions; When for example current channel resource is nervous; Wherein the ground unrest of part ground unrest type possibly use same DTX unit than low rate, even all ground unrest types are all used minimum speed limit DTX unit.Again for example, when not having the switch mode steering order, have only wherein that part DTX unit can be used, but when the switch mode steering order represented that current channel resource is abundant or nervous, another part DTX unit was used.

Visible by example shown in Figure 4:

Instruction 1 indication gets into switch mode 1, all corresponding a kind of different DTX unit of each ground unrest type this moment, and instruction 1 can be that the current channel resource of expression is normal;

Instruction 2 indications get into switch mode 2; The corresponding identical DTX unit of part ground unrest type meeting this moment; Like ground unrest Class1 and ground unrest type 2 equal corresponding DTX unit 212; Ground unrest type 3 and ground unrest type 4 equal corresponding DTX unit 214, instruction 2 can be represented current channel resource, and abundant (DTX unit 212 is high than DTX unit 211 code rates, and it is many to take resource; DTX unit 214 is high than DTX unit 213 code rates, and it is many to take resource), perhaps can representing current channel resource, nervous (DTX unit 212 is low than DTX unit 211 code rates, and it is few to take resource; DTX unit 214 is low than DTX unit 213 code rates, and it is few to take resource);

Instruction 3 indications get into switch mode 3; The all only corresponding DTX of the ground unrest type unit 214 that this moment is all; Instruction 3 can be represented current channel resource, and very abundant (DTX unit 214 code rates are the highest; It is maximum to take resource), perhaps can represent current channel resource very nervous (DTX unit 214 code rates are minimum, and it is minimum to take resource).

As shown in Figure 5, when very sufficient or ground unrest is melodious especially at current channel resource, can also speech coder be set in ground unrest code device 20 again, utilize speech coder that the specific background noise is carried out full-rate codes.The signal characteristic parameter value of specific background noise is satisfied to impose a condition; Whether ground unrest taxon 23 satisfies to impose a condition according at least one signal characteristic parameter value is judged the specific background noise type; And indicate coding switch unit 24 when the specific background noise, to switch to speech coder, speech coder is encoded to speech frame and output continuously with this specific background noise signal.Certainly, also can utilize the existing speech coder of transmitting terminal to bear this function.

Consult shown in Figure 1ly, comprise two parts in each DTX unit 21, one of which is the non-voice scrambler, is used for ambient noise signal is encoded to the SID frame, and it two is the background noise frames transmitter, is used for sending background noise frames according to the discontinuous mode of setting.Difference between the DTX unit can be the difference of SID frame coding mode, also can be the difference of discontinuous transmission mechanism, and perhaps the two is neither together.Wherein: the difference of SID frame coding mode can be embodied in the following aspects:

1, the difference of coding parameter

Different SID frames can adopt the parameter of different types of parameter and varying number;

2, the difference of identical parameters quantified precision

Can adopt the different quantized precision for the different SID frames of same parameter;

Therefore, the structure of the SID frame that different SID frame coding modes produce maybe be different with length, and for example: the coding parameter of stationary background noise signal noise can be simple relatively, and the quantified precision of parameter can be relatively low, and the length of SID frame can be shorter relatively.Corresponding; The coding parameter of unstable state ambient noise signal can be more relatively; A kind of method is the excitation parameters that in existing SID frame, increases in the speech coding parameters, and like adaptive codebook parameter (pitch delay and gain) and fixed codebook parameters (position of non-zero pulses and symbol and gain) etc., the quantified precision of parameter can be higher relatively; As increasing the quantified precision etc. of existing SID frame median filter parameter and excitation parameters, the length of SID frame also can correspondingly increase.And for example; The SID frame of melodious ambient noise signal should use the most complicated coded system (can be full-rate codes); The SID frame of environmental background noise signal uses time complicated coded system, and the SID frame of other ambient noise signal then uses better simply coded system.

Different ground unrest types can also be used different discontinuous transmission mechanisms, and the difference of the SID frame transmit mechanism of different background noise type is mainly reflected on the transmission intercal of adjacent two SID frames.For example for the stationary background noise signal, because the change of background is less, the human auditory system is not high to the quality decline susceptibility of such ground unrest after DTX/CNG, only needs to get final product with relatively long discontinuous transmission intercal.For the unstable state ambient noise signal, because background change in time is bigger, after DTX/CNG, can have a higher subjective quality in order to guarantee such ground unrest, need to use relatively short discontinuous transmission intercal or even continuous transmission.And for example, for melodious background, because the non-noise character of itself, the requirement of subjective quality is quite high, so the DTX system just need carry out continuous transmission with it.For environmental background noise, owing to itself carried certain environmental information, in order to ensure not losing these information, the DTX system also need upgrade the SID frame with higher speed.For other ground unrest, because the true appearance and the non-staple purpose of reduction background, so the renewal of carrying out the SID frame that the DTX system just can be relatively slow.

The transmission intercal of SID frame be divided into constant transmissions at interval with two kinds at Adaptive Transmission interval.Under constant transmissions mechanism at interval, the fixing at interval of adjacent two SID interframe, size at interval just depends on the type of ground unrest.Under Adaptive Transmission mechanism at interval,, the restriction of minimum time interval is arranged also though the transmission of SID frame is to be adaptive to the speed that noise changes.In the embodiment of the invention, can determine to adopt different minimum intervals according to the different background noise type.

As shown in Figure 6, the primary structure synoptic diagram for first kind of communication facilities provided by the invention comprises: VAD10, ground unrest code device 20, speech coder 30 and communication interface 40, wherein:

VAD10 is used to detect efficient voice, audible signal is sent into speech coder 30 be encoded to speech frame, and communication interface is sent speech frame through communication channel; Ambient noise signal is sent into ground unrest code device 20, ground unrest code device 20 output SID frames, communication interface is sent the SID frame through communication channel.

In the ground unrest code device 20, coding switch unit 24 switches the DTX unit 21 that is suitable for according to the type of ambient noise signal.

As shown in Figure 7; The second kind of communication facilities primary structure synoptic diagram that provides for the embodiment of the invention; Wherein: the coding switch unit 24 in the ground unrest code device 20 is confirmed switch mode according to the instruction of command reception unit 25, and switches the DTX unit 21 that is suitable for according to the type of ambient noise signal and the switch mode of current use.

Referring to shown in Figure 5, in the ground unrest code device 20 in Fig. 6 and the communication facilities shown in Figure 7 speech coder can also be set, speech coder is accomplished the full-rate codes of ground unrest.

As shown in Figure 8; Accomplish the full-rate codes of ground unrest if utilize existing speech coder 30; The switch unit 24 of then encoding is connected on the speech coder 30, ambient noise signal is sent into speech frame and the output continuously that is encoded to full rate in the speech coder 30.

The technical scheme that adopts the embodiment of the invention to provide needs the coding send mode of corresponding each SID frame, and corresponding D TX unit indication information is set in the SID frame, and DTX unit indication information supplies decoding end to confirm corresponding decoding and reconstituting scheme.For speech frame, decode and output continuously by Voice decoder.

Calculate identical because 22 pairs of background noise analysis unit calculate the calculating of background characteristics parameter value with the part of each DTX unit 21 when encoding the SID frame; Therefore background noise analysis unit 22 can also connect each DTX unit 21; Analysis result is exported to each DTX unit 21, thereby part reduces the calculated amount of DTX unit 21.

As shown in Figure 9; A kind of communication facilities structural representation that provides for the embodiment of the invention; The unit of accomplishing the receiving end function mainly comprises: ground unrest decoding device 50, recognition unit 60, decoding switch unit 70, Voice decoder 80 and communication interface, wherein:

Comprise at least two CNG unit (51,52...5n) in this ground unrest decoding device 50, each CNG unit is according to a kind of decoding and reconstituting scheme decoding SID frame of setting and be reconstructed into ground unrest;

Recognition unit 60 is used to discern the speech frame and the SID frame that receive through communication interface, and further discerns the DTX unit indication information in the SID frame;

Decoding switch unit 70 is used for switching according to the recognition result of recognition unit 60, and speech frame is sent into voice decipher 80, the SID frame is sent into corresponding CNG unit carry out decoding and reconstituting.

Wherein, consult shown in Figure 1ly, comprise the non-voice demoder in each CNG unit, be used to the background noise frames of decoding and receiving from communication channel; Reconstructor, the mode of inserting in being used to adopt is reconstructed into continuous ambient noise signal with decoded data, to increase the comfort of listener.

If recognition unit 60 is at first discerned speech frame or SID frame and SID frame that the coded frame that receives is a full-rate codes, further discern the DTX unit indication information in this SID frame., should have 1 or several bit that is used to carry DTX unit indication information in the code stream of SID frame here, the bit bit position both can also can be at the end in the beginning of SID frame, perhaps at other assigned address of SID frame, for example shown in following table 3 or the table 4:

Table 3.

Bit Allocation in Discrete (from a high position to the low level)	Parametric description
		S1	DTX unit indication information
S2	Quantitative prediction device index
		S3-S7	One-level LSF quantisation vector index
S8-S11	Secondary LSF quantisation vector index
		S12-S16	Gain (energy)

Table 4.

Bit Allocation in Discrete (from a high position to the low level)	Parametric description
		S1	Quantitative prediction device index
S2-S6	One-level LSF quantisation vector index
		S7-S10	Secondary LSF quantisation vector index
S11-S15	Gain (energy)
		S16-S17	DTX unit indication information

Can comprise two recognin unit in the recognition unit 60; A recognin unit is used to discern speech frame or the SID frame that the coded frame that receives is a full-rate codes; If identify speech frame then notify decoding switch unit 70, the decoding switch unit is sent speech frame into Voice decoder 80 decoding and output continuously; If identifying is the SID frame, then the SID frame is sent into another recognin unit and further discern the DTX unit indication information in the SID frame.

Because different ground unrest types has different SID frame format or update mechanism; Select suitable DTX decoding scheme decoding SID frame according to DTX unit indication information; Noise interpolating method in the different decoding schemes is different, the comfort noise that finally is restored out.

Communication facilities can promptly possess the transmitting terminal function, also concrete receiving end function, and structure shown in Figure 9 is arranged in the communication facilities with Fig. 7 or structure shown in Figure 8 merging usually.

Be elaborated with specific embodiment more below.

Embodiment one

At coding side; Ambient noise signal is imported background noise analysis unit 22 by frame; Background noise analysis unit 22 calculates the subband level level [k, i] of each current demand signal frame, and wherein k and i represent respectively to be the level of i frame k subband and to be kept in the internal memory.The calculating of subband can realize also can realizing through the method for conversion through bank of filters.

When the enough some of signal frame accumulation, 22 beginnings of background noise analysis unit are calculated ground unrest spectrum fluctuation parameter s pec_var [i] according to being kept at the subband level in the internal memory, and after update calculation result when receiving each frame background noise frames.

The computing method of ground unrest spectrum fluctuation parameter s pec_var [i] are:

spec_var [i] = Σ_{m = i - L + 1}^{i} (Σ_{n = i - L + 1, n &NotEqual; m}^{i} (Σ_{k = 1}^{N} {((level [k, m] - level [k, n]) / (\frac{1}{L \cdot N} Σ_{m = i - L + 1}^{i} Σ_{k = 1}^{N} level [k, i]))}^{2}))

Wherein N is the quantity of subband, and L is chosen as to do ambient noise signal number of frames average when long.The calculating of ground unrest spectrum fluctuation parameter s pec_var [i] also can be based on the LSF coefficient, that is:

spec_var [i] = Σ_{m = i - L + 1}^{i} (Σ_{n = i - L + 1, n &NotEqual; m}^{i} (Σ_{k = 1}^{N} {((lsf [k, m] - lsf [k, n]) / (\frac{1}{L \cdot N} Σ_{m = i - L + 1}^{i} Σ_{k = 1}^{N} lsf [k, i]))}^{2}))

K lsf coefficient of lsf [k, i] expression i frame wherein, N is the lsf number of coefficients, L is chosen as to do ambient noise signal number of frames average when long.

Background noise analysis unit 22 exports the ground unrest spectrum fluctuation parameter that calculates to ground unrest taxon 23.

Coding switch unit 24 is selected corresponding DTX unit according to the ground unrest type information of ground unrest taxon 23 outputs for the current background noise frame.

In the present embodiment, two kinds of background type have been defined: stationary background noise signal and unstable state ambient noise signal.Ground unrest taxon 23 incorporates the current background noise into in stable state or the unstable state a kind of through ground unrest spectrum fluctuation parameter and thresholding spec_var_thr being done relatively.< spec_var_thr, the current background noise signal is classified as the stationary background noise signal, otherwise is classified as the unstable state ambient noise signal as if spec_var [i].

In the present embodiment, there are two kinds of different working schemes in the DTX system:

Scheme one adopts per 8 frames to send the discontinuous transmission mechanism of a SID frame, only comprises wave filter and energy parameter in the SID frame;

Scheme two adopts per 3 frames to send the discontinuous transmission mechanism of a SID frame, and except comprising wave filter, energy parameter has also increased fixed codebook parameters (belonging to excitation parameters) outward in the SID frame.

The current background noise belongs to the stable state time-like, and coding switch unit 24 is selected programme of work one.The current background noise belongs to the unstable state time-like, and coding switch unit 24 is selected programme of work two.

A DTX unit indication information is set to state which kind of DTX unit this SID frame belongs to during every kind of SID frame assembling in frame.In the present embodiment, DTX unit indication information can represent that 0 and 1 representes DTX unit one and DTX unit two respectively with 1 bit.

In decoding end, recognition unit 60 is at first judged the DTX unit under this SID frame according to the DTX unit indication information in the SID frame, and decoding switch unit 70 carries out decoding and reconstituting with the corresponding CNG unit 50 of SID frame input.Wherein:

If DTX unit indication information is 0, corresponding CNG unit then obtain filter parameter and energy parameter according to the decoding of the SID frame format in the above-mentioned DTX programme of work one and thereafter 8 ambient noise signal frames are carried out in slotting;

If DTX unit indication information is 1, corresponding CNG unit then obtain filter parameter, energy parameter and fixed codebook parameters according to the decoding of the SID frame format in the above-mentioned DTX programme of work two and thereafter 3 ambient noise signal frames are carried out in insert.

Embodiment two

At coding side; Ambient noise signal is inputed to background noise analysis unit 22 by frame, and background noise analysis unit 22 calculates the subband level level [k, i] of current background noise signal frame; Frame energy pow [i] and signal to noise ratio (S/N ratio) snr [i] in short-term, and be kept in the internal memory.Wherein:

The calculating of subband level level [k, i] can realize also can realizing through the method for conversion through bank of filters;

Frame energy calculation mode is:

Pow [i] = Σ_{k = 1}^{N} Level {[k, i]}^{2},

Wherein N representes total sub band number;

The account form of signal to noise ratio (S/N ratio) is in short-term:

Snr [i] = \frac{Pow [i]}{Bckr_Noise_Pow [i]},

Ground unrest Energy Estimation when wherein bckr_noise_pow [i] is for length.

When the accumulation of ambient noise signal frame reaches some; 22 beginnings of background noise analysis unit are according to being kept at short-time characteristic parameter (the subband level in the internal memory; The frame energy; Signal to noise ratio (S/N ratio) in short-term) calculate ground unrest energy hunting pow_var [i], signal to noise ratio (S/N ratio) snr_long [i] when long, ground unrest Energy Estimation bckr_noise_pow [i] when long, and when receiving each ambient noise signal frame the update calculation result.

When long the signal to noise ratio (S/N ratio), when all the other calculation of parameter results' renewal occurred over just the current demand signal frame and is the ambient noise signal frame, the renewal of signal to noise ratio (S/N ratio) was when to occur over just the current demand signal frame be non-ambient noise signal frame when long.

The computing method of signal to noise ratio (S/N ratio) snr_long [i] are when long:

Snr_Long [i] = \frac{Σ_{m = i - L + 1}^{i} Snr [m]}{L},

Wherein L is chosen as to do non-ambient noise signal number of frames average when long.

The computing method of ground unrest energy hunting pow_var [i] are:

Pow_Var [i] = \frac{1}{L} * Σ_{m = i - L + 1}^{i} {(Pow [m] - \frac{1}{L} * Σ_{m = i - L + 1}^{i} Pow [m])}^{2},

Wherein L is chosen as to do ambient noise signal number of frames average when long.

The computing method of ground unrest Energy Estimation bckr_noise_pow [i] are when long:

Bckr_noise_pow [i]=(1-α) * bckr_noise_pow [i-1]+α * pow [i], wherein α is one 0 to 1 a scale factor, value is 5% in the present embodiment.

Background noise analysis unit 22 is with the ground unrest energy hunting pow_var [i] that calculates, and signal to noise ratio (S/N ratio) snr_long [i] exports ground unrest taxon 23 to when long.

In the present embodiment, four kinds of background type have been defined: the big SNR ambient noise signal of stable state, the little SNR ambient noise signal of stable state, the big SNR ambient noise signal of unstable state and the little SNR ambient noise signal of unstable state.

Ground unrest taxon 23 is through doing comparison with an energy hunting thresholding pow_var_thr and a signal-noise ratio threshold snr_thr respectively with powvar [i] and snr_long [i], and it is a type of 4 kinds of ground unrest type kinds that the current background noise is incorporated into, wherein:

As pow_var [i]>pow_var_thr and snr_long>during snr_thr, current background is the big SNR ambient noise signal of unstable state;

As pow_var [i]>< during snr_thr, current background is the little SNR ambient noise signal of unstable state for pow_var_thr and snr_long;

When powvar [i] < pow_var_thr and snr_long>snr_thr, current background is the big SNR ambient noise signal of stable state;

< < during snr_thr, current background is the little SNR ambient noise signal of stable state for pow_var_thr and snr_long as pow_var [i].

In the present embodiment, one has four kinds of different DTX unit, wherein:

DTX unit one adopts per 8 frames to send the discontinuous transmission mechanism of a SID frame, only comprises filter parameter and energy parameter in the SID frame;

DTX unit two adopts per 8 frames to send the discontinuous transmission mechanism of a SID frame, except that comprising filter parameter and energy parameter, also comprises fixed codebook parameters in the SID frame;

DTX unit three adopts per 3 frames to send the discontinuous transmission mechanism of a SID frame, except that comprising filter parameter and energy parameter, also comprises fixed codebook parameters in the SID frame;

DTX unit four adopts the continuous transmission mechanism of full-rate codes.

Coding switch unit 24 also links to each other with a switch mode steering order interface except the ground unrest type information that receives ground unrest taxon 23.The switch mode steering order has 5 kinds of indications, represent respectively that bandwidth is normal, bandwidth is more, bandwidth is less, bandwidth is a lot of and bandwidth seldom.

The switch mode of switch mode steering order and DTX system concerns as follows:

Represent bandwidth just often when the switch mode steering order, the switch mode of DTX system is: the big SNR class of stable state background is adopted DTX unit one; The little SNR class of stable state background is adopted DTX unit two; The big SNR class of unstable state background is adopted DTX unit three; The little SNR class of unstable state background is adopted DTX unit four.

Represent bandwidth more for a long time when the switch mode steering order, the switch mode of DTX system is: stable state big SNR class background and the little SNR class of stable state background are all adopted DTX unit two; Unstable state big SNR class background and the little SNR class of unstable state background are all adopted DTX unit four;

Represent bandwidth more after a little while when the switch mode steering order, the switch mode of DTX system is: stable state big SNR class background and the little SNR class of stable state background are all adopted DTX unit one; Unstable state big SNR class background and the little SNR class of unstable state background are all adopted DTX unit three;

When the switch mode steering order represented that bandwidth is a lot, the switch mode of DTX system was: four kinds of ground unrest types are all adopted DTX unit four;

When the switch mode steering order represented that bandwidth seldom, the switch mode of DTX system was: four kinds of ground unrest types are all adopted DTX unit one.

A DTX unit indication information is set to state which kind of DTX unit this SID frame belongs to during every kind of SID frame assembling in frame.DTX unit indication information in the present embodiment can represent that wherein: 00,01 and 10 represent DTX unit one, DTX unit two and DTX unit three respectively with 2 bits, so scheme four is owing to be that full-rate codes need not to identify.

In decoding end, recognition unit 60 is at first judged the DTX unit under this SID frame according to the DTX unit indication information in the SID frame, and the corresponding CNG unit decodes reconstruct of indication, wherein:

If DTX unit indication information is 00, then obtain filter parameter and energy parameter according to the decoding of the SID frame format in the above-mentioned DTX programme of work one and thereafter 8 ambient noise signal frames are carried out in slotting;

If DTX unit indication information is 01, then obtain excitation parameters according to the decoding of the SID frame format in the above-mentioned DTX programme of work two, filter parameter and energy parameter and thereafter 8 ambient noise signal frames are carried out in slotting.

If DTX unit indication information is 10, then obtain filter parameter, energy parameter and fixed codebook parameters according to the decoding of the SID frame format in the above-mentioned DTX programme of work three and thereafter 3 ambient noise signal frames are carried out in insert.

Those skilled in the art can be flexibly according to signal characteristic parameter dividing background noise type, and sets switch mode, enumerates no longer one by one here.

In sum, the embodiment of the invention provides a kind of coding method, comprising:

Obtain at least one signal characteristic parameter value of ambient noise signal to be sent;

According to a kind of DTX mechanism of said ground unrest type selecting ambient noise signal is encoded to SID frame and the discontinuous transmission that comprises DTX mechanism indication information, wherein SID frame coding mode and/or the discontinuous send mode of SID frame in each DTX mechanism is inequality.

Further, said coding method also comprises: set different DTX mechanism preference patterns, promptly different switch modes has at least the corresponding DTX mechanism of a ground unrest type inequality between any two DTX mechanism preference pattern; And according to selected DTX mechanism preference pattern, for each ground unrest type is confirmed corresponding DTX mechanism.

In a nearlyer step, said method also comprises: when setting value that said at least one signal characteristic parameter value is, ambient noise signal is encoded to the speech frame of full rate and sends continuously.

Wherein, the signal characteristic parameter value comprises one of following or combination in any: SNR, ground unrest estimate that level, ground unrest energy hunting estimation, the fluctuation of ground unrest spectrum estimate and the ground unrest vibration frequency when long.

The embodiment of the invention also provides a kind of method that the SID frame of correspondence is decoded, and comprising:

DTX mechanism indication information in the identification SID frame, wherein the SID frame coding mode in each DTX mechanism is inequality;

According to said DTX mechanism indication information, adopt corresponding decoding process decoding SID frame, and be reconstructed into ambient noise signal.

Use the technical scheme that the embodiment of the invention proposes; Can be according to the actual state of the dissimilar and communication system of ground unrest; Adopt different SID frame coding methods and discontinuous transmission mechanism; Under the situation of rationally utilizing the communication system real resource, be adaptive to desired encoding precision of dissimilar ground unrests and time sense, reach under the prerequisite of rationally utilizing channel resource in the high-quality purpose that all kinds ground unrest is provided of decoding end.

Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, belong within the scope of claim of the present invention and equivalent technologies thereof if of the present invention these are revised with modification, then the present invention also is intended to comprise these changes and modification interior.

Claims

1. a ground unrest code device is characterized in that, comprising:

2. ground unrest code device as claimed in claim 1; It is characterized in that; Also comprise: the command reception unit; Be used to receive the switch mode steering order and be transmitted to the coding switch unit, said switch mode steering order is used for indication coding switch unit and uses a kind of switch mode, has the corresponding different discontinuous transmission unit of a ground unrest type between any two switch modes at least.

3. ground unrest code device as claimed in claim 1; It is characterized in that; Also comprise: speech coder; Wherein a kind of specific context noise type that said coding switch unit identifies according to the ground unrest taxon is imported speech coder with ambient noise signal and is encoded to speech frame and output continuously.

4. like claim 1,2 or 3 described ground unrest code devices, it is characterized in that said discontinuous transmission unit specifically comprises:

The non-voice scrambler is used for ambient noise signal is encoded to silence description frames;

The silence description frames transmitter is used for the said silence description frames of discontinuous transmission.

5. the ground unrest decoding device that the silence description frames of the said ground unrest code device output of claim 1 is decoded is characterized in that, comprising:

6. ground unrest decoding device as claimed in claim 5 is characterized in that, also comprises:

Second recognition unit, said second recognition unit is used for recognizing voice frame and silence description frames, and said silence description frames is sent into first recognition unit, perhaps when identifying speech frame, notifies said decoding switch unit;

Voice decoder, the speech frame that said decoding switch unit identifies second recognition unit are sent into Voice decoder decoding and output continuously.

7. ground unrest decoding device as claimed in claim 6 is characterized in that, said first recognition unit and second recognition unit merge setting.

8. communication facilities comprises:

First speech coder, being used for the signal encoding that comprises efficient voice is speech frame and output continuously;

It is characterized in that, also comprise the ground unrest code device, this ground unrest code device specifically comprises:

9. communication facilities as claimed in claim 8; It is characterized in that; Said ground unrest code device also comprises: the command reception unit; Be used to receive the switch mode steering order and be transmitted to the coding switch unit, said switch mode steering order is used for indication coding switch unit and uses a kind of switch mode, has the corresponding different discontinuous transmission unit of a ground unrest type between any two switch modes at least.

10. communication facilities as claimed in claim 8; It is characterized in that; Said coding switch unit also connects said first speech coder; Wherein a kind of specific context noise type that said coding switch unit identifies according to the ground unrest taxon is imported first speech coder with ambient noise signal and is encoded to speech frame and output continuously.

11. communication facilities as claimed in claim 8; It is characterized in that; Said ground unrest code device also comprises: second speech coder; Wherein a kind of specific context noise type that said coding switch unit identifies according to the ground unrest taxon is imported second speech coder with ambient noise signal and is encoded to speech frame and output continuously.

12., it is characterized in that said discontinuous transmission unit specifically comprises like claim 8,9,10 or 11 described communication facilitiess:

The non-voice scrambler is used for according to the silence description frames coded system of setting ambient noise signal being encoded to silence description frames;

The silence description frames transmitter is used for sending said silence description frames according to the discontinuous transmission mechanism of setting.

13. communication facilities as claimed in claim 8 is characterized in that, also comprises:

14. communication facilities as claimed in claim 13 is characterized in that, also comprises:

15. communication facilities as claimed in claim 14 is characterized in that, said first recognition unit and second recognition unit merge setting.

16. a communication facilities comprises the ground unrest decoding device, it is characterized in that, said ground unrest decoding device comprises:

17. communication facilities as claimed in claim 16 is characterized in that, also comprises:

Second recognition unit, said second recognition unit is used for recognizing voice frame and silence description frames, and said silence description frames is sent into first recognition unit, perhaps when identifying silence description frames, notifies said decoding switch unit;

Voice decoder, the speech frame that said decoding switch unit identifies second recognition unit are sent into Voice decoder decoding and output.

18. communication facilities as claimed in claim 17 is characterized in that, said first recognition unit and second recognition unit merge setting.

19. a ground unrest coding method is characterized in that, comprising:

20. coding method as claimed in claim 19; It is characterized in that; Saidly be: select according to the corresponding relation between ground unrest type and the discontinuous transmission mechanism in the discontinuous transmission mechanism preference pattern of setting, have at least the corresponding discontinuous transmission mechanism of a ground unrest type inequality between any two switch modes according to a kind of discontinuous transmission mechanism of ground unrest type selecting.

21. coding method as claimed in claim 19 is characterized in that, also comprises: when said at least one signal characteristic parameter value satisfies when imposing a condition, ambient noise signal is encoded to the speech frame of full rate and sends continuously.

22. like claim 19,20 or 21 described coding methods; It is characterized in that said signal characteristic parameter comprises one of following or combination in any: SNR, ground unrest estimate that level, ground unrest energy hunting estimation, the fluctuation of ground unrest spectrum estimate and the ground unrest vibration frequency when long.

23. coding method as claimed in claim 22 is characterized in that, according to the size of each signal characteristic parameter value, said ground unrest type comprises stationary background noise signal and unstable state ambient noise signal.

24. coding method as claimed in claim 23 is characterized in that, when said signal characteristic parameter comprises when long SNR:

Said stationary background noise signal further comprises: big SNR ambient noise signal of stable state and the little SNR ambient noise signal of stable state;

Said unstable state ambient noise signal further comprises: big SNR ambient noise signal of unstable state and the little SNR ambient noise signal of unstable state.

25. coding method as claimed in claim 19 is characterized in that, described silence description frames coding parameter comprises composite filter parameter and excitation parameters.

26. coding method as claimed in claim 25 is characterized in that, described excitation parameters comprises adaptive codebook parameter and/or fixed codebook parameters.

27. coding method as claimed in claim 19 is characterized in that, described transmission intercal is confirmed according to the variation self-adaptation of background signal, and the minimum interval between the discontinuous send mode of different silence description frames is inequality.

28. coding method as claimed in claim 27 is characterized in that, the transmission intercal of the discontinuous send mode of each silence description frames is a fixed value inequality.

29. the method that the silence description frames that the said method of claim 19 is generated is decoded is characterized in that, comprising: