CN1672193A - Speech communication unit and method for error mitigation of speech frames - Google Patents

Speech communication unit and method for error mitigation of speech frames Download PDF

Info

Publication number
CN1672193A
CN1672193A CNA038182726A CN03818272A CN1672193A CN 1672193 A CN1672193 A CN 1672193A CN A038182726 A CNA038182726 A CN A038182726A CN 03818272 A CN03818272 A CN 03818272A CN 1672193 A CN1672193 A CN 1672193A
Authority
CN
China
Prior art keywords
speech
frame
transmission path
communication units
voice communication
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA038182726A
Other languages
Chinese (zh)
Other versions
CN100349395C (en
Inventor
乔纳森·阿拉斯泰尔·吉布斯
史蒂芬·阿夫泰拉克
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Motorola Mobility LLC
Google Technology Holdings LLC
Original Assignee
Motorola Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Motorola Inc filed Critical Motorola Inc
Publication of CN1672193A publication Critical patent/CN1672193A/en
Application granted granted Critical
Publication of CN100349395C publication Critical patent/CN100349395C/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/005Correction of errors induced by the transmission channel, if related to the coding algorithm
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0078Avoidance of errors by organising the transmitted data in a format specifically designed to deal with errors, e.g. location
    • H04L1/0083Formatting with frames or packets; Protocol or part of protocol for error control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L27/00Modulated-carrier systems

Abstract

A speech communication unit (100) comprising a speech encoder (134) capable of representing an input speech signal, the speech encoder (134) comprising a transmission path (281) for transmitting a number of speech frames to a speech decoder, the speech encoder (134) characterised by a virtual transmission path (282) for transmitting one or more references for a number of speech frames transmitted in the transmission path (281) wherein the one or more references relate to an alternative speech frame within the number of speech frames transmitted on the transmission path (281) to be used as a replacement frame when a frame is received in error. The speech communication unit provides at least the advantage that a more accurate replacement frame mechanism is provided, thereby reducing the risk of undesirable artefacts being audible in recovered speech frame.

Description

Be used for voice communication units and method that the speech frame error reduces
Technical field
The present invention relates to voice coding, and the method that is used in voice communication units, improving the audio coder ﹠ decoder (codec) performance.The present invention can be used for but the error that is not limited only in the audio coder ﹠ decoder (codec) reduces.
Background technology
Existing many voice communication systems all use Audio Processing Unit that speech samples is carried out Code And Decode, for example use the global system for mobile communications (GSM) of cellular telephony standard and the land relay wireless system (TETRA) that private mobile radio users uses.In this type of voice communication system, the scrambler in the transmitting element with the analog voice sample conversion for the digital format of coupling for transmission.The analog voice sample that Voice decoder in the receiving element can be heard the audio digital signals conversion that receives for people's ear.
Because the frequency spectrum in these wireless voice communication systems is an expensive resources very, in order to make the number of users in each frequency band many as far as possible, everybody wishes to limit the channel width that these voice signals use.Therefore, using the main target of speech coding technology is exactly under the prerequisite of not losing fidelity, reduces the shared capacity of speech samples as much as possible by compress technique.
Under the situation of voice communications versus data communications system, another kind of method is exactly to provide than the protection still less of similar data-signal for voice signal.This method can cause voice packet to produce more error than packet, and the risk of losing whole voice packet simultaneously can increase.
In Voice decoder, error reduction technology is generally used for improving the performance of voice communication units, if following situation for example takes place:
(i) there is too much bit error in the speech frame of receiving; Perhaps
(ii) lose based on the packet in the network of Internet protocol (IP) (wherein may comprise voice messaging).
Need " bad frame " reduction technology to be used for reducing the error frame of receiving as far as possible, just comprise the frame of error or loss, the influence of auditory effect.These technique reproducible the estimation of the speech frame lost, rather than insert quiet or noise to decoded voice.These technology will be used the statistical static characteristic of voice usually.Exist the individual frames of error can estimate fully usually, just use from the similar parameter of energy, fundamental tone, frequency spectrum and the pure and impure sound etc. of preceding frame voice and substitute it.Yet voice are not real stable state, and for example the initial sum plosive of voice is unusual transient activities.Therefore, this simple " substituting " technology cause sometimes factitious, also be undesirable man-made noise.
In ecotopia, we wish and can insert data from any end that transmission is interrupted, promptly take data away before and after the bad frame sequence, and insert betwixt.Yet owing to can introduce unwanted delay, this method is unacceptable in voice communication system.
If received several bad frames, voice signal energy will be reduced to zero behind several frames so usually.Usually should comprise " pure and impure sound " parameter, because it can be that voiceless sound or voiced sound change repeated content according to voice.In principle, for voiced speech, preferred version is exactly a repetition period property component.On the contrary, for unvoiced speech, preferred version is to produce similar sound spectrum and similar energy, rather than periodic.
The present inventor has realized and has recognized that use " substitutes " frame mechanism reduces strategy as bad frame limitation.Especially, they have only recognized that under rare occasion replacement frame just can be real suitable frame.In addition, if there is error in a large amount of frames of receiving, this situation may frequently appear under the second-rate condition of wireless communication link, and replacement frame mechanism will become and can't accept more so.
Therefore, when using such audio coder ﹠ decoder (codec),, need provide a kind of improved error reduction technology in order to reduce some above-mentioned defective at least.
Summary of the invention
In the first of the present invention, provide a kind of voice communication units according to claim 1.
In the second portion of the present invention, provide a kind of voice communication units according to claim 13.
In the third part of the present invention, provide a kind of method that in voice communication units, reduces the bad frame error according to claim 15.
In the 4th part of the present invention, provide a kind of voice communication units according to claim 16.
In the 5th part of the present invention, provide a kind of wireless communication system according to claim 17.
Other parts of the present invention define in the dependent claims.
Generally speaking, target of the present invention provides a kind of communication unit, and it comprises audio coder ﹠ decoder (codec) and reduce the method for bad frame error, it can reduce at least in the defective that above-mentioned existing bad frame error reduction technology exists certain is several.If there is error in the speech frame that receives in the transmission path, so by transmission voice frames on transmission path, and use the reference/pointer of in virtual transmission path, transmitting to provide the method for the selective replacement speech frame of Voice decoder use, just can reach target.Should have the different error statistics other virtual transmission path of (for example separating FEC mechanism) in the ideal case by using, reference/pointer just is not easy to be subjected to the influence of same error of the speech frame of reference.In addition, buffer technology is used in the scrambler, selects selective speech frame in a large amount of speech frames of transmission before before, and the selective speech frame of choosing shows similar characteristic to speech frame that will reference.
Description of drawings
Now by describing the embodiment of example with reference to the accompanying drawings, in the drawings:
Fig. 1 shows the block diagram of wireless communication unit, and it comprises speech coder, is suitable for supporting the different inventive concepts of the preferred embodiments of the present invention;
Fig. 2 shows the block diagram of code book excited linear predict voice coding device, is suitable for supporting the different inventive concepts of the preferred embodiments of the present invention;
Fig. 3 shows according to the preferred embodiments of the present invention, and the use of the refer-mechanisms that is provided by alternative virtual transmission path is chosen replacement frame thus from other a large amount of frames; With
Fig. 4 shows according to the preferred embodiments of the present invention, and the enhancing of alternative virtual transmission path is used, and it is with solving the multiple error that occurs in the main transmission path.
Embodiment
Referring now to Fig. 1,, show the block diagram of wireless subscriber unit, hereinafter be called movement station (MS) 100, be suitable for supporting the different inventive concepts of the preferred embodiments of the present invention.MS 100 comprises antenna 102, preferably is connected to duplexer filter, reprod or circulator 104 that isolation is provided between receiver in MS 100 and the transmit chain.
As known in the art, receiver chain generally includes radio scanner front-end circuit 106 (reception, filtering effectively are provided, and intermediate frequency or base-band frequency conversion).The scanning front-end circuit is connected with signal processing function unit 108.The output of signal processing function unit offers suitable output unit 110, for example via the loudspeaker of Audio Processing Unit 130.
Audio Processing Unit 130 comprises voice coding functional unit 134, and it is encoded to user speech and is fit to the form that transmission medium transmits.Audio Processing Unit 130 also comprises voice coding functional unit 132, and it is with the form of tone decoding for being fit to export via output unit (loudspeaker) 110 of receiving.Audio Processing Unit 130 is connected with timer 118 with memory cell 116 via controller 114.Especially, the operation of Audio Processing Unit 130 is suitable for supporting the inventive concept of the preferred embodiments of the present invention.Especially, Audio Processing Unit 130 is suitable for choosing replacement speech frame from the speech frame of a large amount of previous transmission.Audio Processing Unit 130 or signal processor 108 can be enabled in the transmission of the reference/pointer signal (provide the replacement speech frame chosen) of selectable virtual transmission path in the main transmission path.The applicability of Audio Processing Unit 130 will further specify according to Fig. 2.
Consider that for integrality receiver chain also comprises received signal volume indicator (RSSI) circuit 112 (linking to each other with radio scanner front end 106 in the diagram, although RSSI circuit 112 can be arranged in other any positions of receiver chain).The RSSI circuit is connected with controller 114, to safeguard whole subscriber unit control.Controller 114 also is connected with signal processing function unit 108 (realizing by DSP usually) with radio scanner front-end circuit 106.Therefore, controller 114 can receive bit error rate (BER) and frame error rate (FER) data from recovering information.Controller 114 links to each other with the store operation rule with storage arrangement 116, for example decoding/encoding function or the like.Timer 118 links to each other with controller 114 usually, with the timing of operation (transmission of time correlation signal and reception) in the control MS 100.
In environment of the present invention, timer 118 has been stipulated the timing of the voice signal in transmission (coding) path and/or reception (decoding) path.
About sending chain, it comprises input media 120 in essence, for example the microphone sensor that is in series via speech coder 134 and transmitter/modulation circuit 122.After this, any transmission signal sends from antenna 102 via power amplifier 124.By the output from the power amplifier that links to each other with duplexer filter or circulator 104,124 pairs of controllers of transmitter/modulation circuit 122 and power amplifier respond.Comprise up-conversion and frequency down-conversion function unit (not shown) in transmitter/modulation circuit 122 and the radio scanner front-end circuit 106.
Certainly, the disparate modules among the MS 100 can be arranged according to function topology any appropriate, that can utilize inventive concept of the present invention.In addition, the disparate modules among the MS 100 can be implemented as discrete or integrated disparate modules form, so its basic structure is only selected arbitrarily.
The present invention's expection, voice signal preferably cushions or disposal route can realize in software, firmware or hardware, and method preferably is to adopt software processes device (perhaps digital signal processor (DSP)) to finish language process function.
Referring now to Fig. 2,, it shows the block diagram of code book Excited Linear Prediction (CELP) speech coder 134 according to the preferred embodiments of the present invention.Audio input signal to be analyzed puts on the speech coder 134 on the microphone 202.Then, input signal puts on wave filter 204.Wave filter 204 has the characteristic of bandpass filter usually.Yet if speech bandwidth is enough, wave filter 204 may comprise direct circuit connection so.
As known in the art, then be converted into N pulse sampling sequence from the analog voice signal of wave filter 204, the amplitude of each pulse sampling is represented by the digital code in modulus (A/D) converter 208.Sampling rate is determined by sampling clock (SC).Sampling clock (SC) is along with frame clock (FC) produces together.
Numeral output with the A/D 208 that imports speech vector s (n) expression can put on coefficient analyser 210.As known in the art, input speech vector s (n) can repeat to obtain from the frame that separates, and just obtains from the time block by frame clock (FC) decision length.
According to the preferred embodiments of the present invention,, can produce linear predictive coding (LPC) parameter set by parameter analyzer 210 for each block of speech.The speech coding parameters that produces may comprise with the lower part: LPC parameter, long-term prediction (LTP) parameter, excitation gain factor (G 2) (together with the random code book excitation code word I of the best).These speech coding parameters are applied to multiplexer 250, and use by the voice operation demonstrator that channel sends in the demoder.Input speech vector s (n) also is applied to subtracter 230, and its function illustrates subsequently.
In the traditional celp coder of Fig. 2, for minimum weighted in the excitation vector is selected in the summation that obtains being used for representing importing speech samples, selection optimal index and gain in the adaptive codebook of codebook search controller 240 from module 216 and the random code book in the module 214.The output of random code book 214 and adaptive codebook 216 is input to respectively in gain function unit 222 and 218.As known in the art, the adjusted output that gains is sued for peace in totalizer 220, is input to then in the LPC wave filter 224.
At first, calculate adaptive codebook or long-term prediction component l (n).It is characterized in that postponing and gain factor " G 1".
For each independent stochastic codebook excitation vector u i(n), relatively import the speech vector s ' that speech vector s (n) produces reconstruct i(n).Gain module 222 scaled excitation gain factor " G 2", summation module 220 increases the adaptive codebook component.Such gain can be calculated in advance and is used to analyze all excitation vectors by coefficient analyser 210, perhaps can carry out combined optimization with search Optimum Excitation code word I, and Optimum Excitation code word I is produced by codebook search controller 240.
Pumping signal G by 224 pairs of convergent-divergents of linear predictive coding wave filter then 1L (n)+G 2u i(n) carry out filtering, wave filter 224 has constituted short-term prediction (STP) wave filter, in order to produce the speech vector s ' of reconstruct i(n).The reconstruct speech vector s ' that is used for i boot code vector i(n) same block with input speech vector s (n) compares, and this is by finishing these two signal subtractions in subtracter 230.
Difference vector e i(n) poor between expression raw tone piece and the reconstruct block of speech.Difference vector carries out perceptual weighting by weighting filter 232, uses the weighting filter parameter (WTP) that is produced by coefficient analyser 210.Perceptual weighting has been strengthened error wherein to the sensuously prior frequency of people's ear, and has weakened other frequency.
Energy calculator functional unit in the codebook search controller 240 calculates weighted difference vector e ' i(n) energy.The codebook search controller relatively is used for current excitation vectors u i(n) i error signal and former error signal are to determine to produce the excitation vectors of least error.Sign indicating number with i excitation vectors of least error is exported as Optimum Excitation sign indicating number I on channel subsequently.
Scaled excitation G 1L (n)+G 2u 1(n) copy is stored in the long-term prediction storer 216 standby.
In addition, codebook search controller 240 can be determined specific code word, and this code word provides the error signal with some preassigned, such as satisfying predetermined error threshold.
The more detailed description of typical case's voice coding unit can find from following document: A.M.Kondoz, " Digital speech coding for low-bit rate communications systems ", John Wiley, 1994.
In a preferred embodiment of the invention, error reduction technology is applied to speech frame after multiplexer 250.The present invention has utilized selective (being preferably parallel) virtual transmission path 282, and it is used to send the pointer of sensing speech frame of coding before sending from scrambler on the main transmission path 281.
In environment of the present invention, term " virtual " is defined as the transmission path except the main transmission path of support voice communication, and it is assumed to from the scrambler to the demoder." virtual " transmission path can be positioned at identical bit stream, perhaps in the identical time frame or multiframe in time division multiplex mechanism, perhaps via different communication routes, for example in VoIP system.By utilizing additional virtual transmission path, it has different error statistics (for example separating FEC mechanism) ideally, and reference/pointer will obtain the error identical with the speech frame of its reference.
Significantly not being both after the multiplexing operation with of known coded configuration is second to minimize part.Speech parameter data in such circuit estimation buffering is also selected near the current speech frame one.
In strengthening embodiment, parallel virtual transmission path is used different forward error recovery (FEC) protection of using with speech coder in main transmission path.Like this, by using independent F EC path, the error statistics that the VoP experience is different.Difference between main transmission path and the parallel virtual transmission path helps to improve the robustness to error.
Multiplexer 250 output data bag/frames are to the impact damper 260 of the in the past multiplexing frame of control.The buffered frame of the multiplexed signals in demodulation multiplexer 270 access buffer 260.Herein, demodulation multiplexer 270 separates excitation parameters 274 with LPC parameter 272.Notice that the storer that is used to produce the long-term prediction device of excitation parameters must be identical with the long-term prediction device 216 that frame begins to locate.
For each block of multiplexed speech, produce linear predictive coding (LPC) parameter set of present frame and former frame thus.In a preferred embodiment of the invention, the set of each quantification LPC parameter and excitation parameters has formed the speech vector s ' of the reconstruct of frame before j that is used for buffered data j(n).It is by coming to compare with the speech vector s (n) that cushions previously to these two signal subtractions in subtracter 262.
Difference vector e j(n) poor between the original and block of speech that cushions previously of expression.Difference vector carries out perceptual weighting by LPC weighting filter 264.As noted, perceptual weighting has strengthened those people's ear has been felt the frequency of prior error, and other the frequency of having decayed.
Energy calculator functional unit in the codebook search controller 266 calculates weighted difference vector e ' j(n) energy.Codebook search controller 266 relatively is used for current excitation vectors u j(n) j the error signal and the error signal of front are to determine to produce the excitation vectors of least error.Codebook search controller 266 is selected " optimal index of frame data " subsequently, so that minimum weighted to be provided." pointer " of frame was sent to demoder before scrambler will point to subsequently, and this preceding frame is confirmed as providing the minimum weighted between each speech frame in himself and the main transmission path.
In essence, the speech frame of reference (ideally, different with current transmission frame on time or number of frames) has constituted the frame of the frame (on the meaning of perceptual weighting error) that is similar to encoder encodes in the specific mobile voice window most.Therefore, if mistake has received frame, its expression is used for the optimum matching (pointer) that error reduces the present frame of step.This expression or pointer will be described in conjunction with Fig. 3 below in more detail.
Referring now to Fig. 3,, the buffering timing diagram 300 that illustrates has illustrated preferred process of the present invention.Timing diagram explanation frame-0 310 is received and is confirmed as mistake at Voice decoder.Demoder inserts selective virtual transmission path then to determine that optimal frame comes replacement frame-0 310.As shown in Figure 3, selective virtual transmission path is included in the pointer of frame-4 320, substitutes as the preferred of frame-0 310.By with frame-4 320 replacement frame-0 310, in the tone decoding process, only voice quality has been produced minimum influence.
The present inventor recognizes and has used such fact, and promptly several frames in front (usually) are all said by identical talker, and promptly these speech frames will show similar fundamental tone and resonance peak position.Therefore, probably find the former speech frame similar to the current speech frame.
According to a preferred embodiment of the invention, by finding the minimal sensation error for each buffered frame assessment weighting segmental signal-to-noise ratio (SEGSNR) or average weighted SNR, the given parameter sets that is used for every frame in storer here.Preferably, in audio coder ﹠ decoder (codec) subframe rank definition segment.
Figure A0381827200141
This determines to finish in scrambler.Exist under the situation of little pitch error, expectation may obtain significantly different SEGSNR value.This is because source voice and buffering signal may shift out phase place fast.Therefore, in enhancing embodiment of the present invention, suggestion is searched near the pitch period of buffered frame, for example+/-5%, uses sub sampling (sub-sample) to decompose (normally 1/3 or 1/4 sampling), selects maximum SEGSNR value.
During another strengthens in the present invention,, then be used to reduce the bad frame self that receives of this frame and will be the source of the voice messaging of the best that is used for the present frame that mistake receives, as shown in Figure 4 if mistake has received this frame self.Therefore, Fig. 4 has illustrated the timing diagram of pointing out how to handle multiple error.Known from the data of frame-0 410 is wrong.The process of the reduction error of suggestion has been used selective virtual transmission path, and it is appointed as suitable substituting with Frame-4 420.But Frame-4 420 also is confirmed as wrong.In the case, pointer will be appointed as the frame the most similar to worsening frame-4 420 from the data of frame-6 430.Therefore, frame-6 430 is used for replacement frame-4 420 and is applicable to replacement frame-0 410.Like this, just can handle the multiframe mistake, overflow the problem of (out-of-memory) reference to overcome storage.
This may cause with reference to (pointer) finally straight-through effectively (lead out of) memory window.But if the improper value in the window obtains upgrading by the needs of removing many references, this just no longer is a problem.
In a word, flow in the main bit stream at selective bit, reference or beacon transmission are to demoder.Reference or pointer have pointed out to have mated best the frame of the former transmission of current transmission frame.Reference or pointer be transmission in parallel bit stream preferably.If received frame, just in frame substitution error reduction process, use reference or pointer in the Voice decoder mistake.Therefore, by with known formerly or the subsequent frame replacement mechanism expand to the reduction that arbitrary frame in a plurality of frames comes the enhancement frame error.In this, the quantity of the frame that uses during the course is subjected to the restriction of the required processing power of buffered/stored device and/or definite minimum weighted frame.
As noted, the buffered/stored of the speech parameter of speech coder is handled and to be based on that a plurality of frames carry out.For example, in the situation of GSM EFR (EFR) codec (<12kb/ second), three second voice memory space have only the 5K byte.Therefore, the most difficult task is the immediate frame coupling of identification from 150 possible frames.Therefore, in one embodiment of the invention, above-mentioned minimum weighted selection technology can be used for subset of parameters or is used to derive from the parameter of synthetic speech, rather than all parameters of speech coder frame.In other words, may be with reference to the energy (getting the speech parameter of the synthetic speech that all calculates in the comfortable encoder) of (or sensing) LPC filter parameter (LSF) and synthetic speech frame, rather than precision encoding device parameter, thereby storage and comparison process have been saved.
In this, because speech frame comprises many parameters, the technology of suggestion can be applied to the parameter of any amount on principle.In celp coder, the example of these parameters comprises:
(i) line spectrum pair (LSP), its expression LPC parameter;
The long-term prediction (LTP) that (ii) is used for subframe-1 lags behind;
(iii) be used for the LTP gain of subframe-1;
The code book index that (iv) is used for subframe-1;
(v) be used for the code book gain of subframe-1;
(long-term prediction that vi) is used for subframe-2 lags behind;
(vii) be used for the LTP gain of subframe-2;
(the code book index that viii) is used for subframe-2;
(ix) code book that is used for subframe-2 gains;
(x) long-term prediction that is used for subframe-3 lags behind;
(xi) LTP that is used for subframe-3 gains;
(xii) be used for the code book index of subframe-3;
(xiii) code book that is used for subframe-3 gains;
(xiv) long-term prediction that is used for subframe-4 lags behind;
(xv) LTP that is used for subframe-4 gains;
(xvi) code book that is used for subframe-4 gains; Or
(xvii) code book that is used for subframe-4 gains.
Below also within limit of consideration of the present invention, can send pointer with reference to LSP set from previous frame, with the LSP of coupling present frame, rather than the entire parameter collection.In addition, might make pointer be used for each of a plurality of above-mentioned parameters.
In wireless communication system, parallel virtual transmission path preferably includes: transmission block coded reference word in the not protected bit of data useful load (7 bits are enough to support 128 frames buffering herein, are equivalent to about 2.5 seconds).This can encode (having 75 bps equivalent rate) by the BCH block code of 15 bits, and the nearly error correction of 2 bits is provided.
In addition, can estimate that selective virtual transmission path may provide the combination of error correction and error-detecting function.Error-detecting will be useful, because the bad reception of reference can cause bad reduction.If poorly received reference word, frame received before this mechanism can default to.75 bps channel speed will be only be reduced to 22.725K bps to the thick bit rate of GSM full speed channel from 22.8K bps, and this will cause the inessential loss of sensitivity.
In a further embodiment, this is as voice-over ip (VoIP) communication link, and selective virtual transmission path can obtain by sending many bag streams.Basically can not increase though wish total flow in the case, because this may increase the rate of substitute.
Preferable mechanism is only under generation transformation and the astable situation of voice, to send the frame that is referenced to the front as mentioned above.When the voice stable state, and when the relative work of prior art is fine, do not send reference.Like this, packet network is excessively overload not, but has obtained most of performance gain.The degree that voice signal becomes static can be generated as a variable, and this variable can be adjusted into to improve under the situation of packet loss and reproduce quality.
Decoder function is the reverse side (adjunct circuit that does not have the multiplexer back) of encoder functionality basically, therefore here repeats no more.The description of the function of typical case's tone decoding unit can be found in below with reference to document: A.M.Kondoz, " Digital speech coding forlow-bit rate communications systems ", John Wiley, 1994.At demoder, demoder is followed the standard decode procedure, determines bad frame up to it.When detecting bad frame, demoder is assessed selective virtual transmission path to determine the indicated selective frame of each reference/pointer.Demoder receives " similar " frame subsequently, as pointed in the reference/pointer transmission.Zhi Shi frame was used for substituting the frame that receives subsequently in the past, with synthetic speech.
Advantageously, inventive concept described here can come existing codec is innovated in pattern or design by steal bit from the FEC mechanism of having constructed.
Should be appreciated that the bad frame error reduces mechanism as mentioned above, following at least advantage be provided:
(i) provide replacement frame mechanism more accurately, be reduced in thus in the speech frame of recovery can audible undesired man-made noise risk.
(ii) by for example stealing bit from the FEC mechanism of having constructed, selective virtual transmission path can innovate in pattern or design to existing codec.
(iii) only taking place to change and the astable situation of voice under just send to before during the reference of frame, will use to have bad frame error reduction technology now required any additional data among minimized thus the present invention.
(iv) by cross reference be the data that receive of particular frame and in this mechanism reference frame, can detect the wrong parameter that receives.
Although preferred embodiment has been discussed the application of the present invention to celp coder, the inventor can expect, inventive concept described here can be used for other Audio Processing Units of wireless communication unit, such as the digital exchange standard (DIIS) or the voice-over ip (VoIP) of Universal Mobile Telecommunications System (UMTS) unit, global system for mobile communications (GSM), land relay wireless (TETRA) communication unit, information and signaling.
The device invention
A kind of voice communication units comprises the speech coder that can represent input speech signal.This speech coder comprises transmission path, is used for a plurality of speech frames are transferred to Voice decoder.This speech coder further comprises virtual transmission path, is used for being transmitted in one or more references of a plurality of speech frames that transmission path transmits.Described one or more reference relates to the selective speech frame in a plurality of speech frames that transmit on transmission path, be used as replacement frame when bad frame.
A kind of voice communication units, for example above-mentioned voice communication units with speech coder comprises Voice decoder, is suitable for receiving a plurality of speech frames on the transmission path and receive one or more selective speech frame references on virtual transmission path.Described one or more reference relates to the selective speech frame in a plurality of speech frames that receive on transmission path, be used as replacement frame when bad frame.
The method invention
A kind of method that reduces the bad frame error in voice communication units, described method comprise the steps: on transmission path a plurality of speech frames to be transferred to Voice decoder by the speech coder in the voice communication units.Speech coder is transmitted in one or more references of a plurality of speech frames that transmit in the transmission path on virtual transmission path, wherein said one or more reference relates to the selective speech frame in a plurality of speech frames that transmit on transmission path, be used as replacement frame when bad frame.
Like this, when mistake receives speech frame, can select to improve replacement frame from a plurality of speech frames.
Therefore, describe bad frame error reduction technology and related voice communication unit and circuit here, reduced some shortcoming at least in the above-mentioned shortcoming of known error reduction technology basically.

Claims (15)

1. a voice communication units (100), comprise the speech coder (134) that to represent input speech signal, this speech coder (134) comprises transmission path (281), be used for a plurality of speech frames are transferred to Voice decoder, this speech coder (134) is characterised in that virtual transmission path (282), be used for being transmitted in one or more references of a plurality of speech frames of transmission path (281) transmission, wherein, described one or more reference relates to the selective speech frame in transmission path (281) is gone up a plurality of speech frames that transmit, and is used as replacement frame when the error received frame.
2. voice communication units according to claim 1 (100), wherein, being further characterized in that of speech coder (134):
Multiplexer (250) is used for multiplexing described a plurality of speech frames;
Impact damper (260) effectively is connected to described multiplexer (250), in order to store multiplexing speech data; With
Processor (130,270), effectively be connected to described impact damper (260), be used for characterizing the current speech frame at described impact damper (260), and select selective speech frame, this selective speech frame has showed the characteristic similar to described speech frame, wherein, the reference transmission that will arrive described selective speech frame with virtual transmission path (282) arrives demoder.
3. voice communication units according to claim 2 (100), wherein, described processor comprises de-multiplexer function (270), the one or more speech frames that are used for access buffer (260), processor is dissociative excitation parameter (274) from the LPC parameter (272) of speech frame of buffering also, in order to select to show the speech frame of similar characteristic.
4. according to the described voice communication units of any front claim (100), wherein, described virtual transmission path (282) is included in the same bits stream of transmission path (281).
5. according to the described voice communication units of any front claim (100); wherein; described transmission path (281) uses the first forward error recovery protection mechanism, and described virtual transmission path (282) is used and be different from second forward error recovery protection of using in transmission path (281).
6. according to each described voice communication units (100) among the claim 2-5 of front, wherein, described processor (130,266,270) is selected selective replacement frame, in order to minimum weighted to be provided.
7. voice communication units according to claim 6 (100), wherein, described processor (130,266,270) is by determining minimum weighted for each buffered frame assessment weighting segmental signal-to-noise ratio (SEGSNR) or average weighted SNR.
8. according to claim 6 or the described voice communication units of claim 7 (100), wherein, described processor (130,266,270) is determined the minimum weighted of speech coding parameters subclass.
9. according to claim 6, claim 7 or the described voice communication units of claim 8 (100), wherein, described processor (130,266) is searched near the pitch period of described buffering speech frame basically, and option table reveals the frame of the highest SEGSNR value.
10. according to the described voice communication units of any front claim (100), wherein, described selective speech frame (320) only is used as the reference of described current speech frame when transformation and voice unstable state take place.
11. according to the described voice communication units of any front claim (100), it is characterized in that Voice decoder (132), be suitable for going up a plurality of speech frames of reception and going up the reference of reception one or more selective speech frames (320) in virtual transmission path (282) at transmission path (281), wherein, described one or more reference relates to the selective speech frame (320) in transmission path (281) is gone up a plurality of speech frames that receive, and is used as replacement frame when the error received frame.
12. voice communication units according to claim 11 (100), wherein, if described selective speech frame (420) is the error received frame, then select the selective frame of frame (430), be used for substituting the selective speech frame (420) that current error receives speech frame (410) and error reception as the selective frame (420) of described error reception.
13. a method that reduces the bad frame error in voice communication units (100), described method comprises the steps:
Come upward a plurality of speech frames to be transferred to Voice decoder by the speech coder (134) in the voice communication units (100) at transmission path (281);
Described method is characterised in that following steps:
On virtual transmission path (282), be transmitted in one or more references of a plurality of speech frames of transmission in the transmission path (281), wherein said one or more reference relates to the selective speech frame in transmission path (281) is gone up a plurality of speech frames that transmit, and is used as replacement frame when the error received frame.
14. voice communication units (100) that is suitable for realizing method step according to claim 13.
15. a wireless communication system is suitable for supporting to use according to any described transmission path of front claim (281) and virtual transmission path (282).
CNB038182726A 2002-07-31 2003-05-12 Speech communication unit and method for error mitigation of speech frames Expired - Lifetime CN100349395C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0217729.3 2002-07-31
GB0217729A GB2391440B (en) 2002-07-31 2002-07-31 Speech communication unit and method for error mitigation of speech frames

Publications (2)

Publication Number Publication Date
CN1672193A true CN1672193A (en) 2005-09-21
CN100349395C CN100349395C (en) 2007-11-14

Family

ID=9941443

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB038182726A Expired - Lifetime CN100349395C (en) 2002-07-31 2003-05-12 Speech communication unit and method for error mitigation of speech frames

Country Status (7)

Country Link
EP (1) EP1527440A1 (en)
JP (1) JP2005534984A (en)
KR (1) KR20050027272A (en)
CN (1) CN100349395C (en)
AU (1) AU2003240644A1 (en)
GB (1) GB2391440B (en)
WO (1) WO2004015690A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105374362A (en) * 2010-01-08 2016-03-02 日本电信电话株式会社 Encoding method, decoding method, encoder apparatus, decoder apparatus and program

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102007018484B4 (en) 2007-03-20 2009-06-25 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for transmitting a sequence of data packets and decoder and apparatus for decoding a sequence of data packets
US20150326884A1 (en) * 2014-05-12 2015-11-12 Silicon Image, Inc. Error Detection and Mitigation in Video Channels

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FI98164C (en) * 1994-01-24 1997-04-25 Nokia Mobile Phones Ltd Processing of speech coder parameters in a telecommunication system receiver
FI950917A (en) * 1995-02-28 1996-08-29 Nokia Telecommunications Oy Processing of speech coding parameters in a telecommunication system
US5917835A (en) * 1996-04-12 1999-06-29 Progressive Networks, Inc. Error mitigation and correction in the delivery of on demand audio
US6636829B1 (en) * 1999-09-22 2003-10-21 Mindspeed Technologies, Inc. Speech communication system and method for handling lost frames

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105374362A (en) * 2010-01-08 2016-03-02 日本电信电话株式会社 Encoding method, decoding method, encoder apparatus, decoder apparatus and program
CN105374362B (en) * 2010-01-08 2019-05-10 日本电信电话株式会社 Coding method, coding/decoding method, code device, decoding apparatus and recording medium

Also Published As

Publication number Publication date
KR20050027272A (en) 2005-03-18
GB2391440B (en) 2005-02-16
JP2005534984A (en) 2005-11-17
GB0217729D0 (en) 2002-09-11
CN100349395C (en) 2007-11-14
EP1527440A1 (en) 2005-05-04
AU2003240644A1 (en) 2004-02-25
GB2391440A (en) 2004-02-04
WO2004015690A1 (en) 2004-02-19

Similar Documents

Publication Publication Date Title
CN1153399C (en) Soft error correction in a TDMA radio system
CN1223989C (en) Frame erasure compensation method in variable rate speech coder
CN1143265C (en) Transmission system with improved speech encoder
CN1158647C (en) Spectral magnetude quantization for a speech coder
CN102461040B (en) Systems and methods for preventing the loss of information within a speech frame
CN1129263C (en) Method and apparatus for group encoding signals
JP4842472B2 (en) Method and apparatus for providing feedback from a decoder to an encoder to improve the performance of a predictive speech coder under frame erasure conditions
KR100935174B1 (en) Fast code-vector searching
CN1161749C (en) Method and apparatus for maintaining a target bit rate in a speech coder
EP1515308A1 (en) Multi-rate coding
CN1732512A (en) Method and device for compressed-domain packet loss concealment
US6940967B2 (en) Multirate speech codecs
CN1212607C (en) Predictive speech coder using coding scheme selection patterns to reduce sensitivity to frame errors
US20080140392A1 (en) Codec mode decoding method and apparatus for adaptive multi-rate system
CN1290077C (en) Method and apparatus for phase spectrum subsamples drawn
CN1244090C (en) Speech coding with background noise reproduction
CN1672193A (en) Speech communication unit and method for error mitigation of speech frames
JP3254126B2 (en) Variable rate coding
US8996361B2 (en) Method and device for determining a decoding mode of in-band signaling
JPH09172413A (en) Variable rate voice coding system
CN1366659A (en) Error correction method with pitch change detection
JP2001249691A (en) Voice encoding device and voice decoding device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: MOTOROLA MOBILE CO., LTD.

Free format text: FORMER OWNER: MOTOROLA INC.

Effective date: 20110107

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20110107

Address after: Illinois State

Patentee after: MOTOROLA MOBILITY, Inc.

Address before: Illinois, USA

Patentee before: Motorola, Inc.

C41 Transfer of patent application or patent right or utility model
C56 Change in the name or address of the patentee
CP01 Change in the name or title of a patent holder

Address after: Illinois State

Patentee after: MOTOROLA MOBILITY LLC

Address before: Illinois State

Patentee before: MOTOROLA MOBILITY, Inc.

TR01 Transfer of patent right

Effective date of registration: 20160310

Address after: California, USA

Patentee after: Google Technology Holdings LLC

Address before: Illinois State

Patentee before: MOTOROLA MOBILITY LLC

CX01 Expiry of patent term
CX01 Expiry of patent term

Granted publication date: 20071114