CN101937679B - Error concealment method for audio data frame, and audio decoding device - Google Patents

Error concealment method for audio data frame, and audio decoding device Download PDF

Info

Publication number
CN101937679B
CN101937679B CN2010102190873A CN201010219087A CN101937679B CN 101937679 B CN101937679 B CN 101937679B CN 2010102190873 A CN2010102190873 A CN 2010102190873A CN 201010219087 A CN201010219087 A CN 201010219087A CN 101937679 B CN101937679 B CN 101937679B
Authority
CN
China
Prior art keywords
audio data
data frame
mdct coefficient
frame
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2010102190873A
Other languages
Chinese (zh)
Other versions
CN101937679A (en
Inventor
徐晶明
林福辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Spreadtrum Communications Shanghai Co Ltd
Original Assignee
Spreadtrum Communications Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Spreadtrum Communications Shanghai Co Ltd filed Critical Spreadtrum Communications Shanghai Co Ltd
Priority to CN2010102190873A priority Critical patent/CN101937679B/en
Publication of CN101937679A publication Critical patent/CN101937679A/en
Application granted granted Critical
Publication of CN101937679B publication Critical patent/CN101937679B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention relates to the field of communication, and discloses an error concealment method for an audio data frame, and an audio decoding end. The error concealment method comprises the following steps of: determining a type of a window sequence of an error frame; and reestablishing MDCT coefficients of the error frame, wherein the MDCT coefficients of the error frame are divided into an audio type and a noise type, the MDCT coefficients belonging to the audio type are obtained by estimating interpolations of corresponding MDCT coefficients of a last frame and a next frame, and the MDCT coefficients belonging to the noise type are generated by shaping noise. Compared with a noise shaping interpolation algorithm provided by a 3GPP standard, under the condition of increasing a small amount of implementation complexity, the error concealment method better recovers tone information in the error frame sensitive to the hearing of human ears so as to obtain better balance between concealment quality and the implementation complexity.

Description

The error concealment method of audio data frame and audio decoding apparatus
Technical field
The present invention relates to the communications field, particularly the technology of the audio decoder in the communications field.
Background technology
China Mobile multimedia broadcasting (China Mobile Multimedia Broadcasting is called for short CMMB) is a mobile TV and multimedia standardization, based on the interactive many service architectures (STiMi) of satellite and ground.The development that high speed has been arranged is in two years in the past popularized at the networking of CMMB and terminal.CMMB is becoming the most pandemic technical standard of China's CMMB and industry standard day by day.
In the CMMB standard, adopted two kinds of digital audio encoding standards, be respectively MPEG (moving image expert group)-4HE-AAC (efficient high-order audio coding) and DRA (multi-sound channel digital audio encoding and decoding technique standard).These two kinds of coding criterions have all adopted the modified discrete cosine transform time-frequency conversion regime of (Modified Discrete Cosine Transform is called for short " MDCT ").
In CMMB communication, because the instability of wireless channel, environmental disturbances or signal attenuation, it is inevitable that error code takes place.In decoder end, this communication error code show as usually a frame even continuous several frames the voice data mistake and can not normal decoder.At this moment just need the error concealment algorithm of utilization DAB to remedy these wrong audio data frames, make them, thereby offer user's audio frequency acoustical quality in the acoustically approaching as far as possible real data of psychology.
Become the most popular audio coding standard along with MPEG-4AAC progressively replaces MP3, industry has proposed many error concealment algorithms based on the MDCT territory, wherein use the most extensively and with the present invention the most approximate be the shaped noise interpolation algorithm that 3GPP proposes.
Briefly, the error concealment of audio data frame carried out before the final step MDCT inverse transformation of high-order audio coding (Advanced Audio Coding is called for short " AAC ") decoding, and the time-delay of a frame is arranged.Suppose that (n-2) frame and (n) frame can correctly decode; (n-1) frame makes a mistake and need do mistake and cover; At first calculate the MDCT coefficient energy of each scale factor band of (n-2) frame and (n) frame; Derive the shaped noise interpolation factor of this subband then by the corresponding coefficient energy difference of each scale factor band, multiply each other by the MDCT coefficient of (n-2) frame and the interpolation factor of corresponding subband at last, and the randomly changing sign symbol; Promptly obtain the MDCT coefficient of (n-1) frame that shaped noise promptly recovers; Carry out promptly obtaining after the MDCT inverse transformation (n-1) frame voice data behind the error concealment, its detailed protocol can be referring to standard 3GPP TS 26.402V9.0.0 " Enhanced aacPlusgeneral audio codec-Additional decoder tools, " 2009.Wherein, The concrete computing method of subband interpolation factor and the mapping method of sub belt energy between the length piece can be referring to standard 3GPP TS 26.411V9.0.0 " Enhanced aacPlus general audio codec-Fixed-point ANSI-C code, " 2009 when piece is changed.
Yet,, can not recover the tone information responsive in the erroneous frame preferably to human auditory system though the shaped noise interpolation algorithm implementation complexity that above-mentioned 3GPP proposes is less.Inventor of the present invention finds that the DAB error concealment algorithm that proposes at present always can not well solve the balance of covering between quality and the implementation complexity.The implementation complexity that method simple quiet or the repetition former frame is introduced is less, but the acoustical quality after covering is relatively poor; And have the higher acoustical quality of covering usually based on the method for signal model or time domain prediction, but the implementation complexity of introducing is bigger, comprises bigger calculated amount, and storage demand and time delay demand are difficult to accomplish real-time realization on the processor of portable terminal.
Summary of the invention
The object of the present invention is to provide a kind of error concealment method and audio decoder end of audio data frame, cover quality, obtain to cover better balance between quality and the implementation complexity with raising.
For solving the problems of the technologies described above, embodiment of the present invention provides a kind of error concealment method of audio data frame, comprises following steps:
According to the last audio data frame and back one audio data frame of wrong audio data frame, confirm the window type of wrong audio data frame, window type comprises the type of series of windows;
If in last audio data frame and back one audio data frame; Have the audio data frame different with the series of windows of wrong audio data frame, the modified discrete cosine transform MDCT coefficient of then that series of windows is different audio data frames is mapped on the series of windows consistent with wrong audio data frame;
The MDCT coefficient of wrong audio data frame is divided into tone class and noise-like,, estimates to obtain through the last audio data frame and the corresponding MDCT coefficient interpolation of back one audio data frame to belonging to the MDCT coefficient of tone class; MDCT coefficient for belonging to noise-like generates through shaped noise.
Embodiment of the present invention also provides a kind of audio decoder end, comprises:
The window type determination module is used for last audio data frame and back one audio data frame according to wrong audio data frame, confirms the window type of wrong audio data frame, and window type comprises the type of series of windows;
Judge module is used for judging at last audio data frame and back one audio data frame, whether has the audio data frame different with the series of windows of wrong audio data frame;
Mapping block; Be used for when judge there is the different audio data frame of series of windows with wrong audio data frame in judge module, the modified discrete cosine transform MDCT coefficient of the audio data frame that series of windows is different is mapped on the series of windows consistent with wrong audio data frame;
Sort module is used for the MDCT coefficient of wrong audio data frame is divided into tone class and noise-like;
MDCT coefficient acquisition module is used for the classification results according to sort module, to belonging to the MDCT coefficient of tone class, estimates to obtain through the last audio data frame and the corresponding MDCT coefficient interpolation of back one audio data frame; MDCT coefficient for belonging to noise-like generates through shaped noise.
Embodiment of the present invention compared with prior art, the key distinction and effect thereof are:
At first; Former frame and back one frame according to erroneous frame; Confirm the MDCT window type of erroneous frame; The type that comprises the series of windows of confirming erroneous frame, if the series of windows of former frame or back one frame is different with the series of windows of erroneous frame, the MDCT coefficient of then that series of windows is different frames is mapped on the series of windows consistent with erroneous frame.Then, reconstruction errors frame MDCT coefficient wherein, is divided into tone class and noise-like with the MDCT coefficient of erroneous frame, to belonging to the MDCT coefficient of tone class, estimates to obtain through the former frame and the corresponding MDCT coefficient interpolation of back one frame; MDCT coefficient for belonging to noise-like generates through shaped noise.Because the MDCT coefficient sets of erroneous frame is categorized as tone class and noise class; Tone class MDCT coefficient sets is estimated to obtain by the MDCT coefficient interpolation of front and back frame; Like this under a small amount of prerequisite that increases of implementation complexity; Better recover the tone information responsive in the erroneous frame, further improve and cover quality, thereby obtain to cover better balance between quality and the implementation complexity human auditory system.And; Through the MDCT window type of definite erroneous frame and the two step method of reconstruction errors frame MDCT coefficient; Realize the scheme of error concealment; Singly be not applicable to the MPEG-4AAC audio coding, be suitable for other audio codings such as DRA yet, solved the error concealment problem of two kinds of audio coding standards among the CMMB simultaneously based on the conversion of MDCT time-frequency.
Further,,, judge the type of this scale factor band, and then judge the type of all MDCT coefficients in this scale factor band whether greater than presetting thresholding according to the number of scale factor band self-energy above the MDCT coefficient of reservation threshold.Because the energy of each MDCT coefficient can go out according to the corresponding MDCT coefficient estimation of former frame with back one frame in the erroneous frame, therefore, can guarantee the classification accuracy of tone class and noise-like through to estimating that the energy frequency spectrum that obtains does peak value and detect.
Further, the type of scale factor band is proofreaied and correct through psychoacoustic analysis, can further improve classification accuracy tone class and noise-like.
Further, average interpolation, obtain the MDCT coefficient of erroneous frame through corresponding MDCT coefficient to former frame and back one frame.Because be the average of the corresponding MDCT coefficient of frame before and after getting during interpolation, therefore realization is simple, and can guarantee to cover quality.
Further, after averaging interpolation, the numerical value that obtains after the average interpolation is carried out energy smoothly proofread and correct,, can further improve and cover quality the MDCT coefficient of the numerical value after proofreading and correct as erroneous frame.
Description of drawings
Fig. 1 is the error concealment method process flow diagram according to the audio data frame of first embodiment of the invention;
Fig. 2 is the audio decoder end structure synoptic diagram according to four embodiment of the invention.
Embodiment
In following narration, many ins and outs have been proposed in order to make the reader understand the application better.But, persons of ordinary skill in the art may appreciate that even without these ins and outs with based on the many variations and the modification of following each embodiment, also can realize each claim of the application technical scheme required for protection.
For making the object of the invention, technical scheme and advantage clearer, will combine accompanying drawing that embodiment of the present invention is done to describe in detail further below.
First embodiment of the invention relates to a kind of error concealment method of audio data frame.Audio data frame in this embodiment adopts high-order audio coding AAC, is based on the error concealment algorithm in MDCT territory, promptly to the audio data frame of mistake, before the final step MDCT inverse transformation of AAC decoding, rebuilds the MDCT coefficient.This reconstruction is based on the MDCT inverse transformation MDCT coefficient (the supposition former frame can correctly be decoded with back one frame) before of former frame and back one frame, so the time-delay of a frame is arranged.
Idiographic flow is as shown in Figure 1, in step 110, according to the last audio data frame and back one audio data frame of wrong audio data frame, confirms the window type of wrong audio data frame, and window type comprises the type of series of windows and the type of window shape.
Specifically, because AAC has adopted the MDCT window transform to adapt to steady-state signal and the transient signal in the audio frequency, so before the MDCT of reconstruction errors frame coefficient, need to confirm the MDCT window type of erroneous frame.Describe for convenient, suppose that (n-1) frame is wrong audio data frame, promptly (n-2) frame and (n) frame can correctly be decoded, and (n-1) frame makes a mistake and need do mistake and cover.Because AAC has stipulated four kinds of MDCT series of windows; Be respectively long series of windows ONLY_LONG_SEQUENCE, LONG_START_SEQUENCE, LONG_STOP_SEQUENCE and short series of windows EIGHT_SHORT_SEQUENCE; And stipulated two kinds of MDCT window shape, be respectively KBD Window and SINE WindoW.In this embodiment, the type of the window shape of (n-1) frame is identical with the type of the window shape of last audio data frame; The MDCT series of windows of (n-1) frame need be confirmed according to the corresponding information of (n-2) frame and (n) frame, to realize the level and smooth conversion of series of windows, satisfies the needs that MDCT rebuilds, and is specifically as shown in table 1:
Figure BSA00000174314300061
Table 1
In table 1, if (n-2) frame window sequence and (n) frame window sequence are in the long series of windows any one, then (n-1) frame window sequence is ONLY_LONG_SEQUENCE; If (n-2) frame window sequence is in the long series of windows any one, (n) frame window sequence is short series of windows, and then (n-1) frame window sequence is LONG_START_SEQUENCE; If (n-2) frame window sequence and (n) frame window sequence are short series of windows, then (n-1) frame window sequence is short series of windows; If (n-2) frame window sequence is any one in the long series of windows for short series of windows, (n) frame window sequence, then (n-1) frame window sequence is LONG_STOP_SEQUENCE.Need to prove; The corresponding information according to (n-2) frame and (n) frame shown in the table 1 is confirmed the MDCT series of windows of (n-1) frame; Be the concrete implementation of this embodiment, in practical application, those skilled in the art can change as required.
Then; In step 120; Whether judgement exists the different audio data frame of series of windows with (n-1) frame in (n-2) frame and (n) frame, if exist; Then get into step 130, the MDCT coefficient of the audio data frame that series of windows is different is mapped on the consistent series of windows with (n-1) frame.If all the series of windows with (n-1) frame is identical for the series of windows of (n-2) frame and (n) frame, then directly get into step 140.
That is to say that if (n-2) frame or (n) frame window sequence are different with (n-1) frame, the MDCT coefficient of audio data frame that then need series of windows is different is mapped on the series of windows consistent with (n-1) frame.
In step 140, the MDCT coefficient of (n-1) frame is classified, the MDCT coefficient of (n-1) frame is divided into tone class and noise-like.Concrete mode classification is following:
With (n-2) frame, k the MDCT coefficient of (n-1) frame and (n) frame is expressed as C respectively N-2(k), C N-1(k) and C n(k).C wherein N-1(k) the unknown, its energy P N-1(k) obtain by (n-2) frame and the corresponding MDCT coefficient Energy Estimation of (n) frame, i.e. P N-1(k)=C N-2 2(k)+C n 2(k).The energy frequency spectrum that then estimation is obtained is done peak value and is detected; Those local maximums that surpass certain threshold values are defined as " tone "; When a scale factor band contained one or more " tone ", this scale factor band was defined as the tone class, otherwise was " noise " class.That is to say that if in a scale factor band, the energy of estimating surpasses the number of MDCT coefficient of reservation threshold greater than presetting thresholding, the type of then judging this scale factor band is the tone class.If in a scale factor band, the number that the energy of estimating surpasses the MDCT coefficient of reservation threshold is less than or equal to and presets thresholding, and the type of then judging this scale factor band is the noise class.All MDCT coefficients in the scale factor band of tone class are the tone class.All MDCT coefficients in the scale factor band of noise class are the noise class.
Because the energy of each MDCT coefficient can go out according to the corresponding MDCT coefficient estimation of former frame with back one frame in the erroneous frame, therefore, can guarantee the classification accuracy of tone class and noise-like through to estimating that the energy frequency spectrum that obtains does peak value and detect.In addition, when being appreciated that reconstruction errors frame MDCT coefficient, the merger of MDCT coefficient can be not limited to the scale factor band, can be the arbitrary collection that human auditory system frequency band meaning is arranged.
Then, in step 150,, estimate to obtain through the last audio data frame and the corresponding MDCT coefficient interpolation of back one audio data frame to belonging to the MDCT coefficient of tone class.MDCT coefficient for belonging to noise-like generates through shaped noise.
Specifically, for the scale factor band of tone class, the MDCT coefficient in this scale factor band, the corresponding MDCT coefficient through forward and backward frame averages interpolation and obtains the average of the corresponding MDCT coefficient of frame before and after promptly getting:
C n-1(k)=1/2*[C n-2(k)+C n(k)],
With the C that obtains N-1(k) k the MDCT coefficient of (n-1) frame of conduct reconstruction.Corresponding MDCT coefficient through to former frame and back one frame averages interpolation, obtains the MDCT coefficient of erroneous frame.Not only realize simply, and can guarantee to cover quality.
Scale factor band for the noise class; MDCT coefficient in this scale factor band; Produce by shaped noise; The shaped noise generating algorithm belongs to the common practise (like the shaped noise generating algorithm of the assurance energy smooth evolution that in 3GPP TS26.402 V9.0.0 " Enhanced aacPlus general audio codec-Additionaldecoder tools, " 2009, relates to) of this area particularly, repeats no more at this.In addition; Be appreciated that; The shaped noise implementation of noise class MDCT coefficient; Being not limited to the shaped noise generating algorithm that guarantees the energy smooth evolution among the 3GPP TS 26.402V9.0.0 " Enhanced aacPlus general audio codec-Additional decoder tools, " 2009, can also be other shaped noise generating algorithm.
Behind the MDCT coefficient that obtains (n-1) frame, can obtain (n-1) frame voice data behind the error concealment through the MDCT inverse transformation.
Be not difficult to find; In this embodiment, because the MDCT coefficient sets of erroneous frame is categorized as tone class and noise class, its medium pitch class MDCT coefficient sets is estimated to obtain by the MDCT coefficient interpolation of front and back frame; Like this under a small amount of prerequisite that increases of implementation complexity; Better recover the tone information responsive in the erroneous frame, further improve and cover quality, thereby obtain to cover better balance between quality and the implementation complexity human auditory system.
Second embodiment of the invention relates to a kind of error concealment method of audio data frame.Second embodiment improves on the basis of first embodiment; Main improvements are: in step 140; Doing the peak value detection through the energy frequency spectrum that estimation is obtained; After judging the type of scale factor band, can also proofread and correct the type of scale factor band, with the classification accuracy of further raising tone class and noise-like according to psychoacoustic analysis.That is to say the method for erroneous frame MDCT coefficient classification.Be not limited to frame corresponding energy frequency spectrum in front and back is done the peak value detection, can also be based on the psychoacoustic analysis correction of MDCT frequency spectrum or its energy frequency spectrum.
In addition, for the MDCT coefficient of tone class, before and after getting, after the average of the corresponding MDCT coefficient of frame, also need the numerical value that obtains after the average interpolation is proofreaied and correct, with the MDCT coefficient of the numerical value after proofreading and correct as wrong audio data frame.That is to say, according to C N-1(k)=1/2* [C N-2(k)+C n(k)], obtain C N-1(k) after, also need be through following formula to C N-1(k) proofread and correct:
C - n-1(k)=a*C n-1(k),
Wherein, a is the coefficient that satisfies this scale factor band energy smooth evolution requirement.Be MDCT coefficient energy and the P on the scale factor band - N-1(a), the P that need satisfy condition - N-1(a)=1/2* [P N-2+ P n].And P - N-1(a) be MDCT coefficient C on this scale factor band - N-1(k) energy with, therefore, can calculate the coefficient a that satisfies this scale factor band energy smooth evolution requirement.
With the C after proofreading and correct - N-1(k) k the MDCT coefficient of (n-1) frame of conduct reconstruction.After averaging interpolation, the numerical value that obtains after the average interpolation is proofreaied and correct, with the MDCT coefficient of the numerical value after proofreading and correct, can further improve and cover quality as erroneous frame.
In addition, being appreciated that the energy in the estimation of tone class MDCT coefficient interpolation is smoothly proofreaied and correct, being not limited to the energy according to all coefficients in whole scale factor band or the set, can also be the energy that only consideration has the MDCT coefficient of tone characteristic, i.e. P - N-1(a) for have the tone characteristic the MDCT coefficient energy with.
Third embodiment of the invention relates to a kind of error concealment method of audio data frame.The 3rd embodiment and first embodiment are basic identical, and difference mainly is: in the first embodiment, audio data frame adopts high-order audio coding AAC.Yet in the 3rd embodiment, audio data frame adopts multi-sound channel digital audio encoding and decoding technique standard DRA.
Because DRA has stipulated nine kinds long M DCT windows: WIN_LONG_LONG2LONG, WIN_LONG_LONG2SHORT, WIN_LONG_SHORT2LONG; WIN_LONG_SHORT2SHORT; WIN_LONG_LONG2BRIEF, WIN_LONG_BRIEF2LONG, WIN_LONG_BRIEF2BRIEF; WIN_LONG_SHORT2BRIEF, WIN_LONG_BRIEF2SHORT; And four kinds short MDCT window: WIN_SHORT_SHORT2SHORT, WIN_SHORT_SHORT2BRIEF, WIN_SHORT_BRIEF2SHORT, WIN_SHORT_BRIEF2BRIEF have been stipulated.Single long MDCT window constitutes a long series of windows, and eight short MDCT windows constitute a short series of windows.
Because the series of windows type among the DRA is different from the series of windows type among the AAC; Therefore in this embodiment; The mode of MDCT series of windows of confirming (n-1) frame according to the corresponding information of (n-2) frame and (n) frame is following; To realize the level and smooth conversion of series of windows, satisfy the needs that MDCT rebuilds:
When (n-2) frame and (n) frame all were long MDCT series of windows, (n-1) frame was similarly long series of windows, and as shown in table 2:
Figure BSA00000174314300101
Figure BSA00000174314300111
Table 2
When (n-2) frame or (n) frame are short MDCT series of windows; (n-1) frame is short series of windows; And by 6.8 reconstruction weak points in " multi-sound channel digital audio encoding and decoding standard "/decision of routine shown in the window function sequence temporarily, concrete grammar is at first all short windows of (n-1) frame to be made as WIN_SHORT_SHORT2SHORT, and nNumCluster is made as 1; The shape of each short window is according to the shape of former frame with back one frame then; The principle that seamlessly transits according to window is adjusted to WIN_SHORT_SHORT2SHORT, WIN_SHORT_SHORT2BRIEF, or WIN_SHORT_BRIEF2SHORT.
After the MDCT series of windows that obtains (n-1) frame, the subsequent step and first embodiment are identical, repeat no more at this.
Be not difficult to find that the flow process of this embodiment and first embodiment are identical, difference only is, in step 110, according to the window type of (n-2) frame and (n) frame, confirms that the window type of (n-1) is different.That is to say; Through the MDCT window type of definite erroneous frame and the two step method of reconstruction errors frame MDCT coefficient; Realize the scheme of error concealment; Singly be not applicable to the MPEG-4AAC audio coding, be suitable for other audio codings such as DRA yet, solved the error concealment problem of two kinds of audio coding standards among the CMMB simultaneously based on the conversion of MDCT time-frequency.
Each method embodiment of the present invention all can be realized with modes such as software, hardware, firmwares.No matter the present invention be with software, hardware, or the firmware mode realize; Instruction code can be stored in the storer of computer-accessible of any kind (for example permanent or revisable; Volatibility or non-volatile; Solid-state or non-solid-state, fixing perhaps removable medium or the like).Equally; Storer can for example be programmable logic array (Programmable Array Logic; Abbreviation " PAL "), RAS (Random Access Memory; Abbreviation " RAM "), programmable read only memory (Programmable Read Only Memory is called for short " PROM "), ROM (read-only memory) (Read-Only Memory is called for short " ROM "), Electrically Erasable Read Only Memory (Electrically Erasable Programmable ROM; Abbreviation " EEPROM "), disk, CD, digital versatile disc (Digital Versatile Disc is called for short " DVD ") or the like.
Four embodiment of the invention relates to a kind of audio decoder end.Specifically as shown in Figure 2, this audio decoder end comprises:
The window type determination module is used for last audio data frame and back one audio data frame according to wrong audio data frame, confirms the window type of wrong audio data frame, and window type comprises the type of series of windows.
Judge module is used for judging at last audio data frame and back one audio data frame, whether has the audio data frame different with the series of windows of wrong audio data frame.
Mapping block is used for when judge there is the different audio data frame of series of windows with wrong audio data frame in judge module, and the MDCT coefficient of the audio data frame that series of windows is different is mapped on the series of windows consistent with wrong audio data frame.
Sort module is used for the MDCT coefficient of wrong audio data frame is divided into tone class and noise-like.
MDCT coefficient acquisition module is used for the classification results according to sort module, to belonging to the MDCT coefficient of tone class, estimates to obtain through the last audio data frame and the corresponding MDCT coefficient interpolation of back one audio data frame.M DCT coefficient for belonging to noise-like generates through shaped noise.
Wherein, sort module is divided into tone class and noise-like with the MDCT coefficient of wrong audio data frame in the following manner:
According to last audio data frame and back one audio data frame, the energy of each MDCT coefficient of misjudgment audio data frame.
If in a scale factor band, the energy of estimating surpasses the number of MDCT coefficient of reservation threshold greater than presetting thresholding, and the type of then judging this scale factor band is the tone class.If in a scale factor band, the number that the energy of estimating surpasses the MDCT coefficient of reservation threshold is less than or equal to and presets thresholding, and the type of then judging this scale factor band is the noise class.All MDCT coefficients in the scale factor band of tone class are the tone class.All MDCT coefficients in the scale factor band of noise class are the noise class.
In this embodiment; MDCT coefficient acquisition module is when estimating to obtain the MDCT coefficient with the corresponding MDCT coefficient interpolation of back one audio data frame through last audio data frame; Corresponding MDCT coefficient to last audio data frame and back one audio data frame averages interpolation, with the MDCT coefficient of the numerical value that obtains after the average interpolation as wrong audio data frame.
Audio data frame in this embodiment adopts high-order audio coding AAC; Window type also comprises the type of window shape; The window type determination module is confirmed as the window shape identical with last audio data frame with the window shape of wrong audio data frame when confirming the window shape of wrong audio data frame.
Be not difficult to find that first embodiment is and the corresponding method embodiment of this embodiment, this embodiment can with the enforcement of working in coordination of first embodiment.The correlation technique details of mentioning in first embodiment is still effective in this embodiment, in order to reduce repetition, repeats no more here.Correspondingly, the correlation technique details of mentioning in this embodiment also can be applicable in first embodiment.
Fifth embodiment of the invention relates to a kind of audio decoder end.The 5th embodiment improves on the basis of the 4th embodiment; Main improvements are: sort module also is used for after the type of decision metrics factor band; According to psychoacoustic analysis the type of scale factor band is proofreaied and correct, with the classification accuracy of further raising tone class and noise-like.
In addition, in this embodiment, MDCT coefficient acquisition module also is used for after averaging interpolation, the numerical value that obtains after the average interpolation being proofreaied and correct, and with the MDCT coefficient of the numerical value after proofreading and correct as wrong audio data frame, further improves and covers quality.
Be not difficult to find that second embodiment is and the corresponding method embodiment of this embodiment, this embodiment can with the enforcement of working in coordination of second embodiment.The correlation technique details of mentioning in second embodiment is still effective in this embodiment, in order to reduce repetition, repeats no more here.Correspondingly, the correlation technique details of mentioning in this embodiment also can be applicable in second embodiment.
Sixth embodiment of the invention relates to a kind of audio decoder end.The 6th embodiment and the 4th embodiment are basic identical, and difference mainly is:
In the 4th embodiment, audio data frame adopts high-order audio coding AAC.Yet in the 6th embodiment, audio data frame adopts multi-sound channel digital audio encoding and decoding technique standard.The difference of this embodiment and the 4th embodiment only is that the window type determination module confirms that the concrete mode of window type of wrong audio data frame is different.
In this embodiment, the window type determination module is according to the last audio data frame of wrong audio data frame and back one audio data frame, confirms that the concrete mode and the 3rd embodiment of window type of wrong audio data frame is similar, repeats no more at this.
Be not difficult to find that the 3rd embodiment is and the corresponding method embodiment of this embodiment, this embodiment can with the enforcement of working in coordination of the 3rd embodiment.The correlation technique details of mentioning in the 3rd embodiment is still effective in this embodiment, in order to reduce repetition, repeats no more here.Correspondingly, the correlation technique details of mentioning in this embodiment also can be applicable in the 3rd embodiment.
Need to prove; Each unit of mentioning in each equipment embodiment of the present invention all is a logical block, and physically, a logical block can be a physical location; It also can be the part of a physical location; Can also realize that the physics realization mode of these logical blocks itself is not most important with the combination of a plurality of physical locations, the combination of the function that these logical blocks realized is the key that just solves technical matters proposed by the invention.In addition, for outstanding innovation part of the present invention, above-mentioned each the equipment embodiment of the present invention will not introduced with solving the not too close unit of technical matters relation proposed by the invention, and this does not show that there is not other unit in the said equipment embodiment.
Though through reference some preferred implementation of the present invention; The present invention is illustrated and describes; But those of ordinary skill in the art should be understood that and can do various changes to it in form with on the details, and without departing from the spirit and scope of the present invention.

Claims (12)

1. the error concealment method of an audio data frame is characterized in that, comprises following steps:
According to the last audio data frame and back one audio data frame of wrong audio data frame, confirm the window type of said wrong audio data frame, said window type comprises the type of series of windows;
If in said last audio data frame and back one audio data frame; Have the audio data frame different with the series of windows of said wrong audio data frame, the modified discrete cosine transform MDCT coefficient of then that series of windows is different audio data frames is mapped on the series of windows consistent with said wrong audio data frame;
The MDCT coefficient of said wrong audio data frame is divided into tone class and noise-like,, estimates to obtain through the corresponding MDCT coefficient interpolation of said last audio data frame and said back one audio data frame to belonging to the MDCT coefficient of said tone class; MDCT coefficient for belonging to said noise-like generates through shaped noise;
MDCT coefficient with said wrong audio data frame is divided into tone class and noise-like in the following manner:
According to said last audio data frame and back one audio data frame, estimate the energy of each MDCT coefficient of said wrong audio data frame;
If in a scale factor band, the energy of estimating surpasses the number of MDCT coefficient of reservation threshold greater than presetting thresholding, and the type of then judging this scale factor band is the tone class; If in a scale factor band, the number that the energy of estimating surpasses the MDCT coefficient of reservation threshold is less than or equal to the said thresholding that presets, and the type of then judging this scale factor band is the noise class;
All MDCT coefficients in the scale factor band of tone class are the tone class; All MDCT coefficients in the scale factor band of noise class are the noise class.
2. the error concealment method of audio data frame according to claim 1 is characterized in that, after the type of said decision metrics factor band, also comprises following steps:
According to psychoacoustic analysis the type of scale factor band is proofreaied and correct.
3. the error concealment method of audio data frame according to claim 1; It is characterized in that; Estimating to obtain in the step of MDCT coefficient through the corresponding MDCT coefficient interpolation of said last audio data frame and said back one audio data frame; Corresponding M DCT coefficient to said last audio data frame and said back one audio data frame averages interpolation, with the MDCT coefficient of the numerical value that obtains after the said average interpolation as said wrong audio data frame.
4. the error concealment method of audio data frame according to claim 3 is characterized in that, after carrying out said average interpolation, also comprises following steps:
The numerical value that obtains after the said average interpolation is carried out energy smoothly proofread and correct, with the MDCT coefficient of the numerical value after proofreading and correct as said wrong audio data frame.
5. according to the error concealment method of each described audio data frame in the claim 1 to 4, it is characterized in that audio data frame adopts high-order audio coding AAC;
Said window type also comprises the type of window shape, and the type of the window shape of said wrong audio data frame is identical with the type of the window shape of said last audio data frame.
6. according to the error concealment method of each described audio data frame in the claim 1 to 4, it is characterized in that audio data frame adopts multi-sound channel digital audio encoding and decoding technique standard.
7. an audio decoding apparatus is characterized in that, comprises:
The window type determination module is used for last audio data frame and back one audio data frame according to wrong audio data frame, confirms the window type of said wrong audio data frame, and said window type comprises the type of series of windows;
Judge module is used for judging at said last audio data frame and back one audio data frame, whether has the audio data frame different with the series of windows of said wrong audio data frame;
Mapping block; Be used for when said judge module is judged the different audio data frame of the series of windows that exists with said wrong audio data frame, the modified discrete cosine transform M DCT coefficient of the audio data frame that series of windows is different is mapped on the series of windows consistent with said wrong audio data frame;
Sort module is used for the MDCT coefficient of said wrong audio data frame is divided into tone class and noise-like;
MDCT coefficient acquisition module is used for the classification results according to said sort module, to belonging to the MDCT coefficient of said tone class, estimates to obtain through the corresponding MDCT coefficient interpolation of said last audio data frame and said back one audio data frame; MDCT coefficient for belonging to said noise-like generates through shaped noise;
Said sort module is divided into tone class and noise-like with the MDCT coefficient of said wrong audio data frame in the following manner:
According to said last audio data frame and back one audio data frame, estimate the energy of each MDCT coefficient of said wrong audio data frame;
If in a scale factor band, the energy of estimating surpasses the number of MDCT coefficient of reservation threshold greater than presetting thresholding, and the type of then judging this scale factor band is the tone class; If in a scale factor band, the number that the energy of estimating surpasses the MDCT coefficient of reservation threshold is less than or equal to the said thresholding that presets, and the type of then judging this scale factor band is the noise class;
All M DCT coefficients in the scale factor band of tone class are the tone class; All MDCT coefficients in the scale factor band of noise class are the noise class.
8. audio decoding apparatus according to claim 7 is characterized in that, said sort module also is used for after the type of said decision metrics factor band, according to psychoacoustic analysis the type of scale factor band being proofreaied and correct.
9. audio decoding apparatus according to claim 7; It is characterized in that; MDCT coefficient acquisition module is when estimating to obtain the MDCT coefficient through the corresponding MDCT coefficient interpolation of said last audio data frame and said back one audio data frame; Corresponding MDCT coefficient to said last audio data frame and said back one audio data frame averages interpolation, with the MDCT coefficient of the numerical value that obtains after the said average interpolation as said wrong audio data frame.
10. audio decoding apparatus according to claim 9; It is characterized in that; Said M DCT coefficient acquisition module also is used for after carrying out said average interpolation; The numerical value that obtains after the said average interpolation is carried out energy smoothly proofread and correct, with the MDCT coefficient of the numerical value after proofreading and correct as said wrong audio data frame.
11., it is characterized in that audio data frame adopts high-order audio coding AAC according to each described audio decoding apparatus in the claim 7 to 10;
Said window type also comprises the type of window shape; Said window type determination module is confirmed as the window shape of said wrong audio data frame and the identical window shape of said last audio data frame when confirming the window shape of said wrong audio data frame.
12., it is characterized in that audio data frame adopts multi-sound channel digital audio encoding and decoding technique standard according to each described audio decoding apparatus in the claim 7 to 10.
CN2010102190873A 2010-07-05 2010-07-05 Error concealment method for audio data frame, and audio decoding device Active CN101937679B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010102190873A CN101937679B (en) 2010-07-05 2010-07-05 Error concealment method for audio data frame, and audio decoding device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010102190873A CN101937679B (en) 2010-07-05 2010-07-05 Error concealment method for audio data frame, and audio decoding device

Publications (2)

Publication Number Publication Date
CN101937679A CN101937679A (en) 2011-01-05
CN101937679B true CN101937679B (en) 2012-01-11

Family

ID=43390978

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010102190873A Active CN101937679B (en) 2010-07-05 2010-07-05 Error concealment method for audio data frame, and audio decoding device

Country Status (1)

Country Link
CN (1) CN101937679B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ES2635027T3 (en) 2013-06-21 2017-10-02 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for improved signal fading for audio coding systems changed during error concealment
CN103646647B (en) * 2013-12-13 2016-03-16 武汉大学 In mixed audio demoder, the spectrum parameter of frame error concealment replaces method and system
US10424305B2 (en) 2014-12-09 2019-09-24 Dolby International Ab MDCT-domain error concealment
CN107863109B (en) * 2017-11-03 2020-07-03 深圳大希创新科技有限公司 Mute control method and system for suppressing noise
CN110782906B (en) * 2018-07-30 2022-08-05 南京中感微电子有限公司 Audio data recovery method and device and Bluetooth equipment
CN111383643B (en) * 2018-12-28 2023-07-04 南京中感微电子有限公司 Audio packet loss hiding method and device and Bluetooth receiver
CN111402904B (en) * 2018-12-28 2023-12-01 南京中感微电子有限公司 Audio data recovery method and device and Bluetooth device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101046964A (en) * 2007-04-13 2007-10-03 清华大学 Error hidden frame reconstruction method based on overlap change compression code
CN101231849A (en) * 2007-09-15 2008-07-30 华为技术有限公司 Method and apparatus for concealing frame error of high belt signal
CN101325631A (en) * 2007-06-14 2008-12-17 华为技术有限公司 Method and apparatus for implementing bag-losing hide
CN101471073A (en) * 2007-12-27 2009-07-01 华为技术有限公司 Package loss compensation method, apparatus and system based on frequency domain

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6985856B2 (en) * 2002-12-31 2006-01-10 Nokia Corporation Method and device for compressed-domain packet loss concealment
KR101292771B1 (en) * 2006-11-24 2013-08-16 삼성전자주식회사 Method and Apparatus for error concealment of Audio signal

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101046964A (en) * 2007-04-13 2007-10-03 清华大学 Error hidden frame reconstruction method based on overlap change compression code
CN101325631A (en) * 2007-06-14 2008-12-17 华为技术有限公司 Method and apparatus for implementing bag-losing hide
CN101231849A (en) * 2007-09-15 2008-07-30 华为技术有限公司 Method and apparatus for concealing frame error of high belt signal
CN101471073A (en) * 2007-12-27 2009-07-01 华为技术有限公司 Package loss compensation method, apparatus and system based on frequency domain

Also Published As

Publication number Publication date
CN101937679A (en) 2011-01-05

Similar Documents

Publication Publication Date Title
CN101937679B (en) Error concealment method for audio data frame, and audio decoding device
CN101346760B (en) Encoder-assisted frame loss concealment techniques for audio coding
CN101887728B (en) Method for multi-sensory speech enhancement
CA2790973C (en) Watermark signal provider and method for providing a watermark signal
EP2681896B1 (en) Method and apparatus for identifying mobile devices in similar sound environment
WO2016192410A1 (en) Method and apparatus for audio signal enhancement
US20200365173A1 (en) Method for constructing voice detection model and voice endpoint detection system
JP6616470B2 (en) Encoding method, decoding method, encoding device, and decoding device
EP3866164B1 (en) Audio frame loss concealment
CN107103909B (en) Frame error concealment
US10984812B2 (en) Audio signal discriminator and coder
ES2440339T3 (en) Digital watermark generator, digital watermark decoder, procedure for providing a digital watermark signal based on binary message data, procedure for providing binary message data based on a digital watermark signal and computer program that uses a two-dimensional bit scatter
CN104269180A (en) Quasi-clean voice construction method for voice quality objective evaluation
CN101308660B (en) Decoding terminal error recovery method of audio compression stream
WO2009109120A1 (en) Method and device for audio signal encoding and decoding
US10262671B2 (en) Audio coding method and related apparatus
Górriz et al. An effective cluster-based model for robust speech detection and speech recognition in noisy environments
CN103456307A (en) Spectrum replacement method and system for frame error hiding in audio decoder
JP2002140093A (en) Noise reducing method using sectioning, correction, and scaling vector of acoustic space in domain of noisy speech
CN111816197A (en) Audio encoding method, audio encoding device, electronic equipment and storage medium
CN113409792B (en) Voice recognition method and related equipment thereof
CN104715761B (en) A kind of audio valid data detection method and system
Samaali et al. Watermark-aided pre-echo reduction in low bit-rate audio coding
CN101976567A (en) Voice signal error concealing method
Farsi et al. Improving voice activity detection used in ITU-T G. 729. B

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20180412

Address after: The 300456 Tianjin FTA test area (Dongjiang Bonded Port) No. 6865 North Road, 1-1-1802-7 financial and trade center of Asia

Patentee after: Xinji Lease (Tianjin) Co.,Ltd.

Address before: 201203 Shanghai city Zuchongzhi road Pudong New Area Zhangjiang hi tech park, Spreadtrum Center Building 1, Lane 2288

Patentee before: SPREADTRUM COMMUNICATIONS (SHANGHAI) Co.,Ltd.

EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20110105

Assignee: SPREADTRUM COMMUNICATIONS (SHANGHAI) Co.,Ltd.

Assignor: Xinji Lease (Tianjin) Co.,Ltd.

Contract record no.: 2018990000196

Denomination of invention: Error concealment method for audio data frame, and audio decoding device

Granted publication date: 20120111

License type: Exclusive License

Record date: 20180801

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20221020

Address after: 201203 Shanghai city Zuchongzhi road Pudong New Area Zhangjiang hi tech park, Spreadtrum Center Building 1, Lane 2288

Patentee after: SPREADTRUM COMMUNICATIONS (SHANGHAI) Co.,Ltd.

Address before: 300456 1-1-1802-7, north area of financial and Trade Center, No. 6865, Asia Road, Tianjin pilot free trade zone (Dongjiang Bonded Port Area)

Patentee before: Xinji Lease (Tianjin) Co.,Ltd.