CN101937679A - Error concealment method for audio data frame, and audio decoding end - Google Patents

Error concealment method for audio data frame, and audio decoding end Download PDF

Info

Publication number
CN101937679A
CN101937679A CN2010102190873A CN201010219087A CN101937679A CN 101937679 A CN101937679 A CN 101937679A CN 2010102190873 A CN2010102190873 A CN 2010102190873A CN 201010219087 A CN201010219087 A CN 201010219087A CN 101937679 A CN101937679 A CN 101937679A
Authority
CN
China
Prior art keywords
audio data
data frame
mdct coefficient
frame
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2010102190873A
Other languages
Chinese (zh)
Other versions
CN101937679B (en
Inventor
徐晶明
林福辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Spreadtrum Communications Shanghai Co Ltd
Original Assignee
Spreadtrum Communications Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Spreadtrum Communications Shanghai Co Ltd filed Critical Spreadtrum Communications Shanghai Co Ltd
Priority to CN2010102190873A priority Critical patent/CN101937679B/en
Publication of CN101937679A publication Critical patent/CN101937679A/en
Application granted granted Critical
Publication of CN101937679B publication Critical patent/CN101937679B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention relates to the field of communication, and discloses an error concealment method for an audio data frame, and an audio decoding end. The error concealment method comprises the following steps of: determining a type of a window sequence of an error frame; and reestablishing MDCT coefficients of the error frame, wherein the MDCT coefficients of the error frame are divided into an audio type and a noise type, the MDCT coefficients belonging to the audio type are obtained by estimating interpolations of corresponding MDCT coefficients of a last frame and a next frame, and the MDCT coefficients belonging to the noise type are generated by shaping noise. Compared with a noise shaping interpolation algorithm provided by a 3GPP standard, under the condition of increasing a small amount of implementation complexity, the error concealment method better recovers tone information in the error frame sensitive to the hearing of human ears so as to obtain better balance between concealment quality and the implementation complexity.

Description

The error concealment method of audio data frame and audio decoder end
Technical field
The present invention relates to the communications field, particularly the audio decoder technology in the communications field.
Background technology
China Mobile multimedia broadcasting (China Mobile Multimedia Broadcasting is called for short CMMB) is a mobile TV and multimedia standardization, based on the interactive many service architectures (STiMi) of satellite and ground.The networking of CMMB and terminal are popularized the development that high speed has been arranged in two years in the past.CMMB is becoming the most pandemic technical standard of China's Mobile Multimedia Broadcasting and industry standard day by day.
In the CMMB standard, adopted two kinds of digital audio encoding standards, be respectively MPEG (moving image expert group)-4HE-AAC (efficient high-order audio coding) and DRA (multi-sound channel digital audio encoding and decoding technique standard).These two kinds of coding criterions have all adopted the modified discrete cosine transform time-frequency conversion regime of (Modified Discrete Cosine Transform is called for short " MDCT ").
In CMMB communication, because the instability of wireless channel, environmental interference or signal attenuation, it is inevitable that error code takes place.In decoder end, this communication error code be usually expressed as a frame even continuous several frames the voice data mistake and can not normal decoder.At this moment just need the error concealment algorithm of utilization DAB to remedy these wrong audio data frames, make them, thereby offer user's audio frequency acoustical quality in the acoustically approaching as far as possible real data of psychology.
Become the most popular audio coding standard along with MPEG-4AAC progressively replaces MP3, industry has proposed many error concealment algorithms based on the MDCT territory, wherein be most widely used and with the present invention the most approximate be the shaped noise interpolation algorithm that 3GPP proposes.
Briefly, the error concealment of audio data frame carried out before the final step MDCT inverse transformation of high-order audio coding (Advanced AudioCoding is called for short " AAC ") decoding, and the time-delay of a frame is arranged.Suppose that (n-2) frame and (n) frame can be correctly decoded, (n-1) frame makes a mistake and need do mistake and cover, at first calculate the MDCT coefficient energy of each scale factor band of (n-2) frame and (n) frame, derive the shaped noise interpolation factor of this subband then by the coefficient energy difference of each scale factor band correspondence, multiply each other by the MDCT coefficient of (n-2) frame and the interpolation factor of corresponding subband at last, and randomly changing sign symbol, promptly obtain the MDCT coefficient of (n-1) frame that shaped noise promptly recovers, carry out promptly obtaining after the MDCT inverse transformation (n-1) frame voice data behind the error concealment, its detailed protocol can be referring to standard 3GPP TS 26.402V9.0.0 " Enhanced aacPlusgeneral audio codec-Additional decoder tools, " 2009.Wherein, the concrete computing method of subband interpolation factor and the mapping method of sub belt energy between the length piece can be referring to standard 3GPP TS 26.411V9.0.0 " Enhanced aacPlus general audio codec-Fixed-point ANSI-C code, " 2009 when piece is changed.
Yet,, can not recover the tone information in the erroneous frame preferably to the human auditory system sensitivity though the shaped noise interpolation algorithm implementation complexity that above-mentioned 3GPP proposes is less.The present inventor finds that the DAB error concealment algorithm that proposes always can not well solve the balance of covering between quality and the implementation complexity at present.The implementation complexity that method simple quiet or the repetition former frame is introduced is less, but the acoustical quality after covering is relatively poor; And have the higher acoustical quality of covering usually based on the method for signal model or time domain prediction, but the implementation complexity of introducing is bigger, comprises bigger calculated amount, and storage demand and time delay demand are difficult to accomplish real-time realization on the processor of portable terminal.
Summary of the invention
The object of the present invention is to provide a kind of error concealment method and audio decoder end of audio data frame, cover quality, obtain to cover better balance between quality and the implementation complexity with raising.
For solving the problems of the technologies described above, embodiments of the present invention provide a kind of error concealment method of audio data frame, comprise following steps:
According to the last audio data frame and back one audio data frame of wrong audio data frame, determine the window type of wrong audio data frame, window type comprises the type of series of windows;
If in last audio data frame and back one audio data frame, have the audio data frame different with the series of windows of wrong audio data frame, the modified discrete cosine transform MDCT coefficient of then that series of windows is different audio data frames is mapped on the series of windows consistent with wrong audio data frame;
The MDCT coefficient of wrong audio data frame is divided into tone class and noise-like,, estimates to obtain by the corresponding MDCT coefficient interpolation of last audio data frame and back one audio data frame to belonging to the MDCT coefficient of tone class; MDCT coefficient for belonging to noise-like generates by shaped noise.
Embodiments of the present invention also provide a kind of audio decoder end, comprise:
The window type determination module is used for last audio data frame and back one audio data frame according to wrong audio data frame, determines the window type of wrong audio data frame, and window type comprises the type of series of windows;
Judge module is used for judging at last audio data frame and back one audio data frame, whether has the audio data frame different with the series of windows of wrong audio data frame;
Mapping block, be used for when judge there is the different audio data frame of series of windows with wrong audio data frame in judge module, the modified discrete cosine transform MDCT coefficient of the audio data frame that series of windows is different is mapped on the series of windows consistent with wrong audio data frame;
Sort module is used for the MDCT coefficient of wrong audio data frame is divided into tone class and noise-like;
MDCT coefficient acquisition module is used for the classification results according to sort module, to belonging to the MDCT coefficient of tone class, estimates to obtain by the corresponding MDCT coefficient interpolation of last audio data frame and back one audio data frame; MDCT coefficient for belonging to noise-like generates by shaped noise.
Embodiment of the present invention compared with prior art, the key distinction and effect thereof are:
At first, former frame and back one frame according to erroneous frame, determine the MDCT window type of erroneous frame, the type that comprises the series of windows of determining erroneous frame, if the series of windows of former frame or back one frame is different with the series of windows of erroneous frame, the MDCT coefficient of then that series of windows is different frames is mapped on the series of windows consistent with erroneous frame.Then, reconstruction errors frame MDCT coefficient wherein, is divided into tone class and noise-like with the MDCT coefficient of erroneous frame, to belonging to the MDCT coefficient of tone class, estimates to obtain by the corresponding MDCT coefficient interpolation of former frame and back one frame; MDCT coefficient for belonging to noise-like generates by shaped noise.Because the MDCT coefficient sets of erroneous frame is categorized as tone class and noise class, tone class MDCT coefficient sets is estimated to obtain by the MDCT coefficient interpolation of front and back frame, like this under a small amount of prerequisite that increases of implementation complexity, better recover the tone information in the erroneous frame to the human auditory system sensitivity, further improve and cover quality, thereby obtain to cover better balance between quality and the implementation complexity.And, by the MDCT window type of definite erroneous frame and the two step method of reconstruction errors frame MDCT coefficient, realize the scheme of error concealment, singly be not applicable to the MPEG-4AAC audio coding, also be suitable for other audio codings such as DRA, solved the error concealment problem of two kinds of audio coding standards among the CMMB simultaneously based on the conversion of MDCT time-frequency.
Further,,, judge the type of this scale factor band, and then judge the type of all MDCT coefficients in this scale factor band whether greater than presetting thresholding according to the number of scale factor band self-energy above the MDCT coefficient of reservation threshold.Because the energy of each MDCT coefficient can go out according to the corresponding MDCT coefficient estimation of former frame and back one frame in the erroneous frame, therefore, can guarantee the classification accuracy of tone class and noise-like by to estimating that the energy frequency spectrum that obtains does peak value and detect.
Further, the type of scale factor band is proofreaied and correct by psychoacoustic analysis, can further improve classification accuracy tone class and noise-like.
Further, average interpolation, obtain the MDCT coefficient of erroneous frame by corresponding MDCT coefficient to former frame and back one frame.Because be the average of the corresponding MDCT coefficient of frame before and after getting during interpolation, therefore realization is simple, and can guarantee to cover quality.
Further, after averaging interpolation, the numerical value that obtains after the average interpolation is carried out energy smoothly proofread and correct,, can further improve and cover quality the MDCT coefficient of the numerical value after proofreading and correct as erroneous frame.
Description of drawings
Fig. 1 is the error concealment method process flow diagram according to the audio data frame of first embodiment of the invention;
Fig. 2 is the audio decoder end structure synoptic diagram according to four embodiment of the invention.
Embodiment
In the following description, in order to make the reader understand the application better many ins and outs have been proposed.But, persons of ordinary skill in the art may appreciate that even without these ins and outs with based on the many variations and the modification of following each embodiment, also can realize each claim of the application technical scheme required for protection.
For making the purpose, technical solutions and advantages of the present invention clearer, embodiments of the present invention are described in further detail below in conjunction with accompanying drawing.
First embodiment of the invention relates to a kind of error concealment method of audio data frame.Audio data frame in the present embodiment adopts high-order audio coding AAC, is based on the error concealment algorithm in MDCT territory, promptly to the audio data frame of mistake, rebuilds the MDCT coefficient before the final step MDCT inverse transformation of AAC decoding.This reconstruction is based on the MDCT inverse transformation MDCT coefficient (supposition former frame and back one frame can be correctly decoded) before of former frame and back one frame, so the time-delay of a frame is arranged.
Idiographic flow in step 110, according to the last audio data frame and back one audio data frame of wrong audio data frame, is determined the window type of wrong audio data frame as shown in Figure 1, and window type comprises the type of series of windows and the type of window shape.
Specifically, because AAC has adopted the MDCT window transform to adapt to steady-state signal and the transient signal in the audio frequency, so before the MDCT of reconstruction errors frame coefficient, need to determine the MDCT window type of erroneous frame.Describe for convenient, suppose that (n-1) frame is wrong audio data frame, promptly (n-2) frame and (n) frame can be correctly decoded, and (n-1) frame makes a mistake and need do mistake and cover.Because AAC has stipulated four kinds of MDCT series of windows, be respectively long series of windows ONLY_LONG_SEQUENCE, LONG_START_SEQUENCE, LONG_STOP_SEQUENCE and short series of windows EIGHT_SHORT_SEQUENCE, and stipulated two kinds of MDCT window shape, be respectively KBD Window and SINE WindoW.In the present embodiment, the type of the window shape of (n-1) frame is identical with the type of the window shape of last audio data frame; The MDCT series of windows of (n-1) frame need be determined according to the corresponding information of (n-2) frame and (n) frame, to realize the level and smooth conversion of series of windows, satisfies the needs that MDCT rebuilds, and is specifically as shown in table 1:
Figure BSA00000174314300061
Table 1
In table 1, if (n-2) frame window sequence and (n) frame window sequence are in the long series of windows any one, then (n-1) frame window sequence is ONLY_LONG_SEQUENCE; If (n-2) frame window sequence is in the long series of windows any one, (n) frame window sequence is short series of windows, and then (n-1) frame window sequence is LONG_START_SEQUENCE; If (n-2) frame window sequence and (n) frame window sequence are short series of windows, then (n-1) frame window sequence is short series of windows; If (n-2) frame window sequence is any one in the long series of windows for short series of windows, (n) frame window sequence, then (n-1) frame window sequence is LONG_STOP_SEQUENCE.Need to prove, the corresponding information according to (n-2) frame and (n) frame shown in the table 1 is determined the MDCT series of windows of (n-1) frame, be the specific implementation of present embodiment, in actual applications, those skilled in the art can change as required.
Then, in step 120, judgement is in (n-2) frame and (n) frame, whether there is the different audio data frame of series of windows with (n-1) frame, if exist, then enter step 130, the MDCT coefficient of the audio data frame that series of windows is different is mapped on the consistent series of windows with (n-1) frame.If all the series of windows with (n-1) frame is identical for the series of windows of (n-2) frame and (n) frame, then directly enter step 140.
That is to say,, then need the MDCT coefficient of the audio data frame that series of windows is different to be mapped on the series of windows consistent with (n-1) frame if (n-2) frame or (n) frame window sequence are different with (n-1) frame.
In step 140, the MDCT coefficient of (n-1) frame is classified, the MDCT coefficient of (n-1) frame is divided into tone class and noise-like.Concrete mode classification is as follows:
With (n-2) frame, k the MDCT coefficient of (n-1) frame and (n) frame is expressed as C respectively N-2(k), C N-1(k) and C n(k).C wherein N-1(k) the unknown, its energy P N-1(k) the MDCT coefficient Energy Estimation by (n-2) frame and (n) frame correspondence obtains, i.e. P N-1(k)=C N-2 2(k)+C n 2(k).The energy frequency spectrum that estimation is obtained is done the peak value detection then, those local maximums that surpass certain threshold values are defined as " tone ", when a scale factor band contained one or more " tone ", this scale factor band was defined as the tone class, otherwise was " noise " class.That is to say that if in a scale factor band, the energy of estimating surpasses the number of MDCT coefficient of reservation threshold greater than presetting thresholding, the type of then judging this scale factor band is the tone class.If in a scale factor band, the number that the energy of estimating surpasses the MDCT coefficient of reservation threshold is less than or equal to and presets thresholding, and the type of then judging this scale factor band is the noise class.All MDCT coefficients in the scale factor band of tone class are the tone class.All MDCT coefficients in the scale factor band of noise class are the noise class.
Because the energy of each MDCT coefficient can go out according to the corresponding MDCT coefficient estimation of former frame and back one frame in the erroneous frame, therefore, can guarantee the classification accuracy of tone class and noise-like by to estimating that the energy frequency spectrum that obtains does peak value and detect.In addition, when being appreciated that reconstruction errors frame MDCT coefficient, the merger of MDCT coefficient can be not limited to the scale factor band, can be the arbitrary collection that human auditory system frequency band meaning is arranged.
Then, in step 150,, estimate to obtain by the corresponding MDCT coefficient interpolation of last audio data frame and back one audio data frame to belonging to the MDCT coefficient of tone class.MDCT coefficient for belonging to noise-like generates by shaped noise.
Specifically, for the scale factor band of tone class, the MDCT coefficient in this scale factor band, the corresponding MDCT coefficient by forward and backward frame averages interpolation and obtains the average of the corresponding MDCT coefficient of frame before and after promptly getting:
C n-1(k)=1/2*[C n-2(k)+C n(k)],
With the C that obtains N-1(k) k the MDCT coefficient of (n-1) frame of conduct reconstruction.Average interpolation by corresponding MDCT coefficient, obtain the MDCT coefficient of erroneous frame former frame and back one frame.Not only realize simply, and can guarantee to cover quality.
Scale factor band for the noise class, MDCT coefficient in this scale factor band, produce by shaped noise, particularly the shaped noise generating algorithm belong to this area common practise (as 3GPP TS26.402 V9.0.0 " Enhanced aacPlus general audio codec-Additionaldecoder tools; " the shaped noise generating algorithm of the assurance energy smooth evolution that relates in 2009), do not repeat them here.In addition, be appreciated that, the shaped noise implementation of noise class MDCT coefficient, be not limited to 3GPP TS 26.402V9.0.0 " Enhanced aacPlus general audio codec-Additional decoder tools; " guaranteeing the shaped noise generating algorithm of energy smooth evolution in 2009, can also be other shaped noise generating algorithm.
Behind the MDCT coefficient that obtains (n-1) frame, can obtain (n-1) frame voice data behind the error concealment by the MDCT inverse transformation.
Be not difficult to find, in the present embodiment, because the MDCT coefficient sets of erroneous frame is categorized as tone class and noise class, its medium pitch class MDCT coefficient sets is estimated to obtain by the MDCT coefficient interpolation of front and back frame, like this under a small amount of prerequisite that increases of implementation complexity, better recover the tone information in the erroneous frame, further improve and cover quality, thereby obtain to cover better balance between quality and the implementation complexity the human auditory system sensitivity.
Second embodiment of the invention relates to a kind of error concealment method of audio data frame.Second embodiment improves on the basis of first embodiment, main improvements are: in step 140, doing the peak value detection by the energy frequency spectrum that estimation is obtained, after judging the type of scale factor band, can also proofread and correct the type of scale factor band according to psychoacoustic analysis, with the classification accuracy of further raising tone class and noise-like.That is to say the method for erroneous frame MDCT coefficient classification.Be not limited to frame corresponding energy frequency spectrum in front and back is done the peak value detection, can also be based on the psychoacoustic analysis correction of MDCT frequency spectrum or its energy frequency spectrum.
In addition, for the MDCT coefficient of tone class, before and after getting, after the average of the corresponding MDCT coefficient of frame, also need the numerical value that obtains after the average interpolation is proofreaied and correct, with the MDCT coefficient of the numerical value after proofreading and correct as wrong audio data frame.That is to say, according to C N-1(k)=1/2*[C N-2(k)+C n(k)], obtain C N-1(k) after, also need be by following formula to C N-1(k) proofread and correct:
C - n-1(k)=a*C n-1(k),
Wherein, a is the coefficient that satisfies this scale factor band energy smooth evolution requirement.Be MDCT coefficient energy and the P on the scale factor band - N-1(a), the P that need satisfy condition - N-1(a)=1/2*[P N-2+ P n].And P - N-1(a) be MDCT coefficient C on this scale factor band - N-1(k) energy and, therefore, can calculate the coefficient a that satisfies this scale factor band energy smooth evolution requirement.
With the C after proofreading and correct - N-1(k) k the MDCT coefficient of (n-1) frame of conduct reconstruction.After averaging interpolation, the numerical value that obtains after the average interpolation is proofreaied and correct, with the MDCT coefficient of the numerical value after proofreading and correct, can further improve and cover quality as erroneous frame.
In addition, being appreciated that the energy in the estimation of tone class MDCT coefficient interpolation is smoothly proofreaied and correct, being not limited to the energy according to all coefficients in whole scale factor band or the set, can also be the energy that only consideration has the MDCT coefficient of tone characteristic, i.e. P - N-1(a) for have the tone characteristic the MDCT coefficient energy and.
Third embodiment of the invention relates to a kind of error concealment method of audio data frame.The 3rd embodiment and first embodiment are basic identical, and difference mainly is: in the first embodiment, audio data frame adopts high-order audio coding AAC.Yet in the 3rd embodiment, audio data frame adopts multi-sound channel digital audio encoding and decoding technique standard DRA.
Because DRA has stipulated nine kinds long M DCT window: WIN_LONG_LONG2LONG, WIN_LONG_LONG2SHORT, WIN_LONG_SHORT2LONG, WIN_LONG_SHORT2SHORT, WIN_LONG_LONG2BRIEF, WIN_LONG_BRIEF2LONG, WIN_LONG_BRIEF2BRIEF, WIN_LONG_SHORT2BRIEF, WIN_LONG_BRIEF2SHORT; And four kinds short MDCT window: WIN_SHORT_SHORT2SHORT, WIN_SHORT_SHORT2BRIEF, WIN_SHORT_BRIEF2SHORT, WIN_SHORT_BRIEF2BRIEF have been stipulated.Single long MDCT window constitutes a long series of windows, and eight short MDCT windows constitute a short series of windows.
Because the series of windows type among the DRA is different from the series of windows type among the AAC, therefore in the present embodiment, the mode of MDCT series of windows of determining (n-1) frame according to the corresponding information of (n-2) frame and (n) frame is as follows, to realize the level and smooth conversion of series of windows, satisfy the needs that MDCT rebuilds:
When (n-2) frame and (n) frame all were long MDCT series of windows, (n-1) frame was similarly long series of windows, and as shown in table 2:
Figure BSA00000174314300101
Figure BSA00000174314300111
Table 2
When (n-2) frame or (n) frame are short MDCT series of windows, (n-1) frame is short series of windows, and by 6.8 reconstruction weak points in " multi-sound channel digital audio encoding and decoding standard "/decision of routine shown in the window function sequence temporarily, concrete grammar is at first all short windows of (n-1) frame to be made as WIN_SHORT_SHORT2SHORT, nNumCluster is made as 1, the shape of each short window is according to the shape of former frame and back one frame then, the principle that seamlessly transits according to window is adjusted to WIN_SHORT_SHORT2SHORT, WIN_SHORT_SHORT2BRIEF, or WIN_SHORT_BRIEF2SHORT.
After the MDCT series of windows that obtains (n-1) frame, the subsequent step and first embodiment are identical, do not repeat them here.
Be not difficult to find that the flow process of present embodiment and first embodiment are identical, difference only is, in step 110, according to the window type of (n-2) frame and (n) frame, determines that the window type of (n-1) is different.That is to say, by the MDCT window type of definite erroneous frame and the two step method of reconstruction errors frame MDCT coefficient, realize the scheme of error concealment, singly be not applicable to the MPEG-4AAC audio coding, also be suitable for other audio codings such as DRA, solved the error concealment problem of two kinds of audio coding standards among the CMMB simultaneously based on the conversion of MDCT time-frequency.
Each method embodiment of the present invention all can be realized in modes such as software, hardware, firmwares.No matter the present invention be with software, hardware, or the firmware mode realize, instruction code can be stored in the storer of computer-accessible of any kind (for example permanent or revisable, volatibility or non-volatile, solid-state or non-solid-state, fixing or removable medium or the like).Equally, storer can for example be programmable logic array (Programmable Array Logic, be called for short " PAL "), random access memory (Random Access Memory, be called for short " RAM "), programmable read only memory (Programmable Read Only Memory, be called for short " PROM "), ROM (read-only memory) (Read-Only Memory, be called for short " ROM "), Electrically Erasable Read Only Memory (Electrically Erasable Programmable ROM, be called for short " EEPROM "), disk, CD, digital versatile disc (Digital Versatile Disc is called for short " DVD ") or the like.
Four embodiment of the invention relates to a kind of audio decoder end.Specifically as shown in Figure 2, this audio decoder end comprises:
The window type determination module is used for last audio data frame and back one audio data frame according to wrong audio data frame, determines the window type of wrong audio data frame, and window type comprises the type of series of windows.
Judge module is used for judging at last audio data frame and back one audio data frame, whether has the audio data frame different with the series of windows of wrong audio data frame.
Mapping block is used for when judge there is the different audio data frame of series of windows with wrong audio data frame in judge module, and the MDCT coefficient of the audio data frame that series of windows is different is mapped on the series of windows consistent with wrong audio data frame.
Sort module is used for the MDCT coefficient of wrong audio data frame is divided into tone class and noise-like.
MDCT coefficient acquisition module is used for the classification results according to sort module, to belonging to the MDCT coefficient of tone class, estimates to obtain by the corresponding MDCT coefficient interpolation of last audio data frame and back one audio data frame.M DCT coefficient for belonging to noise-like generates by shaped noise.
Wherein, sort module is divided into tone class and noise-like with the MDCT coefficient of wrong audio data frame in the following manner:
According to last audio data frame and back one audio data frame, the energy of each MDCT coefficient of misjudgment audio data frame.
If in a scale factor band, the energy of estimating surpasses the number of MDCT coefficient of reservation threshold greater than presetting thresholding, and the type of then judging this scale factor band is the tone class.If in a scale factor band, the number that the energy of estimating surpasses the MDCT coefficient of reservation threshold is less than or equal to and presets thresholding, and the type of then judging this scale factor band is the noise class.All MDCT coefficients in the scale factor band of tone class are the tone class.All MDCT coefficients in the scale factor band of noise class are the noise class.
In the present embodiment, MDCT coefficient acquisition module is when estimating to obtain the MDCT coefficient by the corresponding MDCT coefficient interpolation of last audio data frame and back one audio data frame, corresponding MDCT coefficient to last audio data frame and back one audio data frame averages interpolation, with the numerical value that obtains after the average interpolation MDCT coefficient as wrong audio data frame.
Audio data frame in the present embodiment adopts high-order audio coding AAC, window type also comprises the type of window shape, the window type determination module is defined as the window shape identical with last audio data frame with the window shape of wrong audio data frame when determining the window shape of wrong audio data frame.
Be not difficult to find that first embodiment is and the corresponding method embodiment of present embodiment, present embodiment can with the enforcement of working in coordination of first embodiment.The correlation technique details of mentioning in first embodiment is still effective in the present embodiment, in order to reduce repetition, repeats no more here.Correspondingly, the correlation technique details of mentioning in the present embodiment also can be applicable in first embodiment.
Fifth embodiment of the invention relates to a kind of audio decoder end.The 5th embodiment improves on the basis of the 4th embodiment, main improvements are: sort module also is used for after the type of decision metrics factor band, according to psychoacoustic analysis the type of scale factor band is proofreaied and correct, with the classification accuracy of further raising tone class and noise-like.
In addition, in the present embodiment, MDCT coefficient acquisition module also is used for after averaging interpolation the numerical value that obtains after the average interpolation being proofreaied and correct, and with the MDCT coefficient of the numerical value after proofreading and correct as wrong audio data frame, further improves and covers quality.
Be not difficult to find that second embodiment is and the corresponding method embodiment of present embodiment, present embodiment can with the enforcement of working in coordination of second embodiment.The correlation technique details of mentioning in second embodiment is still effective in the present embodiment, in order to reduce repetition, repeats no more here.Correspondingly, the correlation technique details of mentioning in the present embodiment also can be applicable in second embodiment.
Sixth embodiment of the invention relates to a kind of audio decoder end.The 6th embodiment and the 4th embodiment are basic identical, and difference mainly is:
In the 4th embodiment, audio data frame adopts high-order audio coding AAC.Yet in the 6th embodiment, audio data frame adopts multi-sound channel digital audio encoding and decoding technique standard.The difference of present embodiment and the 4th embodiment only is that the window type determination module determines that the concrete mode of window type of wrong audio data frame is different.
In the present embodiment, the window type determination module is according to the last audio data frame of wrong audio data frame and back one audio data frame, determines that the concrete mode and the 3rd embodiment of window type of wrong audio data frame is similar, do not repeat them here.
Be not difficult to find that the 3rd embodiment is and the corresponding method embodiment of present embodiment, present embodiment can with the enforcement of working in coordination of the 3rd embodiment.The correlation technique details of mentioning in the 3rd embodiment is still effective in the present embodiment, in order to reduce repetition, repeats no more here.Correspondingly, the correlation technique details of mentioning in the present embodiment also can be applicable in the 3rd embodiment.
Need to prove, each unit of mentioning in each equipment embodiment of the present invention all is a logical block, physically, a logical block can be a physical location, it also can be the part of a physical location, can also realize that the physics realization mode of these logical blocks itself is not most important with the combination of a plurality of physical locations, the combination of the function that these logical blocks realized is the key that just solves technical matters proposed by the invention.In addition, for outstanding innovation part of the present invention, above-mentioned each the equipment embodiment of the present invention will not introduced not too close unit with solving technical matters relation proposed by the invention, and this does not show that there is not other unit in the said equipment embodiment.
Though pass through with reference to some of the preferred embodiment of the invention, the present invention is illustrated and describes, but those of ordinary skill in the art should be understood that and can do various changes to it in the form and details, and without departing from the spirit and scope of the present invention.

Claims (14)

1. the error concealment method of an audio data frame is characterized in that, comprises following steps:
According to the last audio data frame and back one audio data frame of wrong audio data frame, determine the window type of described wrong audio data frame, described window type comprises the type of series of windows;
If in described last audio data frame and back one audio data frame, have the audio data frame different with the series of windows of described wrong audio data frame, the modified discrete cosine transform MDCT coefficient of then that series of windows is different audio data frames is mapped on the series of windows consistent with described wrong audio data frame;
The MDCT coefficient of described wrong audio data frame is divided into tone class and noise-like,, estimates to obtain by the corresponding MDCT coefficient interpolation of described last audio data frame and described back one audio data frame to belonging to the MDCT coefficient of described tone class; MDCT coefficient for belonging to described noise-like generates by shaped noise.
2. the error concealment method of audio data frame according to claim 1 is characterized in that, the MDCT coefficient with described wrong audio data frame is divided into tone class and noise-like in the following manner:
According to described last audio data frame and back one audio data frame, estimate the energy of each MDCT coefficient of described wrong audio data frame;
If in a scale factor band, the energy of estimating surpasses the number of MDCT coefficient of reservation threshold greater than presetting thresholding, and the type of then judging this scale factor band is the tone class; If in a scale factor band, the number that the energy of estimating surpasses the MDCT coefficient of reservation threshold is less than or equal to the described thresholding that presets, and the type of then judging this scale factor band is the noise class;
All MDCT coefficients in the scale factor band of tone class are the tone class; All MDCT coefficients in the scale factor band of noise class are the noise class.
3. the error concealment method of audio data frame according to claim 2 is characterized in that, after the type of described decision metrics factor band, also comprises following steps:
According to psychoacoustic analysis the type of scale factor band is proofreaied and correct.
4. the error concealment method of audio data frame according to claim 1, it is characterized in that, estimating to obtain in the step of MDCT coefficient by the corresponding MDCT coefficient interpolation of described last audio data frame and described back one audio data frame, corresponding MDCT coefficient to described last audio data frame and described back one audio data frame averages interpolation, with the numerical value that obtains after the described average interpolation MDCT coefficient as described wrong audio data frame.
5. the error concealment method of audio data frame according to claim 4 is characterized in that, after carrying out described average interpolation, also comprises following steps:
The numerical value that obtains after the described average interpolation is carried out energy smoothly proofread and correct, with the MDCT coefficient of the numerical value after proofreading and correct as described wrong audio data frame.
6. according to the error concealment method of each described audio data frame in the claim 1 to 5, it is characterized in that audio data frame adopts high-order audio coding AAC;
Described window type also comprises the type of window shape, and the type of the window shape of described wrong audio data frame is identical with the type of the window shape of described last audio data frame.
7. according to the error concealment method of each described audio data frame in the claim 1 to 5, it is characterized in that audio data frame adopts multi-sound channel digital audio encoding and decoding technique standard.
8. an audio decoder end is characterized in that, comprises:
The window type determination module is used for last audio data frame and back one audio data frame according to wrong audio data frame, determines the window type of described wrong audio data frame, and described window type comprises the type of series of windows;
Judge module is used for judging at described last audio data frame and back one audio data frame, whether has the audio data frame different with the series of windows of described wrong audio data frame;
Mapping block, be used for when described judge module is judged the different audio data frame of the series of windows that exists with described wrong audio data frame, the modified discrete cosine transform MDCT coefficient of the audio data frame that series of windows is different is mapped on the series of windows consistent with described wrong audio data frame;
Sort module is used for the MDCT coefficient of described wrong audio data frame is divided into tone class and noise-like;
MDCT coefficient acquisition module is used for the classification results according to described sort module, to belonging to the MDCT coefficient of described tone class, estimates to obtain by the corresponding MDCT coefficient interpolation of described last audio data frame and described back one audio data frame; MDCT coefficient for belonging to described noise-like generates by shaped noise.
9. audio decoder end according to claim 8 is characterized in that, described sort module is divided into tone class and noise-like with the MDCT coefficient of described wrong audio data frame in the following manner:
According to described last audio data frame and back one audio data frame, estimate the energy of each MDCT coefficient of described wrong audio data frame;
If in a scale factor band, the energy of estimating surpasses the number of MDCT coefficient of reservation threshold greater than presetting thresholding, and the type of then judging this scale factor band is the tone class; If in a scale factor band, the number that the energy of estimating surpasses the MDCT coefficient of reservation threshold is less than or equal to the described thresholding that presets, and the type of then judging this scale factor band is the noise class;
All MDCT coefficients in the scale factor band of tone class are the tone class; All MDCT coefficients in the scale factor band of noise class are the noise class.
10. audio decoder end according to claim 9 is characterized in that, described sort module also is used for according to psychoacoustic analysis the type of scale factor band being proofreaied and correct after the type of described decision metrics factor band.
11. audio decoder end according to claim 8, it is characterized in that, MDCT coefficient acquisition module is when estimating to obtain the MDCT coefficient by the corresponding MDCT coefficient interpolation of described last audio data frame and described back one audio data frame, corresponding MDCT coefficient to described last audio data frame and described back one audio data frame averages interpolation, with the numerical value that obtains after the described average interpolation MDCT coefficient as described wrong audio data frame.
12. audio decoder end according to claim 11, it is characterized in that, described MDCT coefficient acquisition module also is used for after carrying out described average interpolation, the numerical value that obtains after the described average interpolation is carried out energy smoothly proofread and correct, with the MDCT coefficient of the numerical value after proofreading and correct as described wrong audio data frame.
13. each described audio decoder end in 11 is characterized in that according to Claim 8, audio data frame adopts high-order audio coding AAC;
Described window type also comprises the type of window shape, described window type determination module is defined as the window shape of described wrong audio data frame and the identical window shape of described last audio data frame when determining the window shape of described wrong audio data frame.
14. each described audio decoder end in 11 is characterized in that according to Claim 8, audio data frame adopts multi-sound channel digital audio encoding and decoding technique standard.
CN2010102190873A 2010-07-05 2010-07-05 Error concealment method for audio data frame, and audio decoding device Active CN101937679B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010102190873A CN101937679B (en) 2010-07-05 2010-07-05 Error concealment method for audio data frame, and audio decoding device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010102190873A CN101937679B (en) 2010-07-05 2010-07-05 Error concealment method for audio data frame, and audio decoding device

Publications (2)

Publication Number Publication Date
CN101937679A true CN101937679A (en) 2011-01-05
CN101937679B CN101937679B (en) 2012-01-11

Family

ID=43390978

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010102190873A Active CN101937679B (en) 2010-07-05 2010-07-05 Error concealment method for audio data frame, and audio decoding device

Country Status (1)

Country Link
CN (1) CN101937679B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103646647A (en) * 2013-12-13 2014-03-19 武汉大学 Spectrum parameter substituting method and system for hiding frame error in mixed audio decoder
CN107004417A (en) * 2014-12-09 2017-08-01 杜比国际公司 MDCT domains error concealment
CN107863109A (en) * 2017-11-03 2018-03-30 深圳大希创新科技有限公司 A kind of mute control method and system for suppressing noise
CN110289005A (en) * 2013-06-21 2019-09-27 弗朗霍夫应用科学研究促进协会 For generating the device and method of the adaptive spectrum shape for noise of releiving
WO2020135609A1 (en) * 2018-07-30 2020-07-02 南京中感微电子有限公司 Audio data recovery method, device and bluetooth apparatus
CN111383643A (en) * 2018-12-28 2020-07-07 南京中感微电子有限公司 Audio packet loss hiding method and device and Bluetooth receiver
CN111402904A (en) * 2018-12-28 2020-07-10 南京中感微电子有限公司 Audio data recovery method and device and Bluetooth equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004059894A2 (en) * 2002-12-31 2004-07-15 Nokia Corporation Method and device for compressed-domain packet loss concealment
CN101046964A (en) * 2007-04-13 2007-10-03 清华大学 Error hidden frame reconstruction method based on overlap change compression code
US20080126096A1 (en) * 2006-11-24 2008-05-29 Samsung Electronics Co., Ltd. Error concealment method and apparatus for audio signal and decoding method and apparatus for audio signal using the same
CN101231849A (en) * 2007-09-15 2008-07-30 华为技术有限公司 Method and apparatus for concealing frame error of high belt signal
CN101325631A (en) * 2007-06-14 2008-12-17 华为技术有限公司 Method and apparatus for implementing bag-losing hide
CN101471073A (en) * 2007-12-27 2009-07-01 华为技术有限公司 Package loss compensation method, apparatus and system based on frequency domain

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004059894A2 (en) * 2002-12-31 2004-07-15 Nokia Corporation Method and device for compressed-domain packet loss concealment
US20080126096A1 (en) * 2006-11-24 2008-05-29 Samsung Electronics Co., Ltd. Error concealment method and apparatus for audio signal and decoding method and apparatus for audio signal using the same
CN101046964A (en) * 2007-04-13 2007-10-03 清华大学 Error hidden frame reconstruction method based on overlap change compression code
CN101325631A (en) * 2007-06-14 2008-12-17 华为技术有限公司 Method and apparatus for implementing bag-losing hide
CN101231849A (en) * 2007-09-15 2008-07-30 华为技术有限公司 Method and apparatus for concealing frame error of high belt signal
CN101471073A (en) * 2007-12-27 2009-07-01 华为技术有限公司 Package loss compensation method, apparatus and system based on frequency domain

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110289005B (en) * 2013-06-21 2024-02-09 弗朗霍夫应用科学研究促进协会 Apparatus and method for generating adaptive spectral shape of comfort noise
US11869514B2 (en) 2013-06-21 2024-01-09 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for improved signal fade out for switched audio coding systems during error concealment
US11776551B2 (en) 2013-06-21 2023-10-03 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for improved signal fade out in different domains during error concealment
CN110289005A (en) * 2013-06-21 2019-09-27 弗朗霍夫应用科学研究促进协会 For generating the device and method of the adaptive spectrum shape for noise of releiving
CN103646647A (en) * 2013-12-13 2014-03-19 武汉大学 Spectrum parameter substituting method and system for hiding frame error in mixed audio decoder
CN103646647B (en) * 2013-12-13 2016-03-16 武汉大学 In mixed audio demoder, the spectrum parameter of frame error concealment replaces method and system
CN112967727A (en) * 2014-12-09 2021-06-15 杜比国际公司 MDCT domain error concealment
US10923131B2 (en) 2014-12-09 2021-02-16 Dolby International Ab MDCT-domain error concealment
CN107004417B (en) * 2014-12-09 2021-05-07 杜比国际公司 MDCT domain error concealment
US10424305B2 (en) 2014-12-09 2019-09-24 Dolby International Ab MDCT-domain error concealment
CN107004417A (en) * 2014-12-09 2017-08-01 杜比国际公司 MDCT domains error concealment
CN107863109B (en) * 2017-11-03 2020-07-03 深圳大希创新科技有限公司 Mute control method and system for suppressing noise
CN107863109A (en) * 2017-11-03 2018-03-30 深圳大希创新科技有限公司 A kind of mute control method and system for suppressing noise
WO2020135609A1 (en) * 2018-07-30 2020-07-02 南京中感微电子有限公司 Audio data recovery method, device and bluetooth apparatus
CN111383643A (en) * 2018-12-28 2020-07-07 南京中感微电子有限公司 Audio packet loss hiding method and device and Bluetooth receiver
CN111402904A (en) * 2018-12-28 2020-07-10 南京中感微电子有限公司 Audio data recovery method and device and Bluetooth equipment
CN111402904B (en) * 2018-12-28 2023-12-01 南京中感微电子有限公司 Audio data recovery method and device and Bluetooth device
US11900951B2 (en) 2018-12-28 2024-02-13 Nanjing Zgmicro Company Limited Audio packet loss concealment method, device and bluetooth receiver

Also Published As

Publication number Publication date
CN101937679B (en) 2012-01-11

Similar Documents

Publication Publication Date Title
CN101937679B (en) Error concealment method for audio data frame, and audio decoding device
CN101346760B (en) Encoder-assisted frame loss concealment techniques for audio coding
CN101887728B (en) Method for multi-sensory speech enhancement
WO2016192410A1 (en) Method and apparatus for audio signal enhancement
US11295761B2 (en) Method for constructing voice detection model and voice endpoint detection system
CN1679083A (en) Multichannel voice detection in adverse environments
CN1133151C (en) Method for decoding audio signal with transmission error correction
WO2012121855A1 (en) Method and apparatus for identifying mobile devices in similar sound environment
JP6616470B2 (en) Encoding method, decoding method, encoding device, and decoding device
CN107103909B (en) Frame error concealment
US10984812B2 (en) Audio signal discriminator and coder
CN101308660B (en) Decoding terminal error recovery method of audio compression stream
CN103456307B (en) In audio decoder, the spectrum of frame error concealment replaces method and system
US10262671B2 (en) Audio coding method and related apparatus
Górriz et al. An effective cluster-based model for robust speech detection and speech recognition in noisy environments
CN102214219B (en) Audio/video content retrieval system and method
CN113409792B (en) Voice recognition method and related equipment thereof
CN104715761B (en) A kind of audio valid data detection method and system
CN113259827A (en) Hearing-aid method, system, earphone, medium and equipment based on audio encoding and decoding
CN112992189A (en) Voice audio detection method and device, storage medium and electronic device
CN105185386B (en) The voice activity detection method of entropy is arranged based on two steps
Farsi et al. Improving voice activity detection used in ITU-T G. 729. B
CN114566174B (en) Method, device, system, medium and equipment for optimizing voice coding
CN113345428B (en) Speech recognition model matching method, device, equipment and storage medium
Ajgou et al. Effects of speech codecs on a remote speaker recognition system using a new SAD

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20180412

Address after: The 300456 Tianjin FTA test area (Dongjiang Bonded Port) No. 6865 North Road, 1-1-1802-7 financial and trade center of Asia

Patentee after: Xinji Lease (Tianjin) Co.,Ltd.

Address before: 201203 Shanghai city Zuchongzhi road Pudong New Area Zhangjiang hi tech park, Spreadtrum Center Building 1, Lane 2288

Patentee before: SPREADTRUM COMMUNICATIONS (SHANGHAI) Co.,Ltd.

TR01 Transfer of patent right
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20110105

Assignee: SPREADTRUM COMMUNICATIONS (SHANGHAI) Co.,Ltd.

Assignor: Xinji Lease (Tianjin) Co.,Ltd.

Contract record no.: 2018990000196

Denomination of invention: Error concealment method for audio data frame, and audio decoding device

Granted publication date: 20120111

License type: Exclusive License

Record date: 20180801

EE01 Entry into force of recordation of patent licensing contract
TR01 Transfer of patent right

Effective date of registration: 20221020

Address after: 201203 Shanghai city Zuchongzhi road Pudong New Area Zhangjiang hi tech park, Spreadtrum Center Building 1, Lane 2288

Patentee after: SPREADTRUM COMMUNICATIONS (SHANGHAI) Co.,Ltd.

Address before: 300456 1-1-1802-7, north area of financial and Trade Center, No. 6865, Asia Road, Tianjin pilot free trade zone (Dongjiang Bonded Port Area)

Patentee before: Xinji Lease (Tianjin) Co.,Ltd.

TR01 Transfer of patent right