CN102117613B - Method and equipment for processing digital audio in variable speed - Google Patents

Method and equipment for processing digital audio in variable speed Download PDF

Info

Publication number
CN102117613B
CN102117613B CN 200910202164 CN200910202164A CN102117613B CN 102117613 B CN102117613 B CN 102117613B CN 200910202164 CN200910202164 CN 200910202164 CN 200910202164 A CN200910202164 A CN 200910202164A CN 102117613 B CN102117613 B CN 102117613B
Authority
CN
China
Prior art keywords
length
audio
buffer zone
audio signal
window function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN 200910202164
Other languages
Chinese (zh)
Other versions
CN102117613A (en
Inventor
吴晟
林福辉
张本好
董树景
李昙
徐晶明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Spreadtrum Communications Shanghai Co Ltd
Original Assignee
Spreadtrum Communications Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Spreadtrum Communications Shanghai Co Ltd filed Critical Spreadtrum Communications Shanghai Co Ltd
Priority to CN 200910202164 priority Critical patent/CN102117613B/en
Publication of CN102117613A publication Critical patent/CN102117613A/en
Application granted granted Critical
Publication of CN102117613B publication Critical patent/CN102117613B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention relates to the audio signal processing technology and discloses a method and equipment for processing digital audio in variable speed. In the invention, a pair of perfect reconstructing window functions WL and WR with amplitude attenuation and increase characteristics is used to act on original digital audio according to different delays to obtain a pair of windowing data; and an audio waveform is reconstructed by using the windowing data to obtain audio after variable-speed process. The defection on a fundamental tone period and the relativity of the audio and the time-frequency conversion are avoided, so that the calculation amount is extremely small. In addition, the playing time of the contents is prolonged or shortened by using the compaction and the introduction of the self waveform of an audio signal without changing the audio waveform, so that the original tone quality can be better maintained.

Description

DAB speed variable treatment method and equipment thereof
Technical field
The present invention relates to the Audio Signal Processing technology, particularly the audio speed changing in the Audio Signal Processing technology is handled.
Background technology
In various multimedia application, the adjustment of the playback speed of the DAB that records has demand widely.Such as reducing the velocity of sound that voice are reset, can help the crowd of hearing or comprehension obstacle and the intelligibility that the foreign language beginner improves audition; And the raising velocity of sound then can help the hearer to save the time of from recording, obtaining information.In addition, can also change the rhythm of music, obtain unique effect the adjustment of music playback velocity of sound; For the sound accompaniment in the video, the adjustment of the velocity of sound of audio playback can be so that video be when quickening or slowing down playback speed, and it is synchronous that spectators can be heard, the sound accompaniment of distortion takes place.
But, directly audio frequency is carried out the broadcasting speed adjustment and is left intact the variation that can bring tone and tone color because the linearity of SIF sound intermediate frequency rate composition is migrated.Such as when reducing velocity of sound, sound can sink by step-down, and it is the nasal sound as sending in being sunk into sleep that voice are then cashed; When improving velocity of sound, sound can become sharply, and voice then show as like the child and talk fast.Therefore, in order to guarantee audio frequency the velocity of sound variation has only taken place, and conversion does not take place all for tone and tone color, obviously distortion need not handled DAB.At present, mostly adopt the variable-speed processing of carrying out audio frequency based on the algorithm of overlap-add technology or based on the algorithm of time-frequency conversion and frequency spectrum processing.About the audio speed changing Treatment Technology can be the United States Patent (USP) of " 5952596 " referring to the patent No. also.
Yet; Inventor of the present invention finds, in the algorithm of realizing based on overlap-add (overlap add) technology, need confirm the time delay of overlapping window through the similarity degree (waveform similarity) of detection waveform; These class methods can only be handled the voice with obvious pitch period; The cross-correlation detection technology of its use time domain or frequency domain obtains the time delay of similar waveform, and with this time delay as overlapping window, so calculation consumption is big and processing tonequality is general; And can handle the universal audio (general audio) that comprises voice and music based on the algorithm of time-frequency conversion and frequency spectrum processing, and it resamples with the transformed samples rate to original digital audio, and the DAB that will change sampling rate then is transformed into the frequency spectrum that frequency domain obtains audio frequency; Subsequently frequency spectrum is carried out frequency spectrum shift (frequency shift); Time domain is returned in spectrum transformation after handling, but general Short Time Fourier Transform (the short time Fourier transform) realization of using perfect reconstruction of this algorithm, for obtaining high processing tonequality; Short Time Fourier Transform needs the long audio frequency of single treatment; Though these class methods can obtain reasonable processing tonequality, its calculated amount and memory space are bigger, on hand-held and mobile device; Because the constraint of computing power and power consumption, this algorithm almost can't be realized.
Summary of the invention
The object of the present invention is to provide a kind of DAB speed variable treatment method and equipment thereof, can realize with lower calculated amount the variable-speed processing of general digital audio frequency, and obtain high processing tonequality.
For solving the problems of the technologies described above, embodiment of the present invention provides a kind of DAB speed variable treatment method, comprises following steps:
The audio signal data that A handles pending audio speed changing is filled into buffer zone, reaches the data processing length L until the filling length of said buffer zone p
B carries out windowing process in the following manner with audio signal data pending in the said buffer zone, obtains exporting signal x Out:
If said audio speed changing is treated to the processing of pick up speed, be L then with length in the said buffer zone pAudio signal data and length be L WWindow function W LPointwise is multiplied by W after the left justify LObtain x L, be L with length in the said buffer zone pAudio signal data and length be L WWindow function W RPointwise is multiplied by W behind the right-hand justified RObtain x R, with the x that obtains LAnd x RAddition obtains L WIndividual said output signal x Out
If said audio speed changing is treated to processing slowly, is L with length in the said buffer zone then pAudio signal data and length be L WWindow function W LPointwise is multiplied by W behind the right-hand justified LObtain x L, be L with length in the said buffer zone pAudio signal data and length be L WWindow function W RPointwise is multiplied by W after the left justify RObtain x R, with the x that obtains LAnd x RAddition obtains L WIndividual said output signal x Out
C will accomplish the L of windowing process DIndividual signal shifts out buffer zone, continue to fill pending audio signal data at the buffer zone afterbody, reaches the data processing length L until the filling length of said buffer zone p
Repeat said step B and step C, handle until the audio speed changing of accomplishing all audio signal datas;
Wherein, said W LFor having the window function of amplitude fading characteristic, said W RIncrease the window function of characteristic, W for having amplitude LAnd W RL is respectively arranged WThe data of individual point, pairing some addition equal 1 or be similar to 1.
Embodiment of the present invention also provides a kind of DAB variable-speed processing equipment, comprises:
Packing module is used for the audio signal data that pending audio speed changing is handled is filled into buffer zone, reaches the data processing length L until the filling length of said buffer zone p
The windowing process module is used for the audio signal data that said buffer zone is pending and carries out windowing process, obtains exporting signal x OutSaid windowing process module is L with length in the said buffer zone when said audio speed changing is treated to the processing of pick up speed pAudio signal data and length be L WWindow function W LPointwise is multiplied by W after the left justify LObtain x L, be L with length in the said buffer zone pAudio signal data and length be L WWindow function W RPointwise is multiplied by W behind the right-hand justified RObtain x R, with the x that obtains LAnd x RAddition obtains L WIndividual said output signal x OutWhen said audio speed changing is treated to processing slowly, be L with length in the said buffer zone pAudio signal data and length be L WWindow function W LPointwise is multiplied by W behind the right-hand justified LObtain x L, be L with length in the said buffer zone pAudio signal data and length be L WWindow function W RPointwise is multiplied by W after the left justify RObtain x R, with the x that obtains LAnd x RAddition obtains L WIndividual said output signal x Out
Shift module is used for the L that accomplishes windowing process DIndividual signal shifts out buffer zone, and indicates said packing module continue to fill pending audio signal data at the buffer zone afterbody, reaches the data processing length L until the filling length of said buffer zone p
When the filling length of said buffer zone reaches the data processing length L pThe time, trigger the processing of said windowing process module; When said windowing process module obtains L WIndividual said output signal x OutThe time, trigger the processing of said shift module, handle until the audio speed changing of accomplishing all audio signal datas;
Wherein, said W LFor having the window function of amplitude fading characteristic, said W RIncrease the window function of characteristic, W for having amplitude LAnd W RL is respectively arranged WThe data of individual point, pairing some addition equal 1 or be similar to 1.
Embodiment of the present invention compared with prior art, the key distinction and effect thereof are:
Use has amplitude fading, increases a pair of perfect reconstruction window function W of characteristic LAnd W R, obtain a pair of windowing data by different time-lag actions in original digital audio, utilize windowing data reconstruction audio volume control, the audio frequency after the acquisition variable-speed processing.Owing to need the pitch period and the correlativity of audio frequency do not detected, need not carry out time-frequency conversion, so calculated amount is extremely low yet.And, owing to utilized the compression of sound signal self waveform and introduced the time increase and decrease that realizes play content, audio volume control itself is not done change, therefore can keep original tonequality more.
Further, when pending audio signal data carries out windowing process in to buffer zone, W LAnd W REqual 1 initial reconstitution window function for pairing some addition; Perhaps, W LAnd W RBe the reconstruct window function that distributes according to the selected different weights of the echo type of audio signal data; The reconstruct window function that different weights distribute generates separately respectively, perhaps obtains through the conversion to the initial reconstitution window.Because audio compression (pick up speed) is that the audio-frequency information that the time is compressed is dispersed on the voice data of the shortening after the processing smoothly; Audio frequency expansion (slowing down) then is to carry out crossover smoothly through the audio-frequency information of introducing past and following (comparable data is newer on the time relatively), obtains longer voice data.The very little part of original energy introduced or be diffused into to the process of this crossover all can to the bigger signal of energy; Caused echo (echo at signal the back takes place) and Pre echoes (echo is before signal takes place); Therefore when carrying out windowing process; Can be further according to the selected suitable reconstruct window function of echo type, with the audio quality after the further assurance speed change.
Further, the echo type of audio signal data is obtained according to the block energy of audio signal data or the judged result of piece absolute value and preset thresholding.Because if echo took place greater than present signal in signal in the past easily; If Pre echoes less than present signal, takes place in the signal in past easily.So, can effectively guarantee the accuracy of judged result with the foundation that the block energy (or piece absolute value) of sound signal is judged as the echo type.
Further, initial reconstitution window W LAnd W RAs follows:
W L(k)=1-W R(k),k=1,2,…,L W
Through experiment showed, starting weight structure window W originally LAnd W RDesign W for this reason LAnd W RThe time, through taking out value transform, can obtain being used for more neatly the reconstruct window that the different weights of 4 couple of processing audio distribute.
Further, preestablish L WValue, according to L WR obtains L with playback rate DAnd L pValue.Because three kinds of length L W, L D, L pBetween relation fix, confirm that one of which just can obtain other two.And the design of reconstruct window and L WValue directly related, therefore adopt the L of regular length W, make L D, L pChange with r, can make that follow-up windowing process operation is more easy.
Description of drawings
Fig. 1 is the DAB speed variable treatment method process flow diagram according to first embodiment of the invention;
Fig. 2 fills synoptic diagram according to the buffer zone in the first embodiment of the invention;
Fig. 3 is according to the initial reconstitution window W in the first embodiment of the invention LAnd W RWindow type synoptic diagram;
Fig. 4 is the synoptic diagram according to the windowing process output combined signal of playback rate r>1 in the first embodiment of the invention o'clock;
Fig. 5 is the synoptic diagram according to the windowing process output combined signal of playback rate r<1 in the first embodiment of the invention o'clock;
Fig. 6 is according to the displacement of the buffer zone in first embodiment of the invention synoptic diagram;
Fig. 7 is the waveform effect synoptic diagram according to the tone testing of first embodiment of the invention;
Fig. 8 is the waveform effect synoptic diagram according to the music test of first embodiment of the invention;
Fig. 9 is the voice original signal spectrum diagram intention according to the tone testing of first embodiment of the invention;
Figure 10 is the spectrogram synoptic diagram according to the Time Compression to 0.5 of the tone testing of first embodiment of the invention times;
Figure 11 is the spectrogram synoptic diagram according to the temporal extension to 2 of the tone testing of first embodiment of the invention times;
Figure 12 is the music original signal spectrum diagram intention according to the music test of first embodiment of the invention;
Figure 13 is the spectrogram synoptic diagram according to the Time Compression to 0.5 of the music of first embodiment of the invention test times;
Figure 14 is the spectrogram synoptic diagram according to the temporal extension to 2 of the music of first embodiment of the invention test times;
Figure 15 is according to obtaining the reconstruct window synoptic diagram that different weights distribute through taking out value transform in the second embodiment of the invention;
Figure 16 is according to reconstruct window W in the second embodiment of the invention L1And W R1Synoptic diagram;
Figure 17 is according to reconstruct window W in the second embodiment of the invention L2And W R2Synoptic diagram;
Figure 18 is according to reconstruct window W in the second embodiment of the invention L3And W R3Synoptic diagram;
Figure 19 is according to reconstruct window W in the second embodiment of the invention L4And W R4Synoptic diagram;
Figure 20 is the DAB speed variable treatment method process flow diagram according to second embodiment of the invention;
Figure 21 judges the block synoptic diagram that parameter is corresponding according to echo in the second embodiment of the invention;
Figure 22 is with W when expanding according to the audio frequency in the second embodiment of the invention LAnd W RReplace with W L1And W R1The synoptic diagram of prevention Pre echoes;
Figure 23 is with W when expanding according to the audio frequency in the second embodiment of the invention LAnd W RReplace with W L2And W R2Prevented the synoptic diagram of echo;
Figure 24 is with W when expanding according to the audio frequency in the second embodiment of the invention LAnd W RReplace with W L3And W R3The synoptic diagram of prevention Pre echoes;
Figure 25 is with W when expanding according to the audio frequency in the second embodiment of the invention LAnd W RReplace with W L4And W R4Prevented the synoptic diagram of echo;
Figure 26 is with W during according to the audio compression in the second embodiment of the invention LAnd W RReplace with W L1And W R1Prevented the synoptic diagram of echo;
Figure 27 is with W during according to the audio compression in the second embodiment of the invention LAnd W RReplace with W L2And W R2The synoptic diagram of prevention Pre echoes;
Figure 28 is with W during according to the audio compression in the second embodiment of the invention LAnd W RReplace with W L3And W R3Prevented the synoptic diagram of echo;
Figure 29 is with W during according to the audio compression in the second embodiment of the invention LAnd W RReplace with W L4And W R4The synoptic diagram of prevention Pre echoes;
Figure 30 is according to DAB variable-speed processing device structure synoptic diagram in the third embodiment of the invention.
Embodiment
In following narration, many ins and outs have been proposed in order to make the reader understand the application better.But, persons of ordinary skill in the art may appreciate that even without these ins and outs with based on the many variations and the modification of following each embodiment, also can realize each claim of the application technical scheme required for protection.
For making the object of the invention, technical scheme and advantage clearer, embodiment of the present invention is done to describe in detail further below in conjunction with accompanying drawing.
First embodiment of the invention relates to a kind of DAB speed variable treatment method, and idiographic flow is as shown in Figure 1.
In step 110, the audio signal data that pending audio speed changing is handled is filled into buffer zone, reaches the data processing length L until the filling length of buffer zone p
Specifically, preestablish reconstruct window length L W, length L Updates Information D, the data processing length L pValue.In this embodiment, make that playback rate r is linear playback rate resize ratio, r<1; Then sound is (promptly the slowing down) that is expanded, r>1, and then sound is compressed (being pick up speed); Such as if r=0.5, reproduction time then is original 2 times so.
Owing to when the SF of DAB does not change, import a segment length (sampling number) and be the sound of L, be adjusted into after treatment and can play with playback rate r, its length will be changed to L/r; Understand fixing output length (being the reconstruct window length that defines in this embodiment) L conversely W, sound import length (being the length that Updates Information that defines in this embodiment) L so D=r * L WWhen audio compression, only handle Updating Information of input carried out, so the data processing length L p=L DAnd when audio frequency was expanded, output audio length had increased L W-L D, this part is 2 (L before the input data W-L D) dot information, provide through crossover, so the data processing length L pReach 2 (L W-L D)+L D=2L W-L DThat is to say, according to playback rate r, L W, L DAnd L pBetween relation be:
L D=r×L W
In this step, audio signal data is sampled or block-by-block is filled into the end of buffer zone one by one, surpass L up to the filling length of buffer zone P, as shown in Figure 2.For explaining conveniently, in this embodiment buffer zone is expressed as array x (k).
Then, in step 120, audio signal data pending in the buffer zone is carried out windowing process, obtain exporting signal x Out
Specifically, confirm reconstruct window function W in advance LAnd W R, W LAnd W RL is respectively arranged WThe data of individual point, W LCharacteristic with amplitude fading, W RHas the characteristic that amplitude increases.W LAnd W RMeet the perfect reconstruction conditional request, i.e. W LAnd W RPairing some addition equals 1, and equation expression is W L(k)+W R(k)=1, k=1,2 ..., L WIn this embodiment, adopt a comparatively practical W who has been verified LAnd W R:
W L(k)=1-W R(k),k=1,2,…,L W
L WW during=N point LAnd W RThe window type as shown in Figure 3.Certainly, in practical application, also can be with W LAnd W RBe designed to other forms, only need to satisfy or approximately satisfy the perfect reconstruction condition and get final product.
Concrete windowing process mode in this step is following:
If playback rate r>1, i.e. sound compression (pick up speed), windowing data so do
x L(k)=x(k)W L(k),k=1,2,…,L W
x R(k)=x(k+L D-L W)W R(k),k=1,2,…,L W
Output data is with a frame L WThe form output of point:
x out(k)=x L(k)+x R(k),k=1,2,…,L W
As shown in Figure 4, x LBeing actually length is L p=L D=r * L wPending data (r>1) and window function W LPointwise is multiplied by W after the left justify L, and x RBe to be pending data and window function W to length RPointwise is multiplied by W behind the right-hand justified Rx LWith x RAfter the addition, voice data length is from L pBe reduced to L w, audio frequency length has obtained compression.The L that is compressed p-L wPoint has the window function W of perfect reconstruction characteristic LAnd W RSmoothing effect under, merge in the audio frequency after handling, do not reduce quantity of information, the perfect reconstruction characteristic of window function, the stationarity of the audio frequency after yet having guaranteed to handle.
If playback rate r<1, i.e. sound expansion (slowing down), windowing data so do
x L(k)=x(k)W R(k),k=1,2,…,L W
x R(k)=x(k+L W-L D)W L(k),k=1,2,…,L W
Output data remains the form output of ordering with a frame LW:
x out(k)=x L(k)+x R(k),k=1,2,…,L W
As shown in Figure 5, x LBeing actually length is L p=(2-r) L wPending data (r<1) and window function W LPointwise is multiplied by W behind the right-hand justified L, and x RBe to be pending data and window function W to length RPointwise is multiplied by W after the left justify Rx LWith x RAfter the addition, voice data length is from L DBe increased to L w, audio frequency length has obtained elongation, and the information that is increased is to be L by length p=(2-r) L wPending signal in preceding (1-r) L wAnd back (1-r) L wPoint, they are with respect to middle L DPoint is respectively the information in the past and following information.Through windowing and superposition, preceding (1-r) L in the pending signal wAnd back (1-r) L wL in the middle of point is introduced in wPoint, the audio-frequency information of being introduced has the window function W of perfect reconstruction characteristic LAnd W RSmoothing effect under, merge in the audio frequency after processing the perfect reconstruction characteristic of window function, the stationarity of the audio frequency after also having guaranteed to handle.
What deserves to be mentioned is, because three kinds of length L w, L D, L PDirectly relation is fixed, and confirms that one of which just can obtain other two.And the design of reconstruct window and L wValue directly related, therefore in this embodiment, adopt the L of regular length w, make L D, L PChange with r, so that make that follow-up windowing process operation is more easy.In addition, be appreciated that in practical application, also can confirm L earlier D(or L P) value, again according to L D(or L P) and playback rate r obtain L wAnd L P(or L D) value.
Then, in step 130, with the L that accomplishes windowing process DIndividual signal shifts out buffer zone, continue to fill pending audio signal data at the buffer zone afterbody, reaches data processing length L p until the filling length of buffer zone, and is as shown in Figure 6.That is to say, with among the x (k) from L dAll whole data of+1 beginning are displaced to and begin the place, that is:
x(k)=x(k+L D),k=1,2,…
Behind completing steps 130, get back to step 120, handle until the audio speed changing of accomplishing all audio signal datas.
Owing in this embodiment, use a pair of perfect reconstruction window function W that has amplitude fading, increases characteristic LAnd W R, obtain a pair of windowing data by different time-lag actions in original digital audio, utilize windowing data reconstruction audio volume control, the audio frequency after the acquisition variable-speed processing.Audio compression (pick up speed) is through the crossover of a pair of windowing sound signal, and the audio-frequency information that will be compressed the time is dispersed on the voice data of the shortening after the processing smoothly.Audio frequency expansion (slowing down) is then carried out crossover smoothly through the audio-frequency information of introducing past and following (comparable data is newer on the time relatively), obtains longer voice data.Owing to need the pitch period and the correlativity of audio frequency do not detected, need not carry out time-frequency conversion yet, the Audio Processing flow process has only been used displacement, windowing and crossover, theory of computation complexity is merely O (N), less than the O (N that uses waveform similarity degree detecting algorithm 2) and the O (Nlog of the algorithm of time-frequency conversion and frequency spectrum processing 2N).Therefore calculated amount is extremely low.And, owing to utilized the compression of sound signal self waveform and introduced the time increase and decrease that realizes play content, audio volume control itself is not done change, therefore can keep original tonequality more.Actual processes and displays, the scope of the speed change of this embodiment is big, and time control is accurately handled the tonequality height, can handle universal audio (comprising voice, music).
Such as test is carried out variable-speed processing to one section English Phonetics and one section violin concerto " Liang shanbo and Zhu yingtai ".Playback rate r gets 2 and 0.5, and correspondence adds near 2 times of speed and slows down to 0.5 times of speed.This speed is if use traditional algorithm, and audio quality can descend seriously, and after the method in the employing present embodiment handled, audio frequency still remained on higher degree, improves more relatively.Like Fig. 7 and shown in Figure 8, the waveform envelope of audio frequency is keeping identical shape.And the frequency spectrum of audio frequency is also keeping identical vocal print (shown in Fig. 9-11, Figure 12-14).Actual audition shows, the audio frequency before and after handling, and except velocity of sound, their tone color tone does not all change.
Second embodiment of the invention relates to a kind of DAB speed variable treatment method.Second embodiment improves on the basis of first embodiment, and main improvements are: in the first embodiment, when pending audio signal data carries out windowing process in to buffer zone, the W of use LAnd W RBe the initial reconstitution window function that pairing some addition equals 1, W LAnd W RFix.And in this embodiment; When pending audio signal data carries out windowing process in to buffer zone; Need earlier different reconstruct window functions, re-use selected reconstruct window function and carry out windowing process according to the echo type selecting of audio signal data, shown in figure 20.
Specifically, in Audio Processing, may use the reconstruct window that different weights distribute according to the difference of waveform.The acquisition methods of reconstruct window is not unique, satisfies perfect reconstruction condition, i.e. W but must satisfy or be similar to LAnd W RPairing some addition is similar to 1
W L(k)+W R(k)≈1,k=1,2,…,L W
The reconstruct window that different weights distribute can be distinguished generation separately and storage separately, also can only generate and store an initial reconstitution window, and the reconstruct window that other weights distribute is then from this initial reconstruct window conversion acquisition.Transform method is: to the original reconstructed window value of taking out with carrying out integer ratio, take out 1,3 as 2 and take out 1,4 and take out 1 etc., obtain the gradual part of conversion window type, the constant part at two ends is then filled with 0 or 1 respectively and obtained to extend, and is to reach original length, shown in figure 15.
In this embodiment, with the W in first embodiment LAnd W RAs the initial reconstitution window function, through the value transform of taking out to the initial reconstitution window, can obtain is 4 pairs of reconstruct window functions that different weights distribute, and is used for processing audio more neatly.Wherein, W L1, W R1(its window type is shown in figure 16) is:
Figure G2009102021641D00121
W L2, W R2(its window type is shown in figure 17) is:
Figure G2009102021641D00123
Figure G2009102021641D00124
W L3, W R3(its window type is shown in figure 18) is:
Figure G2009102021641D00131
Figure G2009102021641D00132
W L4, W R4(its window type is shown in figure 19) is:
Figure G2009102021641D00134
In this embodiment, the echo type of audio signal data is obtained according to the block energy of audio signal data or the judged result of piece absolute value and preset thresholding.
Specifically, design judgment parameter engLa, engLb, engRa; EngRb, their pairing blocks (line fill area) are shown in figure 21, visible engLa, engLb; EngRa, engRb arranges by the time from the old to the new, when adopting energy to calculate, judges parameter engLa; EngLb, engRa, engRb is obtained by following formula:
Figure G2009102021641D00135
Figure G2009102021641D00136
When adopting absolute calculation, judge parameter engLa, engLb, engRa, engRb is obtained by following formula:
Figure G2009102021641D00137
Figure G2009102021641D00138
According to judging parameter engLa, engLb, engRa; EngRb sets echo and judges thresholding echoRate (being necessary for the value greater than 1, generally greater than 2); Judge that according to judgement parameter and echo thresholding obtains the echo type, and select employed reconstruct window, specifically can realize through following code:
Initial echoControl=0; Keep W LAnd W RConstant
(outLen>frmLen) flow process is judged in the sound expansion to if
{
if(engRa>(engLa+engLb)*echoRate)
EchoControl=3; W LAnd W RReplace with W L3And W R3Prevention Pre echoes (pre echo)
else?if(engRb>(engLa+engLb)*echoRate)
EchoControl=1; W LAnd W RReplace with W L1And W R1Prevention Pre echoes (pre echo)
if(engLb>(engRa+engRb)*echoRate)
EchoControl=4; W LAnd W RReplace with W L4And W R4Prevented echo (post echo)
else?if(engLa>(engRa+engRb)*echoRate)
EchoControl=2; W LAnd W RReplace with W L2And W R2Prevented echo (post echo)
}
Flow process is judged in the compression of else sound
{
if(engRb>(engLa+engLb))
{
if(engRa>(engLa+engLb))
EchoControl=4; W LAnd W RReplace with W L4And W R4Prevention Pre echoes (pre echo)
else
EchoControl=2; W LAnd W RReplace with W L2And W R2Prevention Pre echoes (pre echo)
}
if(engLa>(engRa+engRb))
{
if(engLb>(engRa+engRb))
EchoControl=3; W LAnd W RReplace with W L3And W R3Prevented echo (post echo)
else
EchoControl=1; W LAnd W RReplace with W L1And W R1Prevented echo (post echo)
}
}
Through experiment confirm, the window function that adapts to according to the echo type selecting can further improve the sharpness of voice, shown in Figure 22-29.
It will be understood by those skilled in the art that audio compression (pick up speed) is that the audio-frequency information that the time is compressed is dispersed on the voice data of the shortening after the processing smoothly; Audio frequency expansion (slowing down) then is to carry out crossover smoothly through the audio-frequency information of introducing past and following (comparable data is newer on the time relatively), obtains longer voice data.The process of this crossover all can be introduced the bigger signal of energy or be diffused into the very little part of original energy, causes echo (echo at signal the back takes place) and Pre echoes (echo is before the signal generation).Therefore echo is judged,, can be alleviated Pre echoes and cross echo in the face of extremely during uneven sound signal, reducing the diffusion of signal energy in energy distribution through the switching of window function.
In this embodiment, the foundation that echo is judged is the block energy (or piece absolute value) of sound signal, if signal in the past greater than present signal echo was taken place easily.When audio compression, the transitional zone of window function need move to left, and promptly uses similar W L1, W R1Or W L3, W R3Such window function was suppressed (past) of left end than large-signal, reduced diffusion (example is corresponding Figure 26 respectively, 28); And when audio frequency was expanded, the transitional zone of window function need move to right, and promptly used similar W L2, W R2Or W L4, W R4Such window function was suppressed (past) of left end than large-signal, reduced diffusion (example is corresponding Figure 23 respectively, 25).If the signal in past less than present signal, Pre echoes takes place easily, when audio compression,, promptly use similar W just the transitional zone of window function need move to right L2, W R2Or W L4, W R4Such window function will be suppressed (future) of right-hand member than large-signal, reduce diffusion (example is corresponding Figure 27 respectively, 29); When audio frequency was expanded, the transitional zone of window function need move to left, and promptly used similar W L1, W R1Or W L3, W R3Such window function will be suppressed (future) of right-hand member than large-signal, reduce diffusion (example is corresponding Figure 22 respectively, 24).W L1, W R1And W L3, W R3The degree varies appearance that moves to left of transitional zone, W L2, W R2And W L4, W R4The degree that moves to right is also different, and this need differentiate through the block energy (or piece absolute value) of segmentation.
Be not difficult to find, through further based on the selected suitable reconstruct window function of echo type, further guaranteed the audio quality after the speed change in this embodiment.And, if because echo greater than present signal, took place in signal in the past easily.If Pre echoes less than present signal, takes place in the signal in past easily.So, can effectively guarantee the accuracy of judged result with the foundation that the block energy (or piece absolute value) of sound signal is judged as the echo type.
Each method embodiment of the present invention all can be realized with modes such as software, hardware, firmwares.No matter the present invention be with software, hardware, or the firmware mode realize; Instruction code can be stored in the storer of computer-accessible of any kind (for example permanent or revisable; Volatibility or non-volatile; Solid-state or non-solid-state, fixing perhaps removable medium or the like).Equally; Storer can for example be programmable logic array (Programmable Array Logic; Abbreviation " PAL "), RAS (Random Access Memory; Abbreviation " RAM "), programmable read only memory (Programmable Read Only Memory is called for short " PROM "), ROM (read-only memory) (Read-Only Memory is called for short " ROM "), Electrically Erasable Read Only Memory (Electrically Erasable Programmable ROM; Abbreviation " EEPROM "), disk, CD, digital versatile disc (Digital Versatile Disc is called for short " DVD ") or the like.
Third embodiment of the invention relates to a kind of DAB variable-speed processing equipment.Shown in figure 30, this DAB variable-speed processing equipment comprises:
Packing module is used for the audio signal data that pending audio speed changing is handled is filled into buffer zone, reaches the data processing length L until the filling length of buffer zone p
The windowing process module is used for the audio signal data that buffer zone is pending and carries out windowing process, obtains exporting signal x OutWhen the windowing process module is treated to the processing of pick up speed at audio speed changing, be L with length in the buffer zone pAudio signal data and length be L WWindow function W LPointwise is multiplied by W after the left justify LObtain x L, be L with length in the buffer zone pAudio signal data and length be L WWindow function W RPointwise is multiplied by W behind the right-hand justified RObtain x R, with the x that obtains LAnd x RAddition obtains L WIndividual output signal x OutWhen audio speed changing is treated to processing slowly, be L with length in the buffer zone pAudio signal data and length be L WWindow function W LPointwise is multiplied by W behind the right-hand justified LObtain x L, be L with length in the buffer zone pAudio signal data and length be L WWindow function W RPointwise is multiplied by W after the left justify RObtain x R, with the x that obtains LAnd x RAddition obtains L WIndividual said output signal x Out
Shift module is used for the L that accomplishes windowing process DIndividual signal shifts out buffer zone, and the indication packing module continue to fill pending audio signal data at the buffer zone afterbody, reaches the data processing length L until the filling length of buffer zone p
When the filling length of buffer zone reaches the data processing length L pThe time, trigger the processing of windowing process module.When the windowing process module obtains L WIndividual output signal x OutThe time, trigger the processing of shift module, handle until the audio speed changing of accomplishing all audio signal datas.
Wherein, W LFor having the window function of amplitude fading characteristic, W RIncrease the window function of characteristic, W for having amplitude LAnd W RL is respectively arranged WThe data of individual point, pairing some addition equal 1 or be similar to 1.L WFor predefined value, according to L WR obtains L with playback rate DAnd L pValue.
In this embodiment, be used to carry out the window function W of windowing process LAnd W RBe the initial reconstitution window function that pairing some addition equals 1, initial reconstitution window W LAnd W RAs follows:
Figure G2009102021641D00171
W L(k)=1-W R(k),k=1,2,…,L W
Be not difficult to find that first embodiment is and the corresponding method embodiment of this embodiment, this embodiment can with the enforcement of working in coordination of first embodiment.The correlation technique details of mentioning in first embodiment is still effective in this embodiment, in order to reduce repetition, repeats no more here.Correspondingly, the correlation technique details of mentioning in this embodiment also can be applicable in first embodiment.
Four embodiment of the invention relates to a kind of DAB variable-speed processing equipment.The 4th embodiment improves on the basis of the 3rd embodiment, and main improvements are: in the 3rd embodiment, be used to carry out the window function W of windowing process LAnd W RBe the initial reconstitution window function that pairing some addition equals 1, W LAnd W RFix; And in this embodiment, be used to carry out the window function W of windowing process LAnd W RBe the reconstruct window function that distributes according to the selected different weights of the echo type of audio signal data.That is to say; Also comprise window function in the DAB variable-speed processing equipment of this embodiment and select module; Be used for obtaining the echo type of audio signal data according to the judged result of the block energy of audio signal data or piece absolute value and preset thresholding; And the echo type of obtaining exported to the windowing process module, the reconstruct window function that the different weights that are used to select distribute.
The reconstruct window function that different weights distribute generates separately respectively, perhaps obtains through the conversion to the initial reconstitution window.Through the conversion to the initial reconstitution window, the mode that obtains the reconstruct window function that different weights distribute is following:
To the initial reconstitution window value of taking out with carrying out integer ratio, obtain the gradual part of conversion window type, the constant part at two ends is then filled with 0 or 1 respectively, until the original length that reaches the initial reconstitution window.
Be not difficult to find that second embodiment is and the corresponding method embodiment of this embodiment, this embodiment can with the enforcement of working in coordination of second embodiment.The correlation technique details of mentioning in second embodiment is still effective in this embodiment, in order to reduce repetition, repeats no more here.Correspondingly, the correlation technique details of mentioning in this embodiment also can be applicable in second embodiment.
Need to prove; Each unit of mentioning in each equipment embodiment of the present invention all is a logical block, and physically, a logical block can be a physical location; It also can be the part of a physical location; Can also realize that the physics realization mode of these logical blocks itself is not most important with the combination of a plurality of physical locations, the combination of the function that these logical blocks realized is the key that just solves technical matters proposed by the invention.In addition, for outstanding innovation part of the present invention, above-mentioned each the equipment embodiment of the present invention will not introduced with solving the not too close unit of technical matters relation proposed by the invention, and this does not show that there is not other unit in the said equipment embodiment.
Though through reference some preferred implementation of the present invention; The present invention is illustrated and describes; But those of ordinary skill in the art should be understood that and can do various changes to it in form with on the details, and without departing from the spirit and scope of the present invention.

Claims (10)

1. a DAB speed variable treatment method is characterized in that, comprises following steps:
The audio signal data that A handles pending audio speed changing is filled into buffer zone, reaches the data processing length L until the filling length of said buffer zone p
B carries out windowing process in the following manner with audio signal data pending in the said buffer zone, obtains exporting signal x Out:
If said audio speed changing is treated to the processing of pick up speed, be L then with length in the said buffer zone pAudio signal data and length be L WWindow function W LPointwise is multiplied by W after the left justify LObtain x L, be L with length in the said buffer zone pAudio signal data and length be L WWindow function W RPointwise is multiplied by W behind the right-hand justified RObtain x R, with the x that obtains LAnd x RAddition obtains L WIndividual said output signal x Out
If said audio speed changing is treated to processing slowly, is L with length in the said buffer zone then pAudio signal data and length be L WWindow function W LPointwise is multiplied by W behind the right-hand justified LObtain x L, be L with length in the said buffer zone pAudio signal data and length be L WWindow function W RPointwise is multiplied by W after the left justify RObtain x R, with the x that obtains LAnd x RAddition obtains L WIndividual said output signal x Out
C will accomplish the L of windowing process DIndividual signal shifts out buffer zone, continue to fill pending audio signal data at the buffer zone afterbody, reaches the data processing length L until the filling length of said buffer zone p
Repeat said step B and step C, handle until the audio speed changing of accomplishing all audio signal datas;
Wherein, said W LFor having the window function of amplitude fading characteristic, said W RIncrease the window function of characteristic, W for having amplitude LAnd W RL is respectively arranged WThe data of individual point, pairing some addition equal 1 or be similar to 1; Said L WBe predefined value, according to said L WObtain said L with playback rate r DAnd L pValue.
2. DAB speed variable treatment method according to claim 1 is characterized in that, when pending audio signal data carries out windowing process in to said buffer zone, and said W LAnd W REqual 1 initial reconstitution window function for pairing some addition; Perhaps, said W LAnd W RBe the reconstruct window function that distributes according to the selected different weights of the echo type of audio signal data; The reconstruct window function that said different weights distribute generates separately respectively, perhaps obtains through the conversion to said initial reconstitution window.
3. DAB speed variable treatment method according to claim 2 is characterized in that, the echo type of said audio signal data is obtained according to the block energy of said audio signal data or the judged result of piece absolute value and preset thresholding.
4. DAB speed variable treatment method according to claim 2 is characterized in that, and is said through to the conversion of initial reconstitution window, and the mode that obtains the reconstruct window function that different weights distribute is following:
To the said initial reconstitution window value of taking out with carrying out integer ratio, obtain the gradual part of conversion window type, the constant part at two ends is then filled with 0 or 1 respectively, until the original length that reaches said initial reconstitution window.
5. DAB speed variable treatment method according to claim 2 is characterized in that, said initial reconstitution window W LAnd W RAs follows:
Figure 2009102021641100001DEST_PATH_IMAGE002
W L(k)=1-W R(k),k=1,2,…,L W
6. a DAB variable-speed processing equipment is characterized in that, comprises:
Packing module is used for the audio signal data that pending audio speed changing is handled is filled into buffer zone, reaches the data processing length L until the filling length of said buffer zone p
The windowing process module is used for the audio signal data that said buffer zone is pending and carries out windowing process, obtains exporting signal x OutSaid windowing process module is L with length in the said buffer zone when said audio speed changing is treated to the processing of pick up speed pAudio signal data and length be L WWindow function W LPointwise is multiplied by W after the left justify LObtain x L, be L with length in the said buffer zone pAudio signal data and length be L WWindow function W RPointwise is multiplied by W behind the right-hand justified RObtain x R, with the x that obtains LAnd x RAddition obtains L WIndividual said output signal x OutWhen said audio speed changing is treated to processing slowly, be L with length in the said buffer zone pAudio signal data and length be L WWindow function W LPointwise is multiplied by W behind the right-hand justified LObtain x L, be L with length in the said buffer zone pAudio signal data and length be L WWindow function W RPointwise is multiplied by W after the left justify RObtain x R, with the x that obtains LAnd x RAddition obtains L WIndividual said output signal x Out
Shift module is used for the L that accomplishes windowing process DIndividual signal shifts out buffer zone, and indicates said packing module continue to fill pending audio signal data at the buffer zone afterbody, reaches the data processing length L until the filling length of said buffer zone p
When the filling length of said buffer zone reaches the data processing length L pThe time, trigger the processing of said windowing process module; When said windowing process module obtains L WIndividual said output signal x OutThe time, trigger the processing of said shift module, handle until the audio speed changing of accomplishing all audio signal datas;
Wherein, said W LFor having the window function of amplitude fading characteristic, said W RIncrease the window function of characteristic, W for having amplitude LAnd W RL is respectively arranged WThe data of individual point, pairing some addition equal 1 or be similar to 1; Said L WBe predefined value, according to said L WObtain said L with playback rate r DAnd L pValue.
7. DAB variable-speed processing equipment according to claim 6 is characterized in that, the said window function W that is used to carry out windowing process LAnd W REqual 1 initial reconstitution window function for pairing some addition; Perhaps,
The said window function W that is used to carry out windowing process LAnd W RBe the reconstruct window function that distributes according to the selected different weights of the echo type of audio signal data; The reconstruct window function that said different weights distribute generates separately respectively, perhaps obtains through the conversion to said initial reconstitution window.
8. DAB variable-speed processing equipment according to claim 7; It is characterized in that; Also comprise: window function is selected module; Be used for obtaining the echo type of said audio signal data, and the echo type of obtaining is exported to said windowing process module according to the judged result of the block energy of said audio signal data or piece absolute value and preset thresholding.
9. DAB variable-speed processing equipment according to claim 7 is characterized in that, and is said through to the conversion of initial reconstitution window, and the mode that obtains the reconstruct window function that different weights distribute is following:
To the said initial reconstitution window value of taking out with carrying out integer ratio, obtain the gradual part of conversion window type, the constant part at two ends is then filled with 0 or 1 respectively, until the original length that reaches said initial reconstitution window.
10. DAB variable-speed processing equipment according to claim 7 is characterized in that, said initial reconstitution window W LAnd W RAs follows:
Figure 135250DEST_PATH_IMAGE002
W L(k)=1-W R(k),k=1,2,…,L W
CN 200910202164 2009-12-31 2009-12-31 Method and equipment for processing digital audio in variable speed Active CN102117613B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200910202164 CN102117613B (en) 2009-12-31 2009-12-31 Method and equipment for processing digital audio in variable speed

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200910202164 CN102117613B (en) 2009-12-31 2009-12-31 Method and equipment for processing digital audio in variable speed

Publications (2)

Publication Number Publication Date
CN102117613A CN102117613A (en) 2011-07-06
CN102117613B true CN102117613B (en) 2012-12-12

Family

ID=44216345

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200910202164 Active CN102117613B (en) 2009-12-31 2009-12-31 Method and equipment for processing digital audio in variable speed

Country Status (1)

Country Link
CN (1) CN102117613B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102419981B (en) * 2011-11-02 2013-04-03 展讯通信(上海)有限公司 Zooming method and device for time scale and frequency scale of audio signal
CN106469559B (en) * 2015-08-19 2020-10-16 中兴通讯股份有限公司 Voice data adjusting method and device
CN105208426B (en) * 2015-09-24 2018-07-06 福州瑞芯微电子股份有限公司 A kind of method and system of audio-visual synchronization speed change
CN110333722A (en) * 2019-07-11 2019-10-15 北京电影学院 A kind of robot trajectory generates and control method, apparatus and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5175769A (en) * 1991-07-23 1992-12-29 Rolm Systems Method for time-scale modification of signals
EP0608833A2 (en) * 1993-01-25 1994-08-03 Matsushita Electric Industrial Co., Ltd. Method of and apparatus for performing time-scale modification of speech signals
US5781885A (en) * 1993-09-09 1998-07-14 Sanyo Electric Co., Ltd. Compression/expansion method of time-scale of sound signal
CN1208490A (en) * 1996-11-11 1999-02-17 松下电器产业株式会社 Sound reproducing speed converter
CN1440549A (en) * 2000-07-26 2003-09-03 Ssi株式会社 Continuously variable time scale modification of digital audio signals

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5175769A (en) * 1991-07-23 1992-12-29 Rolm Systems Method for time-scale modification of signals
EP0608833A2 (en) * 1993-01-25 1994-08-03 Matsushita Electric Industrial Co., Ltd. Method of and apparatus for performing time-scale modification of speech signals
US5781885A (en) * 1993-09-09 1998-07-14 Sanyo Electric Co., Ltd. Compression/expansion method of time-scale of sound signal
CN1208490A (en) * 1996-11-11 1999-02-17 松下电器产业株式会社 Sound reproducing speed converter
CN1440549A (en) * 2000-07-26 2003-09-03 Ssi株式会社 Continuously variable time scale modification of digital audio signals

Also Published As

Publication number Publication date
CN102117613A (en) 2011-07-06

Similar Documents

Publication Publication Date Title
US20200294550A1 (en) Audiovisual capture and sharing framework with coordinated, user-selectable audio and video effects filters
US20220180879A1 (en) Automatic conversion of speech into song, rap or other audible expression having target meter or rhythm
ES2837107T3 (en) Apparatus and method for processing an audio signal to obtain a processed audio signal using a target time domain envelope
CN101123085B (en) Chord-name detection apparatus and chord-name detection method
US7737354B2 (en) Creating music via concatenative synthesis
EP2680255B1 (en) Automatic performance technique using audio waveform data
CN102117613B (en) Method and equipment for processing digital audio in variable speed
CN105513583A (en) Display method and system for song rhythm
EP2682939B1 (en) Automatic performance technique using audio waveform data
EP2650875B1 (en) Music tracks order determination using a table of correlations of beat positions between segments.
CN103021402A (en) Apparatus and method for creating dictionary for speech synthesis
US20100222906A1 (en) Correlating changes in audio
CN104217731A (en) Quick solo music score recognizing method
KR960002328A (en) Display method of reserved song of video song accompaniment device and apparatus suitable for this
JP7343320B2 (en) Information processing device, information processing method, and program
JP6047863B2 (en) Method and apparatus for encoding acoustic signal
Dirks An Analysis of Jonathan Harvey’s ‘Mortuos Plango, Vivos Voco’
JP2011090189A (en) Method and device for encoding acoustic signal
US20230230611A1 (en) Method and device for managing audio based on spectrogram
JP5609280B2 (en) Method and apparatus for encoding acoustic signal
Smith Peter Thoegersen-Peter Thoegersen, Three Pieces in Polytempic Polymicrotonality. New World Records, NW80812
Barrett Mark R Taylor-Mark R Taylor: Aftermaths and other piano pieces. Teodora Stepančić. Another Timbre, at133
Ueda et al. Musical Pitch Expansion by Spectral Peak Shifting for Japanese Traditional Music Box
JPH08234791A (en) Music reproducing device
Tanner Polyphonic Composition

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20180416

Address after: The 300456 Tianjin FTA test area (Dongjiang Bonded Port) No. 6865 North Road, 1-1-1802-7 financial and trade center of Asia

Patentee after: Xinji Lease (Tianjin) Co.,Ltd.

Address before: 201203 Shanghai city Zuchongzhi road Pudong Zhangjiang hi tech park, Spreadtrum Center Building 1, Lane 2288

Patentee before: SPREADTRUM COMMUNICATIONS (SHANGHAI) Co.,Ltd.

EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20110706

Assignee: SPREADTRUM COMMUNICATIONS (SHANGHAI) Co.,Ltd.

Assignor: Xinji Lease (Tianjin) Co.,Ltd.

Contract record no.: 2018990000196

Denomination of invention: Method and equipment for processing digital audio in variable speed

Granted publication date: 20121212

License type: Exclusive License

Record date: 20180801

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20221027

Address after: 201203 Shanghai city Zuchongzhi road Pudong New Area Zhangjiang hi tech park, Spreadtrum Center Building 1, Lane 2288

Patentee after: SPREADTRUM COMMUNICATIONS (SHANGHAI) Co.,Ltd.

Address before: 300456 1-1-1802-7, north area of financial and Trade Center, No. 6865, Asia Road, Tianjin pilot free trade zone (Dongjiang Bonded Port Area)

Patentee before: Xinji Lease (Tianjin) Co.,Ltd.