CN103258552B - The method of adjustment broadcasting speed - Google Patents

The method of adjustment broadcasting speed Download PDF

Info

Publication number
CN103258552B
CN103258552B CN201210038338.7A CN201210038338A CN103258552B CN 103258552 B CN103258552 B CN 103258552B CN 201210038338 A CN201210038338 A CN 201210038338A CN 103258552 B CN103258552 B CN 103258552B
Authority
CN
China
Prior art keywords
audio frequency
frequency frame
sum total
window type
energy sum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210038338.7A
Other languages
Chinese (zh)
Other versions
CN103258552A (en
Inventor
陈亘志
陈昭宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ali Corp
Original Assignee
Ali Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ali Corp filed Critical Ali Corp
Priority to CN201210038338.7A priority Critical patent/CN103258552B/en
Publication of CN103258552A publication Critical patent/CN103258552A/en
Application granted granted Critical
Publication of CN103258552B publication Critical patent/CN103258552B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention provides a kind of method adjusting broadcasting speed, it utilizes the related data of the frequency of audio data in an Auditory Perception decode procedure, judges whether the voice data giving up or copy part, also to reach the change of broadcasting speed in decode procedure simultaneously.Thus, the present invention does not need a large amount of registers to deposit voice data.

Description

The method of adjustment broadcasting speed
Technical field
The present invention about a kind of media processing method and device thereof, espespecially a kind of method and device thereof adjusting media play speed.
Background technology
When user utilizes multimedia platform to listen to as audio compression shelves such as MP3/WMA/AAC (MPEG-1AudioLayer3/WindowsMediaAudio/AdvancedAudioCoding), the fragment of listening to desired by broadcasting speed searching may be accelerated, or slow down the details (expansion) that broadcasting speed carefully listens to certain fragment.In order to the playback quality not significantly distortion because broadcasting speed changes, duration adjusting (TimeScaleModification, TSM) adopts widely for industry.Duration adjusting on conventional Time-domain, as overlap-add method (OverlapAdd, or synchronized overiap-add method (SynchronizedOLA) OLA), mainly input audio signal is divided into many fragment signal, overlapping two contiguous in time fragment signal, and the region of overlap is done the weighting process of fading over.But such duration adjusting needs a large amount of registers to deposit fragment signal.
In addition, existing duration adjusting also utilizes Short-time Fourier to change (Short-TimeDiscreteFourierTransform, ST-DFT) forward input audio signal to frequency domain from time domain to perform an analysis, but when rotating back into time domain after analysis again, the problem of phase distortion can be run into.
U.S. Patent Publication No. 20050010397 discloses a duration adjusting utilizing Short-time Fourier to change, it is mainly according to the variation of human auditory's perception frequency response, select the particular spectral band (SpectralBand) of voice data, these spectral bands are measured according to the Bark about human auditory's sensor model, are used in PGC demodulation.Each spectral band all indicates a frequency spectrum wave crest (SpectralPeak).Frequency spectrum wave crest and close or away from frequency spectrum wave crest spectrum line carry out different Phase Processing, also therefore when subsequent audio data must go back to time domain and carry out signal window reconstruction (Reconstruction), easily cause phase distortion, affect playback quality.
Summary of the invention
Therefore, the present invention mainly provides a kind of method and the device thereof that do not need the adjustment broadcasting speed of a large amount of registers.
The present invention discloses a kind of method adjusting broadcasting speed, includes: an Auditory Perception decoding device receives a voice data; This Auditory Perception decoding device carries out the frequency analysis of one first audio frequency frame of this voice data; Obtain the one first frequency-domain analysis data about this frequency analysis; Receive a speed adjustment signal; When this speed adjustment signal designation accelerates the broadcasting speed of this voice data, according to these the first frequency-domain analysis data, judge whether to give up this first audio frequency frame; When this speed adjustment signal designation slows down the broadcasting speed of this voice data, according to these the first frequency-domain analysis data, judge whether to copy this first audio frequency frame; In this first audio frequency frame be judged as can give up time, this Auditory Perception decoding device gives up the data at least partially of this first audio frequency frame; And in this first audio frequency frame be judged as can copy time, this Auditory Perception decoding device copies the data at least partially of this first audio frequency frame.
The present invention separately discloses a kind of method adjusting broadcasting speed, includes: an Auditory Perception decoding device receives a voice data, and this voice data comprises multiple audio frequency frame; This Auditory Perception decoding device carries out the frequency analysis of the plurality of audio frequency frame; Receive a speed adjustment signal; During in the broadcasting speed that this speed adjustment signal designation accelerates this voice data to (N/ (N-M)) times, to the adjustment determining program whether each audio frequency frame execution audio frequency frame be used for handled by judgement of the N number of continuous audio frequency frame in the plurality of audio frequency frame can be given up, wherein N, M are positive integer; In passing through this adjustment determining program, when judging have M audio frequency frame to give up in this N number of continuous audio frequency frame, this Auditory Perception decoding device gives up the data at least partially of this M audio frequency frame; During in the broadcasting speed that this speed adjustment signal designation slows down this voice data to (N/ (N+M)) times, the adjustment determining program whether the audio frequency frame handled by being used for judging can copy is performed to each audio frequency frame of the N number of continuous audio frequency frame in the plurality of audio frequency frame; And in passing through this adjustment determining program, when judging have M audio frequency frame to copy in this N number of continuous audio frequency frame, this Auditory Perception decoding device copies the data at least partially of this M audio frequency frame.Wherein, this adjustment determining program comprises: obtain corresponding to one first handled audio frequency frame, one first frequency-domain analysis data about this frequency analysis; When this speed adjustment signal designation accelerates the broadcasting speed of this voice data, according to these the first frequency-domain analysis data, judge whether the data at least partially giving up this first audio frequency frame; And when this speed adjustment signal designation slows down the broadcasting speed of this voice data, according to these the first frequency-domain analysis data, judge whether the data at least partially copying this first audio frequency frame.
The present invention separately discloses a kind of method accelerating broadcasting speed, includes an Auditory Perception decoding device and receives a voice data; This Auditory Perception decoding device carries out the frequency analysis of one first audio frequency frame of this voice data; Obtain the one first frequency-domain analysis data about this frequency analysis; Receive one and accelerate adjustment signal; According to these the first frequency-domain analysis data, judge whether to give up this first audio frequency frame; And in this first audio frequency frame be judged as can give up time, this Auditory Perception decoding device according to this acceleration adjustment signal indicated by a broadcasting speed, give up the data at least partially of this first audio frequency frame.
The present invention separately discloses a kind of method slowing down broadcasting speed, includes an Auditory Perception decoding device and receives a voice data; This Auditory Perception decoding device carries out the frequency analysis of one first audio frequency frame of this voice data; Obtain the one first frequency-domain analysis data about this frequency analysis; Receive one and slowly adjust signal; According to these the first frequency-domain analysis data, judge whether to copy this first audio frequency frame; And in this first audio frequency frame be judged as can copy time, this Auditory Perception decoding device according to should slowly adjustment signal make a broadcasting speed of instruction, copy the data at least partially of this first audio frequency frame.
The present invention the adjustment method of broadcasting speed and device thereof are provided, do not need a large amount of registers.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of the embodiment of the present invention one flow process.
Fig. 2 is the process flow diagram of the embodiment of the present invention one flow process.
Fig. 3 is the process flow diagram of the embodiment of the present invention one flow process.
Fig. 4 A and Fig. 4 B is the process flow diagram of the embodiment of the present invention one flow process.
Fig. 5 is the process flow diagram of the embodiment of the present invention one flow process.
Fig. 6 is the process flow diagram of the flow process that the embodiment of the present invention one pressure copies/gives up.
Fig. 7 is the block schematic diagram of the embodiment of the present invention one velocity adjustment apparatus.
Drawing reference numeral:
10,20,30,40,60 flow processs
50 velocity adjustment apparatus
500 audio frequency reading devices
510 processor units
520 storage elements
530 input blocks
540 output units
522 program codes
100,102,104,106,108,110,112,114,116,118,200,202,204,206,208,210,212,300,302,304,306,308,310,312,314,316,318,320,322,324,400,402,404,406,408,410,412,414,416,418, S910, S920, S930, S940, S950,602,604,606,608,610,612,614,616,618,620,622,624,626 steps
Embodiment
Fig. 1 is the speed adjustment process flow diagram of one embodiment of the invention for making sound play speed-variation without tone.Please refer to Fig. 1, the present embodiment is applicable to the multimedia playing apparatus such as televisor, Set Top Box, digital video disc players, MP3 player, in order to according to playing device can buffer capacity, determine the audio frequency frame number of the audio data read when playing, and the distribution of difference summation according to these audio frequency frames, determine the content of the voice data play, and good result of broadcast is provided.
First, the voice data (step S910) that cutting is multiple audio frequency frame is obtained.Wherein, described voice data comprises the voice data of TV programme or multimedia shelves, and each the audio frequency frame in this voice data includes multiple frequency component.
After the voice data obtaining audio frequency frame, then can carry out the frequency-domain analysis process (step S920) of audio frequency frame.Wherein, the calculating of a frequency component can be carried out when doing frequency-domain analysis to audio frequency frame, the account form of frequency component can be utilize fast fourier to change (FastFourierTransform, FFT), and obtain the domain complex value of each frequency, use and an audio frequency frame is divided into multiple FFT frequency, and then the energy value calculating these FFT frequencies is respectively using the energy as its each frequency component.Another kind of mode is then utilize bank of filters (FilterBank) that an audio frequency frame is divided into multiple subband (Sub-band), and the energy value calculating each subband is using the energy as its each frequency component.
When playing device is after the change broadcasting speed instruction receiving user's input, namely can judge that user will play or slow play (step S930) fast, according to the multiple A of broadcasting speed and above-mentioned multiple audio frequency frame with the ratio of the audio frequency frame of dynamic conditioning institute playing audio-fequency data, play out voice data, wherein A is positive number.
Wherein, when user plays fast for performing, now according to an adjustment, playing device can judge that flow process (will in its principle is hereafter described in detail in detail) will meet the audio frequency frame deletion of the condition of giving up, to reach quick broadcasting (step S950) (such as audio frequency frame 1,2,3, wherein 2 is deleted, then play 1,3); Otherwise, when user is for performing slow play, now namely playing device can judge that the audio frequency frame meeting copy condition copies by flow process, to reach slow play (step S940) (such as audio frequency frame 1,2,3 according to adjustment, wherein 2 repeated, then play 1,2,2,3).In actual applications, the multiple A of broadcasting speed can be a decimal, such as 1.75 times or 0.75 times.
For example, when playing device performs the quick broadcasting of 2 speeds, can by B audio frequency frame, according to judging that the B/2 audio frequency frame meeting the condition of giving up is thrown away by flow process, use the audio frequency frame content playing out and change greatly in voice data, and can allow user in the process play fast, still can hear the important message in voice data.On the other hand, when playing device performs the slow play of 0.66 speed, then can by described B audio frequency frame, according to judging that the B/2 audio frequency frame meeting copy condition respectively repeats once by flow process, use repeat playing to go out in voice data to change less audio frequency frame content, and can allow user in the process of slow play, hear and extend and the audio content of invariable tone.Pass through said method, playing device can utilize original can buffer capacity copy to carry out audio frequency frame data and give up, and the normal play of voice data can not be had influence on, in other words, playing device can be saved the use of buffer register and can maintain the details characteristic of the sound overwhelming majority, and what provide user's fast browsing and emphasis to play listens to effect.
It is worth mentioning that, above-mentioned crossover frequency component and the mode calculating its energy are only one embodiment of the invention, know those skilled in the art when visual actual needs, change FFT length or bank of filters subband number, or use wavelet transformation, discrete cosine transform (DiscreteCosineTransform, DCT) or other technologies come crossover frequency component with calculate its energy, the present embodiment does not limit its scope.
For compressed voice data, as: MPEG, AC3, DTS, WMA, AAC etc., it is just first cut into audio frequency frame one by one when compressing, and is that each frequency component is just compressed after all calculating in its each audio frequency frame.Therefore, when playing the audio compressed data of above specification, playing device only needs received audio compressed data to decompress, and can obtain all frequency components in the voice data and each audio frequency frame being cut into multiple audio frequency frame, directly can calculate the energy of these frequency components.
Please refer to Fig. 2, Fig. 3, Fig. 4 A and Fig. 4 B, Fig. 2 is the speed adjustment process flow diagram of one embodiment of the invention, and Fig. 3, Fig. 4 A and Fig. 4 B is the adjustment decision flow chart of another embodiment of the present invention.Speed adjustment flow process 10 can be implemented on Auditory Perception decoding (PerceptualAudiodecoding) device, to coordinate adjustment to judge flow process 20, adjusts audio frequency broadcasting speed under Auditory Perception decoding program.Speed adjustment flow process 10 comprises the following steps:
Step 100: start.
Step 102: the audio frequency frame receiving a voice data.
Step 104: entropy decoding (EntropyDecoding) carrying out this audio frequency frame.
Step 106: the inverse quantization (InverseQuantization) carrying out this audio frequency frame.
Step 108: according to a sense of hearing sensing module, carry out the frequency analysis of this audio frequency frame, and execution adjustment judges flow process 20.
Step 110: judge the judged result that flow process 20 exports according to adjustment, judge whether to give up this audio frequency frame? if so, then step 118 is performed; If not, then step 112 is performed.
Step 112: according to the window type of this audio frequency frame, carries out the inverse modified form discrete cosine transform (InverseModifiedDiscreteCosineTransform, IMDCT) of this audio frequency frame.
Step 114: this judged result judging flow process 20 according to adjustment, judges whether to copy this audio frequency frame? if so, then step 116 is performed; If not, then step 118 is performed.
Step 116: copy this audio frequency frame and preset judged result next time for " not copying ", and performing step 112.
Step 118: in time having next audio frequency frame to exist, receives the next audio frequency frame of this audio frequency frame, and carry out step 104.
As from the foregoing, speed adjustment flow process 10 carries out Auditory Perception decoding to each audio frequency frame of voice data one by one, and voice data can be the voice data of the compressed formats such as MP3/WMA/AAC.First, each audio frequency frame carries out entropy decoding, such as Huffman (Huffman) decoding.Then, audio frequency frame carries out inverse quantization, and it can comprise scale factor (ScaleFactor) used when coding side originally of decoding is used for quantizing.After inverse quantization completes, adjustment judges that flow process 20 is according to the data (hereinafter referred to as frequency-domain analysis data) of frequency-domain analysis and speed adjustment signal designation, judge that audio frequency frame is the need of being replicated, giving up the process that maybe need not copy and give up, and produce correlated judgment result.When speed adjustment signal designation accelerates the broadcasting speed of voice data, adjustment judges that flow process 20 is according to frequency-domain analysis data, judges whether to give up this audio frequency frame; When speed adjustment signal designation slows down the broadcasting speed of voice data, adjustment judges that flow process 20 is according to frequency-domain analysis data, judges whether to copy this audio frequency frame.Speed adjustment signal designation can utilize audio frequency broadcast system to change broadcasting speed according to user and produce.Adjustment judges that the detailed principle of operation of flow process 20 will in hereinafter illustrating.
Auditory Perception decoding device is according to judged result, first judge that audio frequency frame is the need of giving up, if need give up, then as described in step 118, then the next audio frequency frame of decoding, thus, in the playing process of voice data, this audio frequency frame data can not be played, to reach the object that broadcasting speed is accelerated.On the contrary, if audio frequency frame does not need to give up, then Auditory Perception decoding device is according to the window type of audio frequency frame, carry out inverse modified form discrete cosine transform and the synthesis of this audio frequency frame, it is a kind of oppositely time-frequency convert, can long window or short window be unit, the frequency domain data (can be included in frequency-domain analysis data) of audio frequency frame is changed into time domain data.In once against after modified form discrete cosine transform completes, Auditory Perception decoding device can judge that this audio frequency frame is the need of copying, if need copy, then presetting judged result is next time "No", namely meets when judging next time and does not need to copy, and the audio frequency frame copied in addition carries out inverse modified form discrete cosine transform, thus, in the playing process of voice data, this audio frequency frame data can be played twice, to reach the object that broadcasting speed slows down.Because judged result is configured to not need to copy when this audio frequency frame is met and judged next time, speed adjustment flow process 10 goes to the next audio frequency frame of decoding.Therefore, judge the judged result of flow process 20 according to adjustment, speed adjustment flow process 10 can give up/replication actions, to accelerate/to slow down broadcasting speed to each audio frequency frame of voice data.
The short window data be not rejected still can via the inverse modified form discrete cosine transform of step 112 and point window, and to have decoded rear broadcasting in Auditory Perception.
Note that audio frequency frame is in the present invention rejection of data and the minimum unit copied, can containing different length window relative scales according to each audio format; Such as: in A form, a long window length is considered as an audio frequency frame, and the length of a long window may be the length combination of 4 short windows or several short window, and namely 4 short windows or several short window will be considered as an audio frequency frame; Another example, in B form, an audio frequency frame need depending on the matching of its length window.In Auditory Perception coding, the audio frequency frame data be made up of long window represents one section of more stable range of signal, and the audio frequency frame data be made up of short window represents the range of signal that one section of change is more violent.Therefore, on adjustment broadcasting speed, only copy or give up the data belonging to long window and put quality compared with affecting to dial.
Therefore, aforesaid frequency-domain analysis data can comprise a window type index, and it is used to refer to audio frequency frame for the window type against modified form discrete cosine transform is long window or short window.In the case, Fig. 3 adjusts and judges that flow process 20 comprises the following steps:
Step 200: receive a speed adjustment signal designation.
Step 202: the frequency-domain analysis data obtaining the window type index comprising this audio frequency frame.
Step 204: judge that this window type index indicates this audio frequency frame to belong to long window type? if so, then carry out step 208; If not, then carry out step 206.
Step 206: the judged result producing for one " do not give up/do not copy ".
Step 208: judge that this speed adjustment signal indicates the broadcasting speed accelerating this voice data? if so, then carry out step 210; If not, then carry out step 212.
Step 210: the judged result producing for one " giving up ".
Step 212: the judged result producing one " copying ".
Fig. 3 adjusts and judges that flow process 20 mainly utilizes the window type of audio frequency frame as audio frequency frame the need of the criterion given up/copy.As from the foregoing, accelerate the broadcasting speed of voice data in speed adjustment signal designation, and when window type index indicative audio frame belongs to long window, adjustment judges that flow process 20 judges to give up.Slow down the broadcasting speed of voice data in speed adjustment signal designation, and when window type index indicates long window, adjustment judges that flow process 20 judgement can copy.In other words, when window type index indicative audio frame belongs to other window types (as short window, long turning short window etc.), then adjustment judges that flow process 20 indicates this audio frequency frame of speed adjustment flow process 10 not need to give up and also do not need to copy.
Except window type index, aforesaid frequency-domain analysis data separately can comprise a spectrum line (SpectralLine) data of audio frequency frame.Fig. 4 A and Fig. 4 B adjustment judges that flow process 20 utilizes the window type of audio frequency frame and spectrum line data as audio frequency frame the need of the criterion given up/copy simultaneously, and it comprises the following steps:
Step 300: receive a speed adjustment signal designation.
Step 302: the frequency-domain analysis data obtaining this audio frequency frame, it comprises a window type index and spectrum line data.
Step 304: judge that this window type index indicates this audio frequency frame to belong to long window type? if so, then carry out step 308; If not, then carry out step 306.
Step 306: the judged result producing for one " do not give up/do not copy ".
Step 308: this spectrum line Data Placement is gone out multiple band unit, and the energy sum total Pcurr calculating the plurality of band unit.
Step 310: the last audio frequency frame obtaining this audio frequency frame corresponds to an energy sum total Pprev of the plurality of band unit.
Step 312: calculate an energy sum total difference Pdiff=Pprev-Pcurr.
Step 314: judge | Pdiff| < THa? if so, then carry out step 316; If not, then carry out step 306.
Step 316: judge Pdiff > THb? if so, then carry out step 318; If not, then carry out step 306.
Step 318: judge Pprev < THc and Pcurr < THc? if so, then carry out step 320; If not, then carry out step 306.
Step 320: judge that this speed adjustment signal indicates the broadcasting speed accelerating this voice data? if so, then carry out step 322; If not, then carry out step 324.
Step 322: the judged result producing for one " giving up ".
Step 324: the judged result producing one " copying ".
As from the foregoing, to other audio frequency frame window types outside long window, Fig. 4 A and Fig. 4 B adjustment judges that flow process 20 also indicates speed adjustment flow process 10 this audio frequency frame not need to give up and also do not need to copy.Judge in flow process 20 in Fig. 4 A and Fig. 4 B adjustment, the band unit of spectrum line data divides can be different according to system requirements, such as spectrum line data directly can mark off continuously and take the band unit of the frequency range of all audio frequency frames, thus, energy sum total Pcurr and Pprev calculates the gross energy being respectively audio frequency frame and last audio frequency frame.Or spectrum line data according to signal flatness, can mark off the band unit classifying as class simple signal (Tone-like) or noise like (noise-like).The division of band unit and energy calculation thereof can the modes of reference frequency component, operate in this in detail and do not repeat.In addition, Fig. 4 A and Fig. 4 B adjustment judges that flow process 20 defines threshold value THa, THb and THc, it is the threshold value that gives according to the characteristic of covering (Post-masking) effect and quiet (Silence) signal after in class simple signal, Auditory Perception of system respectively, its characteristic should be known by those skilled in the art, does not repeat in this.Therefore, Fig. 4 A and Fig. 4 B adjustment judges that flow process 20 just can be given up or copy by indicative audio frame needs when following condition all meets, and its condition is: the absolute value of (i) energy sum total difference Pdiff is less than the threshold value THa being relevant to class simple signal energy sum total difference; (ii) energy sum total difference Pdiff is greater than the threshold value THb being relevant to rear capture-effect; (iii) energy sum total Pprev is less than the threshold value THc that is relevant to quiet audio number and energy sum total Pcurr is also less than threshold value THc.When not meeting above arbitrary condition, Fig. 4 A and Fig. 4 B adjustment judges that flow process 20 indicates speed adjustment flow process 10 this audio frequency frame not need to give up and also do not need to copy.When meeting above (i), (ii) and (iii) all conditions, adjustment judges the quickening/slow down instruction of flow process 20 according to speed adjustment signal, and this audio frequency frame of instruction speed adjustment flow process 10 needs are given up/copied.
Judge in flow process 20 in Fig. 4 A and Fig. 4 B adjustment, those skilled in the art can using in length window and condition (i) ~ (iii) four arbitrary or four combination as the criterion judging whether to need to give up/copy, be not limited to meet this four conditions completely.For example, audio frequency frame when being judged long window, namely can being rejected or copying.Aforementioned Auditory Perception decoding process, as frequency analysis and the inverse modified form discrete cosine transform of entropy decoding, inverse quantization, audio frequency frame, should be known by those skilled in the art, the frequency analysis information that the present invention is existing in mainly utilizing Auditory Perception to decode, as the benchmark copying or give up audio frequency frame, when therefore can not meet with Auditory Perception decode successive signal reconstruction (Reconstruction), there will be the problem of phase distortion.In addition, the present invention can judge the audio frequency frame that must copy or give up immediately, therefore in adjustment broadcasting speed process, does not need a large amount of registers to store the audio frequency frame data of front and back, and then saves production cost.
Please refer to Fig. 5, Fig. 5 is the process flow diagram of the embodiment of the present invention one speed adjustment flow process 40.Speed adjustment flow process 40 can be implemented on Auditory Perception decoding device, utilizes speed to adjust flow process 10 and gives up, copies or each audio frequency frame of normal process, and then the broadcasting speed that the broadcasting speed of adjustment voice data is expected to user, and it comprises the following steps:
Step 400: receive a voice data, its voice data is the input of continuous print audio frequency frame.
Step 402: receive and judge a speed adjustment signal.During in the broadcasting speed that this speed adjustment signal designation accelerates this voice data to (N/ (N-M)) times, perform step 404; During in the broadcasting speed that this speed adjustment signal designation slows down this voice data to (N/ (N+M)) times, perform step 410.
Step 404: Negotiation speed adjustment flow process 10, judges whether each audio frequency frame of the N number of continuous audio frequency frame in the plurality of audio frequency frame can be given up.
Step 406: judge have M audio frequency frame to give up in this N number of continuous audio frequency frame? if have, then carry out step 408; If nothing, then carry out step 420.
Step 408: this Auditory Perception decoding device gives up the data at least partially of this M audio frequency frame.
Step 410: the N number of continuous audio frequency frame of next group obtaining this voice data, and carry out step 404.
Step 412: Negotiation speed adjustment flow process 10, judges whether each audio frequency frame of the N number of continuous audio frequency frame in the plurality of audio frequency frame can copy.
Step 414: judge have M audio frequency frame to copy in this N number of continuous audio frequency frame? if have, then carry out step 416; If nothing, then carry out step 424.
Step 416: this Auditory Perception decoding device copies the data at least partially of this M audio frequency frame.
Step 418: the N number of continuous audio frequency frame of next group obtaining this voice data, and carry out step 412.
Step 420: the N number of continuous audio frequency frame judging whether processed K group, and whether altogether can give up less than K × M audio frequency frame in the N number of continuous audio frequency frame of this K group? if have, then carry out step 422; If nothing, then carry out step 410.
Step 422: all or part of data of the audio frequency frame after giving up.
Step 424: the N number of continuous audio frequency frame judging whether processed K group, and whether altogether can copy less than K × M audio frequency frame in the N number of continuous audio frequency frame of this K group? if have, then carry out step 426; If nothing, then carry out step 418.
Step 426: all or part of data of the audio frequency frame after copying.
According to speed adjustment flow process 40, when speed adjustment signal designation accelerates broadcasting speed to (N/ (N-M)) times, each audio frequency frame of the N number of continuous audio frequency frame of each group can adjust flow process 10 by speed and judge whether to give up.When having M audio frequency frame to be judged in N number of continuous audio frequency frame can to give up, Auditory Perception decoding device can give up the data at least partially of M audio frequency frame, such as, give up whole or its long window categorical data of M audio frequency frame.Similarly, when speed adjustment signal designation slows down broadcasting speed to (N/ (N+M)) times, each audio frequency frame execution can adjust flow process 10 by speed and judge whether to copy.When having M audio frequency frame to be judged in N number of continuous audio frequency frame can to copy, the data at least partially of the reproducible M of an Auditory Perception decoding device audio frequency frame.Speed adjustment flow process 40, under N number of audio frequency frame, copies or gives up M audio frequency frame data (or long window data), to obtain broadcasting speed desired by user for N/ (N ± M) times.
In addition, organize in N number of audio frequency frame may be had continuously in speed adjustment flow process 40 all not have M audio frequency frame to give up or to copy more.In the case, as shown in step 420 ~ 422 and 424 ~ 426, the present embodiment can set a maximum limit class value K, when not having K × M audio frequency frame to give up or to copy in the N number of audio frequency frame of continuous K group, all or part of data of audio frequency frame after this Auditory Perception decoding device starts to force to give up or copy, with the speed making broadcasting speed can reach user's expection.For example, when only there being (K × M-L) individual audio frequency box to close the condition that can give up or copy in continuous K group N number of audio frequency frame, all or part of data of L audio frequency frame that this Auditory Perception decoding device is given up or received after copying, with maintain broadcasting speed in (N/ (N-M)) doubly or (N/ (N+M)) doubly.
In embodiments of the present invention, judge that the process whether having K × M audio frequency frame to give up or to copy in continuous K group N number of audio frequency frame not necessarily only limits each and organizes N number of audio frequency frame and will have M audio frequency frame, as long as always total K × M audio frequency frame.For example, when K is set as 2, if first group of N number of audio frequency frame has (M-1) individual audio frequency frame to give up or to copy, then second group of N number of audio frequency frame needs (M+1) individual audio frequency frame, with the speed making broadcasting speed can reach user's expection.
Please refer to Fig. 6, Fig. 6 is the block schematic diagram of the embodiment of the present invention one flow process 60.The concept that flow process 60 is given up for realizing above-mentioned pressure or copied audio frequency frame, it comprises the following steps:
Step 602: receive and judge a speed adjustment signal.During in the broadcasting speed that this speed adjustment signal designation accelerates this voice data to (N/ (N-M)) times, perform step 604; During in the broadcasting speed that this speed adjustment signal designation slows down this voice data to (N/ (N+M)) times, perform step 616.
Step 604: set a parameter i=1.
Step 606: judge the audio frequency frame number can given up in the new N number of audio frequency frame received.
Step 608: judge that one amounts to the audio frequency frame number N discard that can give up and has i × M? if nothing, then carry out step 610; If have, then carry out step 614.
Step 610:i=i+1.
Step 612: judge that i is greater than a threshold value K? if nothing, then carry out step 606; If have, then carry out step 614.
Step 614: perform one and reset flow process, it comprises setting i=1, gives up Ndiscard audio frequency frame and force to give up (K × M-Ndiscard) individual audio frequency frame in the audio frequency frame of follow-up new reception, or only give up Ndiscard audio frequency frame, but do not force to give up (K × M-Ndiscard) individual audio frequency frame in the audio frequency frame of follow-up new reception.
Step 616: setting i=1.
Step 618: judge the audio frequency frame number that can copy in the new N number of audio frequency frame received.
Step 620: judge that one amounts to the audio frequency frame number N copy that can copy and has i × M? if nothing, then carry out step 622; If have, then carry out step 626.
Step 622:i=i+1.
Step 624: judge that i is greater than threshold value K? if nothing, then carry out step 618; If have, then carry out step 626.
Step 626: perform one and reset flow process, it comprises setting i=1, copies Ncopy audio frequency frame and force to copy (K × M-Ncopy) individual audio frequency frame in the audio frequency frame of follow-up new reception, or only copy Ncopy audio frequency frame, but do not force to copy (K × M-Ncopy) individual audio frequency frame in the audio frequency frame of follow-up new reception.
According to flow process 60, the embodiment of the present invention can in order to unduly reduce the quality that audio frequency is play, and in the audio frequency frame of follow-up new reception, do not force to copy or give up the audio frequency frame less than K × M, i.e. (K × M-Ndiscard) or (K × M-Ncopy) individual audio frequency frame.The detailed principle of operation of other flow processs 60 in disclosing above, therefore repeats no more.
Please refer to Fig. 7, Fig. 7 is the block schematic diagram of the embodiment of the present invention one velocity adjustment apparatus 50.Velocity adjustment apparatus 50 comprises audio frequency reading device 500, processor unit 510, storage element 520, input block 530 and an output unit 540.Audio frequency reading device 500 can be a CD/DVD player or networking card device, is used for obtaining a voice data AU_DATA and sends processor unit 510 to by storage element 520 dealing with.Input block 530 can be keyboard, slide-mouse, Speech input or other users and is able to reach interactive device with velocity adjustment apparatus 50, in order to according to user's input signal, produces a speed adjustment signal PLR_ADJ to processor unit 510.Storage element 520 can be non-volatility memory, is used for store program code 522, and it is processed by processor unit 510, can realize aforementioned arbitrary flow process (as adjustment judges that flow process 20, speed adjust flow process 40 etc.) or its path combination.Output unit 540 can be loudspeaker, can play the voice data processed via processor unit 510.For example, when user accelerates broadcasting speed by input block 530, processor unit 510 can according to the speed adjustment signal PLR_ADJ of correspondence, speed adjustment flow process 40 is utilized to give up the audio frequency frame of play voice data, and the voice data having audio frequency frame to be rejected is sent to broadcast unit 540 plays, allow user perceive audio frequency and accelerate.Because velocity adjustment apparatus 50 is mainly used to realize aforementioned arbitrary flow process (judging flow process 20, speed adjustment flow process 40 etc. as adjustment) or its path combination, therefore main principle of operation please refer to aforementioned.
In sum, the embodiment of the present invention utilizes the related data (as window type/spectrum line data) of the frequency of audio data in an Auditory Perception decode procedure, judge whether the voice data giving up or copy part, also to reach the change of broadcasting speed in decode procedure simultaneously.Thus, the present invention does not need a large amount of registers to deposit voice data.
The foregoing is only preferred embodiment of the present invention, all equalizations done according to the claims in the present invention change and modify, and all should belong to covering scope of the present invention.

Claims (17)

1. adjust a method for broadcasting speed, it is characterized in that, the method for described adjustment broadcasting speed includes:
One Auditory Perception decoding device receives a voice data;
Described Auditory Perception decoding device carries out the frequency analysis of one first audio frequency frame of described voice data;
Obtain the one first frequency-domain analysis data about described frequency analysis;
Receive a speed adjustment signal, judge to accelerate broadcasting speed or to slow down broadcasting speed to carry out adjustment broadcasting speed;
It is characterized in that comprising:
When described speed adjustment signal designation accelerates the broadcasting speed of described voice data, according to described first frequency-domain analysis data, judge whether to give up described first audio frequency frame;
In described first audio frequency frame be judged as can give up time, described Auditory Perception decoding device gives up described first audio frequency frame; And
When described speed adjustment signal designation slows down the broadcasting speed of described voice data, according to described first frequency-domain analysis data, judge whether to copy described first audio frequency frame;
In described first audio frequency frame be judged as can copy time, described Auditory Perception decoding device copies described first audio frequency frame;
Described first frequency-domain analysis data comprise the window type index being used to refer to the window type that described first audio frequency frame is changed for a frequency domain to the time domain used in described Auditory Perception decoding device;
When described speed adjustment signal designation accelerates the broadcasting speed of described voice data, judge whether that giving up described first audio frequency frame comprises according to described first frequency-domain analysis data: the broadcasting speed accelerating described voice data in described speed adjustment signal designation, and described window type index instruction described first audio frequency frame is when belonging to long window type, judgement can give up described first audio frequency frame; And judge whether that copying described first audio frequency frame comprises according to described first frequency-domain analysis data when described speed adjustment signal designation slows down the broadcasting speed of described voice data: the broadcasting speed slowing down described voice data in described speed adjustment signal designation, and described window type index instruction described first audio frequency frame is when belonging to long window type, judgement can copy described first audio frequency frame.
2. the method for claim 1, is characterized in that, described first frequency-domain analysis data separately comprise the spectrum line data of described first audio frequency frame.
3. method as claimed in claim 2, is characterized in that, when described speed adjustment signal designation accelerates the broadcasting speed of described voice data, judge whether that giving up described first audio frequency frame comprises according to described first frequency-domain analysis data:
When the described first audio frequency frame of described window type index instruction belongs to long window type, described spectrum line Data Placement is gone out multiple band unit;
Calculate one first energy sum total of described multiple band unit;
The one second audio frequency frame obtaining described voice data corresponds to one second energy sum total of described multiple band unit, and described second audio frequency frame is the previous audio frequency frame by the process of described Auditory Perception decoding device of described first audio frequency frame;
Calculate an energy sum total poor, poor=described second energy sum total of described energy sum total-described first energy sum total;
Absolute value in described energy sum total difference is less than one first threshold value being relevant to class simple signal energy sum total difference, described energy sum total difference be greater than be relevant to Auditory Perception after one second threshold value, the described second energy sum total of covering be less than one the 3rd threshold value that is relevant to quiet audio number and at least one condition that described first energy sum total is less than above three conditions of described 3rd threshold value meets time, judge to give up the data belonging to long window type in described first audio frequency frame or described first audio frequency frame; And
When described speed adjustment signal designation slows down the broadcasting speed of described voice data, judge whether that copying described first audio frequency frame comprises according to described first frequency-domain analysis data:
When the described first audio frequency frame of described window type index instruction belongs to long window type, described spectrum line Data Placement is gone out described multiple band unit;
Calculate described first energy sum total;
Obtain described second energy sum total;
Calculate described energy sum total poor;
Absolute value in described energy sum total difference is less than one first threshold value being relevant to class simple signal energy sum total difference, described energy sum total difference be greater than be relevant to Auditory Perception after one second threshold value, the described second energy sum total of covering be less than one the 3rd threshold value that is relevant to quiet audio number and at least one condition that described first energy sum total is less than above three conditions of described 3rd threshold value meets time, judge to copy the data belonging to long window type in described first audio frequency frame or described first audio frequency frame.
4. method as claimed in claim 3, it is characterized in that, when the described first audio frequency frame of described window type index instruction belongs to long window type, described spectrum line Data Placement is gone out described multiple band unit to comprise, when the described first audio frequency frame of described window type index instruction belongs to long window type, according to the flatness of spectrum line data, described spectrum line Data Placement is gone out to classify as class simple signal or the described multiple band unit for noise like.
5. adjust a method for broadcasting speed, it is characterized in that, the method for described adjustment broadcasting speed includes:
One Auditory Perception decoding device receives a voice data, and described voice data comprises multiple audio frequency frame;
Described Auditory Perception decoding device carries out the frequency analysis of described multiple audio frequency frame;
Receive a speed adjustment signal;
The feature of described method is to comprise:
During in the broadcasting speed that described speed adjustment signal designation accelerates described voice data to (N/ (N-M)) times, to the adjustment determining program whether each audio frequency frame execution audio frequency frame be used for handled by judgement of the N number of continuous audio frequency frame in described multiple audio frequency frame can be given up, wherein N, M are positive integer;
In by described adjustment determining program, when judging have M audio frequency frame to give up in described N number of continuous audio frequency frame, described Auditory Perception decoding device gives up the data at least partially of described M audio frequency frame;
During in the broadcasting speed that described speed adjustment signal designation slows down described voice data to (N/ (N+M)) times, the adjustment determining program whether the audio frequency frame handled by being used for judging can copy is performed to each audio frequency frame of the N number of continuous audio frequency frame in described multiple audio frequency frame; And
In by described adjustment determining program, when judging have M audio frequency frame to copy in described N number of continuous audio frequency frame, described Auditory Perception decoding device copies the data at least partially of described M audio frequency frame;
Described adjustment determining program comprises:
Obtain corresponding to one first handled audio frequency frame, one first frequency-domain analysis data about described frequency analysis;
When described speed adjustment signal designation accelerates the broadcasting speed of described voice data, according to described first frequency-domain analysis data, judge whether the data at least partially giving up described first audio frequency frame; And
When described speed adjustment signal designation slows down the broadcasting speed of described voice data, according to described first frequency-domain analysis data, judge whether the data at least partially copying described first audio frequency frame;
Described first frequency-domain analysis data comprise the window type index being used to refer to the window type that described first audio frequency frame is changed for the frequency domain in described Auditory Perception decoding device to time domain;
When described speed adjustment signal designation accelerates the broadcasting speed of described voice data, judge whether that giving up described first audio frequency frame comprises according to described first frequency-domain analysis data: the broadcasting speed accelerating described voice data in described speed adjustment signal designation, and described window type index instruction described first audio frequency frame is when belonging to long window type, judge to give up the data belonging to long window type in described first audio frequency frame or described first audio frequency frame; And judge whether that copying described first audio frequency frame comprises according to described first frequency-domain analysis data when described speed adjustment signal designation slows down the broadcasting speed of described voice data: the broadcasting speed slowing down described voice data in described speed adjustment signal designation, and described window type index instruction described first audio frequency frame is when belonging to long window type, judge to copy the data belonging to long window type in described first audio frequency frame or described first audio frequency frame.
6. method as claimed in claim 5, it is characterized in that, described first frequency-domain analysis data separately comprise the spectrum line data of described first audio frequency frame.
7. method as claimed in claim 6, is characterized in that, when described speed adjustment signal designation accelerates the broadcasting speed of described voice data, judge whether that giving up described first audio frequency frame comprises according to described first frequency-domain analysis data:
When the described first audio frequency frame of described window type index instruction belongs to long window type, described spectrum line Data Placement is gone out multiple band unit;
Calculate one first energy sum total of described multiple band unit;
The one second audio frequency frame obtaining described voice data corresponds to one second energy sum total of described multiple band unit, and described second audio frequency frame is the previous audio frequency frame by the process of described Auditory Perception decoding device of described first audio frequency frame;
Calculate an energy sum total poor, poor=described first energy sum total of described energy sum total-described second energy sum total;
Absolute value in described energy sum total difference is less than one first threshold value being relevant to class simple signal energy sum total difference, described energy sum total difference be greater than be relevant to Auditory Perception after one second threshold value, the described second energy sum total of covering be less than one the 3rd threshold value that is relevant to quiet audio number and at least one condition that described first energy sum total is less than three conditions of described 3rd threshold value meets time, judge to give up the data belonging to long window type in described first audio frequency frame or described first audio frequency frame; And
When described speed adjustment signal designation slows down the broadcasting speed of described voice data, judge whether that copying described first audio frequency frame comprises according to described first frequency-domain analysis data:
When the described first audio frequency frame of described window type index instruction belongs to long window type, described spectrum line Data Placement is gone out described multiple band unit;
Calculate described first energy sum total;
Obtain described second energy sum total;
Calculate described energy sum total poor;
Absolute value in described energy sum total difference is less than one first threshold value being relevant to class simple signal energy sum total difference, described energy sum total difference be greater than be relevant to Auditory Perception after one second threshold value, the described second energy sum total of covering be less than one the 3rd threshold value that is relevant to quiet audio number and at least one condition that described first energy sum total is less than above three conditions of described 3rd threshold value meets time, judge to copy the data belonging to long window type in described first audio frequency frame or described first audio frequency frame.
8. method as claimed in claim 7, it is characterized in that, when the described first audio frequency frame of described window type index instruction belongs to long window type, described spectrum line Data Placement is gone out described multiple band unit to comprise, when the described first audio frequency frame of described window type index instruction belongs to long window type, according to the flatness of spectrum line data, described spectrum line Data Placement is gone out described multiple band unit, and each band unit classifies as a class simple signal classification or a noise like classification.
9. method as claimed in claim 5, it is characterized in that, the method for described adjustment broadcasting speed separately comprises:
In by described adjustment determining program, when judging do not have K × M audio frequency frame to give up in the N number of continuous audio frequency frame of K group, described Auditory Perception decoding device gives up the data at least partially of at least one audio frequency frame after the N number of continuous audio frequency frame of described K group, and wherein K is positive integer; And
In by described adjustment determining program, when judging do not have K × M audio frequency frame to copy in the N number of continuous audio frequency frame of K group, described Auditory Perception decoding device is replicated in the data at least partially of at least one audio frequency frame after the N number of continuous audio frequency frame of described K group.
10. accelerate a method for broadcasting speed, it is characterized in that, the method for described acceleration broadcasting speed includes:
One Auditory Perception decoding device receives a voice data;
Described Auditory Perception decoding device carries out the frequency analysis of one first audio frequency frame of described voice data;
Obtain the one first frequency-domain analysis data about described frequency analysis;
Receive one and accelerate adjustment signal;
The feature of described method is to comprise:
According to described first frequency-domain analysis data, judge whether to give up described first audio frequency frame; And
In described first audio frequency frame be judged as can give up time, described Auditory Perception decoding device according to described accelerate adjustment signal indicated by a broadcasting speed, give up the data at least partially of described first audio frequency frame;
Described first frequency-domain analysis data comprise the window type index being used to refer to the window type that described first audio frequency frame is changed to time domain for the frequent territory used in described Auditory Perception decoding device;
Judge whether that giving up described first audio frequency frame comprises according to described first frequency-domain analysis data: when the described first audio frequency frame of described window type index instruction belongs to long window type, judgement can give up described first audio frequency frame.
11. methods as claimed in claim 10, is characterized in that, described first frequency-domain analysis data separately comprise the spectrum line data of described first audio frequency frame.
12. methods as claimed in claim 11, is characterized in that, judge whether that giving up described first audio frequency frame comprises according to described first frequency-domain analysis data:
When the described first audio frequency frame of described window type index instruction belongs to long window type, described spectrum line Data Placement is gone out multiple band unit;
Calculate one first energy sum total of described multiple band unit;
The one second audio frequency frame obtaining described voice data corresponds to one second energy sum total of described multiple band unit, and described second audio frequency frame is the previous audio frequency frame by the process of described Auditory Perception decoding device of described first audio frequency frame;
Calculate an energy sum total poor, poor=described second energy sum total of described energy sum total-described first energy sum total; And
Absolute value in described energy sum total difference is less than one first threshold value being relevant to class simple signal energy sum total difference, described energy sum total difference be greater than be relevant to Auditory Perception after one second threshold value that covers and described second energy sum total be less than one the 3rd threshold value that is relevant to quiet audio number and at least one condition that described first energy sum total is less than above three conditions of described 3rd threshold value meets time, judge to give up the data belonging to long window type in described first audio frequency frame or described first audio frequency frame.
13. methods as claimed in claim 12, it is characterized in that, when the described first audio frequency frame of described window type index instruction belongs to long window type, described spectrum line Data Placement is gone out described multiple band unit to comprise, when the described first audio frequency frame of described window type index instruction belongs to long window type, according to the flatness of spectrum line data, described spectrum line Data Placement is gone out to classify as class simple signal or the described multiple band unit for noise like.
14. 1 kinds of methods slowing down broadcasting speed, is characterized in that, described in slow down broadcasting speed method include:
One Auditory Perception decoding device receives a voice data;
Described Auditory Perception decoding device carries out the frequency analysis of one first audio frequency frame of described voice data;
Obtain the one first frequency-domain analysis data about described frequency analysis;
Receive one and slowly adjust signal;
The feature of described method is to comprise:
According to described first frequency-domain analysis data, judge whether to copy described first audio frequency frame; And
In described first audio frequency frame be judged as can copy time, described Auditory Perception decoding device according to described slowly adjustment signal make a broadcasting speed of instruction, copy the data at least partially of described first audio frequency frame;
Described first frequency-domain analysis data comprise the window type index being used to refer to the window type that described first audio frequency frame is changed for a frequency domain to the time domain used in described Auditory Perception decoding device;
Judge whether that copying described first audio frequency frame comprises according to described first frequency-domain analysis data: when the described first audio frequency frame of described window type index instruction belongs to long window type, judgement can copy described first audio frequency frame.
15. methods as claimed in claim 14, is characterized in that, described first frequency-domain analysis data separately comprise the spectrum line data of described first audio frequency frame.
16. methods as claimed in claim 15, is characterized in that, judge whether that giving up described first audio frequency frame comprises according to described first frequency-domain analysis data:
When the described first audio frequency frame of described window type index instruction belongs to long window type, described spectrum line Data Placement is gone out multiple band unit;
Calculate one first energy sum total of described multiple band unit;
The one second audio frequency frame obtaining described voice data corresponds to one second energy sum total of described multiple band unit, and described second audio frequency frame is the previous audio frequency frame by the process of described Auditory Perception decoding device of described first audio frequency frame;
Calculate an energy sum total poor, poor=described second energy sum total of described energy sum total-described first energy sum total; And
Absolute value in described energy sum total difference is less than one first threshold value being relevant to class simple signal energy sum total difference, described energy sum total difference be greater than be relevant to Auditory Perception after one second threshold value, the described second energy sum total of covering be less than one the 3rd threshold value that is relevant to quiet audio number and at least one condition that described first energy sum total is less than above three conditions of described 3rd threshold value meets time, judge to copy the data belonging to long window type in described first audio frequency frame or described first audio frequency frame.
17. methods as claimed in claim 16, it is characterized in that, when the described first audio frequency frame of described window type index instruction belongs to long window type, described spectrum line Data Placement is gone out described multiple band unit to comprise, when the described first audio frequency frame of described window type index instruction belongs to long window type, according to the flatness of spectrum line data, described spectrum line Data Placement is gone out to classify as class simple signal or the described multiple band unit for noise like.
CN201210038338.7A 2012-02-20 2012-02-20 The method of adjustment broadcasting speed Active CN103258552B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210038338.7A CN103258552B (en) 2012-02-20 2012-02-20 The method of adjustment broadcasting speed

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210038338.7A CN103258552B (en) 2012-02-20 2012-02-20 The method of adjustment broadcasting speed

Publications (2)

Publication Number Publication Date
CN103258552A CN103258552A (en) 2013-08-21
CN103258552B true CN103258552B (en) 2015-12-16

Family

ID=48962421

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210038338.7A Active CN103258552B (en) 2012-02-20 2012-02-20 The method of adjustment broadcasting speed

Country Status (1)

Country Link
CN (1) CN103258552B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105810219B (en) * 2016-03-11 2018-03-16 宇龙计算机通信科技(深圳)有限公司 Player method, play system and the voice frequency terminal of multimedia file
CN113643728B (en) * 2021-08-12 2023-08-22 荣耀终端有限公司 Audio recording method, electronic equipment, medium and program product

Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1149739A (en) * 1995-09-30 1997-05-14 三星电子株式会社 Speed changeable voice signal regenerator
CN1213935A (en) * 1997-09-03 1999-04-14 松下电器产业株式会社 Apparatus of layered picture coding apparatus of picture decoding, methods of picture decoding, apparatus of recoding for digital broadcasting signal, and apparatus of picture and audio decoding
CN1270356A (en) * 1999-04-08 2000-10-18 英业达股份有限公司 Method for changing pronunciation speed
TW442740B (en) * 1998-12-18 2001-06-23 Inventec Corp Method for changing articulation speed
CN1359231A (en) * 2000-12-19 2002-07-17 株式会社考斯默坦 Audio signal reproducing method and apparatus without changing tone in fast or slow speed replaying mode
TW499653B (en) * 1998-10-08 2002-08-21 Sony Electronics Inc Apparatus and method for implementing a variable-speed audio data playback system
CN1493072A (en) * 2001-01-22 2004-04-28 卡纳斯数据株式会社 Encoding method and decoding method for digital data
CN1525435A (en) * 2003-02-24 2004-09-01 国际商业机器公司 Method and apparatus for estimating pitch frequency of voice signal
CN1579093A (en) * 2001-10-31 2005-02-09 汤姆森特许公司 Changing a playback speed for video presentation recorded in a modified film format
CN1600027A (en) * 2000-11-07 2005-03-23 松下电器产业株式会社 Video signal producing system and video signal recording/ reproducing device in that system
CN1700757A (en) * 2004-05-13 2005-11-23 美国博通公司 System and method for high-quality variable speed playback of audio-visual media
JP2005331588A (en) * 2004-05-18 2005-12-02 Nippon Telegr & Teleph Corp <Ntt> Method and program to adjust voice reproducing speed and recording medium which stores the program
CN101203907A (en) * 2005-06-23 2008-06-18 松下电器产业株式会社 Audio encoding apparatus, audio decoding apparatus and audio encoding information transmitting apparatus
TWI314015B (en) * 2005-10-28 2009-08-21 Novatek Microelectronics Corp Digital audio/video playback system capable of controlling audio and video playback speed
CN101567188A (en) * 2009-04-30 2009-10-28 上海大学 Multi-pitch estimation method for mixed audio signals with combined long frame and short frame
CN101753942A (en) * 2008-12-12 2010-06-23 日立民用电子株式会社 Video reproducing apparatus, a video system, and a reproduction speed converting method of video
CN102074239A (en) * 2010-12-23 2011-05-25 福建星网视易信息系统有限公司 Sound speed change method
CN102341846A (en) * 2009-03-04 2012-02-01 韩国科亚电子股份有限公司 Quantization for audio encoding

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7464028B2 (en) * 2004-03-18 2008-12-09 Broadcom Corporation System and method for frequency domain audio speed up or slow down, while maintaining pitch
KR101298658B1 (en) * 2007-03-16 2013-08-21 삼성전자주식회사 Audio playback device having control function of playback speed and method thereof

Patent Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1149739A (en) * 1995-09-30 1997-05-14 三星电子株式会社 Speed changeable voice signal regenerator
CN1213935A (en) * 1997-09-03 1999-04-14 松下电器产业株式会社 Apparatus of layered picture coding apparatus of picture decoding, methods of picture decoding, apparatus of recoding for digital broadcasting signal, and apparatus of picture and audio decoding
TW499653B (en) * 1998-10-08 2002-08-21 Sony Electronics Inc Apparatus and method for implementing a variable-speed audio data playback system
TW442740B (en) * 1998-12-18 2001-06-23 Inventec Corp Method for changing articulation speed
CN1270356A (en) * 1999-04-08 2000-10-18 英业达股份有限公司 Method for changing pronunciation speed
CN1600027A (en) * 2000-11-07 2005-03-23 松下电器产业株式会社 Video signal producing system and video signal recording/ reproducing device in that system
CN1359231A (en) * 2000-12-19 2002-07-17 株式会社考斯默坦 Audio signal reproducing method and apparatus without changing tone in fast or slow speed replaying mode
CN1493072A (en) * 2001-01-22 2004-04-28 卡纳斯数据株式会社 Encoding method and decoding method for digital data
CN1579093A (en) * 2001-10-31 2005-02-09 汤姆森特许公司 Changing a playback speed for video presentation recorded in a modified film format
CN1525435A (en) * 2003-02-24 2004-09-01 国际商业机器公司 Method and apparatus for estimating pitch frequency of voice signal
CN1700757A (en) * 2004-05-13 2005-11-23 美国博通公司 System and method for high-quality variable speed playback of audio-visual media
JP2005331588A (en) * 2004-05-18 2005-12-02 Nippon Telegr & Teleph Corp <Ntt> Method and program to adjust voice reproducing speed and recording medium which stores the program
CN101203907A (en) * 2005-06-23 2008-06-18 松下电器产业株式会社 Audio encoding apparatus, audio decoding apparatus and audio encoding information transmitting apparatus
TWI314015B (en) * 2005-10-28 2009-08-21 Novatek Microelectronics Corp Digital audio/video playback system capable of controlling audio and video playback speed
CN101753942A (en) * 2008-12-12 2010-06-23 日立民用电子株式会社 Video reproducing apparatus, a video system, and a reproduction speed converting method of video
CN102341846A (en) * 2009-03-04 2012-02-01 韩国科亚电子股份有限公司 Quantization for audio encoding
CN101567188A (en) * 2009-04-30 2009-10-28 上海大学 Multi-pitch estimation method for mixed audio signals with combined long frame and short frame
CN102074239A (en) * 2010-12-23 2011-05-25 福建星网视易信息系统有限公司 Sound speed change method

Also Published As

Publication number Publication date
CN103258552A (en) 2013-08-21

Similar Documents

Publication Publication Date Title
JP4478183B2 (en) Apparatus and method for stably classifying audio signals, method for constructing and operating an audio signal database, and computer program
CN109979472B (en) Dynamic range control for various playback environments
CN107851440A (en) The dynamic range control based on metadata of coded audio extension
US8396705B2 (en) Extraction and matching of characteristic fingerprints from audio signals
JP5101579B2 (en) Spatial audio parameter display
EP2545646B1 (en) System for combining loudness measurements in a single playback mode
WO2013027631A1 (en) Encoding device and method, decoding device and method, and program
KR20090110244A (en) Method for encoding/decoding audio signals using audio semantic information and apparatus thereof
TW201137862A (en) Adaptive dynamic range enhancement of audio recordings
JP2014506686A (en) Extracting and matching feature fingerprints from speech signals
JP2007017908A (en) Signal encoding apparatus and method, signal decoding apparatus and method, and program and recording medium
JP4021124B2 (en) Digital acoustic signal encoding apparatus, method and recording medium
Walsh et al. Adaptive dynamics enhancement
CN102138341A (en) Acoustic signal processing device, processing method thereof, and program
CN103258552B (en) The method of adjustment broadcasting speed
CN107547984A (en) A kind of audio-frequency inputting method and audio output system based on intelligent terminal
CN101901612B (en) Sound playing method and device with variable speed and invariable tone
Jiao et al. MDCT-based perceptual hashing for compressed audio content identification
KR102431737B1 (en) Method of searching highlight in multimedia data and apparatus therof
EP2157580A1 (en) Video editing system
JP2003280691A (en) Voice processing method and voice processor
US20050132397A1 (en) Method for graphically displaying audio frequency component in digital broadcast receiver
KR20080112000A (en) Encoding and decoding using the resemblance of a tonality
JP2000151414A (en) Digital audio encoding device/method and recording medium recording encoding program
Nosirov et al. The fractal method of compression of broadband audio signals

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant