CN105245496B - A kind of method and apparatus of playing audio-fequency data - Google Patents
A kind of method and apparatus of playing audio-fequency data Download PDFInfo
- Publication number
- CN105245496B CN105245496B CN201510536538.9A CN201510536538A CN105245496B CN 105245496 B CN105245496 B CN 105245496B CN 201510536538 A CN201510536538 A CN 201510536538A CN 105245496 B CN105245496 B CN 105245496B
- Authority
- CN
- China
- Prior art keywords
- data
- audio
- played
- duration
- pitch period
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000012545 processing Methods 0.000 claims abstract description 90
- 238000004904 shortening Methods 0.000 claims abstract description 47
- 230000006854 communication Effects 0.000 claims abstract description 30
- 238000004891 communication Methods 0.000 claims abstract description 29
- 238000001514 detection method Methods 0.000 claims description 8
- 230000008569 process Effects 0.000 description 12
- 230000006870 function Effects 0.000 description 11
- 238000003860 storage Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 8
- 241001269238 Data Species 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 230000015572 biosynthetic process Effects 0.000 description 4
- 238000003786 synthesis reaction Methods 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 238000012512 characterization method Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000000151 deposition Methods 0.000 description 2
- 230000005484 gravity Effects 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000004806 packaging method and process Methods 0.000 description 2
- 230000005236 sound signal Effects 0.000 description 2
- 206010044565 Tremor Diseases 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000005314 correlation function Methods 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012905 input function Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000010897 surface acoustic wave method Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 210000001260 vocal cord Anatomy 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/568—Storing data temporarily at an intermediate stage, e.g. caching
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/04—Time compression or expansion
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M7/00—Arrangements for interconnection between switching centres
- H04M7/0024—Services and arrangements where telephone services are combined with data services
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Human Computer Interaction (AREA)
- Quality & Reliability (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computer Networks & Wireless Communication (AREA)
- Telephone Function (AREA)
- Reverberation, Karaoke And Other Acoustics (AREA)
Abstract
The invention discloses a kind of method and apparatus of playing audio-fequency data, belong to Internet technical field.The described method includes: detecting the data volume of the audio data to be played stored in jitter cache during voice communication;If the data volume of the audio data is lower than preset first threshold, duration extension processing is carried out to the audio frame in the audio data to be played;If the data volume of the audio data is higher than preset second threshold, duration shortening processing is carried out to the audio frame in the audio data to be played, wherein the first threshold is less than the second threshold;According to timing is played, treated audio data to be played is played out.Using the present invention, the phenomenon that broadcasting empty or scarce word can be prevented.
Description
Technical field
The present invention relates to Internet technical field, in particular to a kind of method and apparatus of playing audio-fequency data.
Background technique
With the development of Internet technology and mechanics of communication, VOIP (the Voice over based on voice packet switch
Internet Protocol, internet audio call) technology voice communication increasingly by the favor of user.
It is often using the method that VOIP technology carries out voice communication: just in two terminals of voice communication, either end
Terminal sends the voice packet (may include multiframe audio data) through overcompression, and the terminal of opposite end receives voice packet, by voice packet
It is stored in jitter cache after decompression, successively every frame audio data in jitter cache is played out.
In the implementation of the present invention, the inventor finds that the existing technology has at least the following problems:
Based on the method for above-mentioned call, when unstable networks, after the terminal of transmitting terminal sends voice packet, the end of receiving end
End may be not received by the voice packet of transmitting terminal transmission for a long time, so as to cause there is no audio data in jitter cache, or
Person's moment receives a large amount of voice packets, so that the audio data in jitter-buffer overflows, it will and cause audio data to lose, from
And lead to the phenomenon that occurring broadcasting empty or scarce word.
Summary of the invention
In order to solve problems in the prior art, the embodiment of the invention provides a kind of method of playing audio-fequency data and dresses
It sets.The technical solution is as follows:
In a first aspect, providing a kind of method of playing audio-fequency data, which comprises
During voice communication, the data volume of the audio data to be played stored in jitter cache is detected;
If the data volume of the audio data is lower than preset first threshold, in the audio data to be played
Audio frame carries out duration extension processing;If the data volume of the audio data be higher than preset second threshold, to it is described to
Audio frame in playing audio-fequency data carries out duration shortening processing, wherein the first threshold is less than the second threshold;
According to timing is played, treated audio data to be played is played out.
Optionally, the method also includes:
Obtain the pitch period of each audio frame in the audio data to be played;
If the data volume of the audio data is lower than preset first threshold, to the audio data to be played
In audio frame carry out duration extension processing;If the data volume of the audio data is higher than preset second threshold, to institute
The audio frame stated in audio data to be played carries out duration shortening processing, comprising:
If the data volume of the audio data is lower than preset first threshold, will be in the audio data to be played
Each audio frame extends 1 corresponding pitch period;If the data volume of the audio data is higher than preset second threshold,
Each audio frame in the audio data to be played is shortened into 1 corresponding pitch period.
It optionally, will be described to be played if the data volume of the audio data is lower than preset first threshold
Each audio frame in audio data extends 1 corresponding pitch period;If the data volume of the audio data is higher than default
Second threshold, then by the audio data to be played each audio frame shorten 1 corresponding pitch period, comprising:
If the data volume of the audio data is lower than preset first threshold, in the audio data to be played
In each audio frame, the data of first pitch period and second pitch period are merged into the data of a pitch period,
Combined data are inserted between first pitch period and second pitch period;
If the data volume of the audio data is higher than preset second threshold, in the audio data to be played
In each audio frame, the data of first pitch period and second pitch period are merged into the data of a pitch period,
The data of first pitch period and second pitch period are replaced with combined data.
Optionally, the method also includes:
Obtain the pitch period of each audio frame in the audio data to be played;
If the data volume of the audio data is lower than preset first threshold, to the audio data to be played
In audio frame carry out duration extension processing;If the data volume of the audio data is higher than preset second threshold, to institute
The audio frame stated in audio data to be played carries out duration shortening processing, comprising:
If the data volume of the audio data is lower than preset first threshold, according to preset extension duration, determine
The corresponding processed in units duration of each audio frame, wherein each processed in units duration is the pitch period of corresponding audio frame
Integral multiple;Each audio frame in the audio data to be played is extended into corresponding processed in units duration;
If the data volume of the audio data is higher than preset second threshold, according to preset shortening duration, determine
The corresponding processed in units duration of each audio frame, wherein each processed in units duration is the pitch period of corresponding audio frame
Integral multiple;Each audio frame in the audio data to be played is shortened into corresponding processed in units duration.
Optionally, each audio frame by the audio data to be played extends corresponding processed in units duration,
Include:
In each audio frame in the audio data to be played, by first processed in units duration and second unit
The data of handling duration merge into the data of a processed in units duration, and combined data are inserted at first unit
It manages between duration and second processed in units duration;
Each audio frame by the audio data to be played shortens corresponding processed in units duration, comprising:
In each audio frame in the audio data to be played, by first processed in units duration and second unit
The data of handling duration merge into the data of a processed in units duration, replace first processed in units with combined data
The data of duration and second processed in units duration.
Optionally, the pitch period for obtaining each audio frame in the audio data to be played, comprising:
If the audio frame recording in the audio data to be played has pitch period, from the audio data to be played
In each audio frame in obtain the pitch period of each audio frame;If the audio frame in the audio data to be played is not remembered
Record has pitch period, then is based on pitch period searching algorithm and each decoded audio frame, determines the base of each audio frame
The sound period.
Second aspect, provides a kind of device of playing audio-fequency data, and described device includes:
Detection module, for detecting the audio data to be played stored in jitter cache during voice communication
Data volume;
Processing module, if the data volume for the audio data is lower than preset first threshold, to described wait broadcast
The audio frame put in audio data carries out duration extension processing;If the data volume of the audio data is higher than preset second threshold
Value then carries out duration shortening processing to the audio frame in the audio data to be played, wherein the first threshold is less than described
Second threshold;
Playing module, for being played out to treated audio data to be played according to timing is played.
Optionally, described device further includes obtaining module, is used for:
Obtain the pitch period of each audio frame in the audio data to be played;
The processing module, is used for:
If the data volume of the audio data is lower than preset first threshold, will be in the audio data to be played
Each audio frame extends 1 corresponding pitch period;If the data volume of the audio data is higher than preset second threshold,
Each audio frame in the audio data to be played is shortened into 1 corresponding pitch period.
Optionally, the processing module, comprising:
First processing submodule, if the data volume for the audio data is lower than preset first threshold, in institute
It states in each audio frame in audio data to be played, the data of first pitch period and second pitch period is merged into
The data of one pitch period, by combined data be inserted into first pitch period and second pitch period it
Between;
Second processing submodule, if the data volume for the audio data is higher than preset second threshold, in institute
It states in each audio frame in audio data to be played, the data of first pitch period and second pitch period is merged into
The data of one pitch period replace the data of first pitch period and second pitch period with combined data.
Optionally, the acquisition module, is used for:
Obtain the pitch period of each audio frame in the audio data to be played;
The first processing submodule, is used for:
If the data volume of the audio data is lower than preset first threshold, according to preset extension duration, determine
The corresponding processed in units duration of each audio frame, wherein each processed in units duration is the pitch period of corresponding audio frame
Integral multiple;Each audio frame in the audio data to be played is extended into corresponding processed in units duration;
The second processing submodule, is used for:
If the data volume of the audio data is higher than preset second threshold, according to preset shortening duration, determine
The corresponding processed in units duration of each audio frame, wherein each processed in units duration is the pitch period of corresponding audio frame
Integral multiple;Each audio frame in the audio data to be played is shortened into corresponding processed in units duration.
Optionally, the first processing submodule, is used for:
In each audio frame in the audio data to be played, by first processed in units duration and second unit
The data of handling duration merge into the data of a processed in units duration, and combined data are inserted at first unit
It manages between duration and second processed in units duration;
The second processing submodule, is used for:
In each audio frame in the audio data to be played, by first processed in units duration and second unit
The data of handling duration merge into the data of a processed in units duration, replace first processed in units with combined data
The data of duration and second processed in units duration.
Optionally, the acquisition module, is used for:
If the audio frame recording in the audio data to be played has pitch period, from the audio data to be played
In each audio frame in obtain the pitch period of each audio frame;If the audio frame in the audio data to be played is not remembered
Record has pitch period, then is based on pitch period searching algorithm and each decoded audio frame, determines the base of each audio frame
The sound period.
Technical solution provided in an embodiment of the present invention has the benefit that
In the embodiment of the present invention, during voice communication, the audio data to be played stored in jitter cache is detected
Data volume, if the data volume of audio data be lower than preset first threshold, treat the audio frame in playing audio-fequency data
Carry out duration extension processing;If the data volume of audio data is higher than preset second threshold, treat in playing audio-fequency data
Audio frame carry out duration shortening processing, wherein first threshold be less than second threshold, according to play timing, to treated
Audio data to be played plays out.In this way, the audio data in jitter cache plays when data volume is less in jitter cache
It is slack-off, when unstable networks, longer time can be provided and to be stored in new audio data in caching, when in jitter cache
When data volume is more, the audio data in jitter cache plays as early as possible, guarantees the space for having more as far as possible in jitter cache, Ke Yibao
Moment received a large amount of audio datas are deposited, prevents the audio data in jitter-buffer from overflowing, broadcasts sky it is thus possible to prevent
Or the phenomenon that scarce word.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment
Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for
For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other
Attached drawing.
Fig. 1 is a kind of flow chart of the method for playing audio-fequency data provided in an embodiment of the present invention;
Fig. 2 is the schematic diagram that a kind of data volume according in jitter cache provided in an embodiment of the present invention is handled;
Fig. 3 is a kind of schematic diagram for extending duration processing provided in an embodiment of the present invention;
Fig. 4 is a kind of schematic diagram for shortening duration processing provided in an embodiment of the present invention;
Fig. 5 is a kind of structural schematic diagram of the device of playing audio-fequency data provided in an embodiment of the present invention;
Fig. 6 is a kind of structural schematic diagram of the device of playing audio-fequency data provided in an embodiment of the present invention;
Fig. 7 is a kind of structural schematic diagram of the device of playing audio-fequency data provided in an embodiment of the present invention;
Fig. 8 is a kind of structural schematic diagram of terminal provided in an embodiment of the present invention.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention
Formula is described in further detail.
Embodiment one
The embodiment of the invention provides a kind of methods of playing audio-fequency data, as shown in Figure 1, the process flow of this method can
To comprise the following steps that
Step 101, during voice communication, the data of the audio data to be played stored in jitter cache are detected
Amount.
Step 102, it if the data volume of audio data is lower than preset first threshold, treats in playing audio-fequency data
Audio frame carries out duration extension processing;If the data volume of audio data is higher than preset second threshold, to audio to be played
Audio frame in data carries out duration shortening processing, wherein first threshold is less than second threshold.
Step 103, according to timing is played, treated audio data to be played is played out.
In the embodiment of the present invention, during voice communication, the audio data to be played stored in jitter cache is detected
Data volume, if the data volume of audio data be lower than preset first threshold, treat the audio frame in playing audio-fequency data
Carry out duration extension processing;If the data volume of audio data is higher than preset second threshold, treat in playing audio-fequency data
Audio frame carry out duration shortening processing, wherein first threshold be less than second threshold, according to play timing, to treated
Audio data to be played plays out.In this way, the audio data in jitter cache plays when data volume is less in jitter cache
It is slack-off, when unstable networks, longer time can be provided and to be stored in new audio data in caching, when in jitter cache
When data volume is more, the audio data in jitter cache plays as early as possible, guarantees the space for having more as far as possible in jitter cache, Ke Yibao
Moment received a large amount of audio datas are deposited, prevents the audio data in jitter-buffer from overflowing, broadcasts sky it is thus possible to prevent
Or the phenomenon that scarce word.
Embodiment two
The embodiment of the invention provides a kind of method of playing audio-fequency data, the executing subject of this method is terminal.Wherein,
The terminal can be terminal console, can be the mobile terminals such as mobile phone, tablet computer.Can be set in the terminal processor,
Memory, transceiver and loudspeaker, processor can be used for carrying out the data volume of the audio data to be played in jitter cache
It detects and treats playing audio-fequency data according to testing result and perform corresponding processing, memory can be used for the following places of storage
The data of the data and generation that need during reason, transceiver can be used for sending and receiving data, and loudspeaker can be used for
Broadcasting to treated audio data to be played.It is also provided with decoder, decoder can be used for receiving
It is decoded by the audio frame of coding.The terminal is also provided with microphone, encoder, and microphone can be used for obtaining use
Voice signal of the family in voice communication, encoder can be used for encoding the voice signal that terminal obtains.
Below in conjunction with specific embodiment, process flow shown in FIG. 1 is described in detail, content can be as
Under:
Step 101, during voice communication, the data of the audio data to be played stored in jitter cache are detected
Amount.
Wherein, jitter cache can be used for storing the audio data to be played that terminal receives.
In an implementation, in the voice call process based on voice packet switch, the terminal of voice communication one end (is properly termed as
Transmitting terminal) send voice packet after, the terminal of voice communication opposite end can receive the voice packet, wherein voice packet can wrap containing
Multiframe audio data carries out de-packaging operation to the voice packet received, and it is slow that the audio data for including by voice packet is stored in shake
In depositing.
The voice packet that transmitting terminal is sent can carry the serial number of sending time He the voice packet, and terminal receives hair every time
After the voice packet that sending end is sent, de-packaging operation, available multiframe audio data therein and the voice packet pair are carried out to it
The serial number answered, it can be determined that the sequence of the voice packet whether serial number for the voice packet being currently received receives close to the last time
Number, it, can be by the voice if the serial number of voice packet that the serial number for the voice packet being currently received is received close to the last time
The multiframe audio data that packet includes is stored in jitter cache.If the serial number for the voice packet being currently received is received with the last time
Voice packet serial number among also between be separated with other serial numbers, then can wait after a certain period of time, then include by the voice packet
Multiframe audio data is stored in jitter cache.If terminal receives serial number and connects close to the last time in the time of waiting
The multiframe audio data that the voice packet includes can be then stored in jitter cache by the voice packet of the voice packet serial number received,
Then the multiframe audio data that the voice packet being currently received includes is stored in jitter cache.If terminal is above-mentioned certain
Serial number is received after time close to the voice packet of the last voice packet serial number received, then can include by the voice packet
Multiframe audio data is stored in jitter cache, and the sound for being positioned next to the last voice packet received and including stored
Frequency evidence.Being stored in audio data to be played in jitter cache can be according to the sequential storage of broadcasting, namely according to transmission
The terminal at end generates the time storage of above-mentioned audio data.
The detection cycle for detecting the data volume in jitter cache can be preset, during voice communication,
Terminal can periodically detect the data volume of the audio data to be played stored in jitter cache according to preset detection cycle.
Step 102, it if the data volume of audio data is lower than preset first threshold, treats in playing audio-fequency data
Audio frame carries out duration extension processing;If the data volume of audio data is higher than preset second threshold, to audio to be played
Audio frame in data carries out duration shortening processing.
Wherein, first threshold is less than second threshold.
It in an implementation,, can be according to storage after audio data to be played is stored in jitter cache in voice communication
Sequence treat each audio frame in playing audio-fequency data and be successively decoded operation, it will it is corresponding to obtain each audio frame
Decoded audio frame and the corresponding coder parameters of each audio frame.
Two threshold values that the data volume in characterization jitter cache can be preset, are properly termed as first threshold and the second threshold
Value, wherein first threshold can be less than second threshold, as shown in Fig. 2, if terminal detects the audio being stored in jitter cache
The data volume of data is lower than preset first threshold, then can (can be with to the audio frame to be played being stored in jitter cache
Decoded audio frame) duration extended, even if also the broadcasting speed of audio frame to be played is slack-off, in this way, being directed to
Unstable networks, terminal are possible in the long period the case where will not receiving voice packet, can be effectively prevented and occur broadcasting sky
Phenomenon.It, can be with if terminal detects that the data volume for the audio data being stored in jitter cache is higher than preset second threshold
The duration for the audio frame (can be decoded audio frame) to be played being stored in jitter cache is shortened, even if also
The broadcasting speed of audio frame to be played becomes faster, in this way, being directed to unstable networks, terminal moment receives the feelings of a large amount of voice packets
Condition can be effectively prevented the audio data in jitter cache and overflow and cause to play Caton.If terminal, which detects, is stored in shake
The data volume of audio data in caching can not tremble between preset first threshold and preset second threshold to being stored in
The duration of audio frame to be played in dynamic caching carries out any processing.
Optionally, treat each audio frame in playing audio-fequency data be successively decoded operation obtain it is each decoded
After audio frame, the pitch period of available each audio frame, correspondingly, treatment process, which can be such that, obtains audio to be played
The pitch period of each audio frame in data.
Wherein, pitch period is the inverse of the frequency (referred to as fundamental frequency) of vocal cords vibration, is spaced the voice of a pitch period
The correlation maximum of signal, pitch period are a kind of intrinsic parameters of audio signal.
In an implementation, it is treated in voice communication after each audio frame in playing audio-fequency data is successively decoded, phase
The corresponding pitch period of available each audio frame answered, that is, treat in playing audio-fequency data audio frame and be decoded
After can obtain the corresponding pitch period of the audio frame.
Optionally, the mode of the pitch period of each audio frame in acquisition playing audio-fequency data can be varied,
Following present several feasible modes:
Mode one, if the audio frame recording in the audio data to be played has pitch period, from described to be played
The pitch period of each audio frame is obtained in each audio frame in audio data.
In an implementation, it is corresponded to if treating each audio frame obtained after each audio frame decoding in playing audio-fequency data
Coder parameters in characterization audio frame whether record pitch period Status mark parameters value be 1, that is, indicate to be played
Audio frame recording in audio data has pitch period, can directly acquire the fundamental tone week by decoding obtained each audio frame
Phase.
Mode two, if the audio frame in the audio data to be played has not recorded pitch period, based on fundamental tone week
Phase searching algorithm and each decoded audio frame, determine the pitch period of each audio frame.
In an implementation, it is corresponded to if treating each audio frame obtained after each audio frame decoding in playing audio-fequency data
Coder parameters in characterization audio frame whether record pitch period Status mark parameters value be 0, that is, indicate to be played
Audio frame in audio data has not recorded pitch period, then can pass through the fundamental tones week such as correlation method or average amplitude difference method
Phase searching algorithm, the decoded audio frame obtained to decoding calculate, and obtain the corresponding pitch period of each audio frame.
Optionally, for the pitch period of each audio frame in above-mentioned acquisition audio data to be played it the case where, is based on
To the selection principle difference for the duration that audio frame needs to extend or shorten, the processing mode of step 102 can be varied, below
Give several feasible processing modes:
Mode one, the corresponding pitch period of each audio frame chosen in audio data to be played prolong as each audio frame
Duration that is long or shortening, corresponding treatment process can be such that if the data volume of audio data is lower than preset first threshold,
Each audio frame in audio data to be played is then extended into 1 corresponding pitch period;If the data volume of frequency evidence is higher than
Each audio frame in audio data to be played is then shortened 1 corresponding pitch period by preset second threshold.
In an implementation, to the duration for the audio frame being stored in the audio data to be played in jitter cache carry out extend or
When person shortens processing, it can extend or shorten corresponding 1 pitch period of each audio frame.If a terminal detects that being stored in
When the data volume of audio data to be played in jitter cache is lower than first threshold, each of playing audio-fequency data can be treated
The duration of audio frame carries out extension processing, can extend the corresponding pitch period duration of each audio frame 1, i.e., each audio frame
Extend different durations, is to extend 1 corresponding pitch period.If a terminal detects that be stored in jitter cache to
When the data volume of playing audio-fequency data is higher than second threshold, the when progress of each audio frame in playing audio-fequency data can be treated
Row shortening processing, can shorten the corresponding pitch period duration of each audio frame 1, i.e., when each audio frame shortens different
It is long, it is to shorten 1 corresponding pitch period.In this way, the processing to each audio frame only extends or shortens the sound
The corresponding pitch period duration of frequency frame, the pitch period for not changing each audio frame will not change the base of each audio frame
Frequently, wherein not changing fundamental frequency will not both modify tone, and can achieve the effect that speed-variation without tone is carried out to original each audio frame.
Optionally, the case where extending for each audio frame or shorten 1 corresponding pitch period, each audio frame prolongs
Duration that is long or shortening can merge to obtain by the data of the first two pitch period, correspondingly, treatment process can be such that
It, will in each audio frame in audio data to be played if the data volume of audio data is lower than preset first threshold
The data of first pitch period and second pitch period merge into the data of a pitch period, and combined data are inserted into
To between first pitch period and second pitch period;If the data volume of audio data is higher than preset second threshold,
Then in each audio frame in audio data to be played, the data of first pitch period and second pitch period are merged
For the data of a pitch period, the data of first pitch period and second pitch period are replaced with combined data.
It in an implementation, can be by audio data to be played if the data volume of audio data is lower than preset first threshold
In each audio frame in the data of first pitch period and the data of second pitch period carry out corresponding superposition,
In, the data corresponding first of the data of first pitch period and second pitch period when can preset superposition
Weight and the second weight, the first weight and the second weight and be 1, can be respectively 0.5.As shown in figure 3, after corresponding superposition,
The data of one pitch period of the Data Synthesis of the available data and second pitch period by first pitch period,
It can insert it between first pitch period and second pitch period, the pitch period of increasing that will be obtained
Audio frame is as corresponding treated the audio frame of the audio frame.If the data volume of audio data is higher than preset second threshold
Value, can be by the data and second pitch period of first pitch period in each audio frame in audio data to be played
Data carry out corresponding superposition, wherein the data of first pitch period and second fundamental tone week when can preset superposition
Corresponding first weight of the data of phase and the second weight, the first weight and the second weight and be 1, can be respectively 0.5.
It is available by the data of first pitch period and the Data Synthesis of second pitch period as shown in figure 4, after corresponding superposition
A pitch period data, the data of first pitch period and second pitch period can be replaced, will be obtained
The audio frame for shortening a pitch period as corresponding treated the audio frame of the audio frame.
Mode two needs the preset duration for extending or shortening according to each audio frame, chooses each audio frame and be actually subjected to prolong
Duration that is long or shortening, corresponding treatment process can be such that if the data volume of audio data is lower than preset first threshold,
Then according to preset extension duration, the corresponding processed in units duration of each audio frame is determined, wherein each processed in units duration is
The integral multiple of the pitch period of corresponding audio frame;Each audio frame in audio data to be played is extended at corresponding unit
Manage duration;If the data volume of audio data, which is higher than preset second threshold, determines each sound according to preset shortening duration
The corresponding processed in units duration of frequency frame, wherein each processed in units duration is the integral multiple of the pitch period of corresponding audio frame;
Each audio frame in audio data to be played is shortened into corresponding processed in units duration.
In an implementation, the corresponding extension duration of each audio frame can be preset and shorten duration.If audio data
Data volume be lower than preset first threshold, can be all divided by the corresponding fundamental tone of each audio frame according to preset extensions duration
Value pitch period corresponding with each audio frame can be multiplied by phase, an available quotient if the quotient is integer
Obtain that each audio frame is corresponding to be actually subjected to extended processed in units duration, which is corresponding times of processed in units duration
Number.The quotient may not be an integer, for such situation, can take the whole part of the quotient, and integer part is corresponding
Value (being rounded quotient downwards) pitch period corresponding with each audio frame be multiplied to obtain the corresponding reality of each audio frame
Extended processed in units duration is wanted, it can also be by value (quotient rounds up) after the corresponding value of integer part plus 1 and every
The corresponding pitch period of a audio frame is multiplied to obtain that each audio frame is corresponding to be actually subjected to extended processed in units duration (i.e. unit
Handling duration is the integral multiple of the corresponding pitch period of each audio frame), for example, preset extension duration is 7ms, a certain audio
The 3ms when pitch period of frame, according to the method for lower rounding, processed in units duration can be twice of pitch period i.e. 6ms, if
Using the method to round up, processed in units duration can be 3 times of pitch period i.e. 9ms.It is then possible to audio to be played
The corresponding processed in units duration of each audio frame that each audio frame in data extends.If the data volume of audio data
Higher than preset second threshold, can be obtained according to preset shortening duration divided by the corresponding pitch period of each audio frame
To a quotient, which may not be an integer, for such situation, can carry out taking downwards or upwards to the quotient
It is whole, the whole part of the quotient can be taken, by the corresponding value of integer part (being rounded quotient downwards) and each audio frame pair
The pitch period answered is multiplied to obtain the corresponding processed in units duration for being actually subjected to shorten of each audio frame, can also be by integer part
Value (quotient rounds up) pitch period corresponding with each audio frame after corresponding value plus 1 is multiplied to obtain each audio
The corresponding processed in units duration for being actually subjected to shorten of frame, it is then possible to which each audio frame treated in playing audio-fequency data shortens
The corresponding processed in units duration of obtained each audio frame.
Optionally, in order to make the extended duration of each audio frame level off to identical duration, determine that current audio frame is corresponding
Processed in units duration when, it is also contemplated that the corresponding processed in units duration of previous audio frame and the preset difference for extending duration
Value, correspondingly, treatment process can be such that for first audio frame in audio data to be played, according to preset extension
Duration determines the corresponding processed in units duration of first audio frame;Except first audio frame in audio data to be played
Each of other audio frames, at preset extension duration and the corresponding unit of previous audio frame of other audio frames
Duration and the preset difference for extending duration are managed, determines the corresponding processed in units duration of other audio frames, wherein at each unit
Reason duration is the integral multiple of the pitch period of corresponding audio frame.
In an implementation, if the data volume of audio data is lower than preset first threshold, determining that each audio frame is corresponding
Processed in units duration when, can according to the method described above, according to default for first audio frame in audio data to be played
Extension duration, the corresponding processed in units duration of first audio frame is determined, for first audio in audio data to be played
Other audio frames each of except frame can be by the previous audio frame of other audio frames the case where for above-mentioned downward rounding
Corresponding processed in units duration is added with the preset difference for extending duration with preset extension duration, is obtained after being added each
Other audio frames should want extended duration, and according to the duration, each other sounds can be determined in the manner described above two method
The corresponding processed in units duration of frequency frame, wherein each processed in units duration is the integral multiple of the pitch period of corresponding audio frame,
For example, preset extension duration is 7ms, the pitch period of a certain audio frame is 3ms, and determining processed in units duration is 6ms, then
Processed in units duration differs 1ms with preset extension duration, next audio frame of the audio frame should extended duration can be
The difference adds the i.e. 8ms of preset extension duration (duration value can be regarded to preset extension duration as, when can be according to this
Long value pitch period corresponding with audio frame determines processed in units length), if the pitch period of next audio frame is 2.5ms, adopt
With the method being rounded downwards, processed in units duration can be 3 times of pitch period i.e. 7.5ms.For the above-mentioned feelings to round up
Condition can prolong preset extension duration processed in units duration corresponding with the previous audio frame of other audio frames with preset
Long duration is subtracted each other, and other audio frames should want extended duration each of after being subtracted each other, can be according to above-mentioned according to the duration
The method of mode two determines the corresponding processed in units duration of each other audio frames, wherein each processed in units duration is corresponding
Audio frame pitch period integral multiple, for example, preset extension duration is 7ms, the pitch period of a certain audio frame is
3ms, determining processed in units duration is 9ms, then processed in units duration differs 2ms with preset extension duration, the audio frame
Next audio frame should extended duration can be preset extension duration subtract the difference i.e. 5ms (can be by the duration value
Regard preset extension duration as, processed in units length can be determined according to duration value pitch period corresponding with audio frame),
If the pitch period of next audio frame is 3.5ms, using the method to round up, processed in units duration can be 2 times of fundamental tone
Period, that is, 7ms.
In order to which the duration for shortening each audio frame levels off to identical duration, determine at the corresponding unit of current audio frame
When managing duration, it is also contemplated that the previous corresponding processed in units duration of audio frame and the preset difference for shortening duration, accordingly
, treatment process can be such that for first audio frame in audio data to be played, according to preset shortening duration, really
The fixed corresponding processed in units duration of first audio frame;For each of except first audio frame in audio data to be played its
Its audio frame, according to the corresponding processed in units duration of the previous audio frame of preset shortening duration and other audio frames with
The difference for extending duration, determines the corresponding processed in units duration of other audio frames, wherein each processed in units duration is pair
The integral multiple of the pitch period for the audio frame answered.
In an implementation, if the data volume of audio data is higher than preset second threshold, determining that each audio frame is corresponding
Processed in units duration when, can according to the method described above, according to default for first audio frame in audio data to be played
Shortening duration, determine the corresponding processed in units duration of first audio frame.For first audio in audio data to be played
Other audio frames each of except frame can be by the previous audio frame of other audio frames the case where for above-mentioned downward rounding
Corresponding processed in units duration is added with the preset difference for shortening duration with preset shortening duration, is obtained after being added each
The duration that other audio frames should be shortened can determine each audio frame according to the duration in the manner described above two method
Corresponding processed in units duration, wherein each processed in units duration is the integral multiple of the pitch period of corresponding audio frame.For
It above-mentioned the case where rounding up, can will be at preset shortening duration unit corresponding with the previous audio frame of other audio frames
Reason duration is subtracted each other with preset shortening duration, the duration that other audio frames should be shortened each of after being subtracted each other, according to this
Duration can determine the corresponding processed in units duration of each other audio frames, wherein Mei Gedan in the manner described above two method
Position handling duration is the integral multiple of the pitch period of corresponding audio frame.
Optionally, the case where extending processed in units duration for above-mentioned each audio frame, each extended duration of audio frame
It can merge to obtain by the data of the first two processed in units duration, correspondingly, treatment process can be such that in audio to be played
In each audio frame in data, the data of first processed in units duration and second processed in units duration are merged into one
The data of processed in units duration, by combined data be inserted into first processed in units duration and second processed in units duration it
Between.
It in an implementation, can be by audio data to be played if the data volume of audio data is lower than preset first threshold
In each audio frame in the data of first processed in units duration and the data of second processed in units duration corresponded to
Superposition, wherein the number of the data of first processed in units duration and second processed in units duration when can preset superposition
According to corresponding first weight and the second weight, the first weight and the second weight and be 1, can be respectively 0.5.It is corresponding folded
After adding, a base of the Data Synthesis of the available data and second processed in units duration by first processed in units duration
The data in sound period can be inserted it between first processed in units duration and second processed in units duration, will be obtained
The audio frame for increasing a processed in units duration as corresponding treated the audio frame of the audio frame.
The case where shortening processed in units duration for above-mentioned each audio frame, the duration that each audio frame shortens can pass through
The data of the first two processed in units duration merge to obtain, correspondingly, treatment process can be such that in audio data to be played
In each audio frame, the data of first processed in units duration and second processed in units duration are merged into a processed in units
The data of duration replace the data of first processed in units duration and second processed in units duration with combined data.
It in an implementation, can be by audio data to be played if the data volume of audio data is higher than preset second threshold
In each audio frame in the data of first processed in units duration and the data of second processed in units duration corresponded to
Superposition, wherein the number of the data of first processed in units duration and second processed in units duration when can preset superposition
According to corresponding first weight and the second weight, the first weight and the second weight and be 1, can be respectively 0.5.It is corresponding folded
After adding, a list of the Data Synthesis of the available data and second processed in units duration by first processed in units duration
The data of position handling duration, can be replaced the data of first processed in units duration and second processed in units duration, will
The obtained audio frame for shortening a processed in units duration is as corresponding treated the audio frame of the audio frame.
Step 103, according to timing is played, treated audio data to be played is played out.
In an implementation, it treats after each audio frame in playing audio-fequency data successively decodes, by decoded each audio
Frame is not stored in by each audio frame and the audio frame by extending or shortening processing that extend or shorten processing and broadcasts
Slow down in depositing, by system according to playing sequence, audio data to be played therein is played out.
In the embodiment of the present invention, during voice communication, the audio data to be played stored in jitter cache is detected
Data volume, if the data volume of audio data be lower than preset first threshold, treat the audio frame in playing audio-fequency data
Carry out duration extension processing;If the data volume of audio data is higher than preset second threshold, treat in playing audio-fequency data
Audio frame carry out duration shortening processing, wherein first threshold be less than second threshold, according to play timing, to treated
Audio data to be played plays out.In this way, the audio data in jitter cache plays when data volume is less in jitter cache
It is slack-off, when unstable networks, longer time can be provided and to be stored in new audio data in caching, when in jitter cache
When data volume is more, the audio data in jitter cache plays as early as possible, guarantees the space for having more as far as possible in jitter cache, Ke Yibao
Moment received a large amount of audio datas are deposited, prevents the audio data in jitter-buffer from overflowing, broadcasts sky it is thus possible to prevent
Or the phenomenon that scarce word.
Embodiment three
Based on the same technical idea, the embodiment of the invention also provides a kind of devices of playing audio-fequency data, such as Fig. 5 institute
Show, which includes:
Detection module 510, for detecting the audio data to be played stored in jitter cache during voice communication
Data volume;
Processing module 520, if for the audio data data volume be lower than preset first threshold, to it is described to
Audio frame in playing audio-fequency data carries out duration extension processing;If the data volume of the audio data is higher than preset second
Threshold value then carries out duration shortening processing to the audio frame in the audio data to be played, wherein the first threshold is less than institute
State second threshold;
Playing module 530, for being played out to treated audio data to be played according to timing is played.
Optionally, it as shown in fig. 6, described device further includes obtaining module 540, is used for:
Obtain the pitch period of each audio frame in the audio data to be played;
The processing module 520, is used for:
If the data volume of the audio data is lower than preset first threshold, will be in the audio data to be played
Each audio frame extends 1 corresponding pitch period;If the data volume of the audio data is higher than preset second threshold,
Each audio frame in the audio data to be played is shortened into 1 corresponding pitch period.
Optionally, as shown in fig. 7, the processing module 520, comprising:
First processing submodule 5201, if the data volume for the audio data is lower than preset first threshold,
In each audio frame in the audio data to be played, the data of first pitch period and second pitch period are closed
And be the data of a pitch period, combined data are inserted into first pitch period and second fundamental tone week
Between phase;
Second processing submodule 5202, if the data volume for the audio data is higher than preset second threshold,
In each audio frame in the audio data to be played, the data of first pitch period and second pitch period are closed
And be the data of a pitch period, the number of first pitch period and second pitch period is replaced with combined data
According to.
Optionally, the acquisition module 540, is used for:
Obtain the pitch period of each audio frame in the audio data to be played;
The first processing submodule 5201, is used for:
If the data volume of the audio data is lower than preset first threshold, according to preset extension duration, determine
The corresponding processed in units duration of each audio frame, wherein each processed in units duration is the pitch period of corresponding audio frame
Integral multiple;Each audio frame in the audio data to be played is extended into corresponding processed in units duration;
The second processing submodule 5202, is used for:
If the data volume of the audio data is higher than preset second threshold, according to preset shortening duration, determine
The corresponding processed in units duration of each audio frame, wherein each processed in units duration is the pitch period of corresponding audio frame
Integral multiple;Each audio frame in the audio data to be played is shortened into corresponding processed in units duration.
Optionally, the first processing submodule 5201, is used for:
In each audio frame in the audio data to be played, by first processed in units duration and second unit
The data of handling duration merge into the data of a processed in units duration, and combined data are inserted at first unit
It manages between duration and second processed in units duration;
The second processing submodule 5202, is used for:
In each audio frame in the audio data to be played, by first processed in units duration and second unit
The data of handling duration merge into the data of a processed in units duration, replace first processed in units with combined data
The data of duration and second processed in units duration.
Optionally, the acquisition module 540, is used for:
If the audio frame recording in the audio data to be played has pitch period, from the audio data to be played
In each audio frame in obtain the pitch period of each audio frame;If the audio frame in the audio data to be played is not remembered
Record has pitch period, then is based on pitch period searching algorithm and each decoded audio frame, determines the base of each audio frame
The sound period.
In the embodiment of the present invention, during voice communication, the audio data to be played stored in jitter cache is detected
Data volume, if the data volume of audio data be lower than preset first threshold, treat the audio frame in playing audio-fequency data
Carry out duration extension processing;If the data volume of audio data is higher than preset second threshold, treat in playing audio-fequency data
Audio frame carry out duration shortening processing, wherein first threshold be less than second threshold, according to play timing, to treated
Audio data to be played plays out.In this way, the audio data in jitter cache plays when data volume is less in jitter cache
It is slack-off, when unstable networks, longer time can be provided and to be stored in new audio data in caching, when in jitter cache
When data volume is more, the audio data in jitter cache plays as early as possible, guarantees the space for having more as far as possible in jitter cache, Ke Yibao
Moment received a large amount of audio datas are deposited, prevents the audio data in jitter-buffer from overflowing, broadcasts sky it is thus possible to prevent
Or the phenomenon that scarce word.
It should be understood that the device of playing audio-fequency data provided by the above embodiment is in playing audio-fequency data, only with
The division progress of above-mentioned each functional module can according to need and for example, in practical application by above-mentioned function distribution by not
Same functional module is completed, i.e., the internal structure of equipment is divided into different functional modules, to complete whole described above
Or partial function.In addition, the device of playing audio-fequency data provided by the above embodiment and the method for playing audio-fequency data are implemented
Example belongs to same design, and specific implementation process is detailed in embodiment of the method, and which is not described herein again.
Example IV
Referring to FIG. 8, the terminal can be used for it illustrates the structural schematic diagram of terminal involved in the embodiment of the present invention
The method of the playing audio-fequency data provided in above-described embodiment is provided.Specifically:
Terminal 800 may include RF (Radio Frequency, radio frequency) circuit 110, include one or more meter
The memory 120 of calculation machine readable storage medium storing program for executing, input unit 130, display unit 140, sensor 150, voicefrequency circuit 160,
WiFi (wireless fidelity, Wireless Fidelity) module 170, the processing for including one or more than one processing core
The components such as device 180 and power supply 190.It will be understood by those skilled in the art that terminal structure shown in Fig. 8 is not constituted pair
The restriction of terminal may include perhaps combining certain components or different component cloth than illustrating more or fewer components
It sets.Wherein:
RF circuit 110 can be used for receiving and sending messages or communication process in, signal sends and receivees, particularly, by base station
After downlink information receives, one or the processing of more than one processor 180 are transferred to;In addition, the data for being related to uplink are sent to
Base station.In general, RF circuit 110 includes but is not limited to antenna, at least one amplifier, tuner, one or more oscillators, uses
Family identity module (SIM) card, transceiver, coupler, LNA (Low Noise Amplifier, low-noise amplifier), duplex
Device etc..In addition, RF circuit 110 can also be communicated with network and other equipment by wireless communication.The wireless communication can make
With any communication standard or agreement, and including but not limited to GSM (Global System of Mobile communication, entirely
Ball mobile communcations system), GPRS (General Packet Radio Service, general packet radio service), CDMA (Code
Division Multiple Access, CDMA), WCDMA (Wideband Code Division Multiple
Access, wideband code division multiple access), LTE (Long Term Evolution, long term evolution), Email, SMS (Short
Messaging Service, short message service) etc..
Memory 120 can be used for storing software program and module, and processor 180 is stored in memory 120 by operation
Software program and module, thereby executing various function application and data processing.Memory 120 can mainly include storage journey
Sequence area and storage data area, wherein storing program area can the (ratio of application program needed for storage program area, at least one function
Such as sound-playing function, image player function) etc.;Storage data area, which can be stored, uses created number according to terminal 800
According to (such as audio data, phone directory etc.) etc..In addition, memory 120 may include high-speed random access memory, can also wrap
Include nonvolatile memory, a for example, at least disk memory, flush memory device or other volatile solid-state parts.
Correspondingly, memory 120 can also include Memory Controller, to provide processor 180 and input unit 130 to memory
120 access.
Input unit 130 can be used for receiving the number or character information of input, and generate and user setting and function
Control related keyboard, mouse, operating stick, optics or trackball signal input.Specifically, input unit 130 may include touching
Sensitive surfaces 131 and other input equipments 132.Touch sensitive surface 131, also referred to as touch display screen or Trackpad are collected and are used
Family on it or nearby touch operation (such as user using any suitable object or attachment such as finger, stylus in touch-sensitive table
Operation on face 131 or near touch sensitive surface 131), and corresponding attachment device is driven according to preset formula.It is optional
, touch sensitive surface 131 may include both touch detecting apparatus and touch controller.Wherein, touch detecting apparatus detection is used
The touch orientation at family, and touch operation bring signal is detected, transmit a signal to touch controller;Touch controller is from touch
Touch information is received in detection device, and is converted into contact coordinate, then gives processor 180, and can receive processor 180
The order sent simultaneously is executed.Furthermore, it is possible to using multiple types such as resistance-type, condenser type, infrared ray and surface acoustic waves
Realize touch sensitive surface 131.In addition to touch sensitive surface 131, input unit 130 can also include other input equipments 132.Specifically,
Other input equipments 132 can include but is not limited to physical keyboard, function key (such as volume control button, switch key etc.),
One of trace ball, mouse, operating stick etc. are a variety of.
Display unit 140 can be used for showing information input by user or the information and terminal 800 that are supplied to user
Various graphical user interface, these graphical user interface can be made of figure, text, icon, video and any combination thereof.
Display unit 140 may include display panel 141, optionally, can use LCD (Liquid Crystal Display, liquid crystal
Show device), the forms such as OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode) configure display panel
141.Further, touch sensitive surface 131 can cover display panel 141, when touch sensitive surface 131 detects touching on it or nearby
After touching operation, processor 180 is sent to determine the type of touch event, is followed by subsequent processing device 180 according to the type of touch event
Corresponding visual output is provided on display panel 141.Although in fig. 8, touch sensitive surface 131 and display panel 141 are conducts
Two independent components realize input and input function, but in some embodiments it is possible to by touch sensitive surface 131 and display
Panel 141 is integrated and realizes and outputs and inputs function.
Terminal 800 may also include at least one sensor 150, such as optical sensor, motion sensor and other sensings
Device.Specifically, optical sensor may include ambient light sensor and proximity sensor, wherein ambient light sensor can be according to environment
The light and shade of light adjusts the brightness of display panel 141, and proximity sensor can close display when terminal 800 is moved in one's ear
Panel 141 and/or backlight.As a kind of motion sensor, gravity accelerometer can detect in all directions (generally
Three axis) acceleration size, can detect that size and the direction of gravity when static, can be used to identify mobile phone posture application (ratio
Such as horizontal/vertical screen switching, dependent game, magnetometer pose calibrating), Vibration identification correlation function (such as pedometer, tap);Extremely
In other sensors such as gyroscope, barometer, hygrometer, thermometer, the infrared sensors that terminal 800 can also configure, herein
It repeats no more.
Voicefrequency circuit 160, loudspeaker 161, microphone 162 can provide the audio interface between user and terminal 800.Audio
Electric signal after the audio data received conversion can be transferred to loudspeaker 161, be converted to sound by loudspeaker 161 by circuit 160
Sound signal output;On the other hand, the voice signal of collection is converted to electric signal by microphone 162, after being received by voicefrequency circuit 160
Audio data is converted to, then by after the processing of audio data output processor 180, such as another end is sent to through RF circuit 110
End, or audio data is exported to memory 120 to be further processed.Voicefrequency circuit 160 is also possible that earphone jack,
To provide the communication of peripheral hardware earphone Yu terminal 800.
WiFi belongs to short range wireless transmission technology, and terminal 800 can help user's transceiver electronics by WiFi module 170
Mail, browsing webpage and access streaming video etc., it provides wireless broadband internet access for user.Although Fig. 8 is shown
WiFi module 170, but it is understood that, and it is not belonging to must be configured into for terminal 800, it can according to need completely
Do not change in the range of the essence of invention and omits.
Processor 180 is the control centre of terminal 800, utilizes each portion of various interfaces and connection whole mobile phone
Point, by running or execute the software program and/or module that are stored in memory 120, and calls and be stored in memory 120
Interior data execute the various functions and processing data of terminal 800, to carry out integral monitoring to mobile phone.Optionally, processor
180 may include one or more processing cores;Preferably, processor 180 can integrate application processor and modem processor,
Wherein, the main processing operation system of application processor, user interface and application program etc., modem processor mainly handles nothing
Line communication.It is understood that above-mentioned modem processor can not also be integrated into processor 180.
Terminal 800 further includes the power supply 190 (such as battery) powered to all parts, it is preferred that power supply can pass through electricity
Management system and processor 180 are logically contiguous, to realize management charging, electric discharge and power consumption by power-supply management system
The functions such as management.Power supply 190 can also include one or more direct current or AC power source, recharging system, power supply event
Hinder the random components such as detection circuit, power adapter or inverter, power supply status indicator.
Although being not shown, terminal 800 can also include camera, bluetooth module etc., and details are not described herein.Specifically in this reality
It applies in example, the display unit of terminal 800 is touch-screen display, and terminal 800 further includes having memory and one or one
Above program, one of them perhaps more than one program be stored in memory and be configured to by one or one with
Upper processor execution states one or more than one program includes the instruction for performing the following operation:
During voice communication, the data volume of the audio data to be played stored in jitter cache is detected;
If the data volume of the audio data is lower than preset first threshold, in the audio data to be played
Audio frame carries out duration extension processing;If the data volume of the audio data be higher than preset second threshold, to it is described to
Audio frame in playing audio-fequency data carries out duration shortening processing, wherein the first threshold is less than the second threshold;
According to timing is played, treated audio data to be played is played out.
Optionally, the method also includes:
Obtain the pitch period of each audio frame in the audio data to be played;
If the data volume of the audio data is lower than preset first threshold, to the audio data to be played
In audio frame carry out duration extension processing;If the data volume of the audio data is higher than preset second threshold, to institute
The audio frame stated in audio data to be played carries out duration shortening processing, comprising:
If the data volume of the audio data is lower than preset first threshold, will be in the audio data to be played
Each audio frame extends 1 corresponding pitch period;If the data volume of the audio data is higher than preset second threshold,
Each audio frame in the audio data to be played is shortened into 1 corresponding pitch period.
It optionally, will be described to be played if the data volume of the audio data is lower than preset first threshold
Each audio frame in audio data extends 1 corresponding pitch period;If the data volume of the audio data is higher than default
Second threshold, then by the audio data to be played each audio frame shorten 1 corresponding pitch period, comprising:
If the data volume of the audio data is lower than preset first threshold, in the audio data to be played
In each audio frame, the data of first pitch period and second pitch period are merged into the data of a pitch period,
Combined data are inserted between first pitch period and second pitch period;
If the data volume of the audio data is higher than preset second threshold, in the audio data to be played
In each audio frame, the data of first pitch period and second pitch period are merged into the data of a pitch period,
The data of first pitch period and second pitch period are replaced with combined data.
Optionally, the method also includes:
Obtain the pitch period of each audio frame in the audio data to be played;
If the data volume of the audio data is lower than preset first threshold, to the audio data to be played
In audio frame carry out duration extension processing;If the data volume of the audio data is higher than preset second threshold, to institute
The audio frame stated in audio data to be played carries out duration shortening processing, comprising:
If the data volume of the audio data is lower than preset first threshold, according to preset extension duration, determine
The corresponding processed in units duration of each audio frame, wherein each processed in units duration is the pitch period of corresponding audio frame
Integral multiple;Each audio frame in the audio data to be played is extended into corresponding processed in units duration;
If the data volume of the audio data is higher than preset second threshold, according to preset shortening duration, determine
The corresponding processed in units duration of each audio frame, wherein each processed in units duration is the pitch period of corresponding audio frame
Integral multiple;Each audio frame in the audio data to be played is shortened into corresponding processed in units duration.
Optionally, each audio frame by the audio data to be played extends corresponding processed in units duration,
Include:
In each audio frame in the audio data to be played, by first processed in units duration and second unit
The data of handling duration merge into the data of a processed in units duration, and combined data are inserted at first unit
It manages between duration and second processed in units duration;
Each audio frame by the audio data to be played shortens corresponding processed in units duration, comprising:
In each audio frame in the audio data to be played, by first processed in units duration and second unit
The data of handling duration merge into the data of a processed in units duration, replace first processed in units with combined data
The data of duration and second processed in units duration.
Optionally, the pitch period for obtaining each audio frame in the audio data to be played, comprising:
If the audio frame recording in the audio data to be played has pitch period, from the audio data to be played
In each audio frame in obtain the pitch period of each audio frame;If the audio frame in the audio data to be played is not remembered
Record has pitch period, then is based on pitch period searching algorithm and each decoded audio frame, determines the base of each audio frame
The sound period.
In the embodiment of the present invention, during voice communication, the audio data to be played stored in jitter cache is detected
Data volume, if the data volume of audio data be lower than preset first threshold, treat the audio frame in playing audio-fequency data
Carry out duration extension processing;If the data volume of audio data is higher than preset second threshold, treat in playing audio-fequency data
Audio frame carry out duration shortening processing, wherein first threshold be less than second threshold, according to play timing, to treated
Audio data to be played plays out.In this way, the audio data in jitter cache plays when data volume is less in jitter cache
It is slack-off, when unstable networks, longer time can be provided and to be stored in new audio data in caching, when in jitter cache
When data volume is more, the audio data in jitter cache plays as early as possible, guarantees the space for having more as far as possible in jitter cache, Ke Yibao
Moment received a large amount of audio datas are deposited, prevents the audio data in jitter-buffer from overflowing, broadcasts sky it is thus possible to prevent
Or the phenomenon that scarce word.
Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardware
It completes, relevant hardware can also be instructed to complete by program, the program can store in a kind of computer-readable
In storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and
Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.
Claims (10)
1. a kind of method of playing audio-fequency data, which is characterized in that the described method includes:
During voice communication, the data volume of the audio data to be played stored in jitter cache is detected;
If the data volume of the audio data is lower than preset first threshold, to the audio in the audio data to be played
Frame carries out duration extension processing;If the data volume of the audio data is higher than preset second threshold, to described to be played
Audio frame in audio data carries out duration shortening processing, wherein the first threshold is less than the second threshold;
According to timing is played, treated audio data to be played is played out;
The method also includes:
Obtain the pitch period of each audio frame in the audio data to be played;
If the data volume of the audio data is lower than preset first threshold, in the audio data to be played
Audio frame carries out duration extension processing;If the data volume of the audio data be higher than preset second threshold, to it is described to
Audio frame in playing audio-fequency data carries out duration shortening processing, comprising:
If the data volume of the audio data is lower than preset first threshold, according to preset extension duration, determine each
The corresponding processed in units duration of audio frame, wherein each processed in units duration is the integer of the pitch period of corresponding audio frame
Times;Each audio frame in the audio data to be played is extended into corresponding processed in units duration;
If the data volume of the audio data is higher than preset second threshold, according to preset shortening duration, determine each
The corresponding processed in units duration of audio frame, wherein each processed in units duration is the integer of the pitch period of corresponding audio frame
Times;Each audio frame in the audio data to be played is shortened into corresponding processed in units duration.
2. the method according to claim 1, wherein the method also includes:
Obtain the pitch period of each audio frame in the audio data to be played;
If the data volume of the audio data is lower than preset first threshold, in the audio data to be played
Audio frame carries out duration extension processing;If the data volume of the audio data be higher than preset second threshold, to it is described to
Audio frame in playing audio-fequency data carries out duration shortening processing, comprising:
If the data volume of the audio data is lower than preset first threshold, by each of described audio data to be played
Audio frame extends 1 corresponding pitch period;If the data volume of the audio data is higher than preset second threshold, by institute
The each audio frame stated in audio data to be played shortens 1 corresponding pitch period.
3. if according to the method described in claim 2, it is characterized in that, the data volume of the audio data is lower than default
First threshold, then by the audio data to be played each audio frame extend 1 corresponding pitch period;If described
The data volume of audio data is higher than preset second threshold, then each audio frame in the audio data to be played is shortened 1
A corresponding pitch period, comprising:
If the data volume of the audio data is lower than preset first threshold, in each of described audio data to be played
In audio frame, the data of first pitch period and second pitch period are merged into the data of a pitch period, will be closed
And data be inserted between first pitch period and second pitch period;
If the data volume of the audio data is higher than preset second threshold, in each of described audio data to be played
In audio frame, the data of first pitch period and second pitch period are merged into the data of a pitch period, with conjunction
And data replace the data of first pitch period and second pitch period.
4. the method according to claim 1, wherein each audio by the audio data to be played
Frame extends corresponding processed in units duration, comprising:
In each audio frame in the audio data to be played, by first processed in units duration and second processed in units
The data of duration merge into the data of a processed in units duration, when combined data are inserted into first processed in units
Between long and described second processed in units duration;
Each audio frame by the audio data to be played shortens corresponding processed in units duration, comprising:
In each audio frame in the audio data to be played, by first processed in units duration and second processed in units
The data of duration merge into the data of a processed in units duration, replace first processed in units duration with combined data
With the data of second processed in units duration.
5. according to claim 1,2 described in any item methods, which is characterized in that described to obtain in the audio data to be played
Each audio frame pitch period, comprising:
If the audio frame recording in the audio data to be played has pitch period, from the audio data to be played
The pitch period of each audio frame is obtained in each audio frame;If the audio frame in the audio data to be played has not recorded
Pitch period is then based on pitch period searching algorithm and each decoded audio frame, determines the fundamental tone week of each audio frame
Phase.
6. a kind of device of playing audio-fequency data, which is characterized in that described device includes:
Detection module, for detecting the data of the audio data to be played stored in jitter cache during voice communication
Amount;
Processing module, if the data volume for the audio data is lower than preset first threshold, to the sound to be played
Audio frame of the frequency in carries out duration extension processing;If the data volume of the audio data is higher than preset second threshold,
Duration shortening processing then is carried out to the audio frame in the audio data to be played, wherein the first threshold is less than described the
Two threshold values;
Playing module, for being played out to treated audio data to be played according to timing is played;
Module is obtained, is used for: obtaining the pitch period of each audio frame in the audio data to be played;
The processing module includes the first processing submodule and second processing submodule;
Wherein, the first processing submodule, is used for: if the data volume of the audio data is lower than preset first threshold,
Then according to preset extension duration, the corresponding processed in units duration of each audio frame is determined, wherein each processed in units duration is
The integral multiple of the pitch period of corresponding audio frame;Each audio frame in the audio data to be played is extended into corresponding list
Position handling duration;
The second processing submodule, is used for: if the data volume of the audio data is higher than preset second threshold, basis
Preset shortening duration determines the corresponding processed in units duration of each audio frame, wherein each processed in units duration is corresponding
The integral multiple of the pitch period of audio frame;Each audio frame in the audio data to be played is shortened into corresponding processed in units
Duration.
7. device according to claim 6, which is characterized in that described device further includes obtaining module, is used for:
Obtain the pitch period of each audio frame in the audio data to be played;
The processing module, is used for:
If the data volume of the audio data is lower than preset first threshold, by each of described audio data to be played
Audio frame extends 1 corresponding pitch period;If the data volume of the audio data is higher than preset second threshold, by institute
The each audio frame stated in audio data to be played shortens 1 corresponding pitch period.
8. device according to claim 7, which is characterized in that the first processing submodule, if being used for the audio
The data volume of data is lower than preset first threshold, then in each audio frame in the audio data to be played, by first
The data of a pitch period and second pitch period merge into the data of a pitch period, and combined data are inserted into institute
It states between first pitch period and second pitch period;
The second processing submodule, if the data volume for the audio data is higher than preset second threshold, in institute
It states in each audio frame in audio data to be played, the data of first pitch period and second pitch period is merged into
The data of one pitch period replace the data of first pitch period and second pitch period with combined data.
9. device according to claim 6, which is characterized in that the first processing submodule is used for:
In each audio frame in the audio data to be played, by first processed in units duration and second processed in units
The data of duration merge into the data of a processed in units duration, when combined data are inserted into first processed in units
Between long and described second processed in units duration;
The second processing submodule, is used for:
In each audio frame in the audio data to be played, by first processed in units duration and second processed in units
The data of duration merge into the data of a processed in units duration, replace first processed in units duration with combined data
With the data of second processed in units duration.
10. according to claim 6,7 described in any item devices, which is characterized in that the acquisition module is used for:
If the audio frame recording in the audio data to be played has pitch period, from the audio data to be played
The pitch period of each audio frame is obtained in each audio frame;If the audio frame in the audio data to be played has not recorded
Pitch period is then based on pitch period searching algorithm and each decoded audio frame, determines the fundamental tone week of each audio frame
Phase.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510536538.9A CN105245496B (en) | 2015-08-26 | 2015-08-26 | A kind of method and apparatus of playing audio-fequency data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510536538.9A CN105245496B (en) | 2015-08-26 | 2015-08-26 | A kind of method and apparatus of playing audio-fequency data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105245496A CN105245496A (en) | 2016-01-13 |
CN105245496B true CN105245496B (en) | 2019-03-12 |
Family
ID=55042996
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510536538.9A Active CN105245496B (en) | 2015-08-26 | 2015-08-26 | A kind of method and apparatus of playing audio-fequency data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105245496B (en) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106411849A (en) * | 2016-08-30 | 2017-02-15 | 王竞 | Network access information sharing method |
CN109963184B (en) * | 2017-12-14 | 2022-04-29 | 阿里巴巴集团控股有限公司 | Audio and video network playing method and device and electronic equipment |
CN108495177B (en) * | 2018-03-30 | 2021-07-13 | 北京世纪好未来教育科技有限公司 | Audio frequency speed change processing method and device |
CN111225418B (en) * | 2018-11-27 | 2022-05-24 | 华为技术有限公司 | Data transmission method and device |
WO2020237569A1 (en) * | 2019-05-30 | 2020-12-03 | 深圳市大疆创新科技有限公司 | Method, device and system for processing audio data, and storage medium |
CN111083555B (en) * | 2019-10-12 | 2022-09-06 | 广州市保伦电子有限公司 | IP network multimedia communication method, terminal, system and storage medium |
CN111580777B (en) * | 2020-05-06 | 2024-03-08 | 北京达佳互联信息技术有限公司 | Audio processing method, device, electronic equipment and storage medium |
CN111787268B (en) * | 2020-07-01 | 2022-04-22 | 广州视源电子科技股份有限公司 | Audio signal processing method and device, electronic equipment and storage medium |
CN112435678B (en) * | 2020-11-17 | 2024-06-25 | 广州安凯微电子股份有限公司 | Audio playing processing method |
CN112887776B (en) * | 2021-03-18 | 2024-04-23 | 努比亚技术有限公司 | Method, equipment and computer readable storage medium for reducing audio delay |
CN113539295B (en) * | 2021-06-10 | 2024-04-23 | 联想(北京)有限公司 | Voice processing method and device |
CN113436639B (en) * | 2021-08-26 | 2021-12-03 | 北京百瑞互联技术有限公司 | Audio stream compensation method, device, storage medium and equipment |
CN114401472B (en) * | 2021-12-02 | 2023-06-23 | 联想(北京)有限公司 | Electronic equipment and information processing method |
CN115102931B (en) * | 2022-05-20 | 2023-12-19 | 阿里巴巴(中国)有限公司 | Method for adaptively adjusting audio delay and electronic equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1870134A (en) * | 2005-05-24 | 2006-11-29 | 北京大学科技开发部 | Voice time length prolonging method of digital deaf-aid for presbycusis |
CN101175104A (en) * | 2006-10-31 | 2008-05-07 | 华为技术有限公司 | Dithering caching device and its management method |
CN101523822A (en) * | 2006-09-28 | 2009-09-02 | 京瓷株式会社 | Voice transmission apparatus |
CN101894558A (en) * | 2010-08-04 | 2010-11-24 | 华为技术有限公司 | Lost frame recovering method and equipment as well as speech enhancing method, equipment and system |
CN101924683A (en) * | 2009-06-09 | 2010-12-22 | 华为技术有限公司 | Method, device and electronic equipment for dynamically adjusting jitter buffer |
-
2015
- 2015-08-26 CN CN201510536538.9A patent/CN105245496B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1870134A (en) * | 2005-05-24 | 2006-11-29 | 北京大学科技开发部 | Voice time length prolonging method of digital deaf-aid for presbycusis |
CN101523822A (en) * | 2006-09-28 | 2009-09-02 | 京瓷株式会社 | Voice transmission apparatus |
CN101175104A (en) * | 2006-10-31 | 2008-05-07 | 华为技术有限公司 | Dithering caching device and its management method |
CN101924683A (en) * | 2009-06-09 | 2010-12-22 | 华为技术有限公司 | Method, device and electronic equipment for dynamically adjusting jitter buffer |
CN101894558A (en) * | 2010-08-04 | 2010-11-24 | 华为技术有限公司 | Lost frame recovering method and equipment as well as speech enhancing method, equipment and system |
Also Published As
Publication number | Publication date |
---|---|
CN105245496A (en) | 2016-01-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105245496B (en) | A kind of method and apparatus of playing audio-fequency data | |
CN106454404B (en) | A kind of methods, devices and systems playing live video | |
CN104427083B (en) | The method and apparatus for adjusting volume | |
CN104618217B (en) | Share method, terminal, server and the system of resource | |
CN105808060B (en) | A kind of method and apparatus of playing animation | |
CN104519404B (en) | The player method and device of graphic interchange format file | |
CN105549740B (en) | A kind of method and apparatus of playing audio-fequency data | |
CN104036536B (en) | The generation method and device of a kind of stop-motion animation | |
CN106488296B (en) | A kind of method and apparatus showing video barrage | |
CN108632930A (en) | Search network control method, device and mobile terminal | |
CN104159140B (en) | A kind of methods, devices and systems of Video processing | |
CN104602135B (en) | Control the method and device of played in full screen | |
CN109271327A (en) | EMS memory management process and device | |
CN103294442B (en) | A kind of method of playing alert tones, device and terminal device | |
CN106504303B (en) | A kind of method and apparatus playing frame animation | |
CN107396193B (en) | The method and apparatus of video playing | |
CN103475914A (en) | Video playing method, video playing device, terminal equipment and server | |
CN104093053A (en) | Video file playing method, devices and system | |
CN109151469A (en) | Method for video coding, device and equipment | |
CN105187692A (en) | Video recording method and device | |
CN109814930A (en) | A kind of application loading method, device and mobile terminal | |
CN104967864B (en) | A kind of method and device merging video | |
CN106210838B (en) | Caption presentation method and device | |
CN109062643A (en) | A kind of display interface method of adjustment, device and terminal | |
CN105391870A (en) | Timing reminding method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20231007 Address after: 31a, 15 / F, building 30, maple mall, bangrang Road, Brazil, Singapore Patentee after: Baiguoyuan Technology (Singapore) Co.,Ltd. Address before: 511442 25 / F, building B-1, Wanda Plaza North, Wanbo business district, 79 Wanbo 2nd Road, Nancun Town, Panyu District, Guangzhou City, Guangdong Province Patentee before: GUANGZHOU BAIGUOYUAN NETWORK TECHNOLOGY Co.,Ltd. |
|
TR01 | Transfer of patent right |