WO2017124904A1

WO2017124904A1 - Audio playing method and device

Info

Publication number: WO2017124904A1
Application number: PCT/CN2016/113252
Authority: WO
Inventors: 张龙华
Original assignee: 广州视睿电子科技有限公司
Priority date: 2016-01-22
Filing date: 2016-12-29
Publication date: 2017-07-27
Also published as: CN105704554A

Abstract

Disclosed in the present invention is an audio playing method, comprising: receiving an audio data packet, and buffering the audio data packet into a first buffer area; determining the number of the audio data packet buffered in the first buffer area, and adjusting a decoding speed in real time according to the number of the audio data packet; reading the audio data packet from the first buffer area at the decoding speed for decoding and playing. Correspondingly, also disclosed is an audio playing device. The embodiments of the present invention can improve the fluency of audio playing.

Description

Audio playing method and device

Technical field

The present invention relates to the field of network technologies, and in particular, to an audio playing method and apparatus.

Background technique

Under the wide area network, the ideal audio interactive communication is that the interval at which the transmitting end sends the data packet is equal to the interval at which the receiving end receives the data packet, and the receiving end directly decodes and plays the data packet after receiving the data packet. However, in reality, since the data packet is transmitted by UDP (User Datagram Protocol), under the wide area network, UDP transmission may be out of order and packet loss, and the time when each data packet arrives at the receiving end may also be There will be different delays, so that the interval at which the receiving end receives the data packet is not fixed. If the network becomes better, the interval at which the receiving end receives the data packet becomes smaller. At this time, the receiving end can normally decode and play the data packet; if the network is degraded, the interval at which the receiving end receives the data packet becomes larger, and the decoding and playback of the data is inevitable. Waiting occurs, making the playback inconsistent, resulting in a problem that the receiving end receives data quickly and slowly, and the data distortion is serious.

Summary of the invention

The embodiment of the invention provides an audio playing method and device, which can improve the smoothness of audio playing.

An embodiment of the present invention provides an audio playing method, including:

Receiving an audio data packet and buffering the audio data packet into a first buffer;

Detecting the number of audio data packets buffered in the first buffer, and adjusting the decoding speed in real time according to the number of the audio data packets;

The audio data packet is read from the first buffer at the decoding speed for decoding playback.

Further, the detecting the number of the audio data packets buffered in the first buffer, and adjusting the decoding speed in real time according to the number of the audio data packets, specifically includes:

Detecting the number of audio data packets buffered in the first buffer;

If the number of the audio data packets is greater than a preset first threshold and less than a preset second threshold, adjusting the decoding speed to a first speed;

If the number of the audio data packets is less than a preset first threshold, adjusting the decoding speed to a second speed; the second speed is less than the first speed;

If the number of the audio data packets is greater than a preset second threshold, adjusting the playback speed to a third speed; The third speed is greater than the first speed.

Preferably, the first speed is expressed as decoding the next audio data packet after each audio data packet is played; the second speed is expressed as decoding the next audio data after each audio data packet is played and the preset duration is hibernated. The third speed is represented as decoding the next audio data packet of the discarded audio data packet after each audio data packet is played and the next audio data packet is discarded.

Further, the audio playing method further includes:

When the decoding speed is adjusted to the second speed, timing is started, and it is cyclically detected whether the decoding speed is still the second speed;

If yes, when the duration of the timing reaches a preset duration threshold, the decoding operation is suspended, and the first buffer is expanded into a second buffer to buffer the received audio data packet;

After detecting that the number of the audio data packets buffered in the second buffer reaches a preset third threshold, reading the audio data packet from the second buffer to perform decoding and playing according to the first speed. .

Preferably, the second threshold is the number of audio data packets that can be buffered in the first buffer, and the first threshold is half of the number of audio data packets that can be buffered in the first buffer. The third threshold is the number of audio data packets that can be buffered in the second buffer.

Accordingly, the present invention also provides an audio playback device, including:

a cache module, configured to receive an audio data packet, and cache the audio data packet into a first buffer;

a detecting module, configured to detect a number of audio data packets buffered in the first buffer, and adjust a decoding speed in real time according to the number of the audio data packets; and

And a playing module, configured to read the audio data packet from the first buffer according to the decoding speed for decoding and playing.

Further, the detecting module specifically includes:

a detecting unit, configured to detect the number of audio data packets buffered in the first buffer;

a first adjusting unit, configured to adjust the decoding speed to a first speed when the number of the audio data packets is greater than a preset first threshold and less than a preset second threshold;

a second adjusting unit, configured to adjust the decoding speed to a second speed when the number of the audio data packets is less than a preset first threshold; the second speed is smaller than the first speed;

The third adjusting unit is configured to adjust the playing speed to a third speed when the number of the audio data packets is greater than a preset second threshold; the third speed is greater than the first speed.

Preferably, the first speed is expressed as decoding the next audio data packet after each audio data packet is played; the second speed is expressed as decoding the next audio data after each audio data packet is played and the preset duration is hibernated. The third speed is expressed as decoding the discarded audio data after each audio data packet is played and the next audio data packet is discarded. The next audio packet of the package.

Further, the audio playback device further includes:

a loop detection module, configured to start timing when the decoding speed is adjusted to the second speed, and cyclically detect whether the decoding speed is still the second speed;

a buffer expansion module, configured to: if the decoding speed is still the second speed, if the time period of the timer reaches a preset duration threshold, pause the decoding operation, and expand the first buffer to the first buffer Two buffers to buffer the received audio packets; and,

a replay module, configured to re-read from the second buffer according to the first speed when detecting that the number of audio data packets buffered in the second buffer reaches a preset third threshold The audio data packet is decoded and played.

Embodiments of the present invention have the following beneficial effects:

The audio playing method and device provided by the embodiments of the present invention can buffer the received audio data packet, and adjust the speed of decoding and playing in real time according to the number of buffered audio data packets to adapt to different network conditions and ensure audio. The smoothness of playback while improving the user experience.

Moreover, when the network condition is good, the audio data packet in the playback buffer area is accelerated, and when the network condition is poor, the audio data packet in the playback buffer area is decelerated and decoded, so that the speed of normal decoding playback can be restored as soon as possible, and the audio playback is improved. The fluency; when the network situation is very poor, expand the buffer capacity, in order to cache more audio data packets before decoding and playback, improve user experience.

DRAWINGS

1 is a schematic flow chart of an embodiment of an audio playing method provided by the present invention;

2 is a schematic flow chart of an embodiment of step S2 in the audio playing method provided by the present invention;

3 is a schematic structural diagram of an embodiment of an audio playback device provided by the present invention;

4 is a schematic structural diagram of an embodiment of a detection module in an audio playback device provided by the present invention.

detailed description

The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, but not all embodiments. All other embodiments obtained by those of ordinary skill in the art based on the embodiments of the present invention without creative efforts, All fall within the scope of protection of the present invention.

Referring to FIG. 1 , a schematic flowchart of an embodiment of an audio playing method provided by the present invention includes:

S1. Receive an audio data packet, and buffer the audio data packet into a first buffer.

S2, detecting the number of audio data packets buffered in the first buffer, and adjusting the decoding speed in real time according to the number of the audio data packets;

S3. Read an audio data packet from the first buffer according to the decoding speed for decoding and playing.

It should be noted that at the beginning of the program run, a buffer is initialized, and the received audio data packets are cached in the buffer for queuing. After the buffer area is full, that is, after the audio data packets buffered in the buffer area reach their capacity, the audio data packets are read from the buffer area for decoding and playing according to the order of the cache. At the same time, in the decoding and playing process, the number of audio data packets buffered in the buffer area is detected in real time, and the decoding speed is adjusted in real time according to the number of audio data packets, so that the audio data packets are decoded and played according to the decoding speed. The decoding speed is adjusted in real time according to different network conditions to ensure the smoothness of audio playback and improve user experience.

Further, as shown in FIG. 2, the detecting the number of audio data packets buffered in the first buffer, and adjusting the decoding speed in real time according to the number of the audio data packets, specifically includes:

S21. Detect the number of audio data packets buffered in the first buffer.

S22, if the number of the audio data packets is greater than a preset first threshold and less than a preset second threshold, adjusting the decoding speed to a first speed;

S23, if the number of the audio data packets is less than a preset first threshold, adjusting the decoding speed to a second speed; the second speed is smaller than the first speed;

S24. If the number of the audio data packets is greater than a preset second threshold, adjust the playing speed to a third speed; the third speed is greater than the first speed.

It should be noted that due to the instability of the network, the speed of receiving the audio data packet is fast and slow, so that the network condition is determined according to the number of audio data packets buffered in the first buffer. If the number of the audio data packets is greater than the preset first threshold and less than the preset second threshold, the network condition is normal, and the decoding is performed according to the first speed, that is, the normal speed of decoding; if the audio data packet If the number is less than the preset first threshold, the network condition is poor, and the audio data packet needs to be decelerated and decoded, so that the decoding speed is adjusted to the second speed; if the number of audio data packets is greater than the preset number The second threshold indicates that the network condition is good, and the audio data packet needs to be acceleratedly decoded and played, thereby adjusting the decoding speed to the third speed.

For example, the playback time of each audio packet is 10ms. The first speed is to read and decode the audio data packet in the first buffer every 10 ms. The second speed is 10 ms after each 10 ms audio data packet is played, and then the audio data packet in the first buffer is read and played, so that the speech rate heard by the user is slowed down, thereby achieving the purpose of deceleration. The third speed is that after a 10ms audio data packet is played, one 10ms audio data packet in the first buffer is discarded, and the next 10ms audio data packet of the discarded audio data packet in the first buffer is read, so that the next 10ms audio data packet is discarded. The speed of speech that the user hears is accelerated, thereby achieving the purpose of acceleration.

Further, the audio playing method further includes:

It should be noted that when the audio data packet starts to decelerate and decode, the timing starts. If the duration of the timing reaches the preset duration threshold, such as 5s, the decoding speed cannot be restored to the normal speed, that is, the audio data packet is still in the deceleration decoding state. , indicating that the network situation is very poor, will first suspend the decoding operation, and expand the capacity of the first buffer, generally double the capacity of the buffer, thereby expanding the first buffer to the second buffer, and Wait until the number of audio data buffers buffered in the second buffer reaches the preset third threshold, and then restart the decoding operation. If the decoding speed cannot be restored to normal speed after the decoding operation is performed, continue. Expand the capacity of the second buffer. If the decoding speed returns to the normal speed before the timer duration reaches the preset duration threshold, the network condition is improved, and the audio data packet can be decoded according to the normal speed.

For example, each audio data packet has a play duration of 10 ms, and the first buffer can buffer 10 audio data packets, then the second threshold is set to 10, and the first threshold is set to 5. If the number of audio data packets buffered in the first buffer is maintained at 5 to 10, the network condition is normal, and the audio data packet is normally decoded and played; if the audio data packet is buffered in the first buffer If the number is less than 5, the network condition is poor, there is a certain delay, and the audio data packet needs to be decelerated and decoded. If the number of audio data packets buffered in the first buffer exceeds 10, the network is described. The situation is better, and the audio data packet needs to be acceleratedly decoded and played. Wherein, when the duration of the deceleration decoding play reaches 5 s, the first buffer is expanded. The second buffer is buffered so that the second buffer can buffer 20 audio data packets, that is, the third threshold is set to 20, and the decoding operation is suspended, waiting for the reception of the audio data packet. When the number of audio data packets buffered in the second buffer reaches 20, the audio data packets in the second buffer are re-read at normal speed for decoding and playing.

The audio playing method provided by the embodiment of the invention can buffer the received audio data packet, and adjust the speed of decoding and playing in real time according to the number of buffered audio data packets, so as to adapt to different network conditions and ensure audio playback. Fluency while improving user experience. Moreover, when the network condition is good, the audio data packet in the playback buffer area is accelerated, and when the network condition is poor, the audio data packet in the playback buffer area is decelerated and decoded, so that the speed of normal decoding playback can be restored as soon as possible, and the audio playback is improved. The fluency; when the network situation is very poor, expand the buffer capacity, in order to cache more audio data packets before decoding and playback, improve user experience.

Correspondingly, the present invention also provides an audio playback device capable of implementing all the processes of the audio playback method in the above embodiments.

3 is a schematic structural diagram of an embodiment of an audio playback device provided by the present invention, including:

a cache module 1 configured to receive an audio data packet and buffer the audio data packet into a first buffer;

The detecting module 2 is configured to detect the number of audio data packets buffered in the first buffer, and adjust the decoding speed in real time according to the number of the audio data packets; and

The playing module 3 is configured to read the audio data packet from the first buffer according to the decoding speed for decoding and playing.

Further, the detecting module 2 specifically includes:

The detecting unit 21 is configured to detect the number of audio data packets buffered in the first buffer.

The first adjusting unit 22 is configured to adjust the decoding speed to a first speed when the number of the audio data packets is greater than a preset first threshold and less than a preset second threshold;

The second adjusting unit 23 is configured to adjust the decoding speed to a second speed when the number of the audio data packets is less than a preset first threshold; the second speed is smaller than the first speed;

The third adjusting unit 24 is configured to adjust the playing speed to a third speed when the number of the audio data packets is greater than a preset second threshold; the third speed is greater than the first speed.

Further, the audio playback device further includes:

a loop detection module, configured to start timing when the decoding speed is adjusted to the second speed, and cycle detection Whether the decoding speed is still the second speed;

The audio playing device provided by the embodiment of the invention can buffer the received audio data packet, and adjust the speed of the decoding and playing in real time according to the number of the buffered audio data packets, so as to adapt to different network conditions and ensure audio playback. Fluency while improving user experience. Moreover, when the network condition is good, the audio data packet in the playback buffer area is accelerated, and when the network condition is poor, the audio data packet in the playback buffer area is decelerated and decoded, so that the speed of normal decoding playback can be restored as soon as possible, and the audio playback is improved. The fluency; when the network situation is very poor, expand the buffer capacity, in order to cache more audio data packets before decoding and playback, improve user experience.

The above is a preferred embodiment of the present invention, and it should be noted that those skilled in the art can also make several improvements and retouchings without departing from the principles of the present invention. It is the scope of protection of the present invention.

Claims

An audio playing method, comprising:

Receiving an audio data packet and buffering the audio data packet into a first buffer;

Detecting the number of audio data packets buffered in the first buffer, and adjusting the decoding speed in real time according to the number of the audio data packets;

The audio data packet is read from the first buffer at the decoding speed for decoding playback.
The audio playing method according to claim 1, wherein the detecting the number of audio data packets buffered in the first buffer, and adjusting the decoding speed in real time according to the number of the audio data packets, include:

Detecting the number of audio data packets buffered in the first buffer;

If the number of the audio data packets is greater than a preset first threshold and less than a preset second threshold, adjusting the decoding speed to a first speed;

If the number of the audio data packets is less than a preset first threshold, adjusting the decoding speed to a second speed; the second speed is less than the first speed;

And if the number of the audio data packets is greater than a preset second threshold, adjusting the playing speed to a third speed; the third speed is greater than the first speed.
The audio playing method according to claim 2, wherein said first speed is expressed as decoding the next audio data packet every time an audio data packet is played; said second speed is expressed as each audio data is played back Decoding and decoding the next audio data packet after sleeping for a preset duration; the third speed is expressed as decoding the next audio data packet of the discarded audio data packet after each audio data packet is played and the next audio data packet is discarded. .
The audio playing method according to claim 2 or 3, wherein the audio playing method further comprises:

When the decoding speed is adjusted to the second speed, timing is started, and it is cyclically detected whether the decoding speed is still the second speed;

If yes, when the duration of the timing reaches a preset duration threshold, the decoding operation is suspended, and the first buffer is expanded into a second buffer to buffer the received audio data packet;

After detecting that the number of the audio data packets buffered in the second buffer reaches a preset third threshold, reading the audio data packet from the second buffer to perform decoding and playing according to the first speed. .
The audio playing method according to claim 4, wherein said second threshold is said first buffer a number of bufferable audio data packets, the first threshold being half of the number of audio data packets that can be buffered in the first buffer, the third threshold being cacheable in the second buffer The number of audio packets.
An audio playback device, comprising:

a cache module, configured to receive an audio data packet, and cache the audio data packet into a first buffer;

a detecting module, configured to detect a number of audio data packets buffered in the first buffer, and adjust a decoding speed in real time according to the number of the audio data packets; and

And a playing module, configured to read the audio data packet from the first buffer according to the decoding speed for decoding and playing.
The audio playback device of claim 6, wherein the detecting module comprises:

a detecting unit, configured to detect the number of audio data packets buffered in the first buffer;

a first adjusting unit, configured to adjust the decoding speed to a first speed when the number of the audio data packets is greater than a preset first threshold and less than a preset second threshold;

a second adjusting unit, configured to adjust the decoding speed to a second speed when the number of the audio data packets is less than a preset first threshold; the second speed is smaller than the first speed;

The third adjusting unit is configured to adjust the playing speed to a third speed when the number of the audio data packets is greater than a preset second threshold; the third speed is greater than the first speed.
The audio playback device according to claim 7, wherein said first speed is expressed as decoding the next audio data packet every time an audio data packet is played; said second speed is expressed as each audio data being played back Decoding and decoding the next audio data packet after sleeping for a preset duration; the third speed is expressed as decoding the next audio data packet of the discarded audio data packet after each audio data packet is played and the next audio data packet is discarded. .
The audio playback device of claim 7 or 8, wherein the audio playback device further comprises:

a loop detection module, configured to start timing when the decoding speed is adjusted to the second speed, and cyclically detect whether the decoding speed is still the second speed;

a buffer expansion module, configured to: if the decoding speed is still the second speed, if the time period of the timer reaches a preset duration threshold, pause the decoding operation, and expand the first buffer to the first buffer Two buffers to buffer the received audio packets; and,

a replay module, configured to re-read from the second buffer according to the first speed when detecting that the number of audio data packets buffered in the second buffer reaches a preset third threshold The audio data packet is decoded and played.
The audio playback device of claim 9, wherein the second threshold is a number of bufferable audio data packets in the first buffer, and the first threshold is the first buffer One half of the number of audio data packets that can be buffered, and the third threshold is the number of audio data packets that can be buffered in the second buffer.