CN109361494B

CN109361494B - Audio data processing method, device, equipment and storage medium

Info

Publication number: CN109361494B
Application number: CN201811446634.4A
Authority: CN
Inventors: 刘丽; 朱敏; 成家雄; 曾泽兴; 张帆
Original assignee: Guangzhou Baiguoyuan Information Technology Co Ltd
Current assignee: Bigo Technology Pte Ltd
Priority date: 2018-11-29
Filing date: 2018-11-29
Publication date: 2021-06-29
Anticipated expiration: 2038-11-29
Also published as: CN109361494A

Abstract

The embodiment of the invention discloses a method, a device, equipment and a storage medium for processing audio data, wherein the method comprises the following steps: acquiring a network bandwidth value and a network capacity value; packing the audio data into a data packet according to the network bandwidth value; determining a sending mode of the data packet according to the network capacity value; and sending the data packet to audio receiving equipment by adopting the sending mode. The embodiment of the invention packs the audio data into the data packet according to the network bandwidth value of the network, can select a better coding code rate to pack the audio data, determines the sending mode of the data packet according to the network capacity value, and sends the data packet to the audio receiving equipment according to the sending mode, thereby not only improving the quality of voice communication, but also ensuring the low delay of the communication.

Description

Audio data processing method, device, equipment and storage medium

Technical Field

The present invention relates to the field of communications technologies, and in particular, to a method, an apparatus, a device, and a storage medium for processing audio data.

Background

With the increasing development of internet technology, the application programs for voice call based on the internet are receiving more and more attention.

In a voice call, after the audio sending device needs to encode the acquired audio data, the audio data is sent to the audio receiving device through the network, the quality of the call quality is related to the coding rate of the audio data and the network delay, the coding rate is high, the network delay is small, and better call quality can be obtained.

For a network with larger random packet loss and network delay, the audio sending equipment adopts a smaller coding rate to code because the packet loss rate of the network delay and the random packet loss is monitored to be larger, so that the coding rate cannot be increased; for a small-bandwidth network with low network delay, because it is monitored that the network delay is low, the audio sending device adopts a high coding rate for coding, and meanwhile, because the network congestion is caused by the small bandwidth, if the monitored network delay is not in the corresponding threshold range, the audio sending end cannot lower the coding rate, which causes the high delay of the network, therefore, the current method for setting a plurality of thresholds to adjust the audio coding rate cannot effectively improve the communication quality of voice communication.

Disclosure of Invention

The embodiment of the invention provides a method, a device, equipment and a storage medium for processing audio data, which aim to solve the problem that the processing method of the audio data cannot effectively improve the call quality in voice call.

In a first aspect, an embodiment of the present invention provides an audio data processing method applied to an audio sending device, including:

acquiring a network bandwidth value and a network capacity value;

packing the audio data into a data packet according to the network bandwidth value;

determining a sending mode of the data packet according to the network capacity value;

and sending the data packet to audio receiving equipment by adopting the sending mode.

In a second aspect, an embodiment of the present invention provides an apparatus for processing audio data, which is applied to an audio transmitting device, and includes:

the bandwidth value and capacity value acquisition module is used for acquiring a network bandwidth value and a network capacity value;

the audio data packaging module is used for packaging the audio data into a data packet according to the network bandwidth value;

a sending mode determining module, configured to determine a sending mode of the data packet according to the network capacity value;

and the data packet sending module is used for sending the data packet to the audio receiving equipment by adopting the sending mode.

In a third aspect, an embodiment of the present invention provides an apparatus, where the apparatus includes:

one or more processors;

a storage device for storing one or more programs,

when the one or more programs are executed by the one or more processors, the one or more processors implement the audio data processing method according to any embodiment of the present invention.

In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the audio data processing method according to any embodiment of the present invention.

According to the audio data processing method provided by the embodiment of the invention, the network bandwidth value and the network capacity value are obtained, the audio data are packaged into the data packet according to the network bandwidth value, the sending mode of the data packet is determined according to the network capacity value, and the data packet is sent to the audio receiving equipment by adopting the sending mode. The embodiment of the invention packs the audio data into the data packet according to the network bandwidth value of the network, can select a better coding code rate to pack the audio data, determines the sending mode of the data packet according to the network capacity value, and sends the data packet to the audio receiving equipment according to the sending mode, thereby not only improving the quality of voice communication, but also ensuring the low delay of the communication.

Drawings

Fig. 1 is a schematic diagram of a method for processing audio data according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a method for processing audio data according to another embodiment of the present invention;

fig. 3 is a schematic structural diagram of an apparatus for processing audio data according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of an apparatus according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Fig. 1 is a flowchart of an audio data processing method provided in an embodiment of the present invention, where the audio data processing method is applicable to a case where an audio sending device sends audio data in a voice call, and the method may be executed by an audio data processing apparatus, where the apparatus may be implemented by software and/or hardware, and is integrated in a device that executes the method, and specifically, as shown in fig. 1, the method may include the following steps:

s101, acquiring a network bandwidth value and a network capacity value.

In the embodiment of the present invention, the audio sending device may respond to a voice call operation of a user in a network call APP, collect audio data, encode the audio data, and send the encoded audio data to the audio receiving device through the network, and in the process of sending the audio data, the audio sending device may calculate a network bandwidth value and a network capacity value of the network in real time, where the network bandwidth value may refer to a data volume transmittable within a fixed time on a transmission link of the audio data, and the network capacity value may refer to a maximum data volume transmittable by the audio sending device on the transmission link of the audio data.

Specifically, the audio sending device may estimate the network bandwidth value and the network capacity value according to a network state, where the network state may be represented by a network quality parameter, which may include, but is not limited to, a sending rate, a receiving rate, a network delay, a packet loss rate, and the like of the network, and the audio sending device may monitor the network quality parameter in real time to estimate the network bandwidth value and the network capacity value according to the network quality parameter.

And S102, packaging the audio data into a data packet according to the network bandwidth value.

In order to adapt the coding rate to the network bandwidth value to obtain good call quality, a target coding rate may be determined according to the original coding rate and the network bandwidth value, for example, the original coding rate may be increased when the network bandwidth value is large to obtain good call quality, and the original coding rate may be decreased when the network bandwidth value is small to reduce network delay.

S103, determining the sending mode of the data packet according to the network capacity value.

Specifically, the data packets may include audio packets, retransmission packets after the data packets are failed to be sent, redundant packets, and occupancy packets, and the sending mode of the data packets may be determined according to the network capacity value, where the sending mode may be to select the data packets to be sent and to select the priority of the data packets to be sent, for example, according to the network capacity value, to select the sending mode to preferentially send the audio packets and then send the retransmission packets as the data packets, or to send only the audio packets, or to send the audio packets and the occupancy packets, and so on. In implementing the embodiment of the present invention, a person skilled in the art may set other transmission modes according to actual needs, which is not limited in the embodiment of the present invention.

And S104, sending the data packet to audio receiving equipment by adopting the sending mode.

After determining the sending mode of the data packet, the audio sending device may send the data packet to the audio receiving device according to the sending mode of the data packet, so as to decode and play audio data after the audio receiving device receives the data packet, thereby implementing voice call between the audio sending device and the audio receiving device.

The embodiment of the invention packs the audio data into the data packet according to the network bandwidth value by acquiring the network bandwidth value and the network capacity value, determines the sending mode of the data packet according to the network capacity value, and sends the data packet to the audio receiving equipment by adopting the sending mode. The embodiment of the invention packs the audio data into the data packet according to the network bandwidth value of the network, can select a better coding code rate to pack the audio data, determines the sending mode of the data packet according to the network capacity value, and sends the data packet to the audio receiving equipment according to the sending mode, thereby not only improving the quality of voice communication, but also ensuring the low delay of the communication.

Fig. 2 is a flowchart of a method for processing audio data according to another embodiment of the present invention, which is optimized based on the foregoing embodiments. Specifically, as shown in fig. 2, the method provided by the embodiment of the present invention may include the following steps:

s201, acquiring a network bandwidth value and a network capacity value.

In the embodiment of the present invention, the network bandwidth value may be calculated by a sending rate and a receiving rate, specifically, the sending rate of the audio sending device sending the audio data and the receiving rate of the received audio data may be obtained according to a preset rate calculation period, and based on the sending rate and the receiving rate, a reference transmission rate of the preset rate calculation period is determined, and a plurality of reference transmission rates within the preset bandwidth value calculation period are obtained, so as to determine the network bandwidth value among the plurality of reference transmission rates.

For example, the preset rate calculation period may be one RTT (Round-Trip Time), and the Round-Trip delay is the total Time from when the audio transmitting device transmits a data packet to when the audio receiving device receives acknowledgement information fed back by the audio receiving device. Specifically, the audio sending device records the sending time and the receiving time of each data packet, and the difference value between the sending time and the receiving time is the round-trip delay, wherein the receiving time is the time when the audio sending device receives the acknowledgement information fed back by the audio receiving device. In one RTT, the audio sending device may send a plurality of data packets and receive the plurality of data packets, and then may calculate a ratio of the total number of bytes of the data packets sent in one RTT to the RTT to obtain a sending rate, and obtain a receiving rate according to the ratio of the total number of bytes of the received data packets to the RTT.

In the embodiment of the present invention, the network bandwidth value may be calculated according to a preset bandwidth value calculation period, and optionally, a plurality of RTTs (for example, 10 RTTs) may be used as one preset bandwidth value calculation period, a reference transmission rate is determined in each RTT, and a maximum value of the reference transmission rates calculated by the plurality of RTTs is used as the network bandwidth value.

After the network bandwidth value is obtained through calculation, a plurality of network delays in a preset bandwidth value calculation period can be obtained, a network capacity value is calculated according to the network delays and the network bandwidth value, optionally, a minimum network delay can be selected from the plurality of network delays, and the network capacity value is calculated through the product of the minimum network delay and the network bandwidth value.

According to the embodiment of the invention, the network bandwidth value is determined according to the reference transmission rates of the plurality of preset rate calculation cycles, so that the accuracy of the calculated network bandwidth value and the calculated network capacity value can be improved.

S202, determining the original coding rate currently set by the audio data.

In the embodiment of the invention, the audio data coding can be performed with PCM coding, WMA coding, MP3 coding, AAC coding and the like, when in coding, the audio sending equipment sends the audio data to the audio receiving equipment, the audio data is coded according to a certain coding rate, and then the current coding rate can be used as the original coding rate.

S203, determining a required network bandwidth value for sending the audio data according to the original coding rate.

Specifically, the sending of the audio data may occupy a certain network bandwidth, where the network bandwidth is related to the original coding rate, and a value of the occupied network bandwidth may be obtained as a required network bandwidth value, for example, the required network bandwidth value may be obtained by calculating the sending rate and the original coding rate.

S204, if the network bandwidth values are larger than or smaller than the required network bandwidth value within the preset time length, adjusting the original coding rate according to a preset bandwidth coefficient and the network bandwidth value to obtain a target coding rate.

In practical application, if the network bandwidth value is greater than the required network bandwidth value, the original coding code rate can be increased to improve the definition of voice call, and if the network bandwidth value is less than the required network bandwidth value, the original coding code rate can be reduced to avoid network delay caused by network congestion.

Preferably, if the network bandwidth values are all greater than or less than the required network bandwidth values within the preset duration, the original coding bit rate is adjusted according to the preset bandwidth coefficient and the network bandwidth values based on the network bandwidth values, specifically, the network bandwidth values are calculated according to the preset bandwidth calculation cycle, if the continuous multiple network bandwidth values calculated according to the preset bandwidth calculation cycle within the preset duration are all greater than the required network bandwidth values, the coding bit rate higher than the original coding bit rate can be selected from the preset bandwidth-bit rate association list as the target coding bit rate, which can avoid that the code rate adjustment is too frequent due to the fact that the original coding bit rate is adjusted when the network bandwidth values are greater than or less than each time, reduce the burden of code rate adjustment, improve the effectiveness of code rate adjustment, and further, the original coding bit rate is adjusted according to the preset bandwidth coefficients and the network bandwidth values, the coding rate can be ensured to be improved under the condition of low network delay, and the quality of voice communication is further improved.

S205, packaging the audio data into a data packet by adopting the target coding rate.

Specifically, the audio sending device uses a target coding rate to code the audio data according to a preset coding format to obtain a plurality of data packets, optionally, the data packets may include at least one of an audio packet, a retransmission packet, a redundant packet, and a position occupied packet, where the retransmission packet may be a data packet that is failed in sending the data packet and needs to be retransmitted, the redundant packet may be a data packet that is added to the sent audio data to ensure that the audio data can be normally played in the audio receiving device and needs to recover the error of the sent audio packet, and the position occupied packet may be a data packet that is insufficient in data amount when the audio data is sent and is used for complementing the data amount to be sent.

S206, in the process of sending the data packet, counting the amount of data which are not fed back and sent but not received in the specified time length.

In practical application, because network delay exists in a network, the audio sending device does not receive acknowledgement information fed back by the audio receiving device immediately after sending a data packet, the total number of bytes of the data packet which has not received feedback in the sent data packet can be counted as an unrevealed data volume, the unrevealed data volume reflects the sending condition of the data packet in the current network, if the unrevealed data volume is large, network congestion is indicated, the number of sent data packets can be reduced, and if the unrevealed data volume is small, the network is indicated to be smooth, and the number of sent data packets can be increased. The specified duration may refer to one RTT, that is, the unrevealed data amount of a data packet that has not received feedback within one RTT may be counted.

And S207, determining a capacity threshold value of the network capacity value according to a preset capacity coefficient.

Specifically, the product of the preset capacity coefficient and the network capacity value is a capacity threshold, for example, the preset capacity coefficient may be greater than 1, when the amount of unrevealed data is greater than the capacity threshold, the network may be in a congestion state, and when the amount of unrevealed data is less than the capacity threshold, the network may be in a smooth state.

S208, if the amount of the data which is not fed back is larger than the capacity threshold, determining that the sending mode of the data packet is a first current limiting sending mode, and executing S211.

In the embodiment of the present invention, when the amount of non-feedback data is greater than the capacity threshold, it is determined that the transmission mode of the data packet is the first current limiting transmission mode, and optionally, the first current limiting transmission mode is to transmit an audio packet, a retransmission packet, and a redundant packet, so as to reduce the amount of data that needs to be transmitted, and reduce the load on the network and the network delay.

S209, if the amount of the data which is not fed back is less than the capacity threshold, determining that the transmission mode of the data packet is a first transmission mode, and executing S212.

In the embodiment of the present invention, when the amount of non-feedback data is smaller than the capacity threshold, it is determined that the transmission mode of the data packet is the first transmission mode, and optionally, the first transmission mode is to transmit an audio packet, a retransmission packet, a redundancy packet, and an occupancy packet, that is, to transmit all the data packets according to a normal state.

S210, if the quantity of the data which are not fed back is larger than the capacity threshold value in a plurality of continuous specified time lengths, determining that the sending mode of the data packet is a second current limiting sending mode, and executing S213.

Specifically, after the audio sending device enters the first current limiting mode, counting unrefed data amounts of a plurality of specified durations, and if the unrefed data amounts of a plurality of continuous specified durations are all greater than a capacity threshold, for example, within one RTT, the unrefed data amounts are all greater than the capacity threshold, and within the next two consecutive RTTs, the unrefed data amounts are all greater than the capacity threshold, which indicates that the network congestion situation is not improved after passing through the first current limiting mode, determining that the sending mode of the data packet is the second current limiting mode, and optionally, the second current limiting mode is sending the audio packet and a retransmission packet, so as to further reduce the data amount required to be sent, and reduce the load of the network and the network delay.

S211, screening out an audio packet, a retransmission packet and a redundancy packet from the data packet, and sending the audio packet, the retransmission packet and the redundancy packet to audio receiving equipment.

Specifically, if the sending mode of the data packet is the first current-limiting sending mode, that is, the audio packet, the retransmission packet and the redundant packet are sent, the audio packet, the retransmission packet and the redundant packet can be screened out from the data packet, and the audio packet, the retransmission packet and the redundant packet are sent to the audio receiving device.

S212, the audio packet, the retransmission packet, the redundant packet and the placeholder packet are sent to an audio receiving device.

Specifically, if the sending mode of the data packet is the first sending mode, that is, the audio packet, the retransmission packet, the redundant packet and the occupancy packet are sent, and the audio packet, the retransmission packet, the redundant packet and the occupancy packet are sent to the audio receiving device, since the first sending mode is the unrestricted mode, all the data packets can be sent according to the normal mode, and the quality of the voice call can be improved.

S213, screening out the audio packets and the retransmission packets from the data packets, and sending the audio packets and the retransmission packets to an audio receiving device.

In the embodiment of the invention, if the sending mode of the data packet is the second current-limiting sending mode, namely the audio packet and the retransmission packet are sent, the audio packet and the retransmission packet can be screened from the data packet and sent to the audio receiving equipment.

In an optional embodiment of the present invention, when the audio sending device sends the data packet, the real-time sending rate may be obtained, and if the real-time sending rate is smaller than the sending rate calculated by the preset rate calculation cycle, and it is determined that the sending mode of the data packet is the second sending mode, the audio packet and the occupancy packet are screened from the data packet, and the audio packet and the occupancy packet are sent to the audio receiving device.

Specifically, in the process that the audio sending device sends the data packet according to one of a first current-limiting sending mode, a second current-limiting sending mode and a first sending mode, if the real-time sending rate is smaller than the sending rate calculated by the preset rate calculation cycle, the sending mode of the current data packet can be converted into the second sending mode, namely, the audio packet and the bit occupying packet are sent, the problem that the data packet is insufficient after the coding code rate is adjusted to the highest coding code rate can be solved, the sufficient data packet is sent by increasing the bit occupying packet, and therefore the accuracy of the audio data decoded by the audio receiving device is improved.

In the embodiment of the invention, when the audio sending equipment sends the data packets, if a plurality of continuous audio packets are detected to be mute packets, the sending mode of the data packets is determined to be a third sending mode, the audio packets are screened from the data packets, and the audio packets are sent to the audio receiving equipment.

In practical application, the audio data may include a mute section and a non-mute section, and the mute section is invalid voice information, and accordingly, a mute packet and a non-mute packet may be obtained after encoding.

It should be noted that, although the embodiment of the present invention describes that the transmission modes of the data packet include the above multiple modes and the data packet correspondingly transmitted in the transmission modes, in implementing the embodiment of the present invention, a person skilled in the art may also set other transmission modes and data packets correspondingly transmitted in the transmission modes according to actual needs.

Fig. 3 is a schematic structural diagram of an audio data processing apparatus according to an embodiment of the present invention, and as shown in fig. 3, the apparatus specifically includes:

a bandwidth value and capacity value obtaining module 301, configured to obtain a network bandwidth value and a network capacity value;

an audio data packing module 302, configured to pack the audio data into a data packet according to the network bandwidth value;

a sending mode determining module 303, configured to determine a sending mode of the data packet according to the network capacity value;

a data packet sending module 304, configured to send the data packet to the audio receiving device in the sending mode.

Optionally, the bandwidth value and capacity value obtaining module 301 includes:

a sending rate and receiving rate obtaining submodule, configured to obtain a sending rate and a receiving rate at which the audio sending device sends the audio data according to a preset rate calculation period;

a reference transmission rate determining submodule, configured to determine a reference transmission rate of the preset rate calculation period based on the sending rate and the receiving rate;

a reference transmission rate obtaining submodule, configured to obtain multiple reference transmission rates in a preset bandwidth value calculation period;

a network bandwidth value determining submodule, configured to determine a network bandwidth value of the preset bandwidth value calculation period by using the multiple reference transmission rates;

and the network capacity value calculating submodule is used for calculating the network capacity value by adopting the network bandwidth value.

Optionally, the network capacity value calculation sub-module includes:

the network delay acquisition unit module is used for acquiring the network delay of the network;

and the network capacity value calculating unit is used for calculating the network capacity value by adopting the network delay and the network bandwidth value.

Optionally, the audio data packing module 302 includes:

an original coding rate determining submodule, configured to determine an original coding rate currently set for the audio data;

a required network bandwidth value determining submodule, configured to determine a required network bandwidth value for sending the audio data according to the original coding rate;

the coding rate adjusting submodule is used for adjusting the original coding rate according to a preset bandwidth coefficient and the network bandwidth value to obtain a target coding rate if the network bandwidth values are both greater than or less than the required network bandwidth value within a preset duration;

an audio data packing submodule for packing the audio data into a data packet by using the target coding rate

Optionally, the sending mode determining module 303 includes:

the capacity preset determining submodule is used for determining a capacity threshold value of the network capacity value according to a preset capacity coefficient;

a first current limiting transmission mode determining submodule, configured to determine that a transmission mode of the data packet is a first current limiting transmission mode if the amount of the unrevealed data is greater than the capacity threshold;

and the first sending mode determining submodule is used for determining that the sending mode of the data packet is the first sending mode if the amount of the data which is not fed back is less than the capacity threshold.

Optionally, the data packets include an audio packet, a retransmission packet, a redundancy packet, and a bit occupying packet, and the data packet sending module 304 includes:

a first sending submodule, configured to screen out an audio packet, a retransmission packet, and a redundant packet from the data packet if the sending mode is a first current-limiting sending mode, and send the audio packet, the retransmission packet, and the redundant packet to an audio receiving device

And the second sending submodule is used for sending the audio packet, the retransmission packet, the redundant packet and the placeholder packet to audio receiving equipment if the sending mode is the first sending mode.

Optionally, the sending mode determining module 303 further includes:

a second current limit mode determination submodule, configured to determine that the transmission mode of the data packet is a second current limit transmission mode if the amount of unrefed data is greater than the capacity threshold value for a plurality of consecutive specified time periods,

the packet sending module 304 includes:

the data packet processing device is used for screening out audio packets and retransmission packets from the data packets;

and the third sending submodule is used for screening out an audio packet and a retransmission packet from the data packet and sending the audio packet and the retransmission packet to audio receiving equipment.

Optionally, the sending mode determining module 303 includes:

a real-time sending rate obtaining submodule, configured to obtain a real-time sending rate in a process of sending the data packet;

a second sending mode determining submodule, configured to determine that the sending mode of the data packet is a second sending mode if the real-time sending rate is smaller than the sending rate calculated in the preset rate calculation period;

the packet sending module 304 includes:

and the fourth sending submodule is used for screening out the audio packets and the position occupying packets from the data packets, and sending the audio packets and the position occupying packets to the audio receiving equipment.

Optionally, the sending mode determining module 303 includes:

a third sending mode determining submodule, configured to determine that the sending mode of the data packet is a third sending mode if it is detected that consecutive audio packets are silence packets;

the packet sending module 304 includes:

and the fifth sending submodule is used for screening out the audio packets from the data packets and sending the audio packets to the audio receiving equipment.

The audio data processing device provided by the embodiment of the invention can execute the audio data processing method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

Referring to fig. 4, a schematic diagram of a device in one example of the invention is shown. As shown in fig. 4, the apparatus may specifically include: a processor 40, a memory 41, a display screen 42 with touch functionality, an input device 43, an output device 44, and a communication device 45. The number of processors 40 in the device may be one or more, and one processor 40 is taken as an example in fig. 4. The number of the memory 41 in the device may be one or more, and one memory 41 is taken as an example in fig. 4. The processor 40, the memory 41, the display 42, the input means 43, the output means 44 and the communication means 45 of the device may be connected by a bus or other means, as exemplified by the bus connection in fig. 4.

The memory 41 is used as a computer-readable storage medium for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the audio data processing method according to any embodiment of the present invention (for example, the bandwidth value and capacity value acquisition module 301, the audio data packetizing module 302, the transmission mode determination module 303, and the packet transmission module 304 in the audio data processing apparatus). The memory 41 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating device, an application program required for at least one function; the storage data area may store data created according to use of the device, and the like. Further, the memory 41 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, memory 41 may further include memory located remotely from processor 40, which may be connected to the device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The display screen 42 is a display screen 42 with a touch function, which may be a capacitive screen, an electromagnetic screen, or an infrared screen. In general, the display screen 42 is used for displaying data according to instructions from the processor 40, and is also used for receiving touch operations applied to the display screen 42 and sending corresponding signals to the processor 40 or other devices. Optionally, when the display screen 42 is an infrared screen, the display screen further includes an infrared touch frame, and the infrared touch frame is disposed around the display screen 42, and may also be configured to receive an infrared signal and send the infrared signal to the processor 40 or other devices.

The communication device 45 is used for establishing communication connection with other devices, and may be a wired communication device and/or a wireless communication device.

The input means 43 may be used for receiving input numeric or character information and generating key signal inputs related to user settings and function control of the apparatus, and may be a camera for acquiring images and a sound pickup device for acquiring audio data. The output device 44 may include an audio device such as a speaker. It should be noted that the specific composition of the input device 43 and the output device 44 can be set according to actual conditions.

The processor 40 executes various functional applications of the device and data processing, i.e., implements the above-described audio data processing method, by executing software programs, instructions, and modules stored in the memory 41.

Specifically, in the embodiment, when the processor 40 executes one or more programs stored in the memory 41, the following operations are specifically implemented: acquiring a network bandwidth value and a network capacity value; packing the audio data into a data packet according to the network bandwidth value; determining a sending mode of the data packet according to the network capacity value; and sending the data packet to audio receiving equipment by adopting the sending mode.

Embodiments of the present invention further provide a computer-readable storage medium, where instructions in the storage medium, when executed by a processor of a device, enable the device to perform the method for processing audio data according to the foregoing method embodiments. Illustratively, the audio data processing method includes: acquiring a network bandwidth value and a network capacity value; packing the audio data into a data packet according to the network bandwidth value; determining a sending mode of the data packet according to the network capacity value; and sending the data packet to audio receiving equipment by adopting the sending mode.

It should be noted that, as for the embodiments of the apparatus, the device, and the storage medium, since they are basically similar to the embodiments of the method, the description is relatively simple, and in relevant places, reference may be made to the partial description of the embodiments of the method.

From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a robot, a personal computer, a server, or a network device) to execute the method for processing audio data according to any embodiment of the present invention.

It should be noted that, in the above processing apparatus for audio data, the units and modules included in the processing apparatus are only divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be realized; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by suitable instruction execution devices. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A method for processing audio data, applied to an audio transmission device, includes:

acquiring a network bandwidth value and a network capacity value;

packing the audio data into a data packet according to the network bandwidth value, wherein the data packet comprises an audio packet, a retransmission packet, a redundant packet and an occupancy packet;

determining a sending mode of the data packet according to the network capacity value, wherein the sending mode comprises the data packet selected to be sent and the priority of the data packet selected to be sent;

2. The method of claim 1, wherein the obtaining the network bandwidth value and the network capacity value comprises:

acquiring the sending rate and the receiving rate of the audio data sent by the audio sending equipment according to a preset rate calculation period;

determining a reference transmission rate of the preset rate calculation period based on the sending rate and the receiving rate;

acquiring a plurality of reference transmission rates in a preset bandwidth value calculation period;

determining the network bandwidth value of the preset bandwidth value calculation period by adopting the plurality of reference transmission rates;

and calculating a network capacity value by adopting the network bandwidth value.

3. The method of claim 2, wherein said employing said network bandwidth value to calculate a network capacity value comprises:

acquiring network delay of the network;

and calculating a network capacity value by adopting the network delay and the network bandwidth value.

4. The method of claim 1, 2 or 3, wherein said packaging the audio data into packets according to the network bandwidth value comprises:

determining the original coding rate currently set by the audio data;

determining a required network bandwidth value for sending the audio data according to the original coding rate;

if the network bandwidth values are both larger than or smaller than the required network bandwidth value within the preset time length, adjusting the original coding code rate according to a preset bandwidth coefficient and the network bandwidth values to obtain a target coding code rate;

and packaging the audio data into a data packet by adopting the target coding rate.

5. The method of claim 1, 2 or 3, wherein said determining a transmission mode of said data packet based on said network capacity value comprises:

in the process of sending the data packet, counting the amount of data which are not fed back and do not receive the fed back data packet within a specified time;

determining a capacity threshold value of the network capacity value according to a preset capacity coefficient;

if the amount of the data which is not fed back is larger than the capacity threshold, determining that the sending mode of the data packet is a first current limiting sending mode;

and if the amount of the data which is not fed back is less than the capacity threshold value, determining that the transmission mode of the data packet is a first transmission mode.

6. The method of claim 5, wherein said transmitting the data packet to the audio receiving device in the transmit mode comprises:

if the sending mode is a first current limiting sending mode, screening out an audio packet, a retransmission packet and a redundant packet from the data packet, and sending the audio packet, the retransmission packet and the redundant packet to audio receiving equipment;

and if the sending mode is the first sending mode, sending the audio packet, the retransmission packet, the redundant packet and the placeholder packet to audio receiving equipment.

7. The method of claim 6, after determining that the transmission mode of the audio packets is the first limited-current transmission mode, further comprising:

if the quantity of the data which is not fed back is larger than the capacity threshold value in a plurality of continuous specified time lengths, determining that the sending mode of the data packet is a second current limiting sending mode;

the sending the data packet to the audio receiving device in the sending mode includes:

and screening out an audio packet and a retransmission packet from the data packet, and sending the audio packet and the retransmission packet to an audio receiving device.

8. The method of claim 2 or 3, wherein said determining a transmission mode for said data packet based on said network capacity value comprises:

acquiring a real-time sending rate in the process of sending the data packet;

if the real-time sending rate is smaller than the sending rate calculated by the preset rate calculation period, determining that the sending mode of the data packet is a second sending mode;

and screening out an audio packet and a placeholder packet from the data packet, and sending the audio packet and the placeholder packet to audio receiving equipment.

9. The method of claim 7, wherein said determining a transmission mode for said packet based on said network capacity value comprises:

if a plurality of continuous audio packets are detected to be mute packets, determining that the transmission mode of the data packets is a third transmission mode;

and screening out audio packets from the data packets and sending the audio packets to an audio receiving device.

10. An apparatus for processing audio data, applied to an audio transmission device, comprising:

the audio data packaging module is used for packaging the audio data into data packets according to the network bandwidth value, wherein the data packets comprise audio packets, retransmission packets, redundant packets and placeholder packets;

a sending mode determining module, configured to determine a sending mode of the data packet according to the network capacity value, where the sending mode includes a data packet selected for sending and a priority of the data packet selected for sending;

11. An apparatus, characterized in that the apparatus comprises:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of processing audio data as claimed in any one of claims 1-9.

12. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method of processing audio data according to any one of claims 1 to 9.