WO2019052582A1 - 音频编码方法、解码方法、装置及音频编解码系统 - Google Patents

音频编码方法、解码方法、装置及音频编解码系统 Download PDF

Info

Publication number
WO2019052582A1
WO2019052582A1 PCT/CN2018/106298 CN2018106298W WO2019052582A1 WO 2019052582 A1 WO2019052582 A1 WO 2019052582A1 CN 2018106298 W CN2018106298 W CN 2018106298W WO 2019052582 A1 WO2019052582 A1 WO 2019052582A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
audio
encoded data
redundant data
packet
Prior art date
Application number
PCT/CN2018/106298
Other languages
English (en)
French (fr)
Inventor
王兴鹤
Original Assignee
杭州海康威视数字技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州海康威视数字技术股份有限公司 filed Critical 杭州海康威视数字技术股份有限公司
Priority to US16/648,626 priority Critical patent/US11355130B2/en
Priority to EP18856710.1A priority patent/EP3686885A4/en
Publication of WO2019052582A1 publication Critical patent/WO2019052582A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/005Correction of errors induced by the transmission channel, if related to the coding algorithm
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/22Mode decision, i.e. based on audio signal content versus external parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0823Errors, e.g. transmission errors
    • H04L43/0829Packet loss

Definitions

  • the embodiments of the present invention relate to the field of audio technologies, and in particular, to an audio encoding method, a decoding method, a device, and an audio codec system.
  • the transmitting end encodes and compresses the audio source data to obtain compressed data, and sends the compressed data to the receiving end, and the receiving end decodes the received compressed data to obtain audio source data.
  • the audio source data is composed of continuous audio frames, and the audio frames are audio data of 20ms or 40ms, and the audio frames are encoded and compressed to obtain compressed data packets.
  • compressed data packets may lose some data packets during transmission. Some audio frames in the audio data received by the receiving end will be lost, resulting in discontinuous or stuck sounds.
  • the transmitting end after transmitting a set of compressed data packets (corresponding to audio frame D1, audio frame D2, and audio frame D3), the transmitting end sends a redundant data packet (F), which is redundant. The data packet is used to recover the lost audio frame in the compressed data packet.
  • the compressed data packet corresponding to the audio frame D1 is lost, and the receiving end continues to receive the compressed data packet (corresponding to the audio frame D2) and the compressed data packet (corresponding to the audio frame D3).
  • the redundant data packet (F) when the redundant data packet (F) arrives, find the corresponding data according to the timestamp corresponding to the lost audio frame D1 to the redundant data packet (F), and the lost audio frame D1 is restored.
  • the receiving end needs to wait for the redundant data packet to arrive before recovering decoding.
  • an audio frame is 20ms.
  • the audio frame D1 is lost, it needs to wait 60ms to use the redundant data packet to the audio.
  • Frame D1 is restored, resulting in a large delay.
  • At least one embodiment of the present disclosure provides an audio encoding method, a decoding method, an apparatus, and an audio encoding and decoding system.
  • At least one embodiment of the present disclosure provides an audio encoding method, where the method includes:
  • the encapsulating the i-th encoded data and the at most m redundant data before the i-th redundant data into the i-th audio data packet including at least one of the following steps:
  • the first encoded data is packed into the first audio data packet
  • the i-th encoded data and the buffered i-m redundant data to the i-th redundant data are packed into an i-th audio data packet.
  • the method further includes:
  • the method further includes:
  • the sampling rate and/or the compression ratio when encoding the subsequent audio frame is adjusted according to the packet loss rate, wherein the sampling rate is positively correlated with the packet loss rate,
  • the compression ratio is inversely related to the packet loss rate.
  • the encoding the ith audio frame to obtain the ith encoded data, and buffering the ith encoded data to obtain the ith redundant data including:
  • the i-th audio frame is encoded by the second encoding method to obtain an i-th second encoded data, and the i-th second encoded data is buffered as the i-th redundant data.
  • the method further includes:
  • the method further includes:
  • Signal acquisition is performed on the sound signal to obtain audio source data, the audio source data including n consecutive audio frames.
  • At least one embodiment of the present disclosure provides an audio decoding method, where the method includes:
  • the method further includes:
  • the selecting the target redundant data from the redundant data in the current audio data packet includes:
  • the decoding the target redundant data and the current encoded data includes:
  • the target redundant data and the current encoded data are sequentially decoded in order of small time stamps to obtain w+1 audio frames.
  • the method further includes:
  • the current packet loss rate is counted every predetermined time period
  • At least one embodiment of the present disclosure provides an audio encoding apparatus, where the apparatus includes:
  • An encoding module configured to acquire an i-th audio frame of n consecutive audio frames, and obtain an i-th encoded data and an i-th redundant data based on the i-th audio frame, where the i-th encoded data is Encoding the i-th audio frame, the i-th redundant data is obtained by encoding the i-th audio frame, i is a positive integer, n is a positive integer, 1 ⁇ i ⁇ n ;
  • a packing module configured to package the ith encoded data obtained by the encoding module and at most m redundant data before the ith redundant data into an ith audio data packet, where m is a pre- Set a positive integer.
  • the packaging module includes:
  • a second packing unit configured to pack the i-th encoded data and the i-1 redundant data before the cached i-th redundant data into an i-th audio data packet when 1 ⁇ i ⁇ m ;
  • a third packing unit configured to package the ith encoded data and the cached first to fourth redundant data to the i-th redundant data into an ith audio data packet when m ⁇ i ⁇ n .
  • the device further includes:
  • the first receiving module is configured to receive at least one of a packet loss rate and a continuous packet loss information sent by the receiving end, where the continuous packet loss information is used to indicate the number of consecutive packet loss;
  • a first determining module configured to determine, according to at least one of the packet loss rate and the continuous packet loss information received by the first receiving module, a value of the m, where the value of the m is The packet loss rate is positively correlated.
  • the device further includes:
  • a second receiving module configured to receive a packet loss rate sent by the receiving end
  • an adjustment module configured to adjust, according to the packet loss rate received by the second receiving module, a sampling rate and/or a compression ratio when encoding the subsequent audio frame, after the current audio data packet is packaged, where The sampling rate is positively correlated with the packet loss rate, and the compression ratio is negatively correlated with the packet loss rate.
  • the encoding module includes:
  • a first coding unit configured to encode the ith audio frame by using a first coding manner to obtain an ith first coded data
  • a second coding unit configured to encode the ith audio frame by using a second coding manner to obtain an ith second encoded data
  • a buffer unit configured to cache the ith second encoded data obtained by the second coding unit as the i-th redundant data.
  • the device further includes:
  • a third receiving module configured to receive a packet loss rate sent by the receiving end
  • a second determining module configured to determine the second encoding mode according to the packet loss rate received by the third receiving module.
  • the device further includes:
  • An acquisition module is configured to perform signal acquisition on the sound signal to obtain audio source data, where the audio source data includes n consecutive audio frames.
  • At least one embodiment of the present disclosure provides an audio decoding apparatus, where the apparatus includes:
  • a receiving module configured to receive a current audio data packet
  • a selecting module configured to select target redundant data from redundant data in the current audio data packet when there is a lost audio frame before an audio frame corresponding to current encoded data in the current audio data packet,
  • the audio frame corresponding to the target redundant data is the same as the lost audio frame;
  • a decoding module configured to decode the target redundant data and the current encoded data.
  • the device further includes: a determining module, configured to: when the timestamp of the current encoded data is not continuous with the timestamp of the received encoded data and the timestamp of the received redundant data, The missing audio frame exists before the current encoded data is determined.
  • a determining module configured to: when the timestamp of the current encoded data is not continuous with the timestamp of the received encoded data and the timestamp of the received redundant data, The missing audio frame exists before the current encoded data is determined.
  • the selecting module is configured to select, according to a timestamp of the redundant data in the current audio data packet, a timestamp with the received encoded data and a time of received redundant data. Redundant data that is not repeated is used as the target redundant data.
  • the decoding module includes:
  • a sorting unit configured to sort the target redundant data and the current encoded data according to a timestamp of the target redundant data and a timestamp of the current encoded data, where the target redundant data is The quantity is w, and w is a positive integer;
  • a decoding unit configured to sequentially decode the target redundant data and the current encoded data in an order of small to large timestamps to obtain w+1 audio frames.
  • the device further includes:
  • a statistics module configured to calculate a current packet loss rate every predetermined duration
  • a sending module configured to send the current packet loss rate to the sending end.
  • At least one embodiment of the present disclosure provides an audio codec system, the system comprising: an audio encoding device and an audio decoding device; wherein the audio encoding device is the device according to the third aspect; the audio decoding device Is the device as described in the fourth aspect.
  • At least one embodiment of the present disclosure provides a computer device including a processor and a memory, the memory storing at least one instruction, the at least one instruction being loaded and executed by the processor to implement An audio coding method as described on the one hand.
  • At least one embodiment of the present disclosure provides a computer device including a processor and a memory, the memory storing at least one instruction, the at least one instruction being loaded and executed by the processor to implement The audio decoding method described in the two aspects.
  • the i-th encoded data is obtained by encoding the i-th audio frame, and the i-th encoded data is buffered to obtain the i-th redundant data, and when the audio data packet is packaged, the i-th encoded data is compared with the i-th redundant data. Up to m redundant data before the remaining data is packaged into the ith audio data packet and then sent to the receiving end, so that the receiving end can obtain the next one when the ith audio data packet is lost or the ith encoded data fails to be decoded.
  • the last redundant data corresponding to the i-th encoded data in the audio data packet and the receiving end obtains the i-th audio frame by decoding the last redundant data, because the encoded data of the current frame is the same as the previous or previous frames.
  • the redundant data is transmitted together, so that in the case of the current frame loss, the lost audio frame can be recovered as soon as possible by the redundant data in the next audio data packet without waiting for a redundant data packet after a group of compressed data packets to arrive. It can then be restored, reducing the latency caused by loss of audio packets.
  • FIG. 1 is a schematic diagram of an implementation environment involved in audio encoding and audio decoding, according to some embodiments
  • FIG. 2 is a flowchart of a method for an audio encoding method according to an embodiment of the present disclosure
  • FIG. 3 is a flowchart of a method for an audio encoding method according to an embodiment of the present disclosure
  • FIG. 4 is a schematic diagram of an audio data packet provided by an embodiment of the present disclosure.
  • FIG. 5 is a schematic diagram of audio data encoding provided by an embodiment of the present disclosure.
  • FIG. 6 is a flowchart of a method for decoding an audio provided by an embodiment of the present disclosure
  • FIG. 7 is a schematic diagram of audio data transmission provided by an embodiment of the present disclosure.
  • FIG. 8 is a structural block diagram of an audio encoding apparatus according to an embodiment of the present disclosure.
  • FIG. 9 is a structural block diagram of an audio decoding apparatus according to an embodiment of the present disclosure.
  • FIG. 10 is a structural block diagram of an audio codec system according to an embodiment of the present disclosure.
  • FIG. 11 is a structural block diagram of a terminal according to an embodiment of the present disclosure.
  • Audio source data Uncompressed digital audio data obtained by sampling and quantizing an analog signal corresponding to a sound signal.
  • the audio source data may be pulse code modulation (English: Pulse Code Modulation, PCM for short).
  • Sampling rate The number of samples taken from a continuous analog signal per second and composed of discrete signals.
  • Compression ratio The ratio of the compressed file size of the audio data to the file size before compression.
  • FIG. 1 is a schematic diagram of an implementation environment involved in audio encoding and audio decoding, in accordance with some embodiments.
  • the implementation environment mainly includes a transmitting end 110, a receiving end 120, and a communication network 130.
  • the transmitting end 110 is configured to perform signal acquisition on the sound signal to obtain audio source data after receiving or acquiring the sound signal, and then encode and compress the audio source data, and package the encoded compressed data into an audio data packet for transmission.
  • the receiving end 120 is configured to receive an audio data packet, decode the encoded compressed data in the audio data packet, and obtain the audio source data, and then send the audio source data to the sound card for playing.
  • Communication network 130 can be a wired communication network or a wireless communication network.
  • the physical implementation of the communication network 130 is not limited in this embodiment.
  • FIG. 2 is a flowchart of a method for an audio encoding method according to an embodiment of the present disclosure.
  • the audio encoding method is illustrated by being applied to the transmitting end 110 shown in FIG. 1.
  • the audio encoding method may include the following steps:
  • step 201 signal acquisition is performed on the sound signal to obtain audio source data, the audio source data includes n consecutive audio frames, and n is a positive integer.
  • the signal acquisition of the sound signal refers to sampling and quantizing the analog signal corresponding to the sound signal, and the obtained digital audio data is audio source data.
  • the audio source data is PCM data.
  • the audio frame is part of the audio source data, and the audio frame is audio source data corresponding to a predetermined duration, and the predetermined duration is usually 20 ms or 40 ms.
  • step 202 the i-th audio frame in the n consecutive audio frames is obtained, the i-th audio frame is encoded to obtain the i-th encoded data, and the i-th encoded data is buffered to obtain the i-th redundant data.
  • i is a positive integer, 1 ⁇ i ⁇ n.
  • the encoded data is data obtained by encoding and compressing audio source data, and the redundant data is data buffered after encoding the audio source data.
  • the encoded data and the redundant data are in the same encoding manner.
  • the i-th encoded data can be directly buffered as the i-th redundant data, so that only one encoding is required for one audio frame.
  • the encoding method used by the transmitting end in encoding may be Advanced Audio Coding (English: Advanced Audio Coding, abbreviation: AAC).
  • the encoded data and the redundant data are encoded differently.
  • step 202 may be replaced with the step shown in FIG. 3:
  • step 202a the i-th audio frame is encoded by the first encoding method to obtain the i-th encoded data.
  • the i-th audio frame is encoded by the first coding method to obtain the i-th first coded data, and the i-th first coded data is used as the i-th coded data when the packet is subsequently packaged.
  • the first encoding mode typically remains unchanged after the determination.
  • step 202b the i-th audio frame is encoded and buffered by the second encoding method to obtain the i-th redundant data.
  • the i-th audio frame is encoded by the second encoding method to obtain the i-th second encoded data, and the i-th second encoded data is buffered as the i-th redundant data.
  • the contents of the i-th second encoded data and the i-th redundant data are identical.
  • the second encoding mode selects an encoding manner different from the first encoding manner. Since the adjustable range of the coding parameters of one coding mode is limited, a plurality of different coding modes can be used to make the coding parameters have a larger adjustable range when adjusting.
  • the encoding parameters include at least one of a compression ratio and a sampling rate.
  • the redundant data encoded by the second coding mode is buffered, and the buffered redundant data is used as redundant data of the following audio frames at the time of packing.
  • the cached redundant data can be directly obtained, thereby improving the efficiency of packaging.
  • step 203 the i-th encoded data and up to m redundant data before the i-th redundant data are packed into an i-th audio data packet, where m is a preset positive integer.
  • the ith encoded data obtained by encoding the ith audio frame is packed together with the redundant data before the ith redundant data.
  • the i-th first encoded data obtained by encoding the i-th audio frame by using the first encoding manner is redundant with the i-th redundant data. The remaining data is packaged together.
  • step 203 may be replaced with any one or a combination of at least two of step 203a, step 203b, and step 203c in FIG.
  • the transmitting end encodes the first encoded data into the first audio data packet after encoding the first encoded data to obtain the first encoded data. There is no redundant data corresponding to other audio frames in the first audio data packet.
  • step 203b when 1 ⁇ i ⁇ m, the i-th encoded data and the i-1 redundant data before the cached i-th redundant data are packed into the i-th audio data packet.
  • the transmitting end packs the first redundant data to the i-th redundant data together with the encoded data of the current audio frame into an audio data packet.
  • step 203c when m ⁇ i ⁇ n, the i-th encoded data and the buffered im-th redundant data to the i-th redundant data are packed into the i-th audio data packet, where m is positive Integer.
  • the redundant data corresponding to the current frame is obtained, and the redundant data corresponding to the current frame refers to the data buffered after the encoding of the first m frame of the current frame.
  • the data size of one frame of audio frame is usually small. Even if the audio data packet includes encoded data and redundant data, the size of the audio data packet usually does not exceed the network. Maximum transmission unit.
  • the audio data packet 10 includes three parts: a data packet header 11, redundant data 12, and encoded data 13.
  • the packet header 11 defines parameters of the audio packet 11, such as sequence number, delay, identification, time stamp, and the like.
  • the redundant data 12 includes definition parameters of redundant data and encoded redundant data blocks.
  • the definition parameters of the redundant data include the encoding mode, the offset value, the redundant data block length, and the like.
  • the offset value refers to the offset of the redundant data with respect to the encoded data, such as the first frame of the audio frame before the audio frame corresponding to the encoded data, or the second frame of the audio frame before the audio frame corresponding to the encoded data.
  • the definition parameter of the redundant data may also directly include the time stamp, and the time stamp of the redundant data is the same as the time stamp of the encoded data having the same content as the redundant data.
  • the encoded data 13 includes definition parameters of the encoded data and encoded encoded data blocks.
  • the definition parameters of the encoded data include encoding mode, time stamp, and the like. Since the timestamp of the encoded data and the timestamp of the audio data frame are the same, the timestamp of the encoded data may not include the timestamp, and the timestamp of the audio data frame is directly used as the timestamp of the encoded data.
  • each of the redundant data 12 includes a definition parameter of the redundant data and a redundant data block encoded by the frame audio frame.
  • FIG. 5 a schematic diagram of the encoding is shown.
  • the process of encoding and buffering is performed.
  • the i-th audio frame is encoded to obtain the i-th encoded data
  • the i-th encoded data is buffered to obtain the i-th redundant data, i+m -
  • the first i+m-1 encoded data is obtained by encoding one audio frame
  • the i+m-1 redundant data is obtained by the i+m-1 encoded data buffer
  • the i+mth audio frame is encoded to obtain the i th +m encoded data
  • the i+m encoded data buffer obtains the i+mth redundant data.
  • the transmitting end determines the step to be performed for the size relationship between the number of frames of the audio frame and m.
  • the transmitting end encodes the first audio frame to obtain the first encoded data, and buffers the first encoded data to obtain the first redundant data. Then, the first encoded data is packaged into the first audio data packet; for the second audio frame, the transmitting end encodes the second audio frame to obtain the second encoded data, and the second encoded data is buffered.
  • the transmitting end encodes the 3rd audio frame to obtain the 3rd Encoding the data, and buffering the third encoded data to obtain the third redundant data, and then packaging the third encoded data and the first redundant data and the second redundant data into the third audio data packet;
  • the transmitting end encodes the fourth audio frame to obtain the fourth encoded data, and buffers the fourth encoded data to obtain the fourth redundant data, and then the fourth encoded data and the first one.
  • Redundant data, 2nd redundant data, and 3rd redundant data For the 4th audio data packet; for the 5th audio frame, the transmitting end encodes the 5th audio frame to obtain the 5th encoded data, and buffers the 5th encoded data to obtain the 5th redundant data, and then The fifth encoded data and the second redundant data, the third redundant data, and the fourth redundant data are packed into a fifth audio data packet, and the following audio frames are analogized.
  • step 204 the ith audio data packet is transmitted to the receiving end.
  • step 205 the packet loss rate sent by the receiving end is received.
  • the packet loss rate is the ratio of the number of lost or decoded audio packets that the receiver receives during the decoding process to the received audio packets.
  • the transmitting end can receive the information fed back by the receiving end, and adjust the related parameters of the encoding according to the feedback information.
  • the related parameter includes at least the value of m and the coding parameter (sampling rate and/or compression ratio).
  • the coding parameter sampling rate and/or compression ratio.
  • step 206 the value of m is determined according to the packet loss rate, and the value of m is positively correlated with the packet loss rate.
  • the value of m corresponds to the redundancy level
  • the redundancy level refers to primary redundancy, secondary redundancy, and tertiary redundancy.
  • the receiving end feeds back the continuous packet loss information to the sending end, and the sending end adjusts the redundancy level according to the continuous packet loss information.
  • the continuous packet loss information is used to indicate the number of consecutive packet loss. The larger the number of consecutive packet loss, the larger the value of m is, and the value of m is greater than the number of consecutive packet loss. For example, when three audio packets are lost consecutively, m is adjusted to 4, and four-level redundancy is used; when four audio packets are continuously lost, m is adjusted to 5, and five-level redundancy is used.
  • the audio data packet received later may not include redundant data of the previously transmitted audio frame.
  • the lost audio frame cannot be recovered, so the number of audio frames corresponding to the redundant data needs to be increased, so that the audio data packet is more fault-tolerant and increases data reliability.
  • step 207 after the current audio data packet is packaged, according to the packet loss rate, the sampling rate and/or the compression ratio when encoding the subsequent audio frame is adjusted, wherein the sampling rate is positively correlated with the packet loss rate, and the compression ratio is The packet loss rate is negatively correlated.
  • the sending end corresponds to the initial encoding parameter when encoding
  • the encoding parameter includes at least one of a sampling rate and a compression ratio
  • the sampling rate is positively correlated with the packet loss rate
  • the compression ratio is negatively correlated with the packet loss rate
  • sampling rate and the compression ratio may be adjusted, or the sampling rate and the compression ratio may be adjusted at the same time.
  • step 206 and the step 207 can be adjusted at the same time, and only one of them can be adjusted, which is not limited by the embodiment of the present disclosure.
  • the related parameter when the encoding mode of the encoding by the sending end includes the first encoding mode and the second encoding mode, the related parameter further includes a second encoding mode, as shown in FIG. 3, and for the adjustment of the second encoding mode, refer to step 208. .
  • a second encoding mode is determined based on the packet loss rate.
  • common coding methods include G711U coding mode, AAC coding mode, and alphabet coding mode.
  • the compression ratio of the AAC coding mode is relatively high, and the compression ratio of the G711U coding mode is low, and the alphabet coding mode has a higher compression ratio and is closer to the original audio data than the AAC coding mode.
  • the encoded data Since the encoded data is used for decoding and playback in the case where the audio data packet is not lost, and the redundant data does not participate in the decoding and playing, the encoded data adopts the G711U encoding method with a lower compression ratio, thereby improving the efficiency of decoding the audio at the receiving end.
  • the decoded audio frame has a lower degree of distortion, and the redundant data adopts a higher compression ratio AAC coding mode, so that the amount of redundant data is reduced as much as possible, thereby reducing the size of the audio data packet and facilitating the audio data packet. Transmission.
  • the second coding mode is adjusted to a coding mode with less distortion, for example, the AAC coding mode is adjusted to the alphabet coding mode.
  • a packet loss rate threshold may be set. When the packet loss rate received by the sending end reaches the packet loss rate threshold, the sending end adjusts the second encoding mode.
  • an encoder is started for encoding; if the transmitting end uses multiple encoding modes, multiple encoders are started for encoding, and each encoder corresponds to one.
  • Kind of coding if the transmitting end uses an encoding mode, an encoder is started for encoding; if the transmitting end uses multiple encoding modes, multiple encoders are started for encoding, and each encoder corresponds to one.
  • the audio coding and decoding capability supported by the Session Description Protocol can be negotiated between the sender and the receiver.
  • the audio codec capability includes coding parameters, coding modes, and redundancy. At least one of the levels, the transmitting end selects an encoding mode, an encoding parameter, or a redundancy level according to an audio codec capability supported by the receiving end.
  • step 206, step 207, and step 208 may be adjusted at the same time, or only one of them may be adjusted, or any two of them may be adjusted.
  • steps 205 to 208 may be performed at any time during the encoding process, and are not limited to after step 204.
  • any one, any two, any three or all of them may be adjusted.
  • the audio coding method obtains the i-th encoded data by encoding the i-th audio frame, and buffers the i-th encoded data to obtain the i-th redundant data, and packs the audio.
  • the i-th encoded data and the m-th redundant data before the ith redundant data are packetized into the ith audio data packet, and then sent to the receiving end, so that the receiving end is in the ith audio data packet.
  • the last redundant data corresponding to the i-th encoded data in the next audio data packet of the i-th audio data packet may be acquired, and the receiving end decodes the last redundant data.
  • the transmitting end receives the packet loss rate fed back by the transmitting end in the encoding process, and according to the packet loss rate, the value of m at the time of encoding or the sampling rate or compression rate or the second encoding mode used.
  • the adjustment is made such that the transmitting end can adaptively adjust the encoding parameter or the encoding mode according to the actual transmission condition of the audio data packet in the network transmission.
  • FIG. 6 is a flowchart of a method for decoding an audio provided by an embodiment of the present disclosure.
  • the audio decoding method is illustrated by being applied to the receiving end 120 shown in FIG. 1.
  • the audio decoding method may include the following steps:
  • step 301 the current audio data packet is received.
  • the transmitting end sequentially sends an audio data packet to the receiving end, and the receiving end receives the audio data packet sent by the transmitting end.
  • the current audio data packet refers to the audio data packet currently received by the receiving end, and the current audio data packet may be any audio data packet sent by the transmitting end.
  • the first audio data packet includes the first encoded data
  • the first encoded data is the first data packet.
  • the audio frames are encoded.
  • the current audio data packet is the i-th (i>1) audio data packet sent by the transmitting end
  • the ith encoded data packet includes the ith encoded data and the at least one redundant data. It is assumed that the maximum number of redundant data added by the transmitting end in an audio data packet is m, and m is a preset positive integer. When 1 ⁇ i ⁇ m, the ith encoding is included in the ith audio data packet.
  • the i-th audio data packet includes the i-th encoded data and the im-th redundant data to the i-th - 1 redundant data; wherein the ith redundant data is encoded for the ith audio frame.
  • step 302 when there is missing encoded data before the current encoded data in the current audio data packet, the target redundant data is selected from the redundant data in the current audio data packet, and the audio frame corresponding to the target redundant data is lost.
  • the encoded data corresponds to the same audio frame.
  • the sender may lose the audio data packet, and the receiving end cannot receive one or some audio data packets. Therefore, after receiving the current audio data packet, the receiving end needs to determine whether there is missing encoded data before the current encoded data in the current audio data packet.
  • the above judgment process includes the following possible implementation manners:
  • the time stamp of the current encoded data received by the receiving end and the time stamp of the received encoded data should be consecutive, for example, the time stamp of each received encoded data is sequentially 1. 2, 3, the timestamp of the next encoded data should be 4. If the timestamp of the current encoded data received by the receiving end is 4, it is determined that there is no missing encoded data before the current encoded data; if the timestamp of the current encoded data received by the receiving end is 5, it is determined that the current encoded data exists before The lost encoded data, that is, the encoded data with a time stamp of 4, is lost.
  • the time stamp of each redundant data is repeated with the time stamp of the received encoded data, for example, has been received.
  • the time stamps of the respective encoded data are 1, 2, and 3, respectively. Assuming that each audio data packet carries at most 2 redundant data, the next audio data packet should carry the encoded data with a time stamp of 4, and the timestamp is 2 redundant data and redundant data with a time stamp of 3. If the current audio data packet received by the receiving end includes encoded data with a time stamp of 4, redundant data with a time stamp of 2, and redundant data with a time stamp of 3, it is determined that there is no missing encoded data before the current encoded data.
  • the current audio data packet received by the receiving end includes encoded data with timestamp 5, redundant data with timestamp 3, and redundant data with timestamp 4, due to timestamp 4 and received encoded data
  • the time stamp is not repeated, and it is determined that there is missing encoded data before the current encoded data, that is, the encoded data with the time stamp of 4 is lost.
  • the timestamp of the current encoded data is not continuous with the timestamp of the received encoded data, it is also required to determine whether there is unreceived encoded data in the received redundant data. Corresponding redundant data. If there is redundant data corresponding to the unreceived encoded data, it means that the unreceived encoded data can be obtained by using the redundant data corresponding to the unreceived encoded data in the received redundant data, and determining that the unreceived data is not received The encoded data is not lost and there is no need to decode the redundant data in the current audio packet. If there is no redundant data corresponding to the unreceived encoded data, it means that the encoded data is lost.
  • encoded data with a timestamp of 4 is not received, and the received redundant data is redundant data with time stamps of 2 and 3. If there is no redundant data with a timestamp of 4, then The encoded data with a timestamp of 4 is lost.
  • each audio data packet it may be determined that there is a lost encoding before the current encoded data when the time stamp of the current encoded data and the time stamp of the received encoded data are not consecutive. data.
  • the timestamp of the current encoded data and the timestamp of the received encoded data, and the timestamp of the received redundant data may be separately determined, at the time of the currently encoded data.
  • the time stamp of the received encoded data and the time stamp of the received redundant data are not consecutive, it is determined that there is missing encoded data before the current encoded data.
  • the receiving end may determine the interval of time stamps of two adjacent encoded data (or redundant data) according to the frame rate, thereby determining whether the time stamps of the two encoded data (or redundant data) are consecutive. For example, if the difference between the time stamps of the two encoded data (or redundant data) is equal to the above interval, it is continuous; if the difference between the time stamps of the two encoded data (or redundant data) is not equal to the above interval, then continuous.
  • the receiving end selects the target redundant data from the redundant data in the current audio data packet, the audio frame corresponding to the target redundant data, and the lost
  • the audio frames corresponding to the encoded data are the same, that is, the timestamp corresponding to the target redundant data and the timestamp corresponding to the lost encoded data are the same.
  • the receiving end selects redundant data that does not overlap the timestamp of the received encoded data as the target redundancy data according to the time stamp of the redundant data in the current audio data packet.
  • the time stamps of the received encoded data are 1, 2, and 3, respectively. If the current audio data packet received by the receiving end includes the encoded data with the time stamp of 5, the time stamp is 3. Redundant data and redundant data with a time stamp of 4, the receiving end uses the redundant data with a time stamp of 4 as the target redundant data. This method is suitable for the case where a single audio packet includes one redundant data.
  • the time stamps of the redundant data in different audio data packets are different, and therefore, the time stamp of the target redundant data selected from the current audio data packet and the current audio data packet.
  • the timestamps of redundant data that has been received before are not repeated.
  • the receiving end selects a timestamp with the received encoded data and a timestamp of the received redundant data according to the timestamp of the redundant data in the current audio data packet.
  • Duplicate redundant data is used as target redundant data. This approach applies to the case where at least two redundant data are included in a single audio data.
  • the number of target redundant data may be one or more, which is determined by the number of encoded data actually lost.
  • the timestamp of the redundant data may be directly carried in the definition parameter of the redundant data, or may be based on the offset value corresponding to the redundant data and the encoded data of the same audio data packet with the redundant data. The timestamp is calculated.
  • step 303 the target redundant data and the current encoded data are decoded.
  • the receiving end sorts the target redundant data and the current encoded data according to the timestamp of the target redundant data and the timestamp of the current encoded data, where the number of the target redundant data is w, and w is a positive integer. Then, the receiving end sequentially decodes the target redundant data and the current encoded data in order of small to large timestamps to obtain w+1 audio frames.
  • the current encoded data has a timestamp of 5
  • the target redundant data has a number of 2
  • the target redundant data has a timestamp of 3
  • the other target redundant data has a timestamp of 4
  • the receiving end follows the timestamp.
  • the above data is sorted in ascending order, followed by redundant data with a time stamp of 3, redundant data with a time stamp of 4, and encoded data with a time stamp of 5.
  • the receiving end sequentially decodes the data in order of timestamps from small to large to obtain three audio frames.
  • the sorting and decoding are performed in the order of the timestamps from small to large, and it is ensured that the audio frames obtained after decoding can be accurately played.
  • the method provided in this embodiment further includes the following steps:
  • step 304 the current packet loss rate is counted every predetermined time period.
  • the receiving end counts the current packet loss rate every predetermined time. For example, the ratio of the currently lost data packet to the number of transmitted data packets is counted every 1 second.
  • step 305 the current packet loss rate is sent to the sender.
  • the receiver After receiving the packet loss rate, the receiver sends the packet loss rate to the sender in real time, so that the sender can calculate the value of m according to the packet loss rate, the sampling rate at the time of encoding, the compression ratio at the time of encoding, and the second coding mode. At least one of the adjustments.
  • the packet loss rate can be sent by the receiving end to the sending end through the standard RTP control protocol (English: RTP Control Protocol, RTCP for short).
  • RTP control protocol English: RTP Control Protocol, RTCP for short.
  • the receiving end may set a predetermined threshold.
  • the receiving end sends continuous packet loss information to the sending end, and the sending end may further perform packet loss information according to the continuous packet loss information. At least one of the value of m, the sampling rate at the time of encoding, the compression ratio at the time of encoding, and the second encoding method are adjusted.
  • step 304 to step 305 may be performed once every predetermined time period after the start of decoding, and is not limited to be performed after step 303.
  • the audio decoding method selects the target redundancy from the redundant data in the current audio data packet by determining that there is missing encoded data before the current encoded data in the current audio data packet. Remaining data, and then decoding the target redundant data and the current encoded data, since the encoded data of the current frame is transmitted together with the redundant data of the previous frame or the previous frames, so that in the case where the current frame is lost, the The redundant data in an audio packet recovers the lost audio frame as soon as possible, and does not need to wait for a redundant packet after a compressed packet to arrive before recovering, thereby reducing the delay caused by the loss of the audio packet.
  • the receiving end collects the current packet loss rate every predetermined time period, and feeds back the packet loss rate to the sending end, so that the sending end can sample the value of m according to the packet loss rate and the sampling at the time of encoding.
  • At least one of the rate, the compression ratio at the time of encoding, and the second encoding mode are adjusted so that the transmitting end can adaptively adjust the encoding parameter or the encoding mode according to the actual transmission condition of the audio data packet in the network transmission.
  • FIG. 7 a schematic diagram of audio data transmission is exemplarily shown.
  • the audio data is The packet is transmitted to the receiving end 120 through the network, and the audio data packet in the receiving end 120 passes through the redundant encoding parser 24, the decoder 25 and the player 26, and finally the sound signal is played out at the receiving end.
  • FIG. 8 is a block diagram showing the structure of an audio encoding apparatus according to an embodiment of the present disclosure.
  • the audio encoding apparatus is illustrated by being applied to the transmitting end 110 shown in FIG. 1.
  • the audio encoding device may include an encoding module 410 and a packaging module 420.
  • the encoding module 410 is configured to obtain an ith audio frame of the n consecutive audio frames, obtain an ith encoded data and an ith redundant data based on the ith audio frame, where the ith encoded data is the ith The audio frame is encoded, and the i-th redundant data is obtained by encoding the i-th audio frame after encoding, i is a positive integer, n is a positive integer, and 1 ⁇ i ⁇ n.
  • the packing module 420 is configured to package the ith encoded data obtained by the encoding module 410 and at most m redundant data before the ith redundant data into an ith audio data packet, where m is a preset positive integer.
  • the packaging module 420 includes:
  • a second packing unit configured to package the i-th encoded data and the i-1 redundant data before the cached i-th redundant data into the i-th audio data packet when 1 ⁇ i ⁇ m;
  • a third packing unit configured to package the i-th encoded data and the cached i-th redundant data to the i-th redundant data into the i-th audio data packet when m ⁇ i ⁇ n.
  • the device further includes:
  • a first receiving module configured to receive a packet loss rate sent by the receiving end
  • the first determining module is configured to determine the value of m according to the packet loss rate received by the first receiving module, where the value of m is positively correlated with the packet loss rate.
  • the device further includes:
  • a second receiving module configured to receive a packet loss rate sent by the receiving end
  • the adjusting module is configured to adjust a sampling rate and/or a compression ratio when encoding the subsequent audio frame according to the packet loss rate received by the second receiving module after the current audio data packet is packaged, where the sampling rate and the packet loss ratio are performed.
  • the rate is positively correlated, and the compression ratio is negatively correlated with the packet loss rate.
  • the encoding module 410 includes:
  • a first coding unit configured to encode the i-th audio frame by using a first coding manner to obtain an i-th first encoded data
  • a second coding unit configured to encode the i-th audio frame by using a second coding manner to obtain an i-th second encoded data
  • a buffer unit configured to cache the ith second encoded data obtained by the second coding unit as the i-th redundant data.
  • the encoding module 410 is configured to encode the ith audio frame to obtain the ith encoded data, and cache the ith encoded data to obtain the ith redundant data.
  • the device further includes:
  • a third receiving module configured to receive a packet loss rate sent by the receiving end
  • the second determining module is configured to determine the second encoding mode according to the packet loss rate received by the third receiving module.
  • the device further includes:
  • the acquisition module is configured to perform signal acquisition on the sound signal to obtain audio source data, where the audio source data includes n consecutive audio frames.
  • the audio encoding apparatus obtains the i-th encoded data by encoding the i-th audio frame, and buffers the i-th encoded data to obtain the i-th redundant data, and packs the audio.
  • the i-th encoded data and the m-th redundant data before the ith redundant data are packetized into the ith audio data packet, and then sent to the receiving end, so that the receiving end is in the ith audio data packet.
  • the last redundant data corresponding to the i-th encoded data in the next audio data packet of the i-th audio data packet may be acquired, and the receiving end decodes the last redundant data.
  • the transmitting end receives the packet loss rate fed back by the transmitting end during the encoding process, and adjusts the value of the encoding m or the sampling rate or the compression ratio or the second encoding mode used according to the packet loss rate, so that the transmitting end can
  • the coding parameters or coding modes are adaptively adjusted according to the actual transmission conditions of the audio data packets in the network transmission.
  • FIG. 9 is a block diagram showing the structure of an audio decoding apparatus according to an embodiment of the present disclosure.
  • the audio decoding apparatus is illustrated by being applied to the receiving end 120 shown in FIG. 1.
  • the audio decoding device may include: a receiving module 510, a determining module 520, a selecting module 530, and a decoding module 540.
  • the receiving module 510 is configured to receive a current audio data packet.
  • the determining module 520 is configured to determine whether there is missing encoded data before the current encoded data in the current audio data packet received by the receiving module 510.
  • the selecting module 530 is configured to: when the determining module 520 determines that there is missing encoded data before the current encoded data in the current audio data packet, select target redundant data from the redundant data in the current audio data packet, and target redundant data.
  • the corresponding audio frame is the same as the audio frame corresponding to the lost encoded data.
  • the decoding module 540 is configured to decode the target redundant data and the current encoded data.
  • the determining module 520 is configured to determine that there is missing encoded data before the current encoded data when the timestamp of the current encoded data and the timestamp of the received encoded data are not consecutive.
  • the determining module 520 is further configured to: when the timestamp of the lost encoded data is different from the timestamp of the received redundant data, or the timestamp of the currently encoded data and the timestamp of the received redundant data. When not continuous, it is determined that there is a missing audio frame before the current encoded data.
  • the selecting module 530 is configured to select, as the target redundant data, redundant data that does not overlap the timestamp of the received encoded data according to the timestamp of the redundant data in the current audio data packet.
  • the decoding module 540 includes:
  • a sorting unit configured to sort the target redundant data and the current encoded data according to the timestamp of the target redundant data and the timestamp of the current encoded data, where the number of the target redundant data is w, w is a positive integer;
  • a decoding unit configured to sequentially decode the target redundant data and the current encoded data according to a timestamp from a small to large, to obtain w+1 audio frames.
  • the encoding manner when the transmitting end encodes the current audio frame includes a first encoding manner and a second encoding manner, where the first encoding manner is used to encode the current audio frame to obtain the current encoded data;
  • the second coding mode is used to encode the current audio frame to obtain current redundant data.
  • the device also includes:
  • a statistics module configured to calculate a current packet loss rate every predetermined duration
  • a sending module configured to send a current packet loss rate to the sending end
  • the packet loss rate is used to determine at least one of the following parameters:
  • the audio decoding apparatus selects the target redundancy from the redundant data in the current audio data packet by determining the missing encoded data before determining the current encoded data in the current audio data packet. Remaining data, and then decoding the target redundant data and the current encoded data, since the encoded data of the current frame is transmitted together with the redundant data of the previous frame or the previous frames, so that in the case where the current frame is lost, the The redundant data in an audio packet recovers the lost audio frame as soon as possible, and does not need to wait for a redundant packet after a compressed packet to arrive before recovering, thereby reducing the delay caused by the loss of the audio packet.
  • the current packet loss rate is calculated by the receiving end every predetermined time period, and the packet loss rate is fed back to the transmitting end, so that the sending end can take the value of m according to the packet loss rate, the sampling rate at the time of encoding, and the compression ratio at the time of encoding. And adjusting at least one of the second coding modes, so that the transmitting end can adaptively adjust the coding parameter or the coding mode according to the actual transmission condition of the audio data packet in the network transmission.
  • FIG. 10 is a block diagram showing the structure of an audio codec system according to an embodiment of the present disclosure.
  • the audio codec system 600 includes an audio encoding device 610 and an audio decoding device 620.
  • the audio encoding device 610 is the audio encoding device provided by the embodiment shown in FIG. 8, and the audio decoding device 620 is the audio decoding device provided in the embodiment shown in FIG.
  • the above audio encoding device 610 may be an encoder or a transmitting device having an encoder, and the audio decoding device may be a decoder or a receiving device having a decoder.
  • a computer readable storage medium having stored therein at least one instruction loaded by a processor and executed to implement the audio as described in FIG. 2 or FIG. Coding method.
  • a computer readable storage medium having stored therein at least one instruction loaded by a processor and executed to implement an audio decoding method as described in FIG.
  • a computer device comprising a processor and a memory, the memory storing at least one instruction loaded by the processor and executed to implement FIG. 2 or FIG. The described audio coding method.
  • a computer device comprising a processor and a memory, the memory storing at least one instruction loaded by the processor and executed to implement the method as described in FIG. Audio decoding method.
  • the terminal 700 may be the transmitting end 110 shown in FIG. 1 or the receiving end 120 shown in FIG. 1 , and the terminal 700 is used to implement the audio encoding method or the audio decoding method provided by the foregoing embodiment.
  • Terminal 700 in the present disclosure may include one or more of the following components: a processor for executing computer program instructions to perform various processes and methods for information and storage of program instructions for random access memory (English: random access memory) , abbreviation: RAM) and read-only memory (English: read-only memory, referred to as: ROM), memory for storing data and information, I / O devices, interfaces, antennas, etc.
  • a processor for executing computer program instructions to perform various processes and methods for information and storage of program instructions for random access memory (English: random access memory) , abbreviation: RAM) and read-only memory (English: read-only memory, referred to as: ROM), memory for storing data and information, I / O devices, interfaces, antennas, etc.
  • RAM random access memory
  • the terminal 700 may include a radio frequency (English: Radio Frequency, abbreviation: RF) circuit 710, a memory 720, an input unit 730, a display unit 740, a sensor 750, an audio circuit 760, and a wireless fidelity (English: wireless fidelity, referred to as WiFi) module. 770, processor 780, power supply 782, camera 790 and other components. It will be understood by those skilled in the art that the terminal structure shown in FIG. 11 does not constitute a limitation to the terminal, and may include more or less components than those illustrated, or a combination of certain components, or different component arrangements.
  • RF Radio Frequency, abbreviation: RF
  • the RF circuit 710 can be used for transmitting and receiving information or during a call, and receiving and transmitting the signal. Specifically, after receiving the downlink information of the base station, the processor 780 processes the data. In addition, the uplink data is designed to be sent to the base station.
  • the RF circuit includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier (Low Noise Amplifier, LNA for short), a duplexer, and the like.
  • RF circuitry 710 can also communicate with the network and other devices via wireless communication.
  • the wireless communication may use any communication standard or protocol, including but not limited to Global System of Mobile communication (GSM), General Packet Radio Service (English: General Packet Radio Service, referred to as GPRS). , Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE) , email, short message service (English: Short Messaging Service, referred to as: SMS).
  • GSM Global System of Mobile communication
  • GPRS General Packet Radio Service
  • CDMA Code Division Multiple Access
  • WCDMA Wideband Code Division Multiple Access
  • LTE Long Term Evolution
  • SMS Short Messaging Service
  • the memory 720 can be used to store software programs and modules, and the processor 780 executes various functional applications and data processing of the terminal 700 by running software programs and modules stored in the memory 720.
  • the memory 720 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to The data created by the use of the terminal 700 (such as audio data, phone book, etc.) and the like.
  • memory 720 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
  • the input unit 730 can be configured to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the terminal 700.
  • the input unit 730 may include a touch panel 731 and other input devices 732.
  • the touch panel 731 also referred to as a touch screen, can collect touch operations on or near the user (such as the user using a finger, a stylus, or the like on the touch panel 731 or near the touch panel 731. Operation), and drive the corresponding connecting device according to a preset program.
  • the touch panel 731 can include two parts: a touch detection device and a touch controller.
  • the touch detection device detects the touch orientation of the user, and detects a signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts the touch information into contact coordinates, and sends the touch information.
  • the processor 780 is provided and can receive commands from the processor 780 and execute them.
  • the touch panel 731 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic waves.
  • the input unit 730 may also include other input devices 732.
  • other input devices 732 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, joysticks, and the like.
  • the display unit 740 can be used to display information input by the user or information provided to the user and various menus of the terminal 700.
  • the display unit 740 may include a display panel 741.
  • the display may be configured in the form of a Liquid Crystal Display (LCD) or an Organic Light-Emitting Diode (OLED).
  • Panel 741 Panel 741.
  • the touch panel 731 can cover the display panel 741. When the touch panel 731 detects a touch operation on or near the touch panel 731, it transmits to the processor 780 to determine the type of the touch event, and then the processor 780 according to the touch event. The type provides a corresponding visual output on display panel 741.
  • touch panel 731 and the display panel 741 are used as two independent components to implement the input and input functions of the terminal 700 in FIG. 11, in some embodiments, the touch panel 731 can be integrated with the display panel 741. The input and output functions of the terminal 700 are implemented.
  • Terminal 700 can also include at least one type of sensor 750, such as a gyro sensor, a magnetic induction sensor, a light sensor, a motion sensor, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 741 according to the brightness of the ambient light, and the proximity sensor may close the display panel 741 when the terminal 700 moves to the ear. / or backlight.
  • the acceleration sensor can detect the magnitude of acceleration in each direction (usually three axes). When it is stationary, it can detect the magnitude and direction of gravity. It can be used to identify the attitude of the terminal (such as horizontal and vertical screen switching, related games).
  • magnetometer attitude calibration magnetometer attitude calibration
  • vibration recognition related functions such as pedometer, tapping
  • other sensors such as barometers, hygrometers, thermometers, infrared sensors, etc., which can also be configured in the terminal 700, are not described here.
  • An audio circuit 760, a speaker 761, and a microphone 762 can provide an audio interface between the user and the terminal 700.
  • the audio circuit 760 can transmit the converted electrical data of the received audio data to the speaker 761 for conversion to the sound signal output by the speaker 761; on the other hand, the microphone 762 converts the collected sound signal into an electrical signal by the audio circuit 760. After receiving, it is converted into audio data, and then processed by the audio data output processor 780, transmitted to the terminal, for example, via the RF circuit 710, or the audio data is output to the memory 720 for further processing.
  • WiFi is a short-range wireless transmission technology
  • the terminal 700 can help users to send and receive emails, browse web pages, and access streaming media through the WiFi module 770, which provides wireless broadband Internet access for users.
  • FIG. 11 shows the WiFi module 770, it can be understood that it does not belong to the essential configuration of the terminal 700, and may be omitted as needed within the scope of not changing the essence of the disclosure.
  • Processor 780 is the control center of terminal 700, which connects various portions of the entire terminal using various interfaces and lines, by running or executing software programs and/or modules stored in memory 720, and recalling data stored in memory 720, The various functions and processing data of the terminal 700 are performed to perform overall monitoring of the terminal.
  • the processor 780 may include one or more processing units; preferably, the processor 780 may integrate an application processor and a modem processor, where the application processor mainly processes an operating system, a user interface, an application, and the like.
  • the modem processor primarily handles wireless communications. It will be appreciated that the above described modem processor may also not be integrated into the processor 780.
  • the terminal 700 also includes a power source 782 (such as a battery) for powering various components.
  • a power source 782 (such as a battery) for powering various components.
  • the power source can be logically coupled to the processor 780 through a power management system to manage functions such as charging, discharging, and power management through the power management system.
  • the camera 790 is generally composed of a lens, an image sensor, an interface, a digital signal processor, a central processing unit (English: Central Processing Unit, CPU), a display screen, and the like.
  • the lens is fixed above the image sensor, and the focus can be changed by manually adjusting the lens;
  • the image sensor is equivalent to the "film” of the conventional camera, and is the heart of the image captured by the camera;
  • the interface is used to connect the camera with the cable and the board to the board.
  • the spring-type connection mode is connected to the terminal board, and the collected image is sent to the memory 720;
  • the digital signal processor processes the acquired image through a mathematical operation, converts the collected analog image into a digital image, and sends the image to the interface Memory 720.
  • the terminal 700 may further include a Bluetooth module or the like, and details are not described herein again.
  • a person skilled in the art may understand that all or part of the steps of implementing the above embodiments may be completed by hardware, or may be instructed by a program to execute related hardware, and the program may be stored in a computer readable storage medium.
  • the storage medium mentioned may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Environmental & Geological Engineering (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Communication Control (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Detection And Prevention Of Errors In Transmission (AREA)

Abstract

一种音频编码方法、解码方法、装置及音频编解码系统。方法包括:获取n个连续的音频帧中的第i个音频帧,基于第i个音频帧得到第i个编码数据和第i个冗余数据,第i个编码数据是对所述第i个音频帧编码得到的,第i个冗余数据是对所述第i个音频帧编码后缓存得到的,i为正整数,n为正整数,1≤i≤n;(202)将第i个编码数据和第i个冗余数据之前的至多m个冗余数据打包为第i个音频数据包,其中,m为预设正整数(203)。该方法可以降低音频数据包丢失时造成的时延。

Description

音频编码方法、解码方法、装置及音频编解码系统
本申请要求于2017年9月18日提交、申请号为201710840992.2、发明名称为“音频编码方法、解码方法、装置及音频编解码系统”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。
技术领域
本申请实施例涉及音频技术领域,特别涉及一种音频编码方法、解码方法、装置及音频编解码系统。
背景技术
为了减小音频数据在传输时的数据量,发送端对音频源数据进行编码压缩得到压缩数据,将压缩数据发送给接收端,接收端对接收到的压缩数据进行解码得到音频源数据。
音频源数据由连续的音频帧组成,音频帧是20ms或40ms的音频数据,依次将音频帧编码压缩后得到压缩数据包。在网络较差的情况下,压缩数据包在传输过程中可能会丢失部分数据包,则接收端接收到的音频数据中部分音频帧会丢失,导致播放出的声音不连续或卡顿。为了解决数据包丢失的问题,相关技术中,发送端在一组压缩数据包(对应音频帧D1、音频帧D2、音频帧D3)发送之后,会发送一个冗余数据包(F),冗余数据包用于恢复这组压缩数据包中丢失的音频帧,比如:音频帧D1对应的压缩数据包丢失,接收端继续接收压缩数据包(对应音频帧D2)、压缩数据包(对应音频帧D3)和冗余数据包(F),当冗余数据包(F)到达后,根据丢失的音频帧D1对应的时间戳到冗余数据包(F)中查找对应的数据,对丢失的音频帧D1进行恢复。
由于压缩数据包在丢失之后,接收端需要等待冗余数据包到达后才能进行恢复解码,假设一个音频帧为20ms,在音频帧D1丢失的情况下,需要等待60ms才能利用冗余数据包对音频帧D1进行恢复,从而导致较大的时延。
发明内容
本公开至少一实施例提供了一种音频编码方法、解码方法、装置及音频编 解码系统。
本公开至少一实施例提供了一种音频编码方法,所述方法包括:
获取n个连续的音频帧中的第i个音频帧,基于所述第i个音频帧得到第i个编码数据和第i个冗余数据,所述第i个编码数据是对所述第i个音频帧编码得到的,所述第i个冗余数据是对所述第i个音频帧编码后缓存得到的,i为正整数,n为正整数,1≤i≤n;
将所述第i个编码数据和所述第i个冗余数据之前的至多m个冗余数据打包为第i个音频数据包,其中,m为预设正整数。
可选的,所述将所述第i个编码数据和所述第i个冗余数据之前的至多m个冗余数据打包为第i个音频数据包,包括以下步骤中的至少一个:
当i=1时,将第1个编码数据打包为第1个音频数据包;
当1<i≤m时,将所述第i个编码数据和已缓存的第i个冗余数据之前的i-1个冗余数据打包为第i个音频数据包;
当m<i≤n时,将所述第i个编码数据和已缓存的第i-m个冗余数据至第i-1个冗余数据打包为第i个音频数据包。
可选的,所述方法还包括:
接收所述接收端发送的丢包率和连续丢包信息中的至少一个,所述连续丢包信息用于指示连续丢包的数量;
根据所述丢包率和所述连续丢包信息中的至少一个,确定所述m的取值,所述m的取值与所述丢包率呈正相关。
可选的,所述方法还包括:
接收所述接收端发送的丢包率;
在对当前音频数据包打包后,根据所述丢包率,调整对后续音频帧进行编码时的采样率和/或压缩率,其中,所述采样率与所述丢包率呈正相关,所述压缩率与所述丢包率呈负相关。
可选的,所述对所述第i个音频帧进行编码得到第i个编码数据,将所述第i个编码数据进行缓存得到第i个冗余数据,包括:
通过第一编码方式对所述第i个音频帧进行编码得到第i个第一编码数据;
通过第二编码方式对所述第i个音频帧进行编码得到第i个第二编码数据,并将所述第i个第二编码数据进行缓存,作为第i个冗余数据。
可选的,所述方法还包括:
接收所述接收端发送的丢包率;
根据所述丢包率确定所述第二编码方式。
可选的,所述方法还包括:
对声音信号进行信号采集得到音频源数据,所述音频源数据包括n个连续的音频帧。
本公开至少一实施例提供了一种音频解码方法,所述方法包括:
接收当前音频数据包;
在所述当前音频数据包中的当前编码数据对应的音频帧之前存在丢失的音频帧时,从所述当前音频数据包中的冗余数据中选取目标冗余数据,所述目标冗余数据对应的音频帧和所述丢失的音频帧相同;
对所述目标冗余数据和所述当前编码数据进行解码。
可选的,所述方法还包括:
在所述当前编码数据的时间戳与已接收到的编码数据的时间戳和已接收到的冗余数据的时间戳均不连续时,确定所述当前编码数据之前存在所述丢失的音频帧。
可选的,所述从所述当前音频数据包中的冗余数据中选取目标冗余数据,包括:
根据所述当前音频数据包中的冗余数据的时间戳,选取与所述已接收到的编码数据的时间戳和已接收到的冗余数据的时间戳均不重复的冗余数据作为所述目标冗余数据。
可选的,所述对所述目标冗余数据和所述当前编码数据进行解码,包括:
根据所述目标冗余数据的时间戳和所述当前编码数据的时间戳,对所述目标冗余数据和所述当前编码数据进行排序,其中,所述目标冗余数据的数量为w,w为正整数;
按照时间戳由小到大的顺序对所述目标冗余数据和所述当前编码数据依次进行解码,得到w+1个音频帧。
可选的,所述方法还包括:
每隔预定时长统计当前的丢包率;
将所述当前的丢包率发送给所述发送端。
本公开至少一实施例提供了一种音频编码装置,所述装置包括:
编码模块,用于获取n个连续的音频帧中的第i个音频帧,基于所述第i 个音频帧得到第i个编码数据和第i个冗余数据,所述第i个编码数据是对所述第i个音频帧编码得到的,所述第i个冗余数据是对所述第i个音频帧编码后缓存得到的,i为正整数,n为正整数,1≤i≤n;
打包模块,用于将所述编码模块得到的所述第i个编码数据和所述第i个冗余数据之前的至多m个冗余数据打包为第i个音频数据包,其中,m为预设正整数。
可选的,所述打包模块,包括:
第一打包单元,用于当i=1时,将第1个编码数据打包为第1个音频数据包;
第二打包单元,用于当1<i≤m时,将所述第i个编码数据和已缓存的第i个冗余数据之前的i-1个冗余数据打包为第i个音频数据包;
第三打包单元,用于当m<i≤n时,将所述第i个编码数据和已缓存的第i-m个冗余数据至第i-1个冗余数据打包为第i个音频数据包。
可选的,所述装置还包括:
第一接收模块,用于接收所述接收端发送的丢包率和连续丢包信息中的至少一个,所述连续丢包信息用于指示连续丢包的数量;
第一确定模块,用于根据所述第一接收模块接收到的所述丢包率和所述连续丢包信息中的至少一个,确定所述m的取值,所述m的取值与所述丢包率呈正相关。
可选的,所述装置还包括:
第二接收模块,用于接收所述接收端发送的丢包率;
调整模块,用于在对当前音频数据包打包后,根据所述第二接收模块接收到的所述丢包率,调整对后续音频帧进行编码时的采样率和/或压缩率,其中,所述采样率与所述丢包率呈正相关,所述压缩率与所述丢包率呈负相关。
可选的,所述编码模块,包括:
第一编码单元,用于通过第一编码方式对所述第i个音频帧进行编码得到第i个第一编码数据;
第二编码单元,用于通过第二编码方式对所述第i个音频帧进行编码得到第i个第二编码数据;
缓存单元,用于将所述第二编码单元得到的所述第i个第二编码数据进行缓存,作为第i个冗余数据。
可选的,所述装置还包括:
第三接收模块,用于接收所述接收端发送的丢包率;
第二确定模块,用于根据所述第三接收模块接收到的所述丢包率确定所述第二编码方式。
可选的,所述装置还包括:
采集模块,用于对声音信号进行信号采集得到音频源数据,所述音频源数据包括n个连续的音频帧。
本公开至少一实施例提供了一种音频解码装置,所述装置包括:
接收模块,用于接收当前音频数据包;
选取模块,用于在所述当前音频数据包中的当前编码数据对应的音频帧之前存在丢失的音频帧时,从所述当前音频数据包中的冗余数据中选取目标冗余数据,所述目标冗余数据对应的音频帧和所述丢失的音频帧相同;
解码模块,用于对所述目标冗余数据和所述当前编码数据进行解码。
可选的,所述装置还包括:判断模块,用于在所述当前编码数据的时间戳与已接收到的编码数据的时间戳和已接收到的冗余数据的时间戳均不连续时,确定所述当前编码数据之前存在所述丢失的音频帧。
可选的,所述选取模块,用于根据所述当前音频数据包中的冗余数据的时间戳,选取与所述已接收到的编码数据的时间戳和已接收到的冗余数据的时间戳均不重复的冗余数据作为所述目标冗余数据。
可选的,所述解码模块,包括:
排序单元,用于根据所述目标冗余数据的时间戳和所述当前编码数据的时间戳,对所述目标冗余数据和所述当前编码数据进行排序,其中,所述目标冗余数据的数量为w,w为正整数;
解码单元,用于按照时间戳由小到大的顺序对所述目标冗余数据和所述当前编码数据依次进行解码,得到w+1个音频帧。
可选的,所述装置还包括:
统计模块,用于每隔预定时长统计当前的丢包率;
发送模块,用于将所述当前的丢包率发送给所述发送端。
本公开至少一实施例提供了一种音频编解码系统,所述系统包括:音频编码装置和音频解码装置;其中,所述音频编码装置是如第三方面所述的装置;所述音频解码装置是如第四方面所述的装置。
本公开至少一实施例提供了一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令,所述至少一条指令由所述处理器加载并执行以实现如第一方面所述的音频编码方法。
本公开至少一实施例提供了一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令,所述至少一条指令由所述处理器加载并执行以实现如第二方面所述的音频解码方法。
本公开实施例提供的技术方案至少能带来如下有益效果:
通过对第i个音频帧进行编码得到第i个编码数据,并将第i个编码数据缓存得到第i个冗余数据,在打包音频数据包时,将第i个编码数据与第i个冗余数据之前的至多m个冗余数据打包为第i个音频数据包后发送给接收端,从而使得接收端在第i个音频数据包丢失或第i个编码数据解码失败时,可以获取后一个音频数据包中与第i个编码数据对应的最后一个冗余数据,接收端通过对最后一个冗余数据解码得到第i个音频帧,由于当前帧的编码数据与前一帧或前几帧的冗余数据一起传输,从而使得在当前帧丢失的情况下,能够通过下一个音频数据包中的冗余数据尽快恢复丢失的音频帧,不需要等待一组压缩数据包之后的冗余数据包到达之后才能恢复,从而降低了音频数据包丢失时造成的时延。
附图说明
图1是根据部分实施例示出的音频编码和音频解码所涉及的实施环境示意图;
图2是本公开一个实施例提供的音频编码方法的方法流程图;
图3是本公开一个实施例提供的音频编码方法的方法流程图;
图4是本公开一个实施例提供的音频数据包的示意图;
图5是本公开一个实施例提供的音频数据编码的示意图;
图6是本公开一个实施例提供的音频解码方法的方法流程图;
图7是本公开一个实施例提供的音频数据传输的示意图;
图8是本公开一个实施例提供的音频编码装置的结构方框图;
图9是本公开一个实施例提供的音频解码装置的结构方框图;
图10是本公开一个实施例提供的音频编解码系统的结构方框图;
图11是本公开一个实施例提供的终端的结构方框图。
具体实施方式
为使本公开的原理和优点更加清楚,下面将结合附图对本公开实施方式作进一步地详细描述。
为了便于对本公开各实施例的理解,首先对相关名词进行解释:
音频源数据:对声音信号对应的模拟信号进行采样、量化得到的未经压缩的数字音频数据。音频源数据可以是脉冲编码调制(英文:Pulse Code Modulation,简称:PCM)数据。
采样率:每秒从连续的模拟信号中提取并组成离散信号的采样个数。
压缩率:音频数据编码压缩后的文件大小与压缩之前的文件大小之比。
图1是根据部分实施例示出的音频编码和音频解码所涉及的实施环境示意图。如图1所示,该实施环境主要包括发送端110、接收端120和通信网络130。
发送端110用于在接收或获取到声音信号后,对声音信号进行信号采集得到音频源数据,然后对音频源数据进行编码压缩,将编码压缩后的数据打包成音频数据包发送。
接收端120用于接收音频数据包,对音频数据包中编码压缩的数据进行解码,得到音频源数据后,可以将音频源数据送入声卡中播放。
通信网络130可以为有线通信网络,也可以是无线通信网络。本实施例中不对通信网络130的物理实现方式进行限定。
图2是本公开一个实施例提供的音频编码方法的方法流程图,该音频编码方法以应用在图1所示的发送端110中进行举例说明。如图2所示,该音频编码方法可以包括以下步骤:
在步骤201中,对声音信号进行信号采集得到音频源数据,音频源数据包括n个连续的音频帧,n为正整数。
对声音信号进行信号采集是指对声音信号对应的模拟信号进行采样、量化,得到的数字音频数据是音频源数据。可选的,音频源数据是PCM数据。
音频帧是音频源数据的一部分,音频帧是对应预定时长的音频源数据,预定时长通常为20ms或40ms。
在步骤202中,获取n个连续的音频帧中的第i个音频帧,对第i个音频 帧进行编码得到第i个编码数据,将第i个编码数据进行缓存得到第i个冗余数据,i为正整数,1≤i≤n。
编码数据是对音频源数据编码压缩后得到的数据,冗余数据是对音频源数据编码压缩后缓存的数据。
在本实施例中,编码数据与冗余数据采用相同的编码方式。在这种情况下,可以直接将第i个编码数据缓存作为第i个冗余数据,这样针对一个音频帧只需要编码一次。
示例性地,发送端在编码时采用的编码方式可以是高级音频编码(英文:Advanced Audio Coding,简称:AAC)。
可选的,编码数据与冗余数据采用不同的编码方式。
可选的,发送端在编码时采用的编码方式包括第一编码方式和第二编码方式,则步骤202可以替换成图3所示的步骤:
在步骤202a中,通过第一编码方式对第i个音频帧进行编码得到第i个编码数据。
例如,通过第一编码方式对第i个音频帧进行编码得到第i个第一编码数据,将第i个第一编码数据作为后续打包时的第i个编码数据。
在一个可选实施例中,第一编码方式在确定之后通常保持不变。
在步骤202b中,通过第二编码方式对第i个音频帧进行编码并缓存得到第i个冗余数据。
例如,通过第二编码方式对第i个音频帧进行编码得到第i个第二编码数据,并将第i个第二编码数据缓存,作为第i个冗余数据。在这种情况下,第i个第二编码数据和第i个冗余数据的内容一致。
可选的,第二编码方式选择与第一编码方式不同的编码方式。由于一种编码方式的编码参数的可调范围有限,采用多种不同的编码方式可以使得编码参数在调整时有更大的可调范围。这里,编码参数包括压缩率和采样率中的至少一种。
将通过第二编码方式编码得到的冗余数据进行缓存,将缓存的冗余数据作为后面的音频帧在打包时的冗余数据。
通过缓存编码后的数据,使得在对后面的音频帧打包时,能够直接获取已缓存的冗余数据,从而提高打包的效率。
在步骤203中,将第i个编码数据和第i个冗余数据之前的至多m个冗余 数据打包为第i个音频数据包,其中,m为预设正整数。
在对第i个音频数据包打包时,将第i个音频帧编码得到的第i个编码数据与第i个冗余数据之前的冗余数据一起打包。
在一个可选实施例中,在对第i个音频数据包打包时,将第i个音频帧通过第一编码方式编码得到的第i个第一编码数据与第i个冗余数据之前的冗余数据一起打包。
可选的,步骤203可以被替换成图3中的步骤203a、步骤203b和步骤203c中的任意一个或者至少两个的组合。
在步骤203a中,当i=1时,将第1个编码数据打包为第1个音频数据包。
对于第一个音频帧,由于该音频帧之前没有其他音频帧,因此发送端对第一个音频帧编码得到第1个编码数据后,直接将第1个编码数据打包为第1个音频数据包,第1个音频数据包中没有其他音频帧对应的冗余数据。
在步骤203b中,当1<i≤m时,将第i个编码数据和已缓存的第i个冗余数据之前的i-1个冗余数据打包为第i个音频数据包。
若i≤m,当前音频帧之前所有的音频帧的数量小于m,则对当前音频帧打包时,将当前音频帧之前的所有音频帧对应的冗余数据与当前音频帧的编码数据一起打包为音频数据包,也就是说,发送端将第1个冗余数据至第i-1个冗余数据与当前音频帧的编码数据一起打包为音频数据包。
在步骤203c中,当m<i≤n时,将第i个编码数据和已缓存的第i-m个冗余数据至第i-1个冗余数据打包为第i个音频数据包,m为正整数。
在对当前帧进行打包时,获取当前帧对应的冗余数据,当前帧对应的冗余数据是指当前帧的前m帧音频帧编码后缓存的数据。
由于一帧音频帧对应的时长通常为20ms或40ms,所以一帧音频帧的数据大小通常较小,即使音频数据包中包括编码数据和冗余数据,音频数据包的大小通常不会超过网络的最大传输单元。
结合参考图4,其示例性地示出了音频数据包的格式,如图4所示,音频数据包10包括数据包头11、冗余数据12和编码数据13三部分。数据包头11定义了音频数据包11的参数,比如序号、时延、标识、时间戳等。冗余数据12包括冗余数据的定义参数和编码后的冗余数据块。冗余数据的定义参数包括编码方式、偏移值、冗余数据块长度等。偏移值是指冗余数据相对于编码数据的偏移,比如是编码数据对应的音频帧之前的第一帧音频帧,或者是编码数据 对应的音频帧之前的第二帧音频帧。当然,冗余数据的定义参数也可以直接包括时间戳,冗余数据的时间戳和与该冗余数据的内容相同的编码数据的时间戳相同。编码数据13包括编码数据的定义参数和编码后的编码数据块。编码数据的定义参数包括编码方式、时间戳等。由于编码数据的时间戳和音频数据帧的时间戳相同,所以编码数据的定义参数中也可以不包括时间戳,而直接将音频数据帧的时间戳作为编码数据的时间戳。
需要说明的是,若音频数据包10中包括的冗余数据12不止一个,则每一个冗余数据12分别包括该冗余数据的定义参数和该帧音频帧编码后的冗余数据块。
结合参考图5,其示出了编码的示意图。如图5所示,对于每一帧音频帧,都经过编码和缓存的过程。示例性的,在对一个音频帧通过一种编码方式编码的情况下,第i个音频帧编码得到第i个编码数据,第i个编码数据缓存得到第i个冗余数据,第i+m-1个音频帧编码得到第i+m-1个编码数据,第i+m-1个编码数据缓存得到第i+m-1个冗余数据,第i+m个音频帧编码得到第i+m个编码数据,第i+m个编码数据缓存得到第i+m个冗余数据。在对第i+m个音频帧编码得到第i+m个编码数据之后,获取之前m个音频帧对应的冗余数据,即第i个冗余数据至第i+m-1个冗余数据,将第i个冗余数据至第i+m-1个冗余数据与第i+m个编码数据打包成第i+m个音频数据包。
需要说明的是,步骤203a至步骤203c是并列的三种情况,在实现时发送端针对音频帧的帧数与m之间的大小关系来确定所要执行的步骤。
举例说明,假设m取值为3,对于第1个音频帧,发送端对第1个音频帧进行编码得到第1个编码数据,并将第1个编码数据缓存得到第1个冗余数据,然后将第1个编码数据打包为第1个音频数据包;对于第2个音频帧,发送端对第2个音频帧进行编码得到第2个编码数据,并将第2个编码数据缓存得到第2个冗余数据,然后将第2个编码数据与第1个冗余数据打包为第2个音频数据包;对于第3个音频帧,发送端对第3个音频帧进行编码得到第3个编码数据,并将第3个编码数据缓存得到第3个冗余数据,然后将第3个编码数据与第1个冗余数据和第2个冗余数据打包为第3个音频数据包;对于第4个音频帧,发送端对第4个音频帧进行编码得到第4个编码数据,并将第4个编码数据进行缓存得到第4个冗余数据,然后将第4个编码数据与第1个冗余数据、第2个冗余数据和第3个冗余数据打包为第4个音频数据包;对于第5个音频 帧,发送端对第5个音频帧进行编码得到第5个编码数据,并将第5个编码数据进行缓存得到第5个冗余数据,然后将第5个编码数据与第2个冗余数据、第3个冗余数据和第4个冗余数据打包为第5个音频数据包,后面的音频帧依次类推。
在步骤204中,将第i个音频数据包发送给接收端。
在步骤205中,接收接收端发送的丢包率。
丢包率是接收端在解码过程中统计的丢失或解码失败的音频数据包的数量占已接收的音频数据包的比值。
通常网络状况越差,丢包率越高。
发送端在编码过程中,可以接收接收端反馈的信息,并根据反馈的信息调整编码的相关参数。可选的,相关参数至少包括m的取值和编码参数(采样率和/或压缩率),其中,对m的取值的调整请参见步骤206,对编码参数的调整请参见步骤207。
在步骤206中,根据丢包率确定m的取值,m的取值与丢包率呈正相关。
可选的,m的取值与冗余级别对应,冗余级别是指一级冗余、二级冗余、三级冗余等,丢包率越高,m的取值越大,冗余级别越高。
举例说明,当丢包率小于20%时,m=1,使用一级冗余;当丢包率在20%至40%之间时,m=2,使用二级冗余;当丢包率在40%至60%之间时,m=3,使用三级冗余。
可选的,在实际应用中,当出现连续丢包时,接收端将连续丢包信息反馈给发送端,发送端根据连续丢包信息对冗余级别进行调整。连续丢包信息用于指示连续丢包的数量,连续丢包的数量越大,m的取值越大,且m的取值大于连续丢包的数量。比如:连续丢失3个音频数据包时,将m调整为4,使用四级冗余;连续丢失4个音频数据包时,将m调整为5,使用五级冗余。
由于当前帧的编码数据与前m个音频帧的冗余数据打包在一起传输,若出现连续丢包,则可能导致后面接收到的音频数据包中不包含之前传输的音频帧的冗余数据,则无法对丢失的音频帧进行恢复,因此需要将冗余数据对应的音频帧的数量增加,使得音频数据包的容错性更高,增加了数据可靠性。
在步骤207中,在对当前音频数据包打包后,根据丢包率,调整对后续音频帧进行编码时的采样率和/或压缩率,其中,采样率与丢包率呈正相关,压缩率与丢包率呈负相关。
可选的,发送端在编码时对应初始的编码参数,编码参数包括采样率和压缩率中的至少一种,采样率与丢包率呈正相关,压缩率与丢包率呈负相关。
丢包率越高,采样率越高,压缩率越低,从而使得编码后的数据在恢复时失真度较低。
可选的,在实际应用中,可以只调整采样率和压缩率中的一个,也可以同时对采样率和压缩率进行调整。
需要说明的是,步骤206与步骤207可以同时调整,也可以只调整其中一种,本公开实施例对此不进行限定。
可选的,当发送端编码时的编码方式包括第一编码方式和第二编码方式时,相关参数还包括第二编码方式,如图3所示,对于第二编码方式的调整请参见步骤208。
在步骤208中,根据丢包率确定第二编码方式。
举例说明,常见的编码方式包括G711U编码方式、AAC编码方式、opus编码方式等。其中,AAC编码方式的压缩率较高,G711U编码方式的压缩率较低,opus编码方式与AAC编码方式相比压缩率更高且更接近原始音频数据。
由于在音频数据包没有丢失的情况下,编码数据用于解码播放,而冗余数据不参与解码播放,因此编码数据采用压缩率较低的G711U编码方式,从而使得接收端对音频解码的效率提高,且解码得到的音频帧的失真度较低,而冗余数据采用压缩率较高的AAC编码方式,使得冗余数据的数据量尽量缩小,从而减小音频数据包的大小,便于音频数据包的传输。
当丢包率较高时,表明冗余数据被用于解码的概率提高,则将第二编码方式调整为失真度较小的编码方式,比如:将AAC编码方式调整为opus编码方式。可选的,在实际应用中,可以设置一个丢包率阈值,当发送端接收到的丢包率达到丢包率阈值时,发送端对第二编码方式进行调整。
可选的,在实际应用中,若发送端使用一种编码方式,则启动一个编码器进行编码;若发送端使用多种编码方式,则启动多个编码器进行编码,每个编码器对应一种编码方式。
可选的,发送端与接收端之间可以通过会话描述协议(英文:Session Description Protocol,简称:SDP)协商支持的音频编解码能力,这里的音频编解码能力包括编码参数、编码方式和冗余级别中的至少一种,发送端根据接收端支持的音频编解码能力选择编码方式、编码参数或冗余级别。
需要说明的是,步骤206、步骤207和步骤208可以同时调整,也可以只调整其中一种,或者调整其中任意两种。另外,步骤205至步骤208可以在编码过程中的任意时刻执行,不限定在步骤204之后。
可选的,发送端在根据丢包率对m的取值、采样率、丢包率和第二编码方式的调整时,可以调整其中的任意一个、任意两个、任意三个或全部。
综上所述,本公开实施例提供的音频编码方法,通过对第i个音频帧进行编码得到第i个编码数据,并将第i个编码数据缓存得到第i个冗余数据,在打包音频数据包时,将第i个编码数据与第i个冗余数据之前的至多m个冗余数据打包为第i个音频数据包后发送给接收端,从而使得接收端在第i个音频数据包丢失或第i个编码数据解码失败时,可以获取第i个音频数据包的后一个音频数据包中与第i个编码数据对应的最后一个冗余数据,接收端通过对最后一个冗余数据解码得到第i个音频帧,由于当前帧的编码数据与前一帧或前几帧的冗余数据一起传输,从而使得在当前帧丢失的情况下,能够通过下一个音频数据包中的冗余数据尽快恢复丢失的音频帧,不需要等待一组压缩数据包之后的冗余数据包到达之后才能恢复,从而降低了音频数据包丢失时造成的时延。
针对步骤205至步骤208,由于发送端在编码过程中接收发送端反馈的丢包率,并根据丢包率对编码时的m的取值或采样率或压缩率或所使用的第二编码方式进行调整,使得发送端能够根据音频数据包在网络传输中的实际传输情况对编码参数或编码方式进行自适应地调整。
图6是本公开一个实施例提供的音频解码方法的方法流程图,该音频解码方法以应用在图1所示的接收端120中进行举例说明。如图6所示,该音频解码方法可以包括以下步骤:
在步骤301中,接收当前音频数据包。
发送端向接收端依次发送音频数据包,接收端接收发送端发送的音频数据包。当前音频数据包是指接收端当前接收到的音频数据包,当前音频数据包可以是发送端发送的任意一个音频数据包。
在本公开实施例中,若当前音频数据包为发送端发送的第1个音频数据包,则该第1个音频数据包中包括第1个编码数据,该第1个编码数据是对第1个音频帧进行编码得到的。若当前音频数据包为发送端发送的第i(i>1)个音频 数据包,则该第i个编码数据包中包括第i个编码数据和至少一个冗余数据。假设发送端在一个音频数据包中添加的冗余数据的最大个数为m,m为预设正整数,则当1<i≤m时,该第i个音频数据包中包括第i个编码数据和第i个冗余数据之前的i-1个冗余数据;当m<i≤n时,该第i个音频数据包中包括第i个编码数据和第i-m个冗余数据至第i-1个冗余数据;其中,第i个冗余数据是对第i个音频帧编码得到的。
在步骤302中,在当前音频数据包中的当前编码数据之前存在丢失的编码数据时,从当前音频数据包中的冗余数据中选取目标冗余数据,目标冗余数据对应的音频帧和丢失的编码数据对应的音频帧相同。
发送端在向接收端发送音频数据包的过程中,可能会出现音频数据包丢失的情况,导致接收端无法接收到某个或某些音频数据包。因此,接收端在接收到当前音频数据包之后,需要判断当前音频数据包中的当前编码数据之前是否存在丢失的编码数据。示例性的,上述判断过程包括如下几种可能的实现方式:
在一种可能的实现方式中,在当前编码数据的时间戳与已接收到的编码数据的时间戳不连续时,确定当前编码数据之前存在丢失的编码数据;
在没有出现丢包的情况下,接收端接收到的当前编码数据的时间戳与已接收到的编码数据的时间戳应当是连续的,例如已接收到的各个编码数据的时间戳依次为1、2、3,则下一个编码数据的时间戳应当是4。若接收端接收到的当前编码数据的时间戳为4,则确定当前编码数据之前不存在丢失的编码数据;若接收端接收到的当前编码数据的时间戳为5,则确定当前编码数据之前存在丢失的编码数据,也即时间戳为4的编码数据丢失。
在另一种可能的实现方式中,在当前音频数据包所包括的冗余数据中,存在时间戳与已接收到的编码数据的时间戳不重复的冗余数据时,确定当前编码数据之前存在丢失的编码数据。
在没有出现丢包的情况下,接收端接收到的当前音频数据包所包括的冗余数据中,各个冗余数据的时间戳均与已接收到的编码数据的时间戳重复,例如已接收到的各个编码数据的时间戳依次为1、2、3,假设每个音频数据包中最多携带2个冗余数据,则下一个音频数据包中应当携带时间戳为4的编码数据、时间戳为2的冗余数据以及时间戳为3的冗余数据。若接收端接收到的当前音频数据包中包括时间戳为4的编码数据、时间戳为2的冗余数据以及时间戳为3的冗余数据,则确定当前编码数据之前不存在丢失的编码数据;若接收端接 收到的当前音频数据包中包括时间戳为5的编码数据、时间戳为3的冗余数据以及时间戳为4的冗余数据,由于时间戳4与已接收到的编码数据的时间戳不重复,则确定当前编码数据之前存在丢失的编码数据,也即时间戳为4的编码数据丢失。
在又一种可能的实现方式中,在当前编码数据的时间戳与已接收到的编码数据的时间戳不连续时,还需要判断已接收到的冗余数据中是否存在未接收到的编码数据对应的冗余数据。若存在未接收到的编码数据对应的冗余数据,则表示未接收到的编码数据可以通过已接收到的冗余数据中与未接收到的编码数据对应的冗余数据获得,确定未接收到的编码数据未丢失,无需对当前音频数据包中的冗余数据进行解码。若不存在未接收到的编码数据对应的冗余数据,则表示编码数据丢失。例如,如前文所述,时间戳为4的编码数据未接收到,而接收到的冗余数据为时间戳为2和3的冗余数据,不存在时间戳为4的冗余数据,则表示时间戳为4的编码数据丢失。
需要说明的是,若每个音频数据包中仅包括一个冗余数据,可以在当前编码数据的时间戳与已接收到的编码数据的时间戳不连续时,确定当前编码数据之前存在丢失的编码数据。
此外,在另一种可能实现方式中,可以分别判断当前编码数据的时间戳与已接收到的编码数据的时间戳、已接收到的冗余数据的时间戳是否连续,在当前编码数据的时间戳与已接收到的编码数据的时间戳和已接收到的冗余数据的时间戳均不连续时,确定当前编码数据之前存在丢失的编码数据。
另外,接收端可以根据帧率确定相邻两个编码数据(或者冗余数据)的时间戳的间隔,进而确定出两个编码数据(或者冗余数据)的时间戳是否连续。例如,若两个编码数据(或者冗余数据)的时间戳的差值等于上述间隔,则连续;若两个编码数据(或者冗余数据)的时间戳的差值不等于上述间隔,则不连续。
在判断出当前音频数据包中的当前编码数据之前存在丢失的编码数据时,接收端从当前音频数据包中的冗余数据中选取目标冗余数据,目标冗余数据对应的音频帧和丢失的编码数据对应的音频帧相同,也即,目标冗余数据对应的时间戳和丢失的编码数据对应的时间戳相同。
在一种可能的实施方式中,接收端根据当前音频数据包中的冗余数据的时间戳,选取与已接收到的编码数据的时间戳不重复的冗余数据作为目标冗余数 据。仍然以上述例子为例,已接收到的各个编码数据的时间戳依次为1、2、3,若接收端接收到的当前音频数据包中包括时间戳为5的编码数据、时间戳为3的冗余数据以及时间戳为4的冗余数据,则接收端将时间戳为4的冗余数据作为目标冗余数据。此种方式适用于单个音频数据包中包括一个冗余数据的情况。需要说明的是,在这种方式中,不同的音频数据包中的冗余数据的时间戳均不相同,因此,从当前音频数据包中选取的目标冗余数据的时间戳与当前音频数据包之前已接收到的冗余数据的时间戳均不重复。
在另一种可能的实施方式中,接收端根据当前音频数据包中的冗余数据的时间戳,选取与已接收到的编码数据的时间戳和已接收到的冗余数据的时间戳均不重复的冗余数据作为目标冗余数据。此种方式适用于单个音频数据中包括至少两个冗余数据的情况。目标冗余数据的数量可以是1个,也可以是多个,这由实际丢失的编码数据的数量决定。
在本实施例中,冗余数据的时间戳可以直接携带在冗余数据的定义参数中,也可以根据冗余数据对应的偏移值和与该冗余数据在同一个音频数据包的编码数据的时间戳计算得到。
在步骤303中,对目标冗余数据和当前编码数据进行解码。
可选地,接收端根据目标冗余数据的时间戳和当前编码数据的时间戳,对目标冗余数据和当前编码数据进行排序,其中,目标冗余数据的数量为w,w为正整数。而后,接收端按照时间戳由小到大的顺序对目标冗余数据和当前编码数据依次进行解码,得到w+1个音频帧。
例如,当前编码数据的时间戳为5,目标冗余数据的数量为2,其中一个目标冗余数据的时间戳为3,另一个目标冗余数据的时间戳为4,则接收端按照时间戳由小到大的顺序对上述数据进行排序,依次为时间戳为3的冗余数据、时间戳为4的冗余数据、时间戳为5的编码数据。之后,接收端按照时间戳由小到大的顺序对上述数据依次进行解码,得到3个音频帧。
通过上述方式,按照时间戳由小到大的顺序进行排序后解码,能够确保解码后得到的音频帧能够被准确播放。
可选的,本实施例提供的方法还包括以下步骤:
在步骤304中,每隔预定时长统计当前的丢包率。
接收端在解码过程中,每隔预定时长统计一次当前的丢包率,比如:每隔1秒统计一次当前丢失的数据包占已传输的数据包的数量的比值。
在步骤305中,将当前的丢包率发送给发送端。
接收端统计丢包率后,将丢包率实时发送给发送端,使得发送端能够根据丢包率对m的取值、编码时的采样率、编码时的压缩率和第二编码方式中的至少一种进行调整。
可选的,丢包率可以通过标准RTP控制协议(英文:RTP Control Protocol,简称:RTCP)由接收端发送给发送端。
可选的,若出现连续丢包的情况,接收端可以设置预定阈值,当连续丢包的数量达到预定阈值时,接收端向发送端发送连续丢包信息,发送端还可以根据连续丢包信息对m的取值、编码时的采样率、编码时的压缩率和第二编码方式中的至少一种进行调整。
需要说明的是,步骤304至步骤305可以在解码开始之后每隔预定时长执行一次,不限定在步骤303之后执行。
综上所述,本公开实施例提供的音频解码方法,通过在判断出当前音频数据包中的当前编码数据之前存在丢失的编码数据时,从当前音频数据包中的冗余数据中选取目标冗余数据,而后对目标冗余数据和当前编码数据进行解码,由于当前帧的编码数据与前一帧或前几帧的冗余数据一起传输,从而使得在当前帧丢失的情况下,能够通过下一个音频数据包中的冗余数据尽快恢复丢失的音频帧,不需要等待一组压缩数据包之后的冗余数据包到达之后才能恢复,从而降低了音频数据包丢失时造成的时延。
针对步骤304至步骤305,通过接收端每隔预定时长统计当前的丢包率,并将丢包率情况反馈给发送端,使得发送端能够根据丢包率对m的取值、编码时的采样率、编码时的压缩率和第二编码方式中的至少一种进行调整,从而使得发送端能够根据音频数据包在网络传输中的实际传输情况对编码参数或编码方式进行自适应地调整。
结合参考图7,其示例性地示出了音频数据传输的示意图,如图7所示,发送端110中的声音信号经过采集器21、编码器22和冗余编码打包器23后,音频数据包通过网络传输给接收端120,接收端120中的音频数据包经过冗余编码解析器24、解码器25和播放器26,最终将声音信号在接收端播放出来。
图8是本公开一个实施例提供的音频编码装置的结构方框图,该音频编码 装置以应用在图1所示的发送端110中进行举例说明。如图8所示,该音频编码装置可以包括:编码模块410和打包模块420。
编码模块410,用于获取n个连续的音频帧中的第i个音频帧,基于第i个音频帧得到第i个编码数据和第i个冗余数据,第i个编码数据是对第i个音频帧编码得到的,第i个冗余数据是对第i个音频帧编码后缓存得到的,i为正整数,n为正整数,1≤i≤n。
打包模块420,用于将编码模块410得到的第i个编码数据和第i个冗余数据之前的至多m个冗余数据打包为第i个音频数据包,其中,m为预设正整数。
可选的,打包模块420,包括:
第一打包单元,用于当i=1时,将第1个编码数据打包为第1个音频数据包;
第二打包单元,用于当1<i≤m时,将第i个编码数据和已缓存的第i个冗余数据之前的i-1个冗余数据打包为第i个音频数据包;
第三打包单元,用于当m<i≤n时,将第i个编码数据和已缓存的第i-m个冗余数据至第i-1个冗余数据打包为第i个音频数据包。
可选的,所述装置还包括:
第一接收模块,用于接收接收端发送的丢包率;
第一确定模块,用于根据第一接收模块接收到的丢包率确定m的取值,m的取值与丢包率呈正相关。
可选的,所述装置还包括:
第二接收模块,用于接收接收端发送的丢包率;
调整模块,用于在对当前音频数据包打包后,根据第二接收模块接收到的丢包率,调整对后续音频帧进行编码时的采样率和/或压缩率,其中,采样率与丢包率呈正相关,压缩率与丢包率呈负相关。
可选的,编码模块410,包括:
第一编码单元,用于通过第一编码方式对第i个音频帧进行编码得到第i个第一编码数据;
第二编码单元,用于通过第二编码方式对第i个音频帧进行编码得到第i个第二编码数据;
缓存单元,用于将第二编码单元得到的第i个第二编码数据进行缓存,作为第i个冗余数据。
可选的,编码模块410,用于对第i个音频帧进行编码得到第i个编码数据,将第i个编码数据进行缓存得到第i个冗余数据。
可选的,所述装置还包括:
第三接收模块,用于接收接收端发送的丢包率;
第二确定模块,用于根据第三接收模块接收到的丢包率确定第二编码方式。
可选的,所述装置还包括:
采集模块,用于对声音信号进行信号采集得到音频源数据,音频源数据包括n个连续的音频帧。
综上所述,本公开实施例提供的音频编码装置,通过对第i个音频帧进行编码得到第i个编码数据,并将第i个编码数据缓存得到第i个冗余数据,在打包音频数据包时,将第i个编码数据与第i个冗余数据之前的至多m个冗余数据打包为第i个音频数据包后发送给接收端,从而使得接收端在第i个音频数据包丢失或第i个编码数据解码失败时,可以获取第i个音频数据包的后一个音频数据包中与第i个编码数据对应的最后一个冗余数据,接收端通过对最后一个冗余数据解码得到第i个音频帧,由于当前帧的编码数据与前一帧或前几帧的冗余数据一起传输,从而使得在当前帧丢失的情况下,能够通过下一个音频数据包中的冗余数据尽快恢复丢失的音频帧,不需要等待一组压缩数据包之后的冗余数据包到达之后才能恢复,从而降低了音频数据包丢失时造成的时延。
由于发送端在编码过程中接收发送端反馈的丢包率,并根据丢包率对编码时的m的取值或采样率或压缩率或所使用的第二编码方式进行调整,使得发送端能够根据音频数据包在网络传输中的实际传输情况对编码参数或编码方式进行自适应地调整。
图9是本公开一个实施例提供的音频解码装置的结构方框图,该音频解码装置以应用在图1所示的接收端120中进行举例说明。如图9所示,该音频解码装置可以包括:接收模块510、判断模块520、选取模块530和解码模块540。
接收模块510,用于接收当前音频数据包。
判断模块520,用于判断接收模块510接收到的当前音频数据包中的当前编码数据之前是否存在丢失的编码数据。
选取模块530,用于在判断模块520判断出当前音频数据包中的当前编码数据之前存在丢失的编码数据时,从当前音频数据包中的冗余数据中选取目标冗余数据,目标冗余数据对应的音频帧和丢失的编码数据对应的音频帧相同。
解码模块540,用于对目标冗余数据和当前编码数据进行解码。
可选的,判断模块520,用于在当前编码数据的时间戳与已接收到的编码数据的时间戳不连续时,确定当前编码数据之前存在丢失的编码数据。
可选的,判断模块520还用于在丢失的编码数据的时间戳与已接收到的冗余数据的时间戳不同时或者在当前编码数据的时间戳与已接收到的冗余数据的时间戳不连续时,确定当前编码数据之前存在丢失的音频帧。
可选的,选取模块530,用于根据当前音频数据包中的冗余数据的时间戳,选取与已接收到的编码数据的时间戳不重复的冗余数据作为目标冗余数据。
可选的,解码模块540,包括:
排序单元,用于根据目标冗余数据的时间戳和当前编码数据的时间戳,对目标冗余数据和当前编码数据进行排序,其中,目标冗余数据的数量为w,w为正整数;
解码单元,用于按照时间戳由小到大的顺序对目标冗余数据和当前编码数据依次进行解码,得到w+1个音频帧。
可选的,发送端在对当前音频帧进行编码时的编码方式包括第一编码方式和第二编码方式,其中,第一编码方式用于对当前音频帧进行编码得到所述当前编码数据;第二编码方式用于对当前音频帧进行编码得到当前冗余数据;
所述装置还包括:
统计模块,用于每隔预定时长统计当前的丢包率;
发送模块,用于将当前的丢包率发送给发送端,
丢包率用于确定下述参数中的至少一种:
在当前音频数据包中添加的冗余数据的最大个数m的取值、编码时的采样率、编码时的压缩率、第二编码方式;m的取值与丢包率呈正相关;采样率与丢包率呈正相关,压缩率与丢包率呈负相关。
综上所述,本公开实施例提供的音频解码装置,通过在判断出当前音频数据包中的当前编码数据之前存在丢失的编码数据时,从当前音频数据包中的冗余数据中选取目标冗余数据,而后对目标冗余数据和当前编码数据进行解码,由于当前帧的编码数据与前一帧或前几帧的冗余数据一起传输,从而使得在当 前帧丢失的情况下,能够通过下一个音频数据包中的冗余数据尽快恢复丢失的音频帧,不需要等待一组压缩数据包之后的冗余数据包到达之后才能恢复,从而降低了音频数据包丢失时造成的时延。
通过接收端每隔预定时长统计当前的丢包率,并将丢包率情况反馈给发送端,使得发送端能够根据丢包率对m的取值、编码时的采样率、编码时的压缩率和第二编码方式中的至少一种进行调整,从而使得发送端能够根据音频数据包在网络传输中的实际传输情况对编码参数或编码方式进行自适应地调整。
图10是本公开一个实施例提供的音频编解码系统的结构方框图,该音频编解码系统600包括音频编码装置610和音频解码装置620。
其中,音频编码装置610如上图8所示实施例提供的音频编码装置,音频解码装置620如上图9所示实施例提供的音频解码装置。
上述音频编码装置610可以是编码器或者具有编码器的发送端设备,音频解码装置可以是解码器或者具有解码器的接收端设备。
需要说明的是,上述实施例提供的装置,在实现其功能时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的装置与方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
在示例性实施例中,还提供了一种计算机可读存储介质,该存储介质中存储有至少一条指令,该至少一条指令由处理器加载并执行以实现如图2或图3所描述的音频编码方法。
在示例性实施例中,还提供了一种计算机可读存储介质,该存储介质中存储有至少一条指令,该至少一条指令由处理器加载并执行以实现如图6所描述的音频解码方法。
在示例性实施例中,还提供了一种计算机设备,该计算机设备包括处理器和存储器,存储器中存储有至少一条指令,该至少一条指令由处理器加载并执行以实现如图2或图3所描述的音频编码方法。
在示例性实施例中,还提供了一种计算机设备,该计算机设备包括处理器和存储器,存储器中存储有至少一条指令,该至少一条指令由处理器加载并执行以实现如图6所描述的音频解码方法。
请参见图11所示,其示出了本公开部分实施例中提供的终端的结构方框图。该终端700可以是图1所示的发送端110,也可以是图1所示的接收端120,该终端700用于实施上述实施例提供的音频编码方法或音频解码方法。本公开中的终端700可以包括一个或多个如下组成部分:用于执行计算机程序指令以完成各种流程和方法的处理器,用于信息和存储程序指令随机接入存储器(英文:random access memory,简称:RAM)和只读存储器(英文:read-only memory,简称:ROM),用于存储数据和信息的存储器,I/O设备,界面,天线等。具体来讲:
终端700可以包括射频(英文:Radio Frequency,简称:RF)电路710、存储器720、输入单元730、显示单元740、传感器750、音频电路760、无线保真(英文:wireless fidelity,简称:WiFi)模块770、处理器780、电源782、摄像头790等部件。本领域技术人员可以理解,图11中示出的终端结构并不构成对终端的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
下面结合图11对终端700的各个构成部件进行具体的介绍:
RF电路710可用于收发信息或通话过程中,信号的接收和发送,特别地,将基站的下行信息接收后,给处理器780处理;另外,将设计上行的数据发送给基站。通常,RF电路包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(英文:Low Noise Amplifier,简称:LNA)、双工器等。此外,RF电路710还可以通过无线通信与网络和其他设备通信。所述无线通信可以使用任一通信标准或协议,包括但不限于全球移动通讯系统(英文:Global System of Mobile communication,简称GSM)、通用分组无线服务(英文:General Packet Radio Service,简称:GPRS)、码分多址(英文:Code Division Multiple Access,简称:CDMA)、宽带码分多址(英文:Wideband Code Division Multiple Access,简称:WCDMA)、长期演进(英文:Long Term Evolution,简称:LTE)、电子邮件、短消息服务(英文:Short Messaging Service,简称:SMS)等。
存储器720可用于存储软件程序以及模块,处理器780通过运行存储在存储器720的软件程序以及模块,从而执行终端700的各种功能应用以及数据处理。存储器720可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据终端700的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器720可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。
输入单元730可用于接收输入的数字或字符信息,以及产生与终端700的用户设置以及功能控制有关的键信号输入。具体地,输入单元730可包括触控面板731以及其他输入设备732。触控面板731,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板731上或在触控面板731附近的操作),并根据预先设定的程式驱动相应的连接装置。可选的,触控面板731可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器780,并能接收处理器780发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板731。除了触控面板731,输入单元730还可以包括其他输入设备732。具体地,其他输入设备732可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。
显示单元740可用于显示由用户输入的信息或提供给用户的信息以及终端700的各种菜单。显示单元740可包括显示面板741,可选的,可以采用液晶显示器(英文:Liquid Crystal Display,简称:LCD)、有机发光二极管(英文:Organic Light-Emitting Diode,简称:OLED)等形式来配置显示面板741。进一步的,触控面板731可覆盖显示面板741,当触控面板731检测到在其上或附近的触摸操作后,传送给处理器780以确定触摸事件的类型,随后处理器780根据触摸事件的类型在显示面板741上提供相应的视觉输出。虽然在图11中,触控面板731与显示面板741是作为两个独立的部件来实现终端700的输入和输入功能,但是在某些实施例中,可以将触控面板731与显示面板741集成而实现终端700的输入和输出功能。
终端700还可包括至少一种传感器750,比如陀螺仪传感器、磁感应传感器、光传感器、运动传感器以及其他传感器。具体地,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板741的亮度,接近传感器可在终端700移动到耳边时,关闭显示面板741和/或背光。作为运动传感器的一种,加速度传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别终端姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;至于终端700还可配置的气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。
音频电路760、扬声器761,传声器762可提供用户与终端700之间的音频接口。音频电路760可将接收到的音频数据转换后的电信号,传输到扬声器761,由扬声器761转换为声音信号输出;另一方面,传声器762将收集的声音信号转换为电信号,由音频电路760接收后转换为音频数据,再将音频数据输出处理器780处理后,经RF电路710以发送给比如另一终端,或者将音频数据输出至存储器720以便进一步处理。
WiFi属于短距离无线传输技术,终端700通过WiFi模块770可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。虽然图11示出了WiFi模块770,但是可以理解的是,其并不属于终端700的必须构成,完全可以根据需要在不改变公开的本质的范围内而省略。
处理器780是终端700的控制中心,利用各种接口和线路连接整个终端的各个部分,通过运行或执行存储在存储器720内的软件程序和/或模块,以及调用存储在存储器720内的数据,执行终端700的各种功能和处理数据,从而对终端进行整体监控。可选的,处理器780可包括一个或多个处理单元;优选的,处理器780可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器780中。
终端700还包括给各个部件供电的电源782(比如电池),优选的,电源可以通过电源管理系统与处理器780逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。
摄像头790一般由镜头、图像传感器、接口、数字信号处理器、中央处理单元(英文:Central Processing Unit,简称:CPU)、显示屏幕等组成。其中, 镜头固定在图像传感器的上方,可以通过手动调节镜头来改变聚焦;图像传感器相当于传统相机的“胶卷”,是摄像头采集图像的心脏;接口用于把摄像头利用排线、板对板连接器、弹簧式连接方式与终端主板连接,将采集的图像发送给所述存储器720;数字信号处理器通过数学运算对采集的图像进行处理,将采集的模拟图像转换为数字图像并通过接口发送给存储器720。
尽管未示出,终端700还可以包括蓝牙模块等,在此不再赘述。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为对本公开实施例的举例说明,并不用以限制本公开的范围。凡在本公开的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本公开所附权利要求的保护范围之内。

Claims (27)

  1. 一种音频编码方法,包括:
    获取n个连续的音频帧中的第i个音频帧,基于所述第i个音频帧得到第i个编码数据和第i个冗余数据,所述第i个编码数据是对所述第i个音频帧编码得到的,所述第i个冗余数据是对所述第i个音频帧编码后缓存得到的,n为正整数,1≤i≤n;
    将所述第i个编码数据和所述第i个冗余数据之前的至多m个冗余数据打包为第i个音频数据包,其中,m为预设正整数。
  2. 根据权利要求1所述的方法,其中,所述将所述第i个编码数据和所述第i个冗余数据之前的至多m个冗余数据打包为第i个音频数据包,包括以下步骤中的至少一个:
    当i=1时,将第1个编码数据打包为第1个音频数据包;
    当1<i≤m时,将所述第i个编码数据和已缓存的第i个冗余数据之前的i-1个冗余数据打包为第i个音频数据包;
    当m<i≤n时,将所述第i个编码数据和已缓存的第i-m个冗余数据至第i-1个冗余数据打包为第i个音频数据包。
  3. 根据权利要求1所述的方法,还包括:
    接收所述接收端发送的丢包率和连续丢包信息中的至少一个,所述连续丢包信息用于指示连续丢包的数量;
    根据所述丢包率和所述连续丢包信息中的至少一个,确定所述m的取值,所述m的取值与所述丢包率呈正相关。
  4. 根据权利要求1所述的方法,还包括:
    接收所述接收端发送的丢包率;
    在对当前音频数据包打包后,根据所述丢包率,调整对后续音频帧进行编码时的采样率和/或压缩率,其中,所述采样率与所述丢包率呈正相关,所述压缩率与所述丢包率呈负相关。
  5. 根据权利要求1所述的方法,其中,所述对所述第i个音频帧进行编码得到第i个编码数据,将所述第i个编码数据进行缓存得到第i个冗余数据,包括:
    通过第一编码方式对所述第i个音频帧进行编码得到第i个第一编码数据;
    通过第二编码方式对所述第i个音频帧进行编码得到第i个第二编码数据,并将所述第i个第二编码数据进行缓存,作为第i个冗余数据。
  6. 根据权利要求5所述的方法,还包括:
    接收所述接收端发送的丢包率;
    根据所述丢包率确定所述第二编码方式。
  7. 根据权利要求1至6任一所述的方法,还包括:
    对声音信号进行信号采集得到音频源数据,所述音频源数据包括所述n个连续的音频帧。
  8. 一种音频解码方法,包括:
    接收当前音频数据包;
    在所述当前音频数据包中的当前编码数据之前存在丢失的编码数据时,从所述当前音频数据包中的冗余数据中选取目标冗余数据,所述目标冗余数据对应的音频帧和所述丢失的编码数据对应的音频帧相同;
    对所述目标冗余数据和所述当前编码数据进行解码。
  9. 根据权利要求8所述的方法,还包括:
    在所述当前编码数据的时间戳与已接收到的编码数据的时间戳和已接收到的冗余数据的时间戳均不连续时,确定所述当前编码数据之前存在所述丢失的编码数据。
  10. 根据权利要求8或9所述的方法,其中,所述从所述当前音频数据包中的冗余数据中选取目标冗余数据,包括:
    根据所述当前音频数据包中的冗余数据的时间戳,选取与已接收到的编码数据的时间戳和已接收到的冗余数据的时间戳均不重复的冗余数据作为所述目 标冗余数据。
  11. 根据权利要求8所述的方法,其中,所述对所述目标冗余数据和所述当前编码数据进行解码,包括:
    根据所述目标冗余数据的时间戳和所述当前编码数据的时间戳,对所述目标冗余数据和所述当前编码数据进行排序,其中,所述目标冗余数据的数量为w,w为正整数;
    按照时间戳由小到大的顺序对所述目标冗余数据和所述当前编码数据依次进行解码,得到w+1个音频帧。
  12. 根据权利要求8所述的方法,还包括:
    每隔预定时长统计当前的丢包率;
    将所述当前的丢包率发送给所述发送端。
  13. 一种音频编码装置,包括:
    编码模块,用于获取n个连续的音频帧中的第i个音频帧,基于所述第i个音频帧得到第i个编码数据和第i个冗余数据,所述第i个编码数据是对所述第i个音频帧编码得到的,所述第i个冗余数据是对所述第i个音频帧编码后缓存得到的,i为正整数,n为正整数,1≤i≤n;
    打包模块,用于将所述编码模块得到的所述第i个编码数据和所述第i个冗余数据之前的至多m个冗余数据打包为第i个音频数据包,其中,m为预设正整数。
  14. 根据权利要求13所述的装置,其中,所述打包模块,包括:
    第一打包单元,用于当i=1时,将第1个编码数据打包为第1个音频数据包;
    第二打包单元,用于当1<i≤m时,将所述第i个编码数据和已缓存的第i个冗余数据之前的i-1个冗余数据打包为第i个音频数据包;
    第三打包单元,用于当m<i≤n时,将所述第i个编码数据和已缓存的第i-m个冗余数据至第i-1个冗余数据打包为第i个音频数据包。
  15. 根据权利要求13所述的装置,还包括:
    第一接收模块,用于接收所述接收端发送的丢包率和连续丢包信息中的至少一个,所述连续丢包信息用于指示连续丢包的数量;
    第一确定模块,用于根据所述第一接收模块接收到的所述丢包率和所述连续丢包信息中的至少一个,确定所述m的取值,所述m的取值与所述丢包率呈正相关。
  16. 根据权利要求13所述的装置,还包括:
    第二接收模块,用于接收所述接收端发送的丢包率;
    调整模块,用于在对当前音频数据包打包后,根据所述第二接收模块接收到的所述丢包率,调整对后续音频帧进行编码时的采样率和/或压缩率,其中,所述采样率与所述丢包率呈正相关,所述压缩率与所述丢包率呈负相关。
  17. 根据权利要求13所述的装置,其中,所述编码模块,包括:
    第一编码单元,用于通过第一编码方式对所述第i个音频帧进行编码得到第i个第一编码数据;
    第二编码单元,用于通过第二编码方式对所述第i个音频帧进行编码得到第i个第二编码数据;
    缓存单元,用于将所述第二编码单元得到的所述第i个第二编码数据进行缓存,作为第i个冗余数据。
  18. 根据权利要求17所述的装置,还包括:
    第三接收模块,用于接收所述接收端发送的丢包率;
    第二确定模块,用于根据所述第三接收模块接收到的所述丢包率确定所述第二编码方式。
  19. 根据权利要求13至18任一所述的装置,还包括:
    采集模块,用于对声音信号进行信号采集得到音频源数据,所述音频源数据包括n个连续的音频帧。
  20. 一种音频解码装置,包括:
    接收模块,用于接收当前音频数据包;
    选取模块,用于在所述当前音频数据包中的当前编码数据之前存在丢失的编码数据时,从所述当前音频数据包中的冗余数据中选取目标冗余数据,所述目标冗余数据对应的音频帧和所述丢失的编码数据对应的音频帧相同;
    解码模块,用于对所述目标冗余数据和所述当前编码数据进行解码。
  21. 根据权利要求20所述的装置,还包括:
    判断模块,用于在所述当前编码数据的时间戳与已接收到的编码数据的时间戳和已接收到的冗余数据的时间戳均不连续时,确定所述当前编码数据之前存在所述丢失的编码数据。
  22. 根据权利要求20或21所述的装置,其中,
    所述选取模块,用于根据所述当前音频数据包中的冗余数据的时间戳,选取与已接收到的编码数据的时间戳和已接收到的冗余数据的时间戳均不重复的冗余数据作为所述目标冗余数据。
  23. 根据权利要求20所述的装置,其中,所述解码模块,包括:
    排序单元,用于根据所述目标冗余数据的时间戳和所述当前编码数据的时间戳,对所述目标冗余数据和所述当前编码数据进行排序,其中,所述目标冗余数据的数量为w,w为正整数;
    解码单元,用于按照时间戳由小到大的顺序对所述目标冗余数据和所述当前编码数据依次进行解码,得到w+1个音频帧。
  24. 根据权利要求20所述的装置,还包括:
    统计模块,用于每隔预定时长统计当前的丢包率;
    发送模块,用于将所述当前的丢包率发送给所述发送端。
  25. 一种音频编解码系统,包括:音频编码装置和音频解码装置;
    其中,所述音频编码装置是如权利要求13至19中任一所述的装置;
    所述音频解码装置是如权利要求20至24中任一所述的装置。
  26. 一种计算机设备,包括处理器和存储器,所述存储器中存储有至少一 条指令,所述至少一条指令由所述处理器加载并执行以实现如权利要求1至7任一所述的音频编码方法。
  27. 一种计算机设备,包括处理器和存储器,所述存储器中存储有至少一条指令,所述至少一条指令由所述处理器加载并执行以实现如权利要求8至12任一所述的音频解码方法。
PCT/CN2018/106298 2017-09-18 2018-09-18 音频编码方法、解码方法、装置及音频编解码系统 WO2019052582A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US16/648,626 US11355130B2 (en) 2017-09-18 2018-09-18 Audio coding and decoding methods and devices, and audio coding and decoding system
EP18856710.1A EP3686885A4 (en) 2017-09-18 2018-09-18 METHOD AND DEVICE FOR AUDIO CODING AND DECODING, AND AUDIO CODING AND DECODING SYSTEM

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710840992.2A CN109524015B (zh) 2017-09-18 2017-09-18 音频编码方法、解码方法、装置及音频编解码系统
CN201710840992.2 2017-09-18

Publications (1)

Publication Number Publication Date
WO2019052582A1 true WO2019052582A1 (zh) 2019-03-21

Family

ID=65723232

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/106298 WO2019052582A1 (zh) 2017-09-18 2018-09-18 音频编码方法、解码方法、装置及音频编解码系统

Country Status (4)

Country Link
US (1) US11355130B2 (zh)
EP (1) EP3686885A4 (zh)
CN (1) CN109524015B (zh)
WO (1) WO2019052582A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112291762A (zh) * 2020-06-11 2021-01-29 珠海市杰理科技股份有限公司 蓝牙通信中的数据收发方法、装置、设备及系统

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110691218B (zh) * 2019-09-09 2021-05-14 苏州臻迪智能科技有限公司 音频数据传输方法、装置、电子设备及可读存储介质
CN111193966A (zh) * 2019-12-25 2020-05-22 北京佳讯飞鸿电气股份有限公司 音频数据传输方法、装置、计算机设备及存储介质
CN111314335B (zh) * 2020-02-10 2021-10-08 腾讯科技(深圳)有限公司 数据传输方法、装置、终端、存储介质和系统
CN111277864B (zh) 2020-02-18 2021-09-10 北京达佳互联信息技术有限公司 直播数据的编码方法、装置、流转系统及电子设备
WO2021163954A1 (zh) * 2020-02-20 2021-08-26 深圳市汇顶科技股份有限公司 数据传输方法、装置、设备、系统及介质
CN111128203B (zh) * 2020-02-27 2022-10-04 北京达佳互联信息技术有限公司 音频数据编码、解码方法、装置、电子设备及存储介质
CN111385637B (zh) * 2020-03-18 2022-05-20 Oppo广东移动通信有限公司 媒体数据编码方法、装置及电子设备
EP4014233A1 (en) * 2020-04-01 2022-06-22 Google LLC Audio packet loss concealment via packet replication at decoder input
CN113936669A (zh) * 2020-06-28 2022-01-14 腾讯科技(深圳)有限公司 数据传输方法、系统、装置、计算机可读存储介质及设备
CN111818231B (zh) * 2020-07-06 2021-02-09 全时云商务服务股份有限公司 丢包补偿方法、装置、数据报文传输系统和存储介质
CN112601077B (zh) * 2020-12-11 2022-07-26 杭州当虹科技股份有限公司 一种基于音频的编码器延时的自动测量方法
CN113096670B (zh) * 2021-03-30 2024-05-14 北京字节跳动网络技术有限公司 音频数据的处理方法、装置、设备及存储介质
CN113192519B (zh) * 2021-04-29 2023-05-23 北京达佳互联信息技术有限公司 音频编码方法和装置以及音频解码方法和装置
CN113744744B (zh) * 2021-08-04 2023-10-03 杭州网易智企科技有限公司 一种音频编码方法、装置、电子设备及存储介质
CN114301884B (zh) * 2021-08-27 2023-12-05 腾讯科技(深圳)有限公司 音频数据的发送方法、接收方法、装置、终端及存储介质
CN114866856B (zh) * 2022-05-06 2024-01-02 北京达佳互联信息技术有限公司 音频信号的处理方法、音频生成模型的训练方法及装置
CN114640853B (zh) * 2022-05-18 2022-07-29 滨州市人防工程与指挥保障中心 一种无人机巡航图像处理系统
CN115662448B (zh) * 2022-10-17 2023-10-20 深圳市超时代软件有限公司 音频数据编码格式转换的方法及装置
CN116320536B (zh) * 2023-05-16 2023-08-18 瀚博半导体(上海)有限公司 视频处理方法、装置、计算机设备及计算机可读存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1056297A2 (en) * 1999-05-24 2000-11-29 Agilent Technologies Inc. Multimedia decoder with error detection
CN101039260A (zh) * 2006-03-17 2007-09-19 富士通株式会社 数据传送方法以及应用该方法的通信系统和程序
CN101777960A (zh) * 2008-11-17 2010-07-14 华为终端有限公司 音频编码方法、音频解码方法、相关装置及通信系统
CN102025717A (zh) * 2010-09-10 2011-04-20 香港城市大学深圳研究院 一种传输多媒体数据的方法
CN103957222A (zh) * 2014-05-20 2014-07-30 艾诺通信系统(苏州)有限责任公司 一种基于fec算法的视频传输自适应方法
CN106130696A (zh) * 2016-09-22 2016-11-16 杭州龙境科技有限公司 一种前向纠错的方法、装置及电子设备

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6728924B1 (en) 1999-10-21 2004-04-27 Lucent Technologies Inc. Packet loss control method for real-time multimedia communications
US6757654B1 (en) * 2000-05-11 2004-06-29 Telefonaktiebolaget Lm Ericsson Forward error correction in speech coding
US6934676B2 (en) * 2001-05-11 2005-08-23 Nokia Mobile Phones Ltd. Method and system for inter-channel signal redundancy removal in perceptual audio coding
US7668712B2 (en) * 2004-03-31 2010-02-23 Microsoft Corporation Audio encoding and decoding with intra frames and adaptive forward error correction
DE602005020130D1 (de) * 2004-05-10 2010-05-06 Nippon Telegraph & Telephone E, sendeverfahren, empfangsverfahren und einrichtung und programm dafür
EP1817859A1 (en) * 2004-12-02 2007-08-15 THOMSON Licensing Adaptive forward error correction
CN101047604A (zh) 2006-05-17 2007-10-03 华为技术有限公司 一种数据冗余发送方法及系统
EP2381580A1 (en) 2007-04-13 2011-10-26 Global IP Solutions (GIPS) AB Adaptive, scalable packet loss recovery
CN101227482B (zh) * 2008-02-02 2011-11-30 中兴通讯股份有限公司 一种网络电话通话中媒体协商方法、装置及系统
CN103050123B (zh) * 2011-10-17 2015-09-09 多玩娱乐信息技术(北京)有限公司 一种传输语音信息的方法和系统
US9236053B2 (en) * 2012-07-05 2016-01-12 Panasonic Intellectual Property Management Co., Ltd. Encoding and decoding system, decoding apparatus, encoding apparatus, encoding and decoding method
CN104756424A (zh) * 2012-09-28 2015-07-01 数码士控股有限公司 使用跨层优化自适应地传送fec奇偶校验数据的方法
CN102915736B (zh) * 2012-10-16 2015-09-02 广东威创视讯科技股份有限公司 混音处理方法和混音处理系统
CN103280222B (zh) * 2013-06-03 2014-08-06 腾讯科技(深圳)有限公司 音频编码、解码方法及其系统
GB201316575D0 (en) * 2013-09-18 2013-10-30 Hellosoft Inc Voice data transmission with adaptive redundancy
CN103532936A (zh) * 2013-09-28 2014-01-22 福州瑞芯微电子有限公司 一种蓝牙音频自适应传输方法
CN104917671B (zh) * 2015-06-10 2017-11-21 腾讯科技(深圳)有限公司 基于移动终端的音频处理方法和装置
US10049682B2 (en) * 2015-10-29 2018-08-14 Qualcomm Incorporated Packet bearing signaling information indicative of whether to decode a primary coding or a redundant coding of the packet
CN106571893B (zh) * 2016-11-10 2022-05-24 深圳市潮流网络技术有限公司 一种语音数据的编解码方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1056297A2 (en) * 1999-05-24 2000-11-29 Agilent Technologies Inc. Multimedia decoder with error detection
CN101039260A (zh) * 2006-03-17 2007-09-19 富士通株式会社 数据传送方法以及应用该方法的通信系统和程序
CN101777960A (zh) * 2008-11-17 2010-07-14 华为终端有限公司 音频编码方法、音频解码方法、相关装置及通信系统
CN102025717A (zh) * 2010-09-10 2011-04-20 香港城市大学深圳研究院 一种传输多媒体数据的方法
CN103957222A (zh) * 2014-05-20 2014-07-30 艾诺通信系统(苏州)有限责任公司 一种基于fec算法的视频传输自适应方法
CN106130696A (zh) * 2016-09-22 2016-11-16 杭州龙境科技有限公司 一种前向纠错的方法、装置及电子设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3686885A4

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112291762A (zh) * 2020-06-11 2021-01-29 珠海市杰理科技股份有限公司 蓝牙通信中的数据收发方法、装置、设备及系统
CN112291762B (zh) * 2020-06-11 2024-01-30 珠海市杰理科技股份有限公司 蓝牙通信中的数据收发方法、装置、设备及系统

Also Published As

Publication number Publication date
US11355130B2 (en) 2022-06-07
EP3686885A1 (en) 2020-07-29
CN109524015A (zh) 2019-03-26
CN109524015B (zh) 2022-04-15
EP3686885A4 (en) 2020-11-25
US20200219519A1 (en) 2020-07-09

Similar Documents

Publication Publication Date Title
WO2019052582A1 (zh) 音频编码方法、解码方法、装置及音频编解码系统
CN108011686B (zh) 信息编码帧丢失恢复方法和装置
CN106664161B (zh) 基于冗余的包传输错误恢复的系统和方法
WO2021098405A1 (zh) 数据传输方法、装置、终端及存储介质
US9445150B2 (en) Asynchronously streaming video of a live event from a handheld device
WO2015058656A1 (zh) 直播控制方法,及主播设备
WO2021197153A1 (zh) 传输处理方法及设备
WO2014079382A1 (zh) 语音传输方法、终端、语音服务器及语音传输系统
EP1855483A2 (en) Apparatus and method for transmitting and receiving moving pictures using near field communication
WO2019105340A1 (zh) 视频传输方法、装置、系统及计算机可读存储介质
CN112423076B (zh) 一种音频投屏同步控制方法、设备及计算机可读存储介质
CN107682360B (zh) 一种语音通话的处理方法及移动终端
US20120154678A1 (en) Receiving device, screen frame transmission system and method
CN111800826B (zh) 一种rohc反馈处理方法及用户设备
CN109587497B (zh) Flv流的音频数据传输方法、装置和系统
WO2020107168A1 (zh) 视频解码方法、装置、电子设备、计算机可读存储介质
CN113472479B (zh) 一种传输处理方法及设备
CN113225162B (zh) 信道状态信息csi上报方法、终端及计算机可读存储介质
CN101483748A (zh) 一种针对3g电路交换网络上实时视频通话应用的音视频同步方法及装置
CN117498892B (zh) 基于uwb的音频传输方法、装置、终端及存储介质
WO2014177185A1 (en) Operating a terminal device in a communication system
JP4717327B2 (ja) 映像入力装置
CN110933513A (zh) 一种音视频数据传输方法及装置
CN109003313B (zh) 一种传输网页图片的方法、装置和系统
KR100916469B1 (ko) 미디어 장치 및 그 제어방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18856710

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018856710

Country of ref document: EP

Effective date: 20200420