WO2022100414A1 - 音频编解码方法和装置 - Google Patents

音频编解码方法和装置 Download PDF

Info

Publication number
WO2022100414A1
WO2022100414A1 PCT/CN2021/125760 CN2021125760W WO2022100414A1 WO 2022100414 A1 WO2022100414 A1 WO 2022100414A1 CN 2021125760 W CN2021125760 W CN 2021125760W WO 2022100414 A1 WO2022100414 A1 WO 2022100414A1
Authority
WO
WIPO (PCT)
Prior art keywords
encoding
decoding
bit rate
data
rate
Prior art date
Application number
PCT/CN2021/125760
Other languages
English (en)
French (fr)
Inventor
王卓
王萌
杜春晖
范泛
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022100414A1 publication Critical patent/WO2022100414A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/22Mode decision, i.e. based on audio signal content versus external parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/24Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding

Definitions

  • the present application relates to audio coding and decoding technologies, and in particular, to an audio coding and decoding method and apparatus.
  • the Bluetooth audio codec is mainly used between Bluetooth-connected devices (headphones, speakers, smart wearable devices, etc.) to provide a high-quality music transmission and playback in different scenarios.
  • Bluetooth-connected devices headphones, speakers, smart wearable devices, etc.
  • audio coding and decoding technologies there are two kinds of audio coding and decoding technologies, one is the high-bit rate coding and decoding technology, which can be applied to scenarios with high demands on the transmission quality of the Bluetooth channel, and the other is the low-bit-rate coding and decoding technology, which can be applied to the Scenarios with high demand for sound quality.
  • the present application provides an audio encoding and decoding method and device, so as to realize the seamless integration of low-bit rate encoding and decoding processing and high-bit-rate encoding and decoding processing, so as to maximize the guarantee of audio frequency under the premise of complying with the limitation of the data transmission size of the Bluetooth channel. It improves the anti-interference ability of the Bluetooth channel and brings users a more optimized audio experience.
  • the present application provides an audio encoding and decoding method, including: an audio sending device obtains a set bit rate of a current audio frame to be encoded and a final encoding method of a previous audio frame, where the final encoding method includes a first bit rate coding mode, coding mode at the second coding rate, switching coding mode from coding at the first coding rate to coding at the second coding rate, or switching coding mode from coding at the second coding rate to coding at the first coding rate, wherein the first coding rate is lower than the second coding mode Bit rate; determine the final encoding method of the current audio frame according to the set bit rate and the final encoding method of the previous audio frame; encode the current audio frame according to the final encoding method of the current audio frame.
  • the audio sending device sends the stream information to the audio receiving device.
  • the audio receiving device obtains the code stream information; parses the code stream information to obtain the decoding method and the encoding code stream, the encoding code stream includes the first code rate encoding code stream and/or the second code rate encoding code stream, and the decoding method includes the first code rate Decoding mode, second bit rate decoding mode, switch decoding mode from first bit rate decoding to second bit rate decoding, or switch decoding mode from second bit rate decoding to first bit rate decoding, when the decoding mode is the first bit rate decoding mode When the decoding mode is the second bit rate decoding mode, the encoding bit stream includes the second bit rate encoding bit stream, when the decoding mode is the first bit rate decoding to the second bit rate When the decoding mode is switched from the code rate decoding or the decoding mode is switched from the second code rate decoding to the first code rate decoding, the encoding code stream includes the first code rate encoding code stream and the second code rate encoding code
  • the audio frame can be any frame of audio sent by the audio sending device to the audio receiving device.
  • the object of each encoding in this application may be a frame of audio frame in the audio, that is, the audio encoding method provided in this application is for one frame of audio frame, and the method for determining the encoding mode below is applicable to each frame of audio in the audio frame. . Therefore, in order to distinguish, the audio frame being encoded by the audio transmitting device is called the audio frame or the current audio frame, and the audio frame encoded by the audio transmitting device only prior to the audio frame is called the previous audio frame.
  • the audio frame may be represented in the form of audio pulse code modulation (pulse code modulation, PCM) data.
  • PCM pulse code modulation
  • the set code rate may be a target coding code rate preset by the user according to the current channel state.
  • the set bit rate may be, for example, 192kbps, 256kbps, 400kbps or 600kbps.
  • the final encoding method refers to the encoding method actually used by the audio sending device to encode the audio frame, which can include the first bit rate encoding method, the second bit rate encoding method, and the encoding method switching from the first bit rate encoding to the second bit rate encoding.
  • the coding mode is switched from the second code rate to the first code rate, wherein the first code rate is lower than the second code rate, for example, the first code rate may be 64kbps, 128kbps, 192kbps, 256kbps, 400kbps, or 600kbps, etc.
  • the second code rate can be 128kbps, 192kbps, 256kbps, 400kbps, or 600kbps, etc.
  • the code rate may be lower than the second code rate.
  • the first code rate may also be referred to as a low code rate
  • the second code rate may be referred to as a high code rate.
  • This application involves two encoding processes, a first bit rate encoding process and a second bit rate encoding process, that is, a low bit rate encoding process and a high bit rate encoding process
  • the low bit rate encoding process may include, for example, advanced audio encoding advanced audio coding (AAC), the next-generation Bluetooth default low-power and low-latency codec (low complexity communication codec, LC3), etc.
  • the high-bit-rate coding process may include, for example, a low-latency high-fidelity audio codec (low -latency hi-definition audio codec, LHDC), low-power communication encoder high bit rate version (low complexity communication codec plus, LC3plus), etc.
  • the frame length of the low bit rate encoding process is the same as the frame length of the high bit rate encoding process
  • the total delay of the encoding and decoding processing of the low bit rate encoding process is the same as the encoding and decoding process of the high bit rate encoding process.
  • the final encoding method of the current audio frame is low bit rate encoding way
  • the final encoding mode of the previous audio frame is the high bit rate encoding mode
  • determine that the final encoding mode of the current audio frame is the encoding mode switched from high bit rate encoding to low bit rate encoding
  • the final encoding method of the current audio frame is determined to be high bit rate encoding to low bit rate encoding.
  • Rate encoding switches the encoding method; or,
  • the final encoding mode of the previous audio frame is the encoding mode switching from high bit rate encoding to low bit rate encoding
  • determine that the final encoding mode of the current audio frame is the low bit rate encoding mode
  • the final encoding mode of the previous audio frame is the low bit rate encoding mode
  • the final encoding method of the previous audio frame is the high-bit-rate encoding method
  • determine that the final encoding method of the current audio frame is the high-bit-rate encoding method
  • the final encoding mode of the previous audio frame is the encoding mode switching from low-bit rate encoding to high-bit-rate encoding
  • determine that the final encoding mode of the current audio frame is the high-bit-rate encoding mode
  • the final encoding method of the current audio frame is determined to be low bit rate encoding to high bit rate encoding.
  • Rate coding Switches the coding method.
  • the value of the above-mentioned set threshold is associated with the number of channels of the audio frame. For example, when the number of channels of the audio frame is mono, the set threshold may be 150kbps, and when the number of channels of the audio frame is two channels, the set threshold may be 300kbps.
  • the encoding method is the low bit rate encoding method, determine that the final encoding method of the current audio frame is the low bit rate encoding method; or,
  • the initial value of the first counter is the first set value, and the first counter is terminated when the value is 0.
  • the purpose of starting the first counter is to count the processing of switching frames, start the first counter when the first switching frame is processed, and set the initial value of the first counter to the number of switching frames obtained by calculation (the first setting value). Each time a switching frame is processed, the first counter is decremented by 1.
  • the value of the first counter When the value of the first counter is 0, it indicates that the switching frame has been completely encoded, and the first counter is terminated at this time.
  • the value of the first counter is the first set value, it means that the first switching frame is currently being processed.
  • the value of the first counter When the value of the first counter is 1, it means that the last switching frame is currently being processed.
  • the value of the first counter is less than the first set value. A fixed value greater than 1 indicates that the middle switching frame is currently being processed; or,
  • the final encoding method of the previous audio frame is the high-bit-rate encoding method
  • determine that the final encoding method of the current audio frame is the high-bit-rate encoding method
  • the final encoding mode of the previous audio frame is the low bit rate encoding mode
  • start the second counter the initial value of the second counter is the first set value
  • the second counter is terminated when the value is 0.
  • the purpose of starting the second counter is to count the processing of switching frames, start the second counter when the first switching frame is processed, and set the initial value of the second counter to the number of switching frames obtained by calculation (first set value). Each time a switching frame is processed, the second counter is decremented by 1.
  • the second counter When the value of the second counter is 0, it indicates that the switching frame has been completely encoded, and the second counter is terminated at this time.
  • the value of the second counter is the first set value, it means that the first switching frame is currently being processed; when the value of the second counter is 1, it means that the last switching frame is currently being processed, and the value of the second counter is less than the first set value.
  • a fixed value greater than 1 indicates that the middle switching frame is currently being processed; or,
  • the value of the first counter is set to Decrement by 1; if the first counter is still in the startup state, it is determined that the final encoding mode of the current audio frame is to switch the encoding mode from high bit rate encoding to low bit rate encoding; or, if the first counter is terminated (that is, the value of the first counter is 0), then it is determined that the final encoding mode of the current audio frame is the low bit rate encoding mode; or,
  • the value of the second counter is set to Decrement by 1; if the second counter is still in the startup state, it is determined that the final encoding mode of the current audio frame is to switch the encoding mode from low bit rate encoding to high bit rate encoding; or, if the second counter is terminated (that is, the value of the second counter be 0), then determine that the final encoding mode of the current audio frame is a high bit rate encoding mode;
  • the value of the above-mentioned set threshold is associated with the number of channels of the audio frame.
  • the set threshold can be 165kbps, and when the number of channels of the audio frame is two channels, the set threshold can be 330kbps.
  • the value of the number D of the above-mentioned switching frames can be obtained by the following method, wherein A represents a low-bit rate encoding mode, B represents a high-bit-rate encoding method, and A ⁇ B represents a low-bit-rate encoding switching encoding mode to high-bit-rate encoding. , B ⁇ A indicates that the encoding mode is switched from high bit rate encoding to low bit rate encoding.
  • D rounding ((max(total delay of coding and decoding with low code rate, total delay of coding and decoding with high code rate) + overlap length from low code rate to high code rate + processing frame length-1)/processing frame length)
  • the audio sending device encodes the current audio frame, so the following situations are possible:
  • the frame length of the low bit rate encoding process is the same as the frame length of the high bit rate encoding process, and the total encoding and decoding delay of the low bit rate encoding process is the same as that of the high bit rate encoding process.
  • the final encoding method of the current audio frame is the low bit rate encoding method
  • the audio sending device performs low bit rate encoding processing on the current audio frame.
  • the audio sending device may first determine whether the low bit rate encoding process supports the sampling rate of the current audio frame, and when the low bit rate encoding process supports the sampling rate of the current audio frame, it can directly Perform low-bit-rate encoding processing; or, when low-bit-rate encoding processing does not support the sampling rate of the current audio frame, the current audio frame can be down-sampled or up-sampled to obtain the down-sampled or up-sampled current audio frame , and then perform low-bit-rate encoding processing on the down-sampled or up-sampled current audio frame, where the low-bit rate encoding processing supports the sampling rate of the down-sampled or up-sampled current audio frame.
  • the low bit rate coding process does not support the sampling rates of 88.2kHz and 96kHz.
  • the audio transmission device can use quadrature mirror filter (QMF) for downsampling processing, and the frequency band (0 ⁇ 44.1kHz) corresponding to 88.2kHz is used for downsampling.
  • QMF quadrature mirror filter
  • the final encoding method of the current audio frame is the high bit rate encoding method
  • the audio sending device performs high bit rate encoding processing on the current audio frame.
  • the audio sending device can first determine whether the high-bit-rate encoding process supports the sampling rate of the current audio frame, and when the high-bit-rate encoding process supports the sampling rate of the current audio frame, it can directly Perform high-bit-rate encoding processing; or, when the high-bit-rate encoding processing does not support the sampling rate of the current audio frame, you can first perform down-sampling or up-sampling processing on the current audio frame to obtain the down-sampled or up-sampled current audio frame , and then perform high-bit-rate encoding processing on the down-sampled or up-sampled current audio frame, and the high-bit-rate encoding processing supports the sampling rate of the down-sampled or up-sampled current audio frame.
  • the final encoding mode of the current audio frame is to switch the encoding mode from low bit rate encoding to high bit rate encoding or switch encoding mode from high bit rate encoding to low bit rate encoding
  • the audio sending device can perform low bit rate encoding processing and high bit rate encoding processing on the current audio frame. Similarly, the audio sending device can refer to the above description to first determine whether the two encoding processes of high and low bit rates support the sampling rate of the current audio frame, which will not be repeated here.
  • the frame lengths of the low-bit-rate encoding processing and the high-bit-rate encoding processing are different, or the frame lengths of the low-bit-rate encoding processing and the high-bit-rate encoding processing are the same and the total encoding and decoding delays are different.
  • the final encoding method of the current audio frame is the low bit rate encoding method
  • the audio sending device can first determine whether the low bit rate encoding process supports the sampling rate of the current audio frame, and when the low bit rate encoding process supports the sampling rate of the current audio frame, it can directly perform low bit rate encoding processing on the current audio frame; or, When the low bit rate encoding process does not support the sampling rate of the current audio frame, the current audio frame can be downsampled or upsampled first to obtain the downsampled or upsampled current audio frame, and then downsampled or upsampled The subsequent current audio frame is subjected to low bit rate encoding processing, and the low bit rate encoding processing supports the sampling rate of the current audio frame after down-sampling or up-sampling.
  • the final encoding method of the current audio frame is the high bit rate encoding method
  • the audio sending device can first determine whether the high-bit rate encoding processing supports the sampling rate of the current audio frame, and when the high-bit-rate encoding processing supports the sampling rate of the current audio frame, it can directly perform high-bit rate encoding processing on the current audio frame; or, When the high-bit-rate encoding process does not support the sampling rate of the current audio frame, the current audio frame can be down-sampled or up-sampled to obtain the down-sampled or up-sampled current audio frame, and then down-sampled or up-sampled The subsequent current audio frame is subjected to high-bit-rate encoding processing, and the high-bit-rate encoding processing supports the sampling rate of the current audio frame after down-sampling or up-sampling.
  • the final encoding mode of the current audio frame is to switch the encoding mode from low bit rate encoding to high bit rate encoding
  • the frame length processed by low bit rate coding is greater than the frame length processed by high bit rate coding
  • the function of the second counter is to count the processing of the switching frame.
  • the second counter is in the start state, indicating that the current processing is still the switching frame.
  • the value of the second counter is greater than 1 (indicating that the current processing is still the switching frame. frame and not the last frame in the switching frame)
  • perform low-bit-rate encoding processing on the current audio frame perform high-bit-rate encoding processing on the current audio frame.
  • the value of the second counter is equal to 1 (indicating that the current processing is the last frame in the switching frame)
  • the high-bit-rate encoding process is performed on the current audio frame.
  • the audio sending device can refer to the above description to first determine whether the two encoding processes of high and low bit rates support the sampling rate of the current audio frame, which will not be repeated here.
  • the frame length processed by low bit rate coding is smaller than the frame length processed by high bit rate coding
  • the function of the second counter is to count the processing of the switching frame.
  • the second counter is in the activated state, indicating that the current processing is still the switching frame.
  • the value of the second counter is equal to the first set value (indicating that the is the first frame in the switching frame), perform low-bit-rate encoding processing on the current audio frame; perform high-bit-rate encoding processing on the current audio frame.
  • the value of the second counter is less than the first set value (indicating that the current processing is still the switching frame and not the first frame in the switching frame), perform high-bit-rate encoding processing on the current audio frame.
  • the audio sending device can refer to the above description to first determine whether the two encoding processes of high and low bit rates support the sampling rate of the current audio frame, which will not be repeated here.
  • the final encoding mode of the current audio frame is to switch the encoding mode from high bit rate encoding to low bit rate encoding
  • the frame length processed by low bit rate coding is greater than the frame length processed by high bit rate coding
  • the function of the first counter is to count the processing status of the switching frame.
  • the first counter is in the activated state, indicating that the current processing is still the switching frame.
  • the value of the first counter is equal to the first set value (indicating that the current processing When switching the first frame in the frame)
  • the current audio frame is subjected to low bit rate encoding processing; the current audio frame is subjected to high bit rate encoding processing.
  • the value of the first counter is smaller than the first set value (indicating that the current processing is still the switching frame and not the first frame in the switching frame)
  • low bit rate encoding processing is performed on the current audio frame.
  • the audio sending device can refer to the above description to first determine whether the two encoding processes of high and low bit rates support the sampling rate of the current audio frame, which will not be repeated here.
  • the frame length processed by low bit rate coding is smaller than the frame length processed by high bit rate coding
  • the function of the first counter is to count the processing of the switching frame.
  • the first counter is in the active state, indicating that the current processing is still the switching frame.
  • the value of the first counter is greater than 1 (indicating that the current processing is still switching frame and not the last frame in the switching frame)
  • perform low-bit-rate encoding processing on the current audio frame perform high-bit-rate encoding processing on the current audio frame.
  • the value of the first counter is equal to 1 (indicating that the current processing is the last frame in the switching frame)
  • low bit rate encoding processing is performed on the current audio frame.
  • the audio sending device can refer to the above description to first determine whether the two encoding processes of high and low bit rates support the sampling rate of the current audio frame, which will not be repeated here.
  • the code stream information corresponding to the encoded current audio frame includes packet header information, a low code rate encoded code stream and/or a high code rate encoded code stream, wherein the packet header information includes the final encoding mode, sampling rate, and number of channels of the current audio frame. , the frame length, and the length of the low-rate encoding stream.
  • the code stream information only contains the low-bit-rate encoding code stream; if the audio sending device only performs high-bit-rate encoding processing on the current audio frame, then the code stream information only contains high-bit-rate encoding code stream; if the audio sending device performs low-bit-rate encoding processing and high-bit-rate encoding processing on the current audio frame, the code stream information includes low-bit-rate encoding code stream and high-bit-rate encoding code stream .
  • the audio sending device can send the stream information to the audio receiving device through communication methods such as Bluetooth connection.
  • the decoding end needs to adopt the corresponding decoding method to decode the encoded code stream. Therefore, there are the following decoding processing methods:
  • the frame length of the low bit rate encoding process is the same as the frame length of the high bit rate encoding process, and the total encoding and decoding delay of the low bit rate encoding process is the same as that of the high bit rate encoding process.
  • the decoding method is the low bit rate decoding method
  • the audio receiving device performs low-bit-rate decoding processing on the low-bit-rate encoded code stream.
  • the audio receiving device can first determine whether the low-bit rate decoding process supports the sampling rate corresponding to the low-bit-rate encoding code stream, and when the low-bit-rate decoding processing supports the sampling rate, it can directly The low-bit-rate decoding process is performed on the low-bit-rate encoding code stream; or, when the low-bit-rate decoding processing does not support the sampling rate, the low-bit-rate encoding code stream can be subjected to low-bit-rate decoding processing first, and then the decoded data can be uploaded. Sampling or downsampling processing to obtain the target audio frame.
  • the upsampling or downsampling processing performed by the encoding end and the decoding end corresponds to that, that is, if the encoding end adopts downsampling processing, the decoding end can adopt upsampling processing; if the encoding end adopts upsampling processing processing, then the decoding end can use downsampling processing.
  • the audio sending device downsamples the audio frame and then takes the lower subband for encoding processing. The data of the high subband part, so the target audio frame is obtained by upsampling by filling 0.
  • the decoding method is a high bit rate decoding method
  • the audio receiving device performs high-bit-rate decoding processing on the high-bit-rate coded stream.
  • the audio receiving device may first determine whether the high-bit-rate decoding process supports the sampling rate corresponding to the high-bit-rate encoding stream, and when the high-bit-rate decoding processing supports the sampling rate, it can directly The high-bit-rate decoding processing is performed on the high-bit-rate encoding code stream; or, when the high-bit-rate decoding processing does not support the sampling rate, the high-bit-rate encoding code stream can be subjected to high-bit-rate decoding processing first, and then the decoded data can be uploaded Sampling or downsampling processing to obtain the target audio frame.
  • the audio receiving device performs low-bit-rate decoding processing on the low-bit-rate coded code stream to obtain second data, and performs high-bit-rate decoding processing on the high-bit-rate coded code stream to obtain first data.
  • the audio receiving device can perform smooth processing on the back-end data of the second data and the front-end data of the first data to ensure smooth switching between low and high bit rates.
  • the length of the smooth data is N sample point data, that is, weighted average of the last N sample point data of the second data and the first N sample point data of the first data to obtain N sample point smooth data, divide the last N sample points according to the second data Other data other than the data and N samples smoothed data to obtain the target audio frame.
  • the audio receiving device may first determine whether the two decoding processes of the high and low bit rates support the sampling rate as described above, which will not be repeated here.
  • the audio receiving device performs high-bit-rate decoding processing on the high-bit-rate coded code stream to obtain first data, and performs low-bit-rate decoding processing on the low-bit-rate coded code stream to obtain second data.
  • the audio receiving device can perform smooth processing on the back-end data of the first data and the front-end data of the second data to ensure smooth switching between high and low bit rates, and the length of the smooth data is N.
  • sampling point data that is to perform a weighted average of the last N sample point data of the first data and the first N sample point data of the second data to obtain N sample point smoothed data, and divide the last N sample point data according to the first data.
  • Other data and N samples smoothed data to obtain the target audio frame.
  • the audio receiving device may first determine whether the two decoding processes of the high and low bit rates support the sampling rate as described above, which will not be repeated here.
  • the decoding method is the low bit rate decoding method
  • the frame length processed by low bit rate coding is greater than the frame length processed by high bit rate coding
  • the audio receiving device performs low-bit-rate decoding processing on the low-bit-rate encoded code stream.
  • the audio receiving device can first determine whether the low-bit rate decoding process supports the sampling rate corresponding to the low-bit-rate encoding code stream, and when the low-bit-rate decoding processing supports the sampling rate, it can directly The low-bit-rate decoding process is performed on the low-bit-rate encoding code stream; or, when the low-bit-rate decoding processing does not support the sampling rate, the low-bit-rate encoding code stream can be subjected to low-bit-rate decoding processing first, and then the decoded data can be uploaded. Sampling or downsampling processing to obtain the target audio frame.
  • the frame length processed by low bit rate coding is smaller than the frame length processed by high bit rate coding
  • the audio receiving device performs low-bit-rate decoding processing on the low-bit-rate encoded code stream to obtain second data. Since the frame length of low-bit-rate decoding processing is smaller than that of high-bit-rate decoding processing, in order to align the audio frames obtained by low-bit-rate decoding processing and high-bit-rate decoding processing, after obtaining the second data, the audio receiving device can The second data queue corresponding to the low bit rate decoding process overflows M sample data from the queue head, and the second data is put into the second data queue according to the first input first output (FIFO) method, and then the second data The head of the queue extracts M samples of data to obtain the target audio frame.
  • the second data queue follows a first-in, first-out principle.
  • the audio receiving device can first determine whether the low bit rate decoding process supports the sampling rate according to the above description, which will not be repeated here.
  • the decoding method is a high bit rate decoding method
  • the frame length processed by low bit rate coding is greater than the frame length processed by high bit rate coding
  • the audio receiving device performs high-bit-rate decoding processing on the high-bit-rate encoded code stream to obtain first data. Since the frame length of the low-bit-rate encoding process is larger than the frame length of the high-bit-rate encoding process, in order to align the audio frames obtained by the low-bit-rate decoding processing and the high-bit-rate decoding processing, the audio receiving device may, after obtaining the first data, convert the The first data queue corresponding to the high bit rate decoding process overflows M sample data from the head of the queue, puts the first data into the tail of the first data queue, and then extracts M samples from the head of the first data queue data to obtain the target audio frame.
  • the first data queue follows a first-in, first-out principle. M is associated with the frame length of the proxy decoding process.
  • the audio receiving device may first determine whether the high-bit-rate decoding process supports the sampling rate according to the above description, which will not be repeated here.
  • the frame length processed by low bit rate coding is smaller than the frame length processed by high bit rate coding
  • the audio receiving device performs high-bit-rate decoding processing on the high-bit-rate coded stream.
  • the audio receiving device may first determine whether the high-bit-rate decoding process supports the sampling rate corresponding to the high-bit-rate encoding stream, and when the high-bit-rate decoding processing supports the sampling rate, it can directly The high-bit-rate decoding processing is performed on the high-bit-rate encoding code stream; or, when the high-bit-rate decoding processing does not support the sampling rate, the high-bit-rate encoding code stream can be subjected to high-bit-rate decoding processing first, and then the decoded data can be uploaded Sampling or downsampling processing to obtain the target audio frame.
  • the frame length processed by low bit rate coding is greater than the frame length processed by high bit rate coding
  • the first data queue corresponding to the high-bit-rate decoding process is set to all 0, and the first data queue follows the first-in, first-out order.
  • the audio sending device can refer to the above description to first determine whether the two encoding processes of high and low bit rates support the sampling rate of the current audio frame, which will not be repeated here.
  • the frame length processed by low bit rate coding is smaller than the frame length processed by high bit rate coding
  • the low-bit-rate encoding stream is subjected to low-bit-rate decoding processing to obtain the second data; the low-bit-rate decoding processing corresponds to The second data queue overflows M sample data from the head of the queue, and puts the second data into the tail of the second data queue.
  • the second data queue follows the first-in, first-out principle, and M and high bit rate decoding process frames long-term correlation; extract M sample data from the head of the second data queue to obtain fourth data; perform high-bit-rate decoding processing on the high-bit-rate coded stream to obtain the first data; The sample point data and the first N sample point data of the first data are weighted and averaged to obtain N sample point smoothed data; the target is obtained according to the fourth data except the last N sample point data and the N sample point smoothed data audio frame.
  • the second data queue is overflowed with M sample data from the queue head; extracting data from the queue head of the second data queue M sample point data to obtain the fourth data; carry out high bit rate decoding processing on the high code rate encoded code stream to obtain the first data; compare the last N sample data of the fourth data with the first N samples of the first data The point data is weighted and averaged to obtain N sample point smoothed data; the target audio frame is obtained according to the fourth data except the last N sample point data and the N sample point smoothed data.
  • the audio sending device can refer to the above description to first determine whether the two encoding processes of high and low bit rates support the sampling rate of the current audio frame, which will not be repeated here.
  • the frame length processed by low bit rate coding is greater than the frame length processed by high bit rate coding
  • the high-bit-rate decoding process is performed on the high-bit-rate encoding stream to obtain the first data; the high-bit-rate decoding processing corresponds to The first data queue overflows M sample data from the head of the queue, and puts the first data into the tail of the first data queue.
  • the first data queue follows the principle of first-in, first-out, and M and low bit rate decoding process frames long-term correlation; extract M sample data from the head of the first data queue to obtain third data; perform low-bit-rate decoding processing on the low-rate coded stream to obtain second data; The sample point data and the first N sample point data of the second data are weighted and averaged to obtain N sample point smoothed data; the target is obtained according to the third data except the last N sample point data and the N sample point smoothed data audio frame.
  • the first data queue is overflowed with M sample data from the queue head; M sample point data to obtain third data; perform low code rate decoding processing on the low code rate encoded code stream to obtain second data; compare the last N sample data of the third data with the first N samples of the second data
  • the point data is weighted and averaged to obtain N sample point smoothed data; the target audio frame is obtained according to the third data except the last N sample point data and the N sample point smoothed data.
  • the audio sending device can refer to the above description to first determine whether the two encoding processes of high and low bit rates support the sampling rate of the current audio frame, which will not be repeated here.
  • the frame length processed by low bit rate coding is smaller than the frame length processed by high bit rate coding
  • the second data queue corresponding to the low-bit-rate decoding process is set to all 0, and the second data queue follows the first-in, first-out order.
  • the audio sending device can refer to the above description to first determine whether the two encoding processes of high and low bit rates support the sampling rate of the current audio frame, which will not be repeated here.
  • the audio sending device may determine the first bit rate and high bit rate encoding corresponding to the low bit rate encoding process according to the set bit rate of the previous audio frame and the set bit rate of the current audio frame The corresponding second code rate is processed, and the sum of the first code rate and the second code rate is the set code rate of the current audio frame.
  • the final encoding mode of the current audio frame is to switch the encoding mode from high bit rate encoding to low bit rate encoding, which may include:
  • the bit rate allocation of the current audio frame is: the first bit rate is brf, and the second bit rate is brp-brf.
  • the bit rate allocation of the current audio frame is: the first bit rate is brp-300kbps for two channels, the first bit rate is brp-150kbps for mono, and the first bit rate is brp-150kbps for mono.
  • the second bit rate is 300kbps for two channels, and the second bit rate is 150kbps for mono.
  • the bit rate of the current audio frame is allocated as: the first bit rate is 64kbps for two channels, the first bit rate is 32kbps for mono, and the second bit rate is The dual channel is 300kbps, and the second bit rate is 150kbps for the mono channel.
  • the final encoding mode of the current audio frame is to switch the encoding mode from low bit rate encoding to high bit rate encoding, which may include:
  • the bit rate allocation of the current audio frame is: the first bit rate is brp and the second bit rate is brf-brp.
  • the bit-rate allocation of the current audio frame is: the first bit-rate The first bit rate is brf-300kbps for two channels, the first bit rate is brf-150kbps for mono, the second bit rate is 300kbps for two channels, and the second bit rate is 150kbps for mono.
  • bit-rate allocation of the current audio frame is: the first bit rate It is 64kbps for two channels, the first bit rate is 32kbps for mono, the second bit rate is 300kbps for two channels, and the second bit rate is 150kbps for mono.
  • the bit rate allocation of the current audio frame is: the first bit rate and the second bit rate are consistent with the previous audio frame allocation.
  • the bit rate allocation of the current audio frame is: the first bit rate is 299kbps for two channels, the first bit rate is 149kbps for mono, and the first bit rate is 149kbps for mono.
  • the second bit rate is brf-299kbps for two channels, and the second bit rate is brf-149kbps for mono.
  • the present application determines the final encoding method of the current audio frame based on the set bit rate and the final encoding method of the previous audio frame, and accordingly sets the low bit rate encoding process and the high bit rate on the audio frame where the high and low bit rate switching occurs.
  • the encoding process the corresponding bit rate
  • the encoding end sends the bit stream information to the decoding end;
  • the decoding end parses the bit stream information to obtain the decoding method, and then decodes the bit stream data, especially on the audio frame where the high and low bit rates are switched.
  • the data after high and low bit rate decoding is smoothed to realize the seamless integration of low bit rate encoding and decoding processing and high bit rate encoding and decoding processing, so as to maximize the guarantee of audio quality under the premise of complying with the limitation of the data transmission size of the Bluetooth channel. Sound quality, improve the anti-interference ability of the Bluetooth channel, and bring users a more optimized audio experience.
  • the present application provides an audio encoding device, comprising: an obtaining module configured to obtain the set bit rate of the current audio frame to be encoded and the final encoding mode of the previous audio frame, where the final encoding mode includes the first A code rate encoding mode, a second code rate encoding mode, a switching encoding mode from the first code rate encoding to the second code rate encoding, or a switching encoding mode from the second code rate encoding to the first code rate encoding, wherein the first code rate is low at the second code rate; a determination module for determining the final encoding mode of the current audio frame according to the set code rate and the final encoding mode of the previous audio frame; an encoding module for determining the final encoding mode of the current audio frame according to the current audio frame The final encoding mode of the audio frame encodes the current audio frame.
  • the determining module is specifically configured to, when the set bit rate is less than the set threshold, and the final encoding mode of the last audio frame is the first In the case of a bit rate encoding mode, it is determined that the final encoding mode of the current audio frame is the first bit rate encoding mode; or, when the set bit rate is less than a set threshold, and the final encoding mode of the previous audio frame is determined When the mode is the second bit rate encoding mode, determine that the final encoding mode of the current audio frame is to switch the encoding mode from the second bit rate encoding to the first bit rate encoding; or, when the set bit rate is less than the set threshold, And when the mode is the second bit rate encoding mode.
  • the final encoding mode of the current audio frame is the first bit rate encoding mode; or, when the set bit rate is greater than the set threshold and the final encoding mode of the previous audio frame is the first bit rate encoding mode , determine that the final encoding mode of the current audio frame is to switch the encoding mode from the first bit rate encoding to the second bit rate encoding; or, when the set bit rate is greater than the set threshold, and the previous audio frame
  • the final encoding mode is the second bit rate encoding mode
  • determine that the final encoding mode of the current audio frame is the second bit rate encoding mode; or, when the set bit rate is greater than the set threshold, and the previous frame
  • the final encoding mode of the audio frame is the encoding mode of the first bit rate encoding to the second bit
  • the determining module is specifically configured to, when the set code rate is smaller than the set code rate When the threshold is set, and the final encoding mode of the audio frame of the previous frame is the first bit rate encoding mode, determine that the final encoding mode of the current audio frame is the first bit rate encoding mode; or, when the setting code When the rate is less than the set threshold, and the final encoding mode of the last audio frame is the second encoding mode, determine that the final encoding mode of the current audio frame is switching from the second encoding to the first encoding coding mode, and start the first counter, the initial value of the first counter is the first set value, and the first counter is terminated when the value is 0; or, when the set code rate is greater than the set threshold, And when the final encoding mode of the audio frame of the previous frame is the second bit rate encoding mode, it is determined that
  • the final encoding mode of the current audio frame is to switch the encoding mode from the first bit rate encoding to the second bit rate encoding, And start the second counter, the initial value of the second counter is the first set value, and the second counter is terminated when the value is 0; or, when the final encoding mode of the last audio frame is the second
  • the coding mode is switched from the rate coding to the first coding, and when the value of the first counter is greater than 0, the value of the first counter is decremented by 1; if the value of the first counter is still greater than 0, it is determined
  • the final encoding mode of the current audio frame is to switch the encoding mode from the second bit rate encoding to the first bit rate encoding; or, if the value of the first counter is 0, it is determined that the final encoding mode of the current audio frame is The first bit rate encoding mode; or, when the
  • the encoding module is specifically configured to perform first bit rate encoding processing on the current audio frame; and perform second bit rate encoding processing on the current audio frame.
  • the encoding module is specifically configured to perform the first bit rate encoding process on the current audio frame when the value of the first counter is equal to the first set value ; perform a second bit rate encoding process on the current audio frame; or, when the value of the first counter is less than the first set value, perform a first bit rate encoding process on the current audio frame.
  • the encoding module is specifically configured to perform the first bit rate encoding process on the current audio frame when the value of the second counter is greater than 1; The second code rate encoding process is performed on the frame; or, when the value of the second counter is equal to 1, the second code rate encoding process is performed on the current audio frame.
  • the encoding module is specifically configured to perform the first bit rate encoding process on the current audio frame when the value of the first counter is greater than 1; The second bit rate encoding process is performed on the frame; or, when the value of the first counter is equal to 1, the first bit rate encoding process is performed on the current audio frame.
  • the encoding module is specifically configured to perform the first bit rate encoding process on the current audio frame when the value of the second counter is equal to the first set value ; perform a second bit rate encoding process on the current audio frame; or, when the value of the second counter is less than the first set value, perform a second bit rate encoding process on the current audio frame.
  • the encoding module is specifically configured to, when the first code rate encoding process supports the sampling rate of the current audio frame, perform the first code rate on the current audio frame encoding processing; or, when the first bit rate encoding processing does not support the sampling rate of the current audio frame, perform down-sampling or up-sampling processing on the current audio frame to obtain down-sampled or up-sampled current audio frame, the first bit rate encoding process is performed on the down-sampled or up-sampled current audio frame, and the first bit rate encoding process supports the sampling rate of the down-sampled or up-sampled current audio frame.
  • the encoding module is specifically configured to perform the second bit rate encoding process on the current audio frame when the second bit rate encoding process supports the sampling rate of the current audio frame encoding processing; or, when the second bit rate encoding processing does not support the sampling rate of the current audio frame, perform down-sampling or up-sampling processing on the current audio frame to obtain down-sampled or up-sampled current audio frame, performing the second bit rate encoding process on the down-sampled or up-sampled current audio frame, where the second bit rate encoding process supports the sampling rate of the down-sampled or up-sampled current audio frame.
  • the determining module is further configured to determine the first code rate and the second code corresponding to the first code rate encoding process according to the set code rate of the previous audio frame and the set code rate of the current audio frame.
  • the second bit rate corresponding to the rate encoding process, the sum of the first bit rate and the second bit rate is the set bit rate of the current audio frame;
  • the encoding module is specifically configured to use the first bit rate
  • the first code rate encoding process is performed on the current audio frame at the code rate; the second code rate encoding process is performed on the current audio frame at the second code rate.
  • the code stream information corresponding to the encoded current audio frame includes packet header information, the first code rate encoded code stream and/or the second code rate encoded code stream, wherein the packet header information includes all The final encoding mode, sampling rate, number of channels, frame length of the current audio frame and the length of the encoded code stream at the first code rate.
  • the present application provides an audio decoding device, comprising: an obtaining module for obtaining code stream information; a parsing module for parsing the code stream information to obtain a decoding mode and an encoded code stream, the encoded code stream It includes a first code rate encoding code stream and/or a second code rate encoding code stream, and the decoding mode includes a first code rate decoding mode, a second code rate decoding mode, and switching from the first code rate decoding to the second code rate decoding The decoding mode or the second bit rate decoding switches the decoding mode to the first bit rate decoding.
  • the decoding mode is the first bit rate decoding mode
  • the encoding bit stream includes the first bit rate encoding bit stream.
  • the encoded bit stream includes the second bit rate encoding bit stream, and when the decoding mode is the first bit rate decoding to the second bit rate decoding, switching the decoding mode or the second bit rate decoding
  • the encoding code stream includes the first code rate encoding code stream and the second code rate encoding code stream; the decoding module is configured to perform the decoding on the encoding code stream according to the decoding mode. Decode to get the target audio frame.
  • the decoding module is specifically configured to perform decoding processing on the first bit rate coded bit stream according to the first bit rate decoding method to obtain the target audio frame.
  • the decoding module when the decoding method is the first code rate decoding method, and the frame length of the first code rate decoding process is smaller than the frame length of the second code rate decoding process, the decoding module, specifically It is used to decode the encoded code stream of the first code rate according to the first code rate decoding method to obtain second data; overflow the second data queue corresponding to the decoding process of the first code rate from the queue head M sample data, the second data is put into the second data queue in a first-in-first-out FIFO manner, and M is associated with the frame length of the second bit rate decoding processing; The header extracts M sample data to obtain the target audio frame.
  • the decoding module is specifically configured to perform decoding processing on the second code rate encoded code stream according to the second code rate decoding method to obtain the target audio frame.
  • the decoding module when the decoding method is the second code rate decoding method, and the frame length of the first code rate decoding process is greater than the frame length of the second code rate decoding process, the decoding module, specifically It is used to decode the encoded code stream of the second code rate according to the second code rate decoding method to obtain first data; overflow the first data queue corresponding to the decoding process of the second code rate from the queue head M sample data, put the first data into the first data queue in a first-in-first-out FIFO manner, and M is associated with the frame length of the first bit rate decoding processing; The header extracts M sample data to obtain the target audio frame.
  • the decoding module is specifically configured to The code rate decoding method performs decoding processing on the first code rate coded code stream to obtain second data; and decodes the second code rate coded code stream according to the second code rate decoding method to obtain the first data.
  • the target audio frame is obtained from the second data other than the last N sample data and the N sample smoothed data.
  • the decoding process corresponding to the second bit rate decoding process A data queue is set to all 0s, and the first data queue follows the principle of first-in, first-out; decoding the first rate-encoded code stream according to the first rate decoding method to obtain second data;
  • the second code rate decoding method performs decoding processing on the second code rate encoded code stream to obtain first data; overflows the first data queue with M sample data from the queue head, and stores the first data Data is put into the tail of the first data queue, and M is associated with the frame length of the first code rate decoding process; M sample data are extracted from the head of the first data queue to obtain third data;
  • the decoding module when the decoding mode is switched from decoding at the first bit rate to decoding at the second bit rate, and the frame length of the first bit rate decoding process is smaller than the frame length of the second bit rate decoding process , the decoding module is specifically configured to, when the decoding method of the last audio frame is not the first bit rate decoding to switch the decoding method to the second bit rate decoding, according to the first bit rate decoding method
  • the first code rate encoding code stream is decoded to obtain second data; the second data queue corresponding to the first code rate decoding process overflows M sample data from the head of the queue, and the first-in-first-out FIFO method is used.
  • the second data is put into the second data queue, and M is associated with the frame length of the second code rate decoding process; M sample data are extracted from the head of the second data queue to obtain fourth data ; Carry out decoding processing on the encoded code stream of the second code rate according to the second code rate decoding method to obtain the first data; compare the last N sample data of the fourth data with the first data of the first data.
  • the decoding module is specifically configured to The code rate decoding method performs decoding processing on the second code rate coded code stream to obtain first data; and decodes the first code rate coded code stream according to the first code rate decoding method to obtain second data.
  • the target audio frame is obtained from the first data other than the last N sample data and the N sample smoothed data.
  • the decoding module when the decoding mode is switched from the second code rate decoding to the first code rate decoding, and the frame length of the first code rate decoding process is greater than the frame length of the second code rate decoding process , the decoding module is specifically configured to, when the decoding method of the last audio frame is not the second bit rate decoding to switch the decoding method to the first bit rate decoding, according to the second bit rate decoding method
  • the second code rate encoding code stream is decoded to obtain the first data; the first data queue corresponding to the second code rate decoding process is overflowed with M sample data from the queue head, and the data is stored in a first-in, first-out FIFO manner.
  • the first data is put into the first data queue, and M is associated with the frame length of the first code rate decoding process; M sample data are extracted from the head of the first data queue to obtain third data ; Carry out decoding processing on the encoded code stream of the first code rate to obtain second data according to the first code rate decoding method; Perform weighted average of N sample data to obtain N sample smooth data; obtain the target audio according to the third data except the last N sample data and the N sample smooth data or, when the decoding mode of the audio frame of the previous frame is to switch the decoding mode from the second bit rate decoding to the first bit rate decoding, the first data queue is overflowed from the queue head by M sample data ; Extract M sample data from the head of the first data queue to obtain third data; perform decoding processing on the first code rate coded stream according to the first code rate decoding method to obtain second data ; Carry out a weighted average of the last N sample point data of the third data and the first N sample point data of the second data to obtain N sample point smoothing data;
  • the decoding process corresponding to the first bit rate decoding process The second data queue is set to all 0s, and the second data queue follows the principle of first-in, first-out; decoding the second rate-encoded code stream according to the second rate decoding method to obtain the first data; The first code rate decoding method performs decoding processing on the first code rate coded code stream to obtain second data; overflows the second data queue from the head of the queue by M sample data, and stores the second data queue.
  • the data is put into the tail of the second data queue, and M is associated with the frame length of the second code rate decoding process; M sample data are extracted from the head of the second data queue to obtain the fourth data; Perform a weighted average of the last N sample point data of the first data and the first N sample point data of the fourth data to obtain N sample point smoothed data; divide the last N samples according to the first data
  • the target audio frame is obtained by other data other than the sample point data and the N sample point smoothing data.
  • the decoding module is specifically configured to determine whether the first code rate decoding process supports the sampling rate corresponding to the first code rate encoding code stream; if the first code rate decoding process If the processing supports the sampling rate, perform the first bit rate decoding process on the first bit rate encoded code stream; or, if the first bit rate decoding process does not support the sampling rate, perform the first bit rate decoding process on the The first code rate encoded code stream performs the first code rate decoding process to obtain fifth data, and performs up-sampling or down-sampling processing on the fifth data.
  • the decoding module is specifically configured to determine whether the second code rate decoding process supports the sampling rate corresponding to the second code rate encoding code stream; if the second code rate decoding process If the processing supports the sampling rate, perform the second rate decoding process on the second rate encoding code stream; or, if the second rate decoding process does not support the sampling rate, perform the second rate decoding process on the
  • the second code rate coded code stream is subjected to the second code rate decoding process to obtain sixth data, and the up-sampling or down-sampling process is performed on the sixth data.
  • the present application provides an audio encoding device, comprising: one or more processors; a memory for storing one or more programs; when the one or more programs are executed by the one or more processors Execution causes the one or more processors to implement the method as described in any one of the above-described first aspect performed by an audio transmission device.
  • the present application provides an audio decoding device, comprising: one or more processors; a memory for storing one or more programs; when the one or more programs are executed by the one or more processors Executing, causing the one or more processors to implement the method as described in any one of the above-described first aspect performed by an audio receiving device.
  • the present application provides a computer-readable storage medium, including a computer program, which, when executed on a computer, causes the computer to execute the method according to any one of the above-mentioned first aspects.
  • the present application provides a computer-readable storage medium, including code stream information obtained according to any one of the audio encoding methods performed by an audio transmission device in the first aspect above.
  • the present application provides a computer-readable storage medium comprising audio frames obtained according to any one of the audio decoding methods performed by an audio receiving device in the first aspect above.
  • Fig. 1 is an exemplary structural diagram of the audio playback system of the application
  • FIG. 2 is an exemplary structural block diagram of the audio decoding system 10 of the present application.
  • Fig. 3 is an exemplary flow chart of the audio coding and decoding method of the present application.
  • Fig. 5a is an exemplary schematic diagram of the data smoothing processing of the present application.
  • Fig. 5b is an exemplary schematic diagram of the data smoothing processing of the present application.
  • FIG. 6 is an exemplary schematic diagram of the data queue of the present application.
  • FIG. 7 is an exemplary schematic diagram of encoding and decoding processing of an audio frame of the present application.
  • FIG. 8 is a schematic structural diagram of an embodiment of an audio encoding apparatus of the present application.
  • FIG. 9 is a schematic structural diagram of an embodiment of an audio decoding apparatus of the present application.
  • At least one (item) refers to one or more, and "a plurality” refers to two or more.
  • “And/or” is used to describe the relationship between related objects, indicating that there can be three kinds of relationships, for example, “A and/or B” can mean: only A, only B, and both A and B exist , where A and B can be singular or plural.
  • the character “/” generally indicates that the associated objects are an “or” relationship.
  • At least one item(s) below” or similar expressions thereof refer to any combination of these items, including any combination of single item(s) or plural items(s).
  • At least one (a) of a, b or c can mean: a, b, c, "a and b", “a and c", “b and c", or "a and b and c" ", where a, b, c can be single or multiple.
  • Audio frame Audio data is streaming.
  • the amount of audio data within a period of time is usually taken as a frame of audio. This period is called “sampling time", which can be determined according to the codec. Determine its value according to the needs of the controller and specific applications, for example, the duration is 2.0ms to 60ms, and ms is milliseconds.
  • FIG. 1 is an exemplary structural diagram of an audio playback system of the application.
  • the audio playback system includes: an audio sending device and an audio receiving device, wherein the audio sending device includes, for example, a mobile phone, a computer (laptop, Desktop computer, etc.), tablet (handheld tablet, car tablet, etc.) and other devices that can encode audio and send audio streams; audio receiving devices include, for example, TWS headsets, ordinary wireless headsets, speakers, smart watches, smart glasses, etc. that can receive audio A device that streams, decodes and plays audio streams.
  • the audio sending device includes, for example, a mobile phone, a computer (laptop, Desktop computer, etc.), tablet (handheld tablet, car tablet, etc.) and other devices that can encode audio and send audio streams
  • audio receiving devices include, for example, TWS headsets, ordinary wireless headsets, speakers, smart watches, smart glasses, etc. that can receive audio A device that streams, decodes and plays audio streams.
  • a Bluetooth connection can be established between the audio sending device and the audio receiving device, and the two can support the transmission of voice and music.
  • audio sending and receiving devices are between mobile phones and TWS headsets, wireless headsets, or wireless collars, or between mobile phones and other end devices (such as smart speakers, smart watches, smart glasses, and car-mounted devices). speakers, etc.).
  • examples of audio transmitting devices and audio receiving devices can also be tablets, laptops or desktop computers and TWS earphones, wireless headphones, wireless collar earphones or other terminal devices (such as smart speakers, smart watches, between smart glasses and car speakers).
  • the audio sending device and the audio receiving device may also be connected by other communication methods, such as WiFi connection, wired connection or other wireless connection, which is not specifically limited in this application.
  • FIG. 2 is an exemplary structural block diagram of the audio decoding system 10 of the present application.
  • the audio decoding system 10 may include a source device 12 and a destination device 14, and the source device 12 may be the audio transmitting device in FIG. 1 .
  • the destination device 14 may be the audio receiving device of FIG. 1 .
  • the source device 12 generates encoded stream information, and thus, the source device 12 may be referred to as an audio encoding device.
  • the destination device 14 may decode the encoded bitstream information generated by the source device 12, and thus, the destination device 14 may be referred to as an audio decoding device.
  • the source device 12 includes an encoder 20 and, optionally, an input interface 16 , an audio preprocessor 18 , and a communication interface 22 .
  • the input interface 16 is used for inputting audio pulse code modulation (pulse code modulation, PCM) data and setting the code rate.
  • the audio PCM data can be classified into voice type or music type, and the set bit rate can be preset by the user.
  • the audio preprocessor 18 is used to determine the encoding mode according to the set bit rate input from the input interface 16 . That is, based on the desired purpose: when the set bit rate is less than the threshold, use the low-bit rate encoding process to encode the audio frame, and when the set bit rate is greater than the threshold, use the high-bit rate encoding process to encode the audio frame. Therefore, the final encoding method of the audio frame depends on the set bit rate of the current frame and the final encoding method of the previous frame.
  • the encoder 20 is configured to encode the audio frame according to the encoding mode determined by the audio preprocessor 18 to obtain the code stream information.
  • the communication interface 22 in the source device 12 can be used to receive the code stream information and send the code stream to the destination device 14 through the communication channel 13 .
  • the destination device 14 includes a decoder 30 and, optionally, a communication interface 28 , an audio post-processor 32 and a playback device 34 .
  • the communication interface 28 in the destination device 14 is used to receive the code stream directly from the source device 12 and provide the code stream to the decoder 30 .
  • Communication interface 22 and communication interface 28 may be used to send or receive code streams through a communication link between source device 12 and destination device 14, such as a Bluetooth connection, or the like.
  • the communication interface 22 may be used to encapsulate the code stream into a suitable format such as a message, and/or use Bluetooth transfer encoding or processing to process the code stream for transmission over the communication link.
  • the communication interface 28 corresponds to the communication interface 22 and, for example, can be used to receive a code stream and decode or process and/or decapsulate using the corresponding transmission to obtain the code stream.
  • Both the communication interface 22 and the communication interface 28 can be configured as a one-way communication interface as indicated by the arrow in FIG. 2 from the corresponding communication channel 13 of the source device 12 to the destination device 14, or a two-way communication interface, and can be used to send and receive messages etc. to establish a connection, acknowledge and exchange any other information related to a communication link and/or data transfer such as encoded audio data, etc.
  • the decoder 30 is configured to receive the code stream information, and obtain audio data by decoding the code stream in the code stream information according to the decoding mode in the code stream information.
  • the audio post-processor 32 is used for post-processing the decoded audio data to obtain post-processed audio data.
  • the post-processing performed by the audio post-processor 32 may include, for example, trimming or resampling, and the like.
  • the playback device 34 is used to receive the post-processed audio data to play the audio to a user or listener.
  • Playback device 34 may be or include any type of player for playing reconstructed audio, eg, integrated or external speakers.
  • speakers may include speakers, speakers, and the like.
  • the present application provides an audio coding and decoding method.
  • FIG. 3 is an exemplary flowchart of the audio coding and decoding method of the present application.
  • the process 300 can be performed by an audio sending device and an audio receiving device in an audio playback system, that is, the audio sending device implements audio encoding, and then sends the code stream information to the audio receiving device, and the audio receiving device decodes the code stream information to obtain target audio frame.
  • Process 300 is described as a series of steps or operations, and it should be understood that process 300 may be performed in various orders and/or concurrently, and is not limited to the order of execution shown in FIG. 3 .
  • the method includes:
  • Step 301 The audio sending device obtains the set bit rate of the current audio frame to be encoded and the final encoding mode of the previous audio frame.
  • the audio frame can be any frame of audio sent by the audio sending device to the audio receiving device.
  • the object of each encoding in this application may be a frame of audio frame in the audio, that is, the audio encoding method provided in this application is for one frame of audio frame, and the method for determining the encoding mode below is applicable to each frame of audio in the audio frame. . Therefore, in order to distinguish, the audio frame being encoded by the audio transmitting device is called the audio frame or the current audio frame, and the audio frame encoded by the audio transmitting device only before the audio frame is called the previous audio frame.
  • the audio frame can be represented in the form of PCM data.
  • the set code rate may be a target coding code rate preset by the user according to the current channel state.
  • the set bit rate may be, for example, 192kbps, 256kbps, 400kbps or 600kbps.
  • the final encoding method refers to the encoding method actually used by the audio sending device to encode the audio frame, which can include the first bit rate encoding method, the second bit rate encoding method, and the encoding method switching from the first bit rate encoding to the second bit rate encoding.
  • the coding mode is switched from the second code rate to the first code rate, wherein the first code rate is lower than the second code rate, for example, the first code rate may be 64kbps, 128kbps, 192kbps, 256kbps, 400kbps, or 600kbps, etc.
  • the second code rate can be 128kbps, 192kbps, 256kbps, 400kbps, or 600kbps, etc.
  • the code rate may be lower than the second code rate.
  • the first code rate may also be referred to as a low code rate
  • the second code rate may be referred to as a high code rate.
  • Step 302 The audio sending device determines the final encoding mode of the current audio frame according to the set bit rate and the final encoding mode of the previous audio frame.
  • This application involves two kinds of encoding processing, namely low bit rate encoding processing and high bit rate encoding processing, wherein the low bit rate encoding processing may include, for example, AAC, the default LC3 of next-generation Bluetooth, etc., and the high bit rate encoding processing may include, for example, LHDC, LC3plus, etc.
  • the low bit rate encoding processing may include, for example, AAC, the default LC3 of next-generation Bluetooth, etc.
  • the high bit rate encoding processing may include, for example, LHDC, LC3plus, etc.
  • Table 1 exemplarily shows that the frame length of the low bit rate encoding process is the same as the frame length of the high bit rate encoding process, and the total time delay of the low bit rate encoding process is the same as that of the high bit rate encoding process.
  • the audio sending device determines the final encoding mode of the current audio frame according to the set bit rate and the final encoding mode of the previous audio frame.
  • A represents the low bit rate encoding mode
  • B represents the high bit rate encoding mode
  • a ⁇ B represents the encoding mode switching from low bit rate encoding to high bit rate encoding
  • B ⁇ A represents switching from high bit rate encoding to low bit rate encoding Encoding. Therefore, the final encoding method of the current audio frame can be determined by the following methods:
  • the final encoding method of the previous audio frame is the low-bit-rate encoding method
  • determine that the final encoding method of the current audio frame is the low-bit-rate encoding method
  • the final encoding mode of the previous audio frame is the high bit rate encoding mode
  • determine that the final encoding mode of the current audio frame is the encoding mode switched from high bit rate encoding to low bit rate encoding
  • the final encoding method of the current audio frame is determined to be high bit rate encoding to low bit rate encoding.
  • Rate encoding switches the encoding method; or,
  • the final encoding mode of the previous audio frame is the encoding mode switching from high bit rate encoding to low bit rate encoding
  • determine that the final encoding mode of the current audio frame is the low bit rate encoding mode
  • the final encoding mode of the previous audio frame is the low bit rate encoding mode
  • the final encoding method of the previous audio frame is the high-bit-rate encoding method
  • determine that the final encoding method of the current audio frame is the high-bit-rate encoding method
  • the final encoding mode of the previous audio frame is the encoding mode switching from low-bit rate encoding to high-bit-rate encoding
  • determine that the final encoding mode of the current audio frame is the high-bit-rate encoding mode
  • the final encoding method of the current audio frame is determined to be low bit rate encoding to high bit rate encoding.
  • Rate coding Switches the coding method.
  • the value of the above-mentioned set threshold is associated with the number of channels of the audio frame. For example, when the number of channels of the audio frame is mono, the set threshold may be 150kbps, and when the number of channels of the audio frame is two channels, the set threshold may be 300kbps.
  • Table 2 exemplarily shows the final encoding method of the audio transmission device according to the set bit rate and the last audio frame in the case that the frame length of the low bit rate encoding process is different from the frame length of the high bit rate encoding process Determine the final encoding mode of the current audio frame.
  • A represents the low bit rate encoding mode
  • B represents the high bit rate encoding mode
  • a ⁇ B represents switching the encoding mode from low bit rate encoding to high bit rate encoding
  • B ⁇ A represents switching from high bit rate encoding to low bit rate encoding Encoding.
  • the switching frame is added, that is, regardless of whether the previous audio frame adopts the encoding method of A or B, as long as the next audio frame is determined to adopt the encoding method of A ⁇ B or B ⁇ A, then from The D consecutive audio frames at the beginning of the frame are directly determined as switching frames, and the value of D can be obtained by the following methods:
  • D rounding ((max(total delay of coding and decoding with low code rate, total delay of coding and decoding with high code rate) + overlap length from low code rate to high code rate + processing frame length-1)/processing frame length)
  • the final encoding method of the current audio frame can be determined by the following methods:
  • the final encoding method of the previous audio frame is the low-bit-rate encoding method
  • determine that the final encoding method of the current audio frame is the low-bit-rate encoding method
  • the initial value of the first counter is the first set value, and the first counter is terminated when the value is 0.
  • the purpose of starting the first counter is to count the processing of switching frames, start the first counter when the first switching frame is processed, and set the initial value of the first counter to the number of switching frames obtained by calculation (the first setting value). Each time a switching frame is processed, the first counter is decremented by 1.
  • the value of the first counter When the value of the first counter is 0, it indicates that the switching frame has been completely encoded, and the first counter is terminated at this time.
  • the value of the first counter is the first set value, it means that the first switching frame is currently being processed.
  • the value of the first counter When the value of the first counter is 1, it means that the last switching frame is currently being processed.
  • the value of the first counter is less than the first set value. A fixed value greater than 1 indicates that the middle switching frame is currently being processed; or,
  • the final encoding method of the previous audio frame is the high-bit-rate encoding method
  • determine that the final encoding method of the current audio frame is the high-bit-rate encoding method
  • the final encoding mode of the previous audio frame is the low bit rate encoding mode
  • start the second counter the initial value of the second counter is the first set value
  • the second counter is terminated when the value is 0.
  • the purpose of starting the second counter is to count the processing of switching frames, start the second counter when the first switching frame is processed, and set the initial value of the second counter to the number of switching frames obtained by calculation (first set value). Each time a switching frame is processed, the second counter is decremented by 1.
  • the second counter When the value of the second counter is 0, it indicates that the switching frame has been completely encoded, and the second counter is terminated at this time.
  • the value of the second counter is the first set value, it means that the first switching frame is currently being processed; when the value of the second counter is 1, it means that the last switching frame is currently being processed, and the value of the second counter is less than the first set value.
  • a fixed value greater than 1 indicates that the middle switching frame is currently being processed; or,
  • the value of the first counter is set to Decrement by 1; if the first counter is still in the startup state, it is determined that the final encoding mode of the current audio frame is to switch the encoding mode from high bit rate encoding to low bit rate encoding; or, if the first counter is terminated (that is, the value of the first counter is 0), then it is determined that the final encoding mode of the current audio frame is the low bit rate encoding mode; or,
  • the value of the second counter is set to Decrement by 1; if the second counter is still in the startup state, it is determined that the final encoding mode of the current audio frame is to switch the encoding mode from low bit rate encoding to high bit rate encoding; or, if the second counter is terminated (that is, the value of the second counter be 0), then determine that the final encoding mode of the current audio frame is a high bit rate encoding mode;
  • the value of the above-mentioned set threshold is associated with the number of channels of the audio frame.
  • the set threshold may be 165kbps, and when the number of channels of the audio frame is two channels, the set threshold may be 330kbps.
  • the above-mentioned first setting value is the number D of switching frames.
  • Step 303 The audio sending device encodes the current audio frame according to the final encoding mode of the current audio frame.
  • the audio transmission device encodes the current audio frame, so there can be the following situations:
  • the frame length of the low bit rate encoding process is the same as the frame length of the high bit rate encoding process, and the total encoding and decoding delay of the low bit rate encoding process is the same as that of the high bit rate encoding process.
  • the final encoding method of the current audio frame is the low bit rate encoding method
  • the audio sending device performs low bit rate encoding processing on the current audio frame.
  • the audio sending device may first determine whether the low bit rate encoding process supports the sampling rate of the current audio frame, and when the low bit rate encoding process supports the sampling rate of the current audio frame, it can directly Perform low-bit-rate encoding processing; or, when low-bit-rate encoding processing does not support the sampling rate of the current audio frame, the current audio frame can be down-sampled or up-sampled first, and then the down-sampled or up-sampled current audio frame can be down-sampled or up-sampled.
  • the audio frame is processed by low bit rate encoding. For example, the low bit rate coding process does not support the sampling rates of 88.2kHz and 96kHz.
  • the audio transmitting device can use QMF for downsampling processing, and divide the frequency band (0 ⁇ 44.1kHz) corresponding to 88.2kHz into two subbands 0 ⁇ 22.05kHz and 22.05kHz ⁇ 44.1kHz, select the low subband 0 ⁇ 22.05kHz for low bit rate coding processing; divide the frequency band (0 ⁇ 48kHz) corresponding to 96kHz into two subbands 0 ⁇ 24kHz and 24 ⁇ 48kHz, select the low subband 0 ⁇ 24kHz for low bit rate coding. Rate encoding processing.
  • the final encoding method of the current audio frame is the high bit rate encoding method
  • the audio sending device performs high bit rate encoding processing on the current audio frame.
  • the audio sending device can first determine whether the high-bit-rate encoding process supports the sampling rate of the current audio frame, and when the high-bit-rate encoding process supports the sampling rate of the current audio frame, it can directly Perform high-bit-rate encoding processing; or, when the high-bit-rate encoding processing does not support the sampling rate of the current audio frame, the current audio frame can be down-sampled or up-sampled first, and then the down-sampled or up-sampled current audio frame can be down-sampled or up-sampled.
  • the audio frame is processed by high bit rate encoding.
  • the final encoding mode of the current audio frame is to switch the encoding mode from low bit rate encoding to high bit rate encoding or switch encoding mode from high bit rate encoding to low bit rate encoding
  • the audio sending device can perform low bit rate encoding processing and high bit rate encoding processing on the current audio frame. Similarly, the audio sending device may first determine whether the two encoding processes of high and low bit rates support the sampling rate of the current audio frame with reference to the above description, which will not be repeated here.
  • the frame lengths of the low-bit-rate encoding processing and the high-bit-rate encoding processing are different, or the frame lengths of the low-bit-rate encoding processing and the high-bit-rate encoding processing are the same and the total encoding and decoding delays are different.
  • the final encoding method of the current audio frame is the low bit rate encoding method
  • the audio sending device can first determine whether the low bit rate encoding process supports the sampling rate of the current audio frame, and when the low bit rate encoding process supports the sampling rate of the current audio frame, it can directly perform low bit rate encoding processing on the current audio frame; or, When the low bit rate encoding process does not support the sampling rate of the current audio frame, the current audio frame can be down-sampled or up-sampled first, and then the down-sampled or up-sampled current audio frame can be subjected to low-bit rate encoding processing.
  • the final encoding method of the current audio frame is the high bit rate encoding method
  • the audio sending device can first determine whether the high-bit rate encoding processing supports the sampling rate of the current audio frame, and when the high-bit-rate encoding processing supports the sampling rate of the current audio frame, it can directly perform high-bit rate encoding processing on the current audio frame; or, When the high-bit-rate encoding process does not support the sampling rate of the current audio frame, the current audio frame can be down-sampled or up-sampled first, and then the high-bit-rate encoding process can be performed on the down-sampled or up-sampled current audio frame.
  • the final encoding mode of the current audio frame is to switch the encoding mode from low bit rate encoding to high bit rate encoding
  • the frame length processed by low bit rate coding is greater than the frame length processed by high bit rate coding
  • the function of the second counter is to count the processing of the switching frame.
  • the second counter is in the start state, indicating that the current processing is still the switching frame.
  • the value of the second counter is greater than 1 (indicating that the current processing is still the switching frame. frame and not the last frame in the switching frame)
  • perform low-bit-rate encoding processing on the current audio frame perform high-bit-rate encoding processing on the current audio frame.
  • the value of the second counter is equal to 1 (indicating that the current processing is the last frame in the switching frame)
  • the high-bit-rate encoding process is performed on the current audio frame.
  • the audio sending device can refer to the above description to first determine whether the two encoding processes of high and low bit rates support the sampling rate of the current audio frame, which will not be repeated here.
  • the frame length processed by low bit rate coding is smaller than the frame length processed by high bit rate coding
  • the function of the second counter is to count the processing of the switching frame.
  • the second counter is in the activated state, indicating that the current processing is still the switching frame.
  • the value of the second counter is equal to the first set value (indicating that the is the first frame in the switching frame), perform low-bit-rate encoding processing on the current audio frame; perform high-bit-rate encoding processing on the current audio frame.
  • the value of the second counter is less than the first set value (indicating that the current processing is still the switching frame and not the first frame in the switching frame), perform high-bit-rate encoding processing on the current audio frame.
  • the audio sending device can refer to the above description to first determine whether the two encoding processes of high and low bit rates support the sampling rate of the current audio frame, which will not be repeated here.
  • the final encoding mode of the current audio frame is to switch the encoding mode from high bit rate encoding to low bit rate encoding
  • the frame length processed by low bit rate coding is greater than the frame length processed by high bit rate coding
  • the function of the first counter is to count the processing status of the switching frame.
  • the first counter is in the activated state, indicating that the current processing is still the switching frame.
  • the value of the first counter is equal to the first set value (indicating that the current processing When switching the first frame in the frame)
  • the current audio frame is subjected to low bit rate encoding processing; the current audio frame is subjected to high bit rate encoding processing.
  • the value of the first counter is smaller than the first set value (indicating that the current processing is still the switching frame and not the first frame in the switching frame)
  • low-bit-rate encoding processing is performed on the current audio frame.
  • the audio sending device can refer to the above description to first determine whether the two encoding processes of high and low bit rates support the sampling rate of the current audio frame, which will not be repeated here.
  • the frame length processed by low bit rate coding is smaller than the frame length processed by high bit rate coding
  • the function of the first counter is to count the processing of the switching frame.
  • the first counter is in the active state, indicating that the current processing is still the switching frame.
  • the value of the first counter is greater than 1 (indicating that the current processing is still switching frame and not the last frame in the switching frame)
  • perform low-bit-rate encoding processing on the current audio frame perform high-bit-rate encoding processing on the current audio frame.
  • the value of the first counter is equal to 1 (indicating that the current processing is the last frame in the switching frame)
  • low bit rate encoding processing is performed on the current audio frame.
  • the audio sending device can refer to the above description to first determine whether the two encoding processes of high and low bit rates support the sampling rate of the current audio frame, which will not be repeated here.
  • the audio sending device may determine the first bit rate and high bit rate encoding corresponding to the low bit rate encoding process according to the set bit rate of the previous audio frame and the set bit rate of the current audio frame The corresponding second code rate is processed, and the sum of the first code rate and the second code rate is the set code rate.
  • the final encoding mode of the current audio frame is to switch the encoding mode from high bit rate encoding to low bit rate encoding, which may include:
  • the bit rate allocation of the current audio frame is: the first bit rate is brf, and the second bit rate is brp-brf.
  • the bit rate allocation of the current audio frame is: the first bit rate is brp-300kbps for two channels, the first bit rate is brp-150kbps for mono, and the first bit rate is brp-150kbps for mono.
  • the second bit rate is 300kbps for two channels, and the second bit rate is 150kbps for mono.
  • the bit rate of the current audio frame is allocated as: the first bit rate is 64kbps for two channels, the first bit rate is 32kbps for mono, and the second bit rate is The dual channel is 300kbps, and the second bit rate is 150kbps for the mono channel.
  • the final encoding mode of the current audio frame is to switch the encoding mode from low bit rate encoding to high bit rate encoding, which may include:
  • the bit rate allocation of the current audio frame is: the first bit rate is brp and the second bit rate is brf-brp.
  • the bit-rate allocation of the current audio frame is: the first bit-rate The first bit rate is brf-300kbps for two channels, the first bit rate is brf-150kbps for mono, the second bit rate is 300kbps for two channels, and the second bit rate is 150kbps for mono.
  • bit-rate allocation of the current audio frame is: the first bit rate It is 64kbps for two channels, the first bit rate is 32kbps for mono, the second bit rate is 300kbps for two channels, and the second bit rate is 150kbps for mono.
  • the bit rate allocation of the current audio frame is: the first bit rate and the second bit rate are consistent with the previous audio frame allocation.
  • the bit rate allocation of the current audio frame is: the first bit rate is 299kbps for two channels, the first bit rate is 149kbps for mono, and the first bit rate is 149kbps for mono.
  • the second bit rate is brf-299kbps for two channels, and the second bit rate is brf-149kbps for mono.
  • the code stream information corresponding to the encoded current audio frame includes packet header information, a low code rate encoded code stream and/or a high code rate encoded code stream, wherein the packet header information includes the final encoding mode, sampling rate, and number of channels of the current audio frame. , the frame length, and the length of the low-rate encoding stream.
  • the code stream information only contains the low-bit-rate encoding code stream; if the audio sending device only performs high-bit-rate encoding processing on the current audio frame, then the code stream information only contains high-bit-rate encoding code stream; if the audio sending device performs low-bit-rate encoding processing and high-bit-rate encoding processing on the current audio frame, the code stream information includes low-bit-rate encoding code stream and high-bit-rate encoding code stream .
  • FIG. 4 is an exemplary format diagram of the code stream information of the present application. As shown in FIG.
  • the code stream information includes packet header information, a low bit rate encoded code stream and a high bit rate encoded code stream, wherein the packet header information includes the final code Mode (2bit), sampling rate (2bit), number of channels (1bit), frame length (1bit), length of low bit rate encoding stream (10bit), it can be seen that the length of the packet header information is 2 bytes.
  • the length of the low code rate encoded code stream is the actual length of the low code rate encoded code stream, and the actual data is written in the low code rate encoded code stream; if the code stream information If there is no low-rate encoding stream in the code stream, the length of the low-rate encoding stream is 0, and the low-rate encoding stream is empty, or the default data; if there is a high-rate encoding stream in the code stream information, the high The actual data is written in the code rate encoding code stream; if there is no high code rate code stream in the code stream information, then the high code rate code stream is empty or default data.
  • Step 304 The audio sending device sends the code stream information to the audio receiving device.
  • the audio sending device can send the stream information to the audio receiving device through communication methods such as Bluetooth connection.
  • Step 305 The audio receiving device parses the code stream information to obtain a decoding mode and an encoded code stream.
  • the encoding stream includes a low-rate encoding stream and/or a high-bit-rate encoding stream
  • the decoding methods include low-rate decoding, high-rate decoding, and switching from low-rate decoding to high-rate decoding, or high-rate decoding. Switch the decoding mode from low-bit-rate decoding to low-bit-rate decoding.
  • the actual content of the encoded code stream is related to the final encoding method of the audio frame by the audio sending device. Therefore, after parsing the code stream information, the audio receiving device can obtain two pieces of information. One is the decoding method that needs to be used. , and the other is the content of the encoded stream.
  • the encoded code stream only contains the low-bit-rate encoding code stream; when the decoding method is the high-bit-rate decoding method, the encoded code stream only includes the high-bit-rate encoding code stream; when the decoding method is When the decoding mode is switched from low bit rate decoding to high bit rate decoding or when the decoding mode is switched from high bit rate decoding to low bit rate decoding, the encoded code stream includes a low bit rate encoded code stream and/or a high bit rate encoded code stream.
  • Step 306 The audio receiving device decodes the encoded code stream according to the decoding method to obtain the target audio frame.
  • the decoding end needs to adopt the corresponding decoding method to decode the encoded code stream. Therefore, there are the following decoding processing methods:
  • the frame length of the low bit rate encoding process is the same as the frame length of the high bit rate encoding process, and the total encoding and decoding delay of the low bit rate encoding process is the same as that of the high bit rate encoding process.
  • the decoding method is the low bit rate decoding method
  • the audio receiving device performs low-bit-rate decoding processing on the low-bit-rate encoded code stream.
  • the audio receiving device can first determine whether the low-bit rate decoding process supports the sampling rate corresponding to the low-bit-rate encoding code stream, and when the low-bit-rate decoding processing supports the sampling rate, it can directly The low-bit-rate decoding process is performed on the low-bit-rate encoding code stream; or, when the low-bit-rate decoding processing does not support the sampling rate, the low-bit-rate encoding code stream can be subjected to low-bit-rate decoding processing first, and then the decoded data can be uploaded. Sampling or downsampling processing to obtain the target audio frame.
  • the upsampling or downsampling processing performed by the encoding end and the decoding end corresponds to that, that is, if the encoding end adopts downsampling processing, the decoding end can adopt upsampling processing; if the encoding end adopts upsampling processing processing, then the decoding end can use downsampling processing.
  • the audio sending device downsamples the audio frame and then takes the lower subband for encoding processing. The data of the high subband part, so the target audio frame is obtained by upsampling by filling 0.
  • the decoding method is a high bit rate decoding method
  • the audio receiving device performs high-bit-rate decoding processing on the high-bit-rate coded stream.
  • the audio receiving device may first determine whether the high-bit-rate decoding process supports the sampling rate corresponding to the high-bit-rate encoding stream, and when the high-bit-rate decoding processing supports the sampling rate, it can directly The high-bit-rate decoding processing is performed on the high-bit-rate encoding code stream; or, when the high-bit-rate decoding processing does not support the sampling rate, the high-bit-rate encoding code stream can be subjected to high-bit-rate decoding processing first, and then the decoded data can be uploaded Sampling or downsampling processing to obtain the target audio frame.
  • the audio receiving device performs low-bit-rate decoding processing on the low-bit-rate coded code stream to obtain second data, and performs high-bit-rate decoding processing on the high-bit-rate coded code stream to obtain first data.
  • the audio receiving device can perform smooth processing on the back-end data of the second data and the front-end data of the first data to ensure smooth switching between low and high bit rates.
  • the length of the smooth data is N sample point data, that is, weighted average of the last N sample point data of the second data and the first N sample point data of the first data to obtain N sample point smooth data, divide the last N sample points according to the second data Other data other than the data and N samples smoothed data to obtain the target audio frame.
  • FIG. 5a is an exemplary schematic diagram of the data smoothing processing of the present application.
  • the sample point data between the two dotted lines is the N sample point data that needs to be smoothed
  • the N sample point data in the second data The oblique lines in the point data represent the weight changes of the second data
  • the oblique lines in the N sample point data in the first data represent the weight changes of the first data.
  • the second data processed by low-bit-rate decoding is first, the first data processed by high-bit-rate decoding is behind, and the data of the target audio frame includes the second data The first part of the data, and the smoothed data of N samples after smoothing.
  • the audio receiving device may first determine whether the two decoding processes of the high and low bit rates support the sampling rate as described above, which will not be repeated here.
  • the audio receiving device performs high-bit-rate decoding processing on the high-bit-rate coded code stream to obtain first data, and performs low-bit-rate decoding processing on the low-bit-rate coded code stream to obtain second data.
  • the audio receiving device can perform smooth processing on the back-end data of the first data and the front-end data of the second data to ensure smooth switching between high and low bit rates, and the length of the smooth data is N.
  • sampling point data that is to perform a weighted average of the last N sample point data of the first data and the first N sample point data of the second data to obtain N sample point smoothed data, and divide the last N sample point data according to the first data.
  • FIG. 5b is an exemplary schematic diagram of the data smoothing processing of the present application.
  • the sample point data between the two dotted lines is the N sample point data that needs to be smoothed
  • the N sample point data in the first data The oblique lines in the point data represent the weight changes of the first data
  • the oblique lines in the N sample point data in the second data represent the weight changes of the second data.
  • the first data processed by high-bit-rate decoding is first
  • the second data processed by low-bit-rate decoding is behind
  • the data of the target audio frame includes the first data The first part of the data
  • the smoothed data of N samples after smoothing For the calculation method of the smoothed data of the N sample points, reference may be made to the above description, which will not be repeated here.
  • the audio receiving device may first determine whether the two decoding processes of the high and low bit rates support the sampling rate as described above, which will not be repeated here.
  • the decoding method is the low bit rate decoding method
  • the frame length processed by low bit rate coding is greater than the frame length processed by high bit rate coding
  • the audio receiving device performs low-bit-rate decoding processing on the low-bit-rate encoded code stream.
  • the audio receiving device can first determine whether the low-bit rate decoding process supports the sampling rate corresponding to the low-bit-rate encoding code stream, and when the low-bit-rate decoding processing supports the sampling rate, it can directly The low-bit-rate decoding process is performed on the low-bit-rate encoding code stream; or, when the low-bit-rate decoding processing does not support the sampling rate, the low-bit-rate encoding code stream can be subjected to low-bit-rate decoding processing first, and then the decoded data can be uploaded. Sampling or downsampling processing to obtain the target audio frame.
  • the frame length processed by low bit rate coding is smaller than the frame length processed by high bit rate coding
  • the audio receiving device performs low-bit-rate decoding processing on the low-bit-rate encoded code stream to obtain second data. Since the frame length of low-bit-rate decoding processing is smaller than that of high-bit-rate decoding processing, in order to align the audio frames obtained by low-bit-rate decoding processing and high-bit-rate decoding processing, after obtaining the second data, the audio receiving device can
  • the second data queue corresponding to the low bit rate decoding process overflows M sample data from the head of the queue, puts the second data into the tail of the second data queue, and then extracts M samples from the head of the second data queue data to obtain the target audio frame.
  • the second data queue follows a first-in, first-out principle.
  • FIG. 6 is an exemplary schematic diagram of the data queue of the application. As shown in FIG. 6 , assuming that the length of the second data queue is n+M, according to the principle of first-in, first-out, M sample data is overflowed from the head of the queue, so that The position of M data is vacated at the end of the second data queue, and then the second data is placed at the end of the second data queue, and then M sample data is extracted from the head of the second data queue, which is the target audio frame. .
  • the number of channels, the length of the second data queue corresponding to the low code rate decoding process B frame length ⁇ number of channels + (total delay of high code rate codec - total delay of low code rate codec) ⁇ sound number of lanes.
  • the audio receiving device can first determine whether the low bit rate decoding process supports the sampling rate according to the above description, which will not be repeated here.
  • the decoding method is a high bit rate decoding method
  • the frame length processed by low bit rate coding is greater than the frame length processed by high bit rate coding
  • the audio receiving device performs high-bit-rate decoding processing on the high-bit-rate encoded code stream to obtain first data. Since the frame length of the low-bit-rate encoding process is larger than the frame length of the high-bit-rate encoding process, in order to align the audio frames obtained by the low-bit-rate decoding processing and the high-bit-rate decoding processing, the audio receiving device may, after obtaining the first data, convert the The first data queue corresponding to the high bit rate decoding process overflows M sample data from the head of the queue, puts the first data into the tail of the first data queue, and then extracts M samples from the head of the first data queue data to obtain the target audio frame.
  • the first data queue follows a first-in, first-out principle. M is associated with the frame length of the proxy decoding process.
  • the audio receiving device may first determine whether the high-bit-rate decoding process supports the sampling rate according to the above description, which will not be repeated here.
  • the frame length processed by low bit rate coding is smaller than the frame length processed by high bit rate coding
  • the audio receiving device performs high-bit-rate decoding processing on the high-bit-rate coded stream.
  • the audio receiving device may first determine whether the high-bit-rate decoding process supports the sampling rate corresponding to the high-bit-rate encoding stream, and when the high-bit-rate decoding processing supports the sampling rate, it can directly The high-bit-rate decoding processing is performed on the high-bit-rate encoding code stream; or, when the high-bit-rate decoding processing does not support the sampling rate, the high-bit-rate encoding code stream can be subjected to high-bit-rate decoding processing first, and then the decoded data can be uploaded Sampling or downsampling processing to obtain the target audio frame.
  • the frame length processed by low bit rate coding is greater than the frame length processed by high bit rate coding
  • the first data queue corresponding to the high-bit-rate decoding process is set to all 0, and the first data queue follows the first-in, first-out order.
  • the audio sending device can refer to the above description to first determine whether the two encoding processes of high and low bit rates support the sampling rate of the current audio frame, which will not be repeated here.
  • the frame length processed by low bit rate coding is smaller than the frame length processed by high bit rate coding
  • the low-bit-rate decoding process is performed on the low-bit-rate encoding stream to obtain the second data; the low-bit-rate decoding processing corresponds to The second data queue overflows M sample data from the head of the queue, and puts the second data into the tail of the second data queue.
  • the second data queue follows the first-in, first-out principle, and M and high bit rate decoding process frames long-term correlation; extract M sample data from the head of the second data queue to obtain fourth data; perform high-bit-rate decoding processing on the high-bit-rate coded stream to obtain the first data; The sample point data and the first N sample point data of the first data are weighted and averaged to obtain N sample point smoothed data; the target is obtained according to the fourth data except the last N sample point data and the N sample point smoothed data audio frame.
  • the second data queue is overflowed with M sample data from the queue head; extracting data from the queue head of the second data queue M sample point data to obtain the fourth data; carry out high bit rate decoding processing on the high code rate encoded code stream to obtain the first data; compare the last N sample data of the fourth data with the first N samples of the first data The point data is weighted and averaged to obtain N sample point smoothed data; the target audio frame is obtained according to the fourth data except the last N sample point data and the N sample point smoothed data.
  • the audio sending device can refer to the above description to first determine whether the two encoding processes of high and low bit rates support the sampling rate of the current audio frame, which will not be repeated here.
  • the frame length processed by low bit rate coding is greater than the frame length processed by high bit rate coding
  • the high-bit-rate decoding process is performed on the high-bit-rate encoding stream to obtain the first data; the high-bit-rate decoding processing corresponds to The first data queue overflows M sample data from the head of the queue, and puts the first data into the tail of the first data queue.
  • the first data queue follows the principle of first-in, first-out, and M and low bit rate decoding process frames long-term correlation; extract M sample data from the head of the first data queue to obtain third data; perform low-bit-rate decoding processing on the low-rate coded stream to obtain second data; The sample point data and the first N sample point data of the second data are weighted and averaged to obtain N sample point smoothed data; the target is obtained according to the third data except the last N sample point data and the N sample point smoothed data audio frame.
  • the first data queue is overflowed with M sample data from the queue head; M sample point data to obtain third data; perform low code rate decoding processing on the low code rate encoded code stream to obtain second data; compare the last N sample data of the third data with the first N samples of the second data
  • the point data is weighted and averaged to obtain N sample point smoothed data; the target audio frame is obtained according to the third data except the last N sample point data and the N sample point smoothed data.
  • the audio sending device can refer to the above description to first determine whether the two encoding processes of high and low bit rates support the sampling rate of the current audio frame, which will not be repeated here.
  • the frame length processed by low bit rate coding is smaller than the frame length processed by high bit rate coding
  • the second data queue corresponding to the low-bit-rate decoding process is set to all 0, and the second data queue follows the first-in, first-out order.
  • the audio sending device can refer to the above description to first determine whether the two encoding processes of high and low bit rates support the sampling rate of the current audio frame, which will not be repeated here.
  • the present application determines the final encoding method of the current audio frame based on the set bit rate and the final encoding method of the previous audio frame, and accordingly sets the low bit rate encoding process and the high bit rate on the audio frame where the high and low bit rate switching occurs.
  • the encoding process the corresponding bit rate
  • the encoding end sends the bit stream information to the decoding end;
  • the decoding end parses the bit stream information to obtain the decoding method, and then decodes the bit stream data, especially on the audio frame where the high and low bit rates are switched.
  • the data after high and low bit rate decoding is smoothed to realize the seamless integration of low bit rate encoding and decoding processing and high bit rate encoding and decoding processing, so as to maximize the guarantee of audio quality under the premise of complying with the limitation of the data transmission size of the Bluetooth channel. Sound quality, improve the anti-interference ability of the Bluetooth channel, and bring users a more optimized audio experience.
  • the low bit rate codec processing supports 64 ⁇ 300kbps, and the high code rate codec processing supports 300 ⁇ 990kbps.
  • the bit depths supported by the two are 16bit, 24bit, 32bit floating point or 32bit fixed point, and both support mono and dual channel.
  • the supported sample rates for low bit rate codec processing include 44.1kHz and 48kHz, and the unsupported sample rates include 88.2kHz and 96kHz.
  • the frame length of the low-bit rate encoding and decoding processing and the high-bit-rate encoding and decoding processing are the same, the total delay of encoding and decoding is the same, and there is a partial overlap between adjacent audio frames.
  • the bit rate encoding process and the high bit rate encoding process are switched, the two encoding processes run at the same time to ensure the continuity of the audio stream.
  • the encoding and decoding processing of the low bit rate is set as A
  • the encoding and decoding processing of the high bit rate is set as B.
  • the final encoding mode of the current audio frame may be determined with reference to Table 1, which will not be repeated here.
  • the sampling rates of A and B can be determined by using the above method, which will not be repeated here.
  • the encoding end can use QMF to perform downsampling processing, and divide the frequency band (0 ⁇ 44.1kHz) corresponding to 88.2kHz into two subbands 0 ⁇ 22.05kHz and 22.05 ⁇ 44.1kHz, Select the low subband 0 ⁇ 22.05kHz for low bit rate coding processing; divide the frequency band (0 ⁇ 48kHz) corresponding to 96kHz into two subbands 0 ⁇ 24kHz and 24 ⁇ 48kHz, select the low subband 0 ⁇ 24kHz for low bit rate coding processing .
  • the decoding end can use QMF to fill the high subband of the decoded data (the high subband in 88.2kHz is 22.05kHz-44.1kHz, and the high subband in 96kHz is 24kHz-48kHz) to realize upsampling processing, so that the upsampling process is performed.
  • the resulting data conforms to the sample rate of the original audio.
  • the low bit rate codec processing supports 64 ⁇ 300kbps, and the high code rate codec processing supports 300 ⁇ 990kbps.
  • the bit depths supported by the two are 16bit, 24bit, 32bit floating point or 32bit fixed point, and both support mono and dual channel. Audio signals, low-bit rate codec processing and high-bit-rate codec processing support sampling rates including 44.1kHz, 48kHz, 88.2kHz and 96kHz.
  • the encoding and decoding processing of the low bit rate is set as A
  • the encoding and decoding processing of the high bit rate is set as B.
  • the frame lengths of the low code rate codec processing and the high code rate codec processing are inconsistent.
  • the frame length of A is 1024 sample data
  • the frame length of B is 256 sample data.
  • the total delay of codec of A and B is inconsistent.
  • B is 11 samples of data delay when the sampling rate is greater than or equal to 88.2kHz, and 5 when the sample rate is lower than 88.2kHz sample data delay. Therefore, a parallel fusion strategy is adopted, that is, when switching between the low bit rate encoding process and the high bit rate encoding process, the two encoding processes run simultaneously to ensure the continuity of the audio stream.
  • the final encoding mode of the current audio frame may be determined with reference to Table 2, which will not be repeated here.
  • the sampling rates of A and B can be determined by using the above method, which will not be repeated here.
  • FIG. 7 is an exemplary schematic diagram of encoding and decoding processing of audio frames of the present application. As shown in FIG. 7 , the number of switching frames is set to 3, the audio frames are monophonic, and each audio frame in the frame sequence has been determined The order of the final encoding is
  • A processes one audio frame (with a length of 1024 samples of data), and B needs to process 4 audio frames (with a length of 256 samples of data) into 1024 samples of data.
  • A consecutively encodes three audio frames according to the above sequence.
  • the next three switch frames (A ⁇ B), where, in the first switch frame, A and B run simultaneously, and B only encodes the second half of the audio frame; the second switch frame, A and B still run at the same time , the data encoded by A is all set to 0, and B encodes the entire frame of data; the third switch frame, A stops running, and B encodes the entire frame of data.
  • Next B encodes three audio frames consecutively.
  • A encodes an audio frame.
  • A continuously decodes three audio frames according to the above sequence.
  • the next three switching frames (A ⁇ B), of which, for the first switching frame, first set the first data queue corresponding to B to all 0s, A and B run at the same time on this switching frame, and B only decodes the audio frame the latter half of the data.
  • the first data queue is overflowed with 1024 sample point data from the head of the queue, and the first data obtained by decoding B is put into the tail of the first data queue. Extract 1024 sample data from the head of the first data queue and smooth the second data decoded by A to obtain an audio frame; the second switching frame and the third switching frame, A and B are still running at the same time, both are Decode an entire frame of data.
  • the first data queue is overflowed by 1024 sample data from the queue head, and the first data obtained by decoding B is put into the queue tail of the first data queue.
  • Next B successively decodes three audio frames.
  • the first data queue corresponding to B is overflowed by 1024 sample point data from the head of the queue, and the first data obtained by decoding B is put into the tail of the first data queue.
  • Extract 1024 sample data from the head of the first data queue and perform smooth processing with the second data decoded by A to obtain an audio frame; for the second switching frame and the third switching frame, A decodes the whole frame data, B Stop running.
  • the first data queue is overflowed from the queue head by 1024 samples of data on both the second switching frame and the third switching frame. Extract 1024 sample data from the head of the first data queue and perform smoothing processing on the second data obtained by A decoding to obtain an audio frame. Next A decodes an audio frame.
  • FIG. 8 is a schematic structural diagram of an embodiment of an audio encoding apparatus of the present application. As shown in FIG. 8 , the apparatus may be applied to the audio sending device in the above-mentioned embodiment.
  • the encoding apparatus in this embodiment may include: an obtaining module 801 , a determining module 802 and an encoding module 803 . in,
  • the obtaining module 801 is used to obtain the set bit rate of the current audio frame to be encoded and the final encoding mode of the previous audio frame, the final encoding mode includes the first bit rate encoding mode, the second bit rate encoding mode, the The coding mode is switched from one code rate coding to the second code rate coding or the coding mode is switched from the second code rate coding to the first code rate coding, wherein the first code rate is lower than the second code rate; the determining module 802 is used for determining according to the The set bit rate and the final encoding mode of the last audio frame determine the final encoding mode of the current audio frame; the encoding module 803 is used to encode the current audio frame according to the final encoding mode of the current audio frame. to encode.
  • the determining module 802 is specifically configured to be used when the set bit rate is less than the set threshold, and the final encoding mode of the last audio frame is: In the case of the first bit rate encoding mode, determine that the final encoding mode of the current audio frame is the first bit rate encoding mode; or, when the set bit rate is less than the set threshold, and the final audio frame of the previous frame is determined.
  • the encoding mode is the second bit rate encoding mode
  • determine that the final encoding mode of the current audio frame is to switch the encoding mode from the second bit rate encoding to the first bit rate encoding; or, when the set bit rate is less than the set threshold , and the final encoding mode of the audio frame of the previous frame is from the first bit rate encoding to the second bit rate encoding when the encoding mode is switched, determine that the final encoding mode of the current audio frame is the second bit rate encoding to the first bit rate encoding.
  • the final encoding mode of the current audio frame is the first bit rate encoding mode; or, when the set bit rate is greater than the set threshold, and the final encoding mode of the previous audio frame is the first bit rate encoding mode , determine that the final encoding mode of the current audio frame is to switch the encoding mode from the first bit rate encoding to the second bit rate encoding; or, when the set bit rate is greater than the set threshold, and the previous audio frame
  • the final encoding mode of the audio frame is the second bit rate encoding mode
  • it is determined that the final encoding mode of the current audio frame is the second bit rate encoding mode; or, when the set bit rate is greater than the set threshold, and the last When the final encoding mode of the frame audio frame is the encoding mode of the first bit rate en
  • the determining module 802 is specifically configured to, when the set bit rate is less than Setting a threshold, and when the final encoding mode of the previous audio frame is the first bit rate encoding mode, determine that the final encoding mode of the current audio frame is the first bit rate encoding mode; or, when the setting When the bit rate is less than the set threshold, and the final encoding method of the previous audio frame is the second bit rate encoding method, determine that the final encoding method of the current audio frame is the second bit rate encoding to the first bit rate encoding Switch the encoding mode, and start the first counter, the initial value of the first counter is the first set value, and the first counter is terminated when the value is 0; or, when the set code rate is greater than the set threshold value , and when the final encoding mode of the previous audio frame is the second bit rate encoding mode, determine
  • the encoding module 803 is specifically configured to perform first bit rate encoding processing on the current audio frame; and perform second bit rate encoding processing on the current audio frame.
  • the encoding module 803 is specifically configured to perform first bit rate encoding on the current audio frame when the value of the first counter is equal to the first set value processing; performing a second bit rate encoding process on the current audio frame; or, when the value of the first counter is less than the first set value, performing a first bit rate encoding process on the current audio frame.
  • the encoding module 803 is specifically configured to perform the first bit rate encoding process on the current audio frame when the value of the second counter is greater than 1; The audio frame is subjected to the second bit rate encoding process; or, when the value of the second counter is equal to 1, the second bit rate encoding process is performed on the current audio frame.
  • the encoding module 803 is specifically configured to perform the first bit rate encoding process on the current audio frame when the value of the first counter is greater than 1; The audio frame is subjected to the second bit rate encoding process; or, when the value of the first counter is equal to 1, the first bit rate encoding process is performed on the current audio frame.
  • the encoding module 803 is specifically configured to perform the first bit rate encoding on the current audio frame when the value of the second counter is equal to the first set value processing; performing a second bit rate encoding process on the current audio frame; or, when the value of the second counter is less than the first set value, performing a second bit rate encoding process on the current audio frame.
  • the encoding module 803 is specifically configured to perform the first encoding process on the current audio frame when the first bit rate encoding process supports the sampling rate of the current audio frame or, when the first rate encoding process does not support the sampling rate of the current audio frame, perform down-sampling or up-sampling processing on the current audio frame to obtain the down-sampled or up-sampled current audio frame. Audio frame, performing the first bit rate encoding process on the down-sampled or up-sampled current audio frame, and the first bit rate encoding process supports the sampling rate of the down-sampled or up-sampled current audio frame .
  • the encoding module 803 is specifically configured to perform the second encoding process on the current audio frame when the second bit rate encoding process supports the sampling rate of the current audio frame or, when the second rate encoding process does not support the sampling rate of the current audio frame, perform down-sampling or up-sampling processing on the current audio frame to obtain the down-sampled or up-sampled current audio frame.
  • Audio frame, performing the second bit rate encoding process on the down-sampled or up-sampled current audio frame, and the second bit rate encoding process supports the sampling rate of the down-sampled or up-sampled current audio frame .
  • the determining module 802 is further configured to determine the first code rate and the second code rate corresponding to the first code rate encoding process according to the set code rate of the previous audio frame and the set code rate of the current audio frame.
  • the second code rate corresponding to the code rate encoding process, the sum of the first code rate and the second code rate is the set code rate of the current audio frame;
  • the encoding module 803 is specifically configured to use the The first code rate is used to perform the first code rate encoding process on the current audio frame; the second code rate encoding process is performed on the current audio frame at the second code rate.
  • the code stream information corresponding to the encoded current audio frame includes packet header information, the first code rate encoded code stream and/or the second code rate encoded code stream, wherein the packet header information includes all The final encoding mode, sampling rate, number of channels, frame length of the current audio frame and the length of the encoded code stream at the first code rate.
  • the apparatus of this embodiment can be used to execute the technical solution of the method embodiment shown in FIG. 3 , and its implementation principle and technical effect are similar, and details are not repeated here.
  • FIG. 9 is a schematic structural diagram of an embodiment of an audio decoding apparatus of the present application. As shown in FIG. 9 , the apparatus may be applied to the audio receiving device in the above-mentioned embodiment.
  • the decoding apparatus in this embodiment may include: an obtaining module 901 , a parsing module 902 and a decoding module 903 . in,
  • the obtaining module 901 is used to obtain the code stream information; the parsing module 902 is used to parse the code stream information to obtain the decoding mode and the encoded code stream, the encoded code stream includes the first code rate code stream and/or the second code stream A code rate encoding code stream, the decoding methods include a first code rate decoding method, a second code rate decoding method, a switching decoding method from the first code rate decoding to the second code rate decoding, or the second code rate decoding changing to the first code rate Decoding switches the decoding mode.
  • the encoding bit stream When the decoding mode is the first bit rate decoding mode, the encoding bit stream includes the first bit rate encoding bit stream, and when the decoding mode is the second bit rate decoding mode, the encoding bit stream
  • the code stream includes a second code rate coded code stream, and when the decoding mode is to switch the decoding mode from the first code rate decoding to the second code rate decoding or switch the decoding mode from the second code rate decoding to the first code rate decoding, the The encoded code stream includes a first rate encoded code stream and a second code rate encoded code stream; the decoding module 903 is configured to decode the encoded code stream according to the decoding method to obtain a target audio frame.
  • the decoding module 903 is specifically configured to perform decoding processing on the encoded bit stream of the first bit rate according to the decoding method of the first bit rate to obtain the encoded bit stream of the first bit rate. describe the target audio frame.
  • the decoding module 903 it is specifically used to decode the encoded code stream of the first code rate according to the first code rate decoding method to obtain second data; overflow the second data queue corresponding to the decoding process of the first code rate from the queue head For M sample data, the second data is put into the second data queue in a first-in-first-out FIFO manner, and M is associated with the frame length of the second code rate decoding process; The head of line extracts M sample data to obtain the target audio frame.
  • the decoding module 903 is specifically configured to perform decoding processing on the second bit rate coded bit stream according to the second bit rate decoding mode to obtain the desired frame rate. describe the target audio frame.
  • the decoding module 903 when the decoding method is the second code rate decoding method, and the frame length of the first code rate decoding process is greater than the frame length of the second code rate decoding process, the decoding module 903, Specifically, it is used to decode the encoded code stream of the second code rate according to the decoding method of the second code rate to obtain first data; overflow the first data queue corresponding to the decoding process of the second code rate from the queue head For M sample data, the first data is put into the first data queue in a first-in-first-out FIFO manner, and M is associated with the frame length of the first code rate decoding processing; The head of line extracts M sample data to obtain the target audio frame.
  • the decoding module 903 is specifically configured to A code rate decoding method is used to decode the first code rate coded code stream to obtain second data; and the second code rate code stream is decoded according to the second code rate decoding method to obtain the first code stream.
  • N is the second set value
  • the target audio frame is obtained from the second data except for the last N sample point data and the N sample point smoothing data.
  • the decoding module 903 when the decoding mode switches from decoding at the first bit rate to decoding at the second bit rate, and the frame length of the first bit rate decoding process is greater than the frame length of the second bit rate decoding process
  • the decoding module 903 when the decoding mode of the previous audio frame is not the decoding mode of the first bit rate decoding to the second bit rate decoding, the decoding process corresponding to the second bit rate decoding
  • the first data queue is set to all 0s, and the first data queue follows the principle of first-in, first-out; decoding processing is performed on the encoded code stream of the first code rate according to the first code rate decoding method to obtain second data; Perform decoding processing on the encoded code stream of the second code rate according to the second code rate decoding method to obtain first data; overflow the first data queue with M sample data from the queue head, and store the first data A piece of data is put into the tail of the first data queue, and M is associated with the frame length of the first rate decoding process; M sample data are extracted from
  • the decoding module 903 when the decoding mode is switched from decoding at the first bit rate to decoding at the second bit rate, and the frame length of the first bit rate decoding process is smaller than the frame length of the second bit rate decoding process
  • the decoding method of the last audio frame is not the decoding method of the first bit rate decoding to the second bit rate decoding, the decoding method is based on the first bit rate decoding method.
  • the decoding module 903 is specifically configured to The second rate coded code stream is decoded in the two-rate decoding mode to obtain first data; the first code rate coded code stream is decoded according to the first code rate decoding method to obtain the second code stream.
  • N is the second set value
  • the target audio frame is obtained from the first data except the last N sample data and the N sample smoothed data.
  • the decoding module 903 is specifically configured to, when the decoding method of the last audio frame is not the second bit rate decoding to switch the decoding method to the first bit rate decoding, according to the second bit rate decoding method Perform decoding processing on the second code rate encoding code stream to obtain first data; overflow the first data queue corresponding to the second code rate decoding processing with M sample data from the queue head, in a first-in, first-out FIFO manner Put the first data into the first data queue, and M is associated with the frame length of the first code rate decoding process; extract M sample data from the head of the first data queue to obtain a third performing decoding processing on the encoded code stream of the first code rate according to the first code rate decoding method to obtain second data; comparing the last N sample point data of the third data with the data of
  • the decoding module 903 when the decoding mode is switched from the second code rate decoding to the first code rate decoding, and the frame length of the first code rate decoding process is smaller than the frame length of the second code rate decoding process
  • the decoding module 903 when the decoding mode of the previous audio frame is not the decoding mode of the second bit rate decoding to the first bit rate decoding, the decoding process corresponding to the first bit rate decoding
  • the second data queue is set to all 0s, and the second data queue follows the first-in-first-out principle; decoding the second rate-encoded code stream according to the second rate decoding method to obtain the first data; Perform decoding processing on the encoded code stream at the first code rate according to the first code rate decoding method to obtain second data; overflow the second data queue with M sample data from the queue head, and store the second data
  • the second data is put into the tail of the second data queue, and M is associated with the frame length of the second rate decoding process; M sample data are extracted from the head of the second data
  • the decoding module 903 is specifically configured to determine whether the first code rate decoding process supports the sampling rate corresponding to the first code rate encoding code stream; If the decoding process supports the sampling rate, perform the first code rate decoding process on the first code rate encoded code stream; or, if the first code rate decoding process does not support the sampling rate, perform the first code rate decoding process on the The first code rate encoded code stream is subjected to the first code rate decoding process to obtain fifth data, and the fifth data is subjected to up-sampling or down-sampling processing.
  • the decoding module 903 is specifically configured to determine whether the second code rate decoding process supports the sampling rate corresponding to the second code rate encoding code stream; If the decoding process supports the sampling rate, perform the second code rate decoding process on the second code rate encoded code stream; or, if the second code rate decoding process does not support the sampling rate, perform the second code rate decoding process on the
  • the second code rate encoding code stream is subjected to the second code rate decoding process to obtain sixth data, and the sixth data is subjected to up-sampling or down-sampling processing.
  • the apparatus of this embodiment can be used to execute the technical solution of the method embodiment shown in FIG. 3 , and its implementation principle and technical effect are similar, and details are not repeated here.
  • each step of the above method embodiments may be completed by a hardware integrated logic circuit in a processor or an instruction in the form of software.
  • the processor can be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other Programming logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC application-specific integrated circuit
  • FPGA field programmable gate array
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the steps of the methods disclosed in the embodiments of the present application may be directly embodied as executed by a hardware coding processor, or executed by a combination of hardware and software modules in the coding processor.
  • the software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art.
  • the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps of the above method in combination with its hardware.
  • the memory mentioned in the above embodiments may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically programmable Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • Volatile memory may be random access memory (RAM), which acts as an external cache.
  • RAM random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous DRAM
  • SDRAM double data rate synchronous dynamic random access memory
  • ESDRAM enhanced synchronous dynamic random access memory
  • SLDRAM synchronous link dynamic random access memory
  • direct rambus RAM direct rambus RAM
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium.
  • the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

一种音频编解码方法和装置。包括:获得待编码的当前音频帧的设定码率和上一帧音频帧的最终编码方式,最终编码方式包括第一码率编码方式、第二码率编码方式、第一码率编码向第二码率编码切换编码方式或者第二码率编码向第一码率编码切换编码方式;根据设定码率和上一帧音频帧的最终编码方式确定当前音频帧的最终编码方式;根据当前音频帧的最终编码方式对当前音频帧进行编码。音频发送设备将码流信息发送给音频接收设备。解析码流信息以获得解码方式和编码码流;音频接收设备根据解码方式对编码码流进行解码以获得目标音频帧。在符合蓝牙信道对数据传输大小限制的前提下,保证音频的音质。

Description

音频编解码方法和装置
本申请要求于2020年11月11日提交中国专利局、申请号为202011258196.6、申请名称为“音频编解码方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及音频编解码技术,尤其涉及一种音频编解码方法和装置。
背景技术
随着真无线立体声(true wireless stereo,TWS)耳机、智能音箱和智能手表等无线蓝牙设备在人们日常生活中的广泛普及和使用,人们在各种场景下对高质量音乐播放体验的需求也变得越来越迫切,尤其是在地铁、机场、火车站等蓝牙信号易受干扰的环境中。由于蓝牙信道对数据传输大小的限制,音乐数据流必须经过蓝牙设备发送端的音频编码器进行数据压缩后才能传输到蓝牙设备接收端进行解码,这样同时也促使了各种蓝牙音频编解码器的蓬勃发展。
蓝牙音频编解码器主要应用于蓝牙互联的设备(耳机、音箱、智能可穿戴设备等)之间,以在不同场景需求下提供一个高质量的音乐传输和播放。目前音频编解码技术分为两种,一种是高码率编解码技术,可以适用于对蓝牙信道传输质量有较高需求的场景,另一种是低码率编解码技术,可以适用于对音质有较高需求的场景。
因此,如何在高码率编解码和低码率编解码之间实现平滑过渡,是满足用户在任意场景下对高品质音乐的需求的关键。
发明内容
本申请提供一种音频编解码方法和装置,以实现低码率编解码处理和高码率编解码处理的无缝融合,使得在符合蓝牙信道对数据传输大小限制的前提下,最大化保证音频的音质,提高蓝牙信道的抗干扰能力,带给用户更优化的音频体验。
第一方面,本申请提供一种音频编解码方法,包括:音频发送设备获得待编码的当前音频帧的设定码率和上一帧音频帧的最终编码方式,最终编码方式包括第一码率编码方式、第二码率编码方式、第一码率编码向第二码率编码切换编码方式或者第二码率编码向第一码率编码切换编码方式,其中,第一码率低于第二码率;根据设定码率和上一帧音频帧的最终编码方式确定当前音频帧的最终编码方式;根据当前音频帧的最终编码方式对当前音频帧进行编码。音频发送设备将码流信息发送给音频接收设备。音频接收设备获得码流信息;解析码流信息以获得解码方式和编码码流,编码码流包括第一码率编码码流和/或第二码率编码码流,解码方式包括第一码率解码方式、第二码率解码方式、第一码率解码向第二码率解码切换解码方式或者第二码率解码向第一码率解码切换解码方式,当解码方式为第一码率解码方式时,编码码流包括第一码率编码码流,当解码方式为第二码率解码方式时,编码码流包括第二码率编码码流,当解码方式为第一码率解码向第二码率解码切换解码方式或者第二码率解码向第一码率解码切换解码方式时,编码码流包括第一码率编码码 流和第二码率编码码流;根据解码方式对编码码流进行解码以获得目标音频帧。
音频帧可以是音频发送设备发送给音频接收设备的音频中的任意一帧。本申请中每次编码的对象可以是音频中的一帧音频帧,即本申请提供的音频编码方法是针对一帧音频帧的,下文确定编码方式的方法适用于音频中的每一帧音频帧。因此为了区分,将音频发送设备正在进行编码的音频帧称为音频帧或者当前音频帧,将音频发送设备仅先于音频帧编码的音频帧称为上一帧音频帧。可选的,音频帧可以以音频脉冲编码调制(pulse code modulation,PCM)数据的形式表示。
设定码率可以是用户根据当前的信道状态预先设定的目标编码码率。该设定码率例如可以是192kbps、256kbps、400kbps或者600kbps。
最终编码方式是指音频发送设备对音频帧进行编码时实际采用的编码方式,可以包括第一码率编码方式、第二码率编码方式、第一码率编码向第二码率编码切换编码方式或者第二码率编码向第一码率编码切换编码方式,其中,第一码率低于第二码率,例如第一码率可以是64kbps,128kbps,192kbps,256kbps,400kbps,或600kbps等等,第二码率可以是128kbps,192kbps,256kbps,400kbps,或600kbps等等;需要说明的是,本发明实施例中并不限定第一码率和第二码率的具体数值,只要满足第一码率低于第二码率即可。下文中也可以将第一码率称作低码率,将第二码率称作高码率。
本申请中涉及两种编码处理,第一码率编码处理和第二码率编码处理,也即低码率编码处理和高码率编码处理,其中,低码率编码处理例如可以包括高级音频编码器(advanced audio coding,AAC)、下一代蓝牙默认的低功耗低延迟的编码器(low complexity communication codec,LC3)等,高码率编码处理例如可以包括低时延高保真音频编码器(low-latency hi-definition audio codec,LHDC)、低功耗通信编码器高码率版本(low complexity communication codec plus,LC3plus)等。
在一种可能的实现方式中,低码率编码处理的帧长和高码率编码处理的帧长相同,且低码率编码处理的编解码的总时延与高码率编码处理的编解码的总时延相同的情况下,当设定码率小于设定阈值,且上一帧音频帧的最终编码方式为低码率编码方式时,确定当前音频帧的最终编码方式为低码率编码方式;或者,
当设定码率小于设定阈值,且上一帧音频帧的最终编码方式为高码率编码方式时,确定当前音频帧的最终编码方式为高码率编码向低码率编码切换编码方式;或者,
当设定码率小于设定阈值,且上一帧音频帧的最终编码方式为低码率编码向高码率编码切换编码方式时,确定当前音频帧的最终编码方式为高码率编码向低码率编码切换编码方式;或者,
当设定码率小于设定阈值,且上一帧音频帧的最终编码方式为高码率编码向低码率编码切换编码方式时,确定当前音频帧的最终编码方式为低码率编码方式;或者,
当设定码率大于设定阈值,且上一帧音频帧的最终编码方式为低码率编码方式时,确定当前音频帧的最终编码方式为低码率编码向高码率编码切换编码方式;或者,
当设定码率大于设定阈值,且上一帧音频帧的最终编码方式为高码率编码方式时,确定当前音频帧的最终编码方式为高码率编码方式;或者,
当设定码率大于设定阈值,且上一帧音频帧的最终编码方式为低码率编码向高码率编码切换编码方式时,确定当前音频帧的最终编码方式为高码率编码方式;或者,
当设定码率大于设定阈值,且上一帧音频帧的最终编码方式为高码率编码向低码率编码切换编码方式时,确定当前音频帧的最终编码方式为低码率编码向高码率编码切换编码方式。
上述设定阈值的取值与音频帧的声道数相关联。例如,音频帧的声道数为单声道时,设定阈值可以是150kbps,音频帧的声道数为双声道时,设定阈值可以是300kbps。
在一种可能的实现方式中,低码率编码处理的帧长和高码率编码处理的帧长不相同的情况下,当设定码率小于设定阈值,且上一帧音频帧的最终编码方式为低码率编码方式时,确定当前音频帧的最终编码方式为低码率编码方式;或者,
当设定码率小于设定阈值,且上一帧音频帧的最终编码方式为高码率编码方式时,确定当前音频帧的最终编码方式为高码率编码向低码率编码切换编码方式,且启动第一计数器,第一计数器的初始值为第一设定值,第一计数器在值为0时终止。启动第一计数器的目的在于统计切换帧的处理情况,在处理第一个切换帧时启动该第一计数器,并且将第一计数器的初始值设定为计算获得的切换帧的个数(第一设定值)。每处理完一个切换帧,就将第一计数器减1,当第一计数器的值为0时表示切换帧已经全部编码完成,此时将第一计数器终止。第一计数器的值为第一设定值时表示当前处理的是第一个切换帧,第一计数器的值为1时表示当前处理的是最后一个切换帧,第一计数器的值小于第一设定值且大于1时表示当前处理的是中间的切换帧;或者,
当设定码率大于设定阈值,且上一帧音频帧的最终编码方式为高码率编码方式时,确定当前音频帧的最终编码方式为高码率编码方式;或者,
当设定码率大于设定阈值,且上一帧音频帧的最终编码方式为低码率编码方式时,确定当前音频帧的最终编码方式为低码率编码向高码率编码切换编码方式,且启动第二计数器,第二计数器的初始值为第一设定值,第二计数器在值为0时终止。同样的,启动第二计数器的目的在于统计切换帧的处理情况,在处理第一个切换帧时启动该第二计数器,并且将第二计数器的初始值设定为计算获得的切换帧的个数(第一设定值)。每处理完一个切换帧,就将第二计数器减1,当第二计数器的值为0时表示切换帧已经全部编码完成,此时将第二计数器终止。第二计数器的值为第一设定值时表示当前处理的是第一个切换帧,第二计数器的值为1时表示当前处理的是最后一个切换帧,第二计数器的值小于第一设定值且大于1时表示当前处理的是中间的切换帧;或者,
当上一帧音频帧的最终编码方式为高码率编码向低码率编码切换编码方式,且第一计数器处于启动状态(亦即第一计数器的值大于0)时,将第一计数器的值减1;若第一计数器仍处于启动状态,则确定当前音频帧的最终编码方式为高码率编码向低码率编码切换编码方式;或者,若第一计数器终止(亦即第一计数器的值为0),则确定当前音频帧的最终编码方式为低码率编码方式;或者,
当上一帧音频帧的最终编码方式为低码率编码向高码率编码切换编码方式,且第二计数器处于启动状态(亦即第二计数器的值大于0)时,将第二计数器的值减1;若第二计数器仍处于启动状态,则确定当前音频帧的最终编码方式为低码率编码向高码率编码切换编码方式;或者,若第二计数器终止(亦即第二计数器的值为0),则确定当前音频帧的最终编码方式为高码率编码方式;
同样的,上述设定阈值的取值与音频帧的声道数相关联。例如,音频帧的声道数为单 声道时,设定阈值可以是165kbps,音频帧的声道数为双声道时,设定阈值可以是330kbps。
上述切换帧的个数D的取值可以采用以下方法获得,其中,A表示低码率编码方式,B表示高码率编码方式,A→B表示低码率编码向高码率编码切换编码方式,B→A表示高码率编码向低码率编码切换编码方式。
(1)A→B的切换帧的个数
D=取整((max(低码率的编解码的总时延,高码率的编解码的总时延)+低码率向高码率交叠长度+处理帧长-1)/处理帧长)
(2)B→A的切换帧的个数
D=取整((max(低码率的编解码的总时延,高码率的编解码的总时延)+高码率向低码率交叠长度+处理帧长-1)/处理帧长)
其中,处理帧长=max(低码率编码处理的帧长,高码率编码处理的帧长),低码率向高码率交叠长度=处理帧长–低码率的编解码的总时延%处理帧长,高码率向低码率交叠长度=处理帧长–高码率的编解码的总时延%处理帧长,%表示取余操作。
音频发送设备对当前音频帧进行编码,因此可以有以下几种情况:
1、低码率编码处理的帧长和高码率编码处理的帧长相同,且低码率编码处理的编解码的总时延与高码率编码处理的编解码的总时延相同
(1)当前音频帧的最终编码方式为低码率编码方式
音频发送设备对当前音频帧进行低码率编码处理。
在一种可能的实现方式中,音频发送设备可以先判断低码率编码处理是否支持当前音频帧的采样率,当低码率编码处理支持当前音频帧的采样率时,可以直接对当前音频帧进行低码率编码处理;或者,当低码率编码处理不支持当前音频帧的采样率时,可以先对当前音频帧进行下采样或上采样处理以获得下采样或上采样后的当前音频帧,然后再对下采样或上采样后的当前音频帧进行低码率编码处理,该低码率编码处理支持下采样或上采样后的当前音频帧的采样率。例如,低码率编码处理不支持88.2kHz和96kHz的采样率,音频发送设备可以采用正交镜像变换(quadrature mirror filter,QMF)进行下采样处理,将88.2kHz对应的频带(0~44.1kHz)分成两个子带0~22.05kHz和22.05~44.1kHz,选取低子带0~22.05kHz进行低码率编码处理;将96kHz对应的频带(0~48kHz)分成两个子带0~24kHz和24~48kHz,选取低子带0~24kHz进行低码率编码处理。
(2)当前音频帧的最终编码方式为高码率编码方式
音频发送设备对当前音频帧进行高码率编码处理。
在一种可能的实现方式中,音频发送设备可以先判断高码率编码处理是否支持当前音频帧的采样率,当高码率编码处理支持当前音频帧的采样率时,可以直接对当前音频帧进行高码率编码处理;或者,当高码率编码处理不支持当前音频帧的采样率时,可以先对当前音频帧进行下采样或上采样处理以获得下采样或上采样后的当前音频帧,然后再对下采样或上采样后的当前音频帧进行高码率编码处理,该高码率编码处理支持下采样或上采样后的当前音频帧的采样率。
(3)当前音频帧的最终编码方式为低码率编码向高码率编码切换编码方式或者高码率编码向低码率编码切换编码方式
音频发送设备可以对当前音频帧进行低码率编码处理和高码率编码处理。同样的,音 频发送设备可以参照上述描述先判断高低码率两种编码处理是否支持当前音频帧的采样率,此处不再赘述。
2、低码率编码处理和高码率编码处理各自的帧长不相同,或者低码率编码处理和高码率编码处理各自的帧长相同且编解码的总时延不相同
(1)当前音频帧的最终编码方式为低码率编码方式
音频发送设备可以先判断低码率编码处理是否支持当前音频帧的采样率,当低码率编码处理支持当前音频帧的采样率时,可以直接对当前音频帧进行低码率编码处理;或者,当低码率编码处理不支持当前音频帧的采样率时,可以先对当前音频帧进行下采样或上采样处理以获得下采样或上采样后的当前音频帧,然后再对下采样或上采样后的当前音频帧进行低码率编码处理,该低码率编码处理支持下采样或上采样后的当前音频帧的采样率。
(2)当前音频帧的最终编码方式为高码率编码方式
音频发送设备可以先判断高码率编码处理是否支持当前音频帧的采样率,当高码率编码处理支持当前音频帧的采样率时,可以直接对当前音频帧进行高码率编码处理;或者,当高码率编码处理不支持当前音频帧的采样率时,可以先对当前音频帧进行下采样或上采样处理以获得下采样或上采样后的当前音频帧,然后再对下采样或上采样后的当前音频帧进行高码率编码处理,该高码率编码处理支持下采样或上采样后的当前音频帧的采样率。
(3)当前音频帧的最终编码方式为低码率编码向高码率编码切换编码方式
A、低码率编码处理的帧长大于高码率编码处理的帧长
如上所述,第二计数器的作用是统计切换帧的处理情况,第二计数器处于启动状态表示当前处理的仍然是切换帧,此时当第二计数器的值大于1(表示当前处理的仍然是切换帧且不是切换帧中的最后一帧)时,对当前音频帧进行低码率编码处理;对当前音频帧进行高码率编码处理。或者,当第二计数器的值等于1(表示当前处理的是切换帧中的最后一帧)时,对当前音频帧进行高码率编码处理。
音频发送设备可以参照上述描述先判断高低码率两种编码处理是否支持当前音频帧的采样率,此处不再赘述。
B、低码率编码处理的帧长小于高码率编码处理的帧长
如上所述,第二计数器的作用是统计切换帧的处理情况,第二计数器处于启动状态表示当前处理的仍然是切换帧,此时第二计数器的值等于第一设定值(表示当前处理的是切换帧中的第一帧)时,对当前音频帧进行低码率编码处理;对当前音频帧进行高码率编码处理。或者,当第二计数器的值小于第一设定值(表示当前处理的仍然是切换帧且不是切换帧中的第一帧)时,对当前音频帧进行高码率编码处理。
音频发送设备可以参照上述描述先判断高低码率两种编码处理是否支持当前音频帧的采样率,此处不再赘述。
(4)当前音频帧的最终编码方式为高码率编码向低码率编码切换编码方式
A、低码率编码处理的帧长大于高码率编码处理的帧长
如上所述,第一计数器的作用是统计切换帧的处理情况,第一计数器处于启动状态表示当前处理的仍然是切换帧,此时当第一计数器的值等于第一设定值(表示当前处理的是切换帧中的第一帧)时,对当前音频帧进行低码率编码处理;对当前音频帧进行高码率编码处理。或者,当第一计数器的值小于第一设定值(表示当前处理的仍然是切换帧且不是 切换帧中的第一帧)时,对当前音频帧进行低码率编码处理。
音频发送设备可以参照上述描述先判断高低码率两种编码处理是否支持当前音频帧的采样率,此处不再赘述。
B、低码率编码处理的帧长小于高码率编码处理的帧长
如上所述,第一计数器的作用是统计切换帧的处理情况,第一计数器处于启动状态表示当前处理的仍然是切换帧,此时当第一计数器的值大于1(表示当前处理的仍然是切换帧且不是切换帧中的最后一帧)时,对当前音频帧进行低码率编码处理;对当前音频帧进行高码率编码处理。或者,当第一计数器的值等于1(表示当前处理的是切换帧中的最后一帧)时,对当前音频帧进行低码率编码处理。
音频发送设备可以参照上述描述先判断高低码率两种编码处理是否支持当前音频帧的采样率,此处不再赘述。
编码后的当前音频帧对应的码流信息包括包头信息、低码率编码码流和/或高码率编码码流,其中,包头信息包括当前音频帧的最终编码方式、采样率、声道数、帧长和低码率编码码流的长度。如果音频发送设备对当前音频帧只进行低码率编码处理,那么码流信息中只包含低码率编码码流;如果音频发送设备对当前音频帧只进行高码率编码处理,那么码流信息中只包含高码率编码码流;如果音频发送设备对当前音频帧进行低码率编码处理和高码率编码处理,那么码流信息中包含低码率编码码流和高码率编码码流。
音频发送设备可以通过蓝牙连接等通信方式将码流信息发送给音频接收设备。
与编码端相对应,音频发送设备采用哪种编码方式对音频帧进行编码处理,在解码端就需要采用对应的解码方式对编码码流进行解码处理。因此有以下几种解码处理方式:
1、低码率编码处理的帧长和高码率编码处理的帧长相同,且低码率编码处理的编解码的总时延与高码率编码处理的编解码的总时延相同
(1)解码方式为低码率解码方式
音频接收设备对低码率编码码流进行低码率解码处理。
在一种可能的实现方式中,音频接收设备可以先判断低码率解码处理是否支持低码率编码码流对应的采样率,当低码率解码处理支持该采样率时,可以直接对低码率编码码流进行低码率解码处理;或者,当低码率解码处理不支持该采样率时,可以先对低码率编码码流进行低码率解码处理,然后对解码后的数据进行上采样或下采样处理以获得目标音频帧。需要说明的是,编码端和解码端所执行的上采样或下采样处理时相对应的,即编码端如果采用了下采样处理,那么解码端可以采用上采样处理;编码端如果采用了上采样处理,那么解码端可以采用下采样处理。例如,上述描述中在编码端,音频发送设备对音频帧进行下采样处理后取低子带进行编码处理,相应的,在解码端,音频接收设备对低码率编码码流解码后,由于缺少高子带部分的数据,因此通过补0的方式上采样获得目标音频帧。
(2)解码方式为高码率解码方式
音频接收设备对高码率编码码流进行高码率解码处理。
在一种可能的实现方式中,音频接收设备可以先判断高码率解码处理是否支持高码率编码码流对应的采样率,当高码率解码处理支持该采样率时,可以直接对高码率编码码流进行高码率解码处理;或者,当高码率解码处理不支持该采样率时,可以先对高码率编码码流进行高码率解码处理,然后对解码后的数据进行上采样或下采样处理以获得目标音频 帧。
(3)解码方式为低码率解码向高码率解码切换解码方式
音频接收设备对低码率编码码流进行低码率解码处理以获得第二数据,并对高码率编码码流进行高码率解码处理以获得第一数据。得到第二数据和第一数据之后,音频接收设备可以对第二数据的后端数据和第一数据的前端数据进行平滑处理,以确保低高码率之间的平滑切换,平滑的数据长度为N个样点数据,即将第二数据的后N个样点数据与第一数据的前N个样点数据进行加权平均以获得N个样点平滑数据,根据第二数据除后N个样点数据外的其他数据和N个样点平滑数据获得目标音频帧。
同样的,音频接收设备可以上述描述先判断高低码率两种解码处理是否支持采样率,此处不再赘述。
(4)解码方式为高码率解码向低码率解码切换解码方式
音频接收设备对高码率编码码流进行高码率解码处理以获得第一数据,并对低码率编码码流进行低码率解码处理以获得第二数据。得到第一数据和第二数据之后,音频接收设备可以对第一数据的后端数据和第二数据的前端数据进行平滑处理,以确保高低码率之间的平滑切换,平滑的数据长度为N个样点数据,即将第一数据的后N个样点数据与第二数据的前N个样点数据进行加权平均以获得N个样点平滑数据,根据第一数据除后N个样点数据外的其他数据和N个样点平滑数据获得目标音频帧。
同样的,音频接收设备可以上述描述先判断高低码率两种解码处理是否支持采样率,此处不再赘述。
2、低码率编码处理和高码率编码处理各自的帧长不相同
(1)解码方式为低码率解码方式
A、低码率编码处理的帧长大于高码率编码处理的帧长
音频接收设备对低码率编码码流进行低码率解码处理。
在一种可能的实现方式中,音频接收设备可以先判断低码率解码处理是否支持低码率编码码流对应的采样率,当低码率解码处理支持该采样率时,可以直接对低码率编码码流进行低码率解码处理;或者,当低码率解码处理不支持该采样率时,可以先对低码率编码码流进行低码率解码处理,然后对解码后的数据进行上采样或下采样处理以获得目标音频帧。
B、低码率编码处理的帧长小于高码率编码处理的帧长
音频接收设备对低码率编码码流进行低码率解码处理以获得第二数据。由于低码率解码处理的帧长小于高码率解码处理的帧长,为了使低码率解码处理和高码率解码处理得到的音频帧对齐,音频接收设备可以在得到第二数据后,将低码率解码处理对应的第二数据队列从队头溢出M个样点数据,按照先入先出(first input first output,FIFO)方式将第二数据放入第二数据队列,然后从第二数据队列的队头提取M个样点数据以获得目标音频帧。该第二数据队列遵循先入先出的原则。M与高码率解码处理的帧长相关联,例如,M=处理帧长×声道数,处理帧长是高码率解码处理的帧长。
同样的,音频接收设备可以上述描述先判断低码率解码处理是否支持采样率,此处不再赘述。
(2)解码方式为高码率解码方式
A、低码率编码处理的帧长大于高码率编码处理的帧长
音频接收设备对高码率编码码流进行高码率解码处理以获得第一数据。由于低码率编码处理的帧长大于高码率编码处理的帧长,为了使低码率解码处理和高码率解码处理得到的音频帧对齐,音频接收设备可以在得到第一数据后,将高码率解码处理对应的第一数据队列从队头溢出M个样点数据,并将第一数据放入第一数据队列的队尾,然后从第一数据队列的队头提取M个样点数据以获得目标音频帧。该第一数据队列遵循先入先出的原则。M与低码率解码处理的帧长相关联。
同样的,音频接收设备可以上述描述先判断高码率解码处理是否支持采样率,此处不再赘述。
B、低码率编码处理的帧长小于高码率编码处理的帧长
音频接收设备对高码率编码码流进行高码率解码处理。
在一种可能的实现方式中,音频接收设备可以先判断高码率解码处理是否支持高码率编码码流对应的采样率,当高码率解码处理支持该采样率时,可以直接对高码率编码码流进行高码率解码处理;或者,当高码率解码处理不支持该采样率时,可以先对高码率编码码流进行高码率解码处理,然后对解码后的数据进行上采样或下采样处理以获得目标音频帧。
(3)解码方式为低码率解码向高码率解码切换解码方式
A、低码率编码处理的帧长大于高码率编码处理的帧长
当上一帧音频帧的解码方式不是低码率解码向高码率解码切换解码方式时,将高码率解码处理对应的第一数据队列置为全0,第一数据队列遵循先入先出的原则;对低码率编码码流进行低码率解码处理以获得第二数据;对高码率编码码流进行高码率解码处理以获得第一数据;将第一数据队列从队头溢出M个样点数据,并将第一数据放入第一数据队列的队尾,M与低码率解码处理的帧长相关联;从第一数据队列的队头提取M个样点数据以获得第三数据;将第二数据的后N个样点数据与第三数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据第二数据除后N个样点数据外的其他数据和N个样点平滑数据获得目标音频帧。
音频发送设备可以参照上述描述先判断高低码率两种编码处理是否支持当前音频帧的采样率,此处不再赘述。
B、低码率编码处理的帧长小于高码率编码处理的帧长
当上一帧音频帧的解码方式不是低码率解码向高码率解码切换解码方式时,对低码率编码码流进行低码率解码处理以获得第二数据;将低码率解码处理对应的第二数据队列从队头溢出M个样点数据,并将第二数据放入第二数据队列的队尾,第二数据队列遵循先入先出的原则,M与高码率解码处理的帧长相关联;从第二数据队列的队头提取M个样点数据以获得第四数据;对高码率编码码流进行高码率解码处理以获得第一数据;将第四数据的后N个样点数据与第一数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据第四数据除后N个样点数据外的其他数据和N个样点平滑数据获得目标音频帧。或者,当上一帧音频帧的解码方式是低码率解码向高码率解码切换解码方式时,将第二数据队列从队头溢出M个样点数据;从第二数据队列的队头提取M个样点数据以获得第四数据;对高码率编码码流进行高码率解码处理以获得第一数据;将第四数据的后N个 样点数据与第一数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据第四数据除后N个样点数据外的其他数据和N个样点平滑数据获得目标音频帧。
音频发送设备可以参照上述描述先判断高低码率两种编码处理是否支持当前音频帧的采样率,此处不再赘述。
(4)解码方式为高码率解码向低码率解码切换解码方式
A、低码率编码处理的帧长大于高码率编码处理的帧长
当上一帧音频帧的解码方式不是高码率解码向低码率解码切换解码方式时,对高码率编码码流进行高码率解码处理以获得第一数据;将高码率解码处理对应的第一数据队列从队头溢出M个样点数据,并将第一数据放入第一数据队列的队尾,第一数据队列遵循先入先出的原则,M与低码率解码处理的帧长相关联;从第一数据队列的队头提取M个样点数据以获得第三数据;对低码率编码码流进行低码率解码处理以获得第二数据;将第三数据的后N个样点数据与第二数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据第三数据除后N个样点数据外的其他数据和N个样点平滑数据获得目标音频帧。或者,当上一帧音频帧的解码方式是高码率解码向低码率解码切换解码方式时,将第一数据队列从队头溢出M个样点数据;从第一数据队列的队头提取M个样点数据以获得第三数据;对低码率编码码流进行低码率解码处理以获得第二数据;将第三数据的后N个样点数据与第二数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据第三数据除后N个样点数据外的其他数据和N个样点平滑数据获得目标音频帧。
音频发送设备可以参照上述描述先判断高低码率两种编码处理是否支持当前音频帧的采样率,此处不再赘述。
B、低码率编码处理的帧长小于高码率编码处理的帧长
当上一帧音频帧的解码方式不是高码率解码向低码率解码切换解码方式时,将低码率解码处理对应的第二数据队列置为全0,第二数据队列遵循先入先出的原则;对高码率编码码流进行高码率解码处理以获得第一数据;对低码率编码码流进行低码率解码处理以获得第二数据;将第二数据队列从队头溢出M个样点数据,并将第二数据放入第二数据队列的队尾,M与高码率解码处理的帧长相关联;从第二数据队列的队头提取M个样点数据以获得第四数据;将第一数据的后N个样点数据与第四数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据第一数据除后N个样点数据外的其他数据和N个样点平滑数据获得目标音频帧。
音频发送设备可以参照上述描述先判断高低码率两种编码处理是否支持当前音频帧的采样率,此处不再赘述。
在一种可能的实现方式中,音频发送设备可以根据上一帧音频帧的设定码率和当前音频帧的设定码率确定低码率编码处理对应的第一码率和高码率编码处理对应的第二码率,第一码率和第二码率之和为当前音频帧的设定码率。
(1)当前音频帧的最终编码方式为高码率编码向低码率编码切换编码方式,可以有:
A、如果上一帧音频帧的设定码率brp满足:双声道:600kbps<brp≤990kbps,单声道:300kbps<brp≤495kbps,当前音频帧的设定码率brf满足:双声道:brf<300kbps,单声道:brf<150kbps,那么当前音频帧的码率分配为:第一码率为brf,第二码率为为brp-brf。
B、如果上一帧音频帧的设定码率brp满足:双声道:364kbps<brp≤600kbps,单声道: 182kbps<brp≤300kbps,当前音频帧的设定码率brf满足:双声道:brf<300kbps,单声道:brf<150kbps,那么当前音频帧的码率分配为:第一码率为双声道是brp-300kbps,第一码率为单声道是brp-150kbps,第二码率为双声道是300kbps,第二码率为单声道是150kbps。
C、如果上一帧音频帧的设定码率brp满足:双声道:300kbps≤brp≤364kbps,单声道:150kbps≤brp≤182kbps,当前音频帧的设定码率brf满足:双声道:brf<300kbps,单声道:brf<150kbps,那么当前音频帧的码率分配为:第一码率为双声道是64kbps,第一码率为单声道是32kbps,第二码率为双声道是300kbps,第二码率为单声道是150kbps。
(2)当前音频帧的最终编码方式为低码率编码向高码率编码切换编码方式,可以有:
A、如果上一帧音频帧的设定码率brp满足:双声道:64kbps≤brp<300kbps,单声道:32kbps≤brp<150kbps,当前音频帧的设定码率brf满足:双声道:600kbps<brf≤990kbps,单声道:300kbps<brf≤495kbps,那么当前音频帧的码率分配为:第一码率为brp第二码率为brf-brp。
B、如果上一帧音频帧的设定码率brp满足:双声道:364kbps<brf≤600kbps,单声道:182kbps<brf≤300kbps,那么当前音频帧的码率分配为:第一码率为双声道是brf-300kbps,第一码率为单声道是brf-150kbps,第二码率为双声道是300kbps,第二码率为单声道是150kbps。
C、如果上一帧音频帧的设定码率brp满足:双声道:300kbps≤brf≤364kbps,单声道:150kbps≤brf≤182kbps,那么当前音频帧的码率分配为:第一码率为双声道是64kbps,第一码率为单声道是32kbps,第二码率为双声道是300kbps,第二码率为单声道是150kbps。
D、如果上一帧音频帧的设定码率brp满足:双声道:600kbps<brp≤990kbps,单声道:300kbps<brp≤495kbps,当前音频帧的设定码率brf满足:双声道:600kbps<brf≤990kbps,单声道:300kbps<brf≤495kbps,那么当前音频帧的码率分配为:第一码率和第二码率与上一帧音频帧的分配情况一致。
E、如果上一帧音频帧的设定码率brp满足:双声道:364kbps≤brp≤600kbps,单声道:182kbps≤brp≤300kbps,当前音频帧的设定码率brf满足:双声道:600kbps<brf≤990kbps,单声道:300kbps<brf≤495kbps,那么当前音频帧的码率分配为:第一码率为双声道是299kbps,第一码率为单声道是149kbps,第二码率为双声道是brf-299kbps,第二码率为单声道是brf-149kbps。
本申请基于设定码率和上一帧音频帧的最终编码方式确定当前音频帧的最终编码方式,并且据此设定发生高低码率切换的音频帧上的低码率编码处理和高码率编码处理各自对应的码率,编码端将码流信息发送给解码端;解码端解析码流信息获取解码方式,进而对码流数据进行解码,尤其是在发生高低码率切换的音频帧上对高低码率解码后的数据进行平滑处理,以实现低码率编解码处理和高码率编解码处理的无缝融合,使得在符合蓝牙信道对数据传输大小限制的前提下,最大化保证音频的音质,提高蓝牙信道的抗干扰能力,带给用户更优化的音频体验。
第二方面,本申请提供一种音频编码装置,包括:获得模块,用于获得待编码的当前音频帧的设定码率和上一帧音频帧的最终编码方式,所述最终编码方式包括第一码率编码方式、第二码率编码方式、第一码率编码向第二码率编码切换编码方式或者第二码率编码向第一码率编码切换编码方式,其中,第一码率低于第二码率;确定模块,用于根据所述 设定码率和所述上一帧音频帧的最终编码方式确定所述当前音频帧的最终编码方式;编码模块,用于根据所述当前音频帧的最终编码方式对所述当前音频帧进行编码。
在一种可能的实现方式中,当第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,所述确定模块,具体用于当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码方式;或者,当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式;或者,当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式;或者,当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码方式;或者,当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式;或者,当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码方式;或者,当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码方式;或者,当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式;其中,所述设定阈值的取值与音频帧的声道数相关联。
在一种可能的实现方式中,当第一码率编码处理的帧长和第二码率编码处理的帧长不相同时,所述确定模块,具体用于当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码方式;或者,当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式,且启动第一计数器,所述第一计数器的初始值为第一设定值,所述第一计数器在值为0时终止;或者,当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码方式;或者,当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式,且启动第二计数器,所述第二计数器的初始值为第一设定值,所述第二计数器在值为0时终止;或者,当所述上一帧音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式,且所述第一计数器的值大于0时,将所述第一计数器的值减1;若所述第一计数器的值仍大于0,则确定所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式;或者,若所述第一计数器的值为0,则确定所述当前音频帧的最终编码方式为第一码率编码方式;或者,当所述上一帧音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式,且所述第二计数器的值大于0时,将所述第二计数 器的值减1;若所述第二计数器的值仍大于0,则确定所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式;或者,若所述第二计数器的值为0,则确定所述当前音频帧的最终编码方式为第二码率编码方式;其中,所述设定阈值的取值与音频帧的声道数相关联。
在一种可能的实现方式中,当所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式或者第二码率编码向第一码率编码切换编码方式时,所述编码模块,具体用于对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理。
在一种可能的实现方式中,当所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式,且所述第一码率编码处理的帧长大于所述第二码率编码处理的帧长时,所述编码模块,具体用于当所述第一计数器的值等于所述第一设定值时,对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理;或者,当所述第一计数器的值小于所述第一设定值时,对所述当前音频帧进行第一码率编码处理。
在一种可能的实现方式中,当所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式,且所述第一码率编码处理的帧长大于所述第二码率编码处理的帧长时,所述编码模块,具体用于当所述第二计数器的值大于1时,对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理;或者,当所述第二计数器的值等于1时,对所述当前音频帧进行第二码率编码处理。
在一种可能的实现方式中,当所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式,且所述第一码率编码处理的帧长小于所述第二码率编码处理的帧长时,所述编码模块,具体用于当所述第一计数器的值大于1时,对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理;或者,当所述第一计数器的值等于1时,对所述当前音频帧进行第一码率编码处理。
在一种可能的实现方式中,当所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式,且所述第一码率编码处理的帧长小于所述第二码率编码处理的帧长时,所述编码模块,具体用于当所述第二计数器的值等于所述第一设定值时,对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理;或者,当所述第二计数器的值小于所述第一设定值时,对所述当前音频帧进行第二码率编码处理。
在一种可能的实现方式中,所述编码模块,具体用于当所述第一码率编码处理支持所述当前音频帧的采样率时,对所述当前音频帧进行所述第一码率编码处理;或者,当所述第一码率编码处理不支持所述当前音频帧的采样率时,对所述当前音频帧进行下采样或上采样处理以获得下采样或上采样后的当前音频帧,对所述下采样或上采样后的当前音频帧进行所述第一码率编码处理,所述第一码率编码处理支持所述下采样或上采样后的当前音频帧的采样率。
在一种可能的实现方式中,所述编码模块,具体用于当所述第二码率编码处理支持所述当前音频帧的采样率时,对所述当前音频帧进行所述第二码率编码处理;或者,当所述第二码率编码处理不支持所述当前音频帧的采样率时,对所述当前音频帧进行下采样或上采样处理以获得下采样或上采样后的当前音频帧,对所述下采样或上采样后的当前音频帧进行所述第二码率编码处理,所述第二码率编码处理支持所述下采样或上采样后的当前音 频帧的采样率。
在一种可能的实现方式中,当所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式或者第二码率编码向第一码率编码切换编码方式时,所述确定模块,还用于根据所述上一帧音频帧的设定码率和所述当前音频帧的设定码率确定第一码率编码处理对应的第一码率和第二码率编码处理对应的第二码率,所述第一码率和所述第二码率之和为所述当前音频帧的设定码率;所述编码模块,具体用于以所述第一码率对所述当前音频帧进行所述第一码率编码处理;以所述第二码率对所述当前音频帧进行所述第二码率编码处理。
在一种可能的实现方式中,编码后的当前音频帧对应的码流信息包括包头信息、第一码率编码码流和/或第二码率编码码流,其中,所述包头信息包括所述当前音频帧的最终编码方式、采样率、声道数、帧长和所述第一码率编码码流的长度。
第三方面,本申请提供一种音频解码装置,包括:获得模块,用于获得码流信息;解析模块,用于解析所述码流信息以获得解码方式和编码码流,所述编码码流包括第一码率编码码流和/或第二码率编码码流,所述解码方式包括第一码率解码方式、第二码率解码方式、第一码率解码向第二码率解码切换解码方式或者第二码率解码向第一码率解码切换解码方式,当所述解码方式为第一码率解码方式时,所述编码码流包括第一码率编码码流,当所述解码方式为第二码率解码方式时,所述编码码流包括第二码率编码码流,当所述解码方式为第一码率解码向第二码率解码切换解码方式或者第二码率解码向第一码率解码切换解码方式时,所述编码码流包括第一码率编码码流和第二码率编码码流;解码模块,用于根据所述解码方式对所述编码码流进行解码以获得目标音频帧。
在一种可能的实现方式中,当所述解码方式为第一码率解码方式,并且第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,或者,当所述解码方式为第一码率解码方式,并且第一码率解码处理的帧长大于第二码率解码处理的帧长时,所述解码模块,具体用于根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得所述目标音频帧。
在一种可能的实现方式中,当所述解码方式为第一码率解码方式,并且第一码率解码处理的帧长小于第二码率解码处理的帧长时,所述解码模块,具体用于根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第一码率解码处理对应的第二数据队列从队头溢出M个样点数据,按照先入先出FIFO方式将所述第二数据放入所述第二数据队列,M与所述第二码率解码处理的帧长相关联;从所述第二数据队列的队头提取M个样点数据以获得所述目标音频帧。
在一种可能的实现方式中,当所述解码方式为第二码率解码方式,并且第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,或者,当所述解码方式为第二码率解码方式,并且第一码率解码处理的帧长小于第二码率解码处理的帧长时,所述解码模块,具体用于根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得所述目标音频帧。
在一种可能的实现方式中,当所述解码方式为第二码率解码方式,并且第一码率解码 处理的帧长大于第二码率解码处理的帧长时,所述解码模块,具体用于根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第二码率解码处理对应的第一数据队列从队头溢出M个样点数据,按照先入先出FIFO方式将所述第一数据放入所述第一数据队列,M与所述第一码率解码处理的帧长相关联;从所述第一数据队列的队头提取M个样点数据以获得所述目标音频帧。
在一种可能的实现方式中,当所述解码方式为第一码率解码向第二码率解码切换解码方式,并且第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,所述解码模块,具体用于根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第二数据的后N个样点数据与所述第一数据的前N个样点数据进行加权平均以获得N个样点平滑数据,N为第二设定值;根据所述第二数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧。
在一种可能的实现方式中,当所述解码方式为第一码率解码向第二码率解码切换解码方式,并且第一码率解码处理的帧长大于第二码率解码处理的帧长时,所述解码模块,具体用于当上一帧音频帧的解码方式不是所述第一码率解码向第二码率解码切换解码方式时,将所述第二码率解码处理对应的第一数据队列置为全0,所述第一数据队列遵循先入先出的原则;根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第一数据队列从队头溢出M个样点数据,并将所述第一数据放入所述第一数据队列的队尾,M与所述第一码率解码处理的帧长相关联;从所述第一数据队列的队头提取M个样点数据以获得第三数据;将所述第二数据的后N个样点数据与所述第三数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第二数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧。
在一种可能的实现方式中,当所述解码方式为第一码率解码向第二码率解码切换解码方式,并且第一码率解码处理的帧长小于第二码率解码处理的帧长时,所述解码模块,具体用于当所述上一帧音频帧的解码方式不是所述第一码率解码向第二码率解码切换解码方式时,根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第一码率解码处理对应的第二数据队列从队头溢出M个样点数据,按照先入先出FIFO方式将所述第二数据放入所述第二数据队列,M与所述第二码率解码处理的帧长相关联;从所述第二数据队列的队头提取M个样点数据以获得第四数据;根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第四数据的后N个样点数据与所述第一数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第四数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧;或者,当所述上一帧音频帧的解码方式是所述第一码率解码向第二码率解码切换解码方式时,将所述第二数据队列从队头溢出M个样点数据;从所述第二数据队列的队头提取M个样点数据以获得第四数据;根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第四数据的后N个样点数据与所述第一数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第四 数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧。
在一种可能的实现方式中,当所述解码方式为第二码率解码向第一码率解码切换解码方式,并且第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,所述解码模块,具体用于根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第一数据的后N个样点数据与所述第二数据的前N个样点数据进行加权平均以获得N个样点平滑数据,N为第二设定值;根据所述第一数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧。
在一种可能的实现方式中,当所述解码方式为第二码率解码向第一码率解码切换解码方式,并且第一码率解码处理的帧长大于第二码率解码处理的帧长时,所述解码模块,具体用于当所述上一帧音频帧的解码方式不是所述第二码率解码向第一码率解码切换解码方式时,根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第二码率解码处理对应的第一数据队列从队头溢出M个样点数据,按照先入先出FIFO方式将所述第一数据放入所述第一数据队列,M与所述第一码率解码处理的帧长相关联;从所述第一数据队列的队头提取M个样点数据以获得第三数据;根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第三数据的后N个样点数据与所述第二数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第三数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧;或者,当所述上一帧音频帧的解码方式是所述第二码率解码向第一码率解码切换解码方式时,将所述第一数据队列从队头溢出M个样点数据;从所述第一数据队列的队头提取M个样点数据以获得第三数据;根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第三数据的后N个样点数据与所述第二数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第三数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧。
在一种可能的实现方式中,当所述解码方式为第二码率解码向第一码率解码切换解码方式,并且第一码率解码处理的帧长小于第二码率解码处理的帧长时,所述解码模块,具体用于当上一帧音频帧的解码方式不是所述第二码率解码向第一码率解码切换解码方式时,将所述第一码率解码处理对应的第二数据队列置为全0,所述第二数据队列遵循先入先出的原则;根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第二数据队列从队头溢出M个样点数据,并将所述第二数据放入所述第二数据队列的队尾,M与所述第二码率解码处理的帧长相关联;从所述第二数据队列的队头提取M个样点数据以获得第四数据;将所述第一数据的后N个样点数据与所述第四数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第一数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧。
在一种可能的实现方式中,所述解码模块,具体用于判断所述第一码率解码处理是否支持所述第一码率编码码流对应的采样率;若所述第一码率解码处理支持所述采样率,则对所述第一码率编码码流进行所述第一码率解码处理;或者,若所述第一码率解码处理不 支持所述采样率,则对所述第一码率编码码流进行所述第一码率解码处理以获得第五数据,对所述第五数据进行上采样或下采样处理。
在一种可能的实现方式中,所述解码模块,具体用于判断所述第二码率解码处理是否支持所述第二码率编码码流对应的采样率;若所述第二码率解码处理支持所述采样率,则对所述第二码率编码码流进行所述第二码率解码处理;或者,若所述第二码率解码处理不支持所述采样率,则对所述第二码率编码码流进行所述第二码率解码处理以获得第六数据,对所述第六数据进行上采样或下采样处理。
第四方面,本申请提供一种音频编码设备,包括:一个或多个处理器;存储器,用于存储一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如上述第一方面中由音频发送设备执行的任一项所述的方法。
第五方面,本申请提供一种音频解码设备,包括:一个或多个处理器;存储器,用于存储一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如上述第一方面中由音频接收设备执行的任一项所述的方法。
第六方面,本申请提供一种计算机可读存储介质,包括计算机程序,所述计算机程序在计算机上被执行时,使得所述计算机执行上述第一方面中任一项所述的方法。
第七方面,本申请提供一种计算机可读存储介质,包括根据如上述第一方面中由音频发送设备执行的任一项所述的音频编码方法获得的码流信息。
第八方面,本申请提供一种计算机可读存储介质,包括根据如上述第一方面中由音频接收设备执行的任一项所述的音频解码方法获得的音频帧。
附图说明
图1为本申请音频播放系统的一个示例性的结构图;
图2为本申请音频译码系统10的一个示例性的结构框图;
图3是本申请音频编解码方法的一个示例性的流程图;
图4为本申请码流信息的一个示例性的格式图;
图5a为本申请数据平滑处理的一个示例性的示意图;
图5b为本申请数据平滑处理的一个示例性的示意图;
图6为本申请数据队列的一个示例性的示意图;
图7为本申请音频帧的编解码处理的一个示例性的示意图;
图8为本申请音频编码装置实施例的结构示意图;
图9为本申请音频解码装置实施例的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合本申请中的附图,对本申请中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书实施例和权利要求书及附图中的术语“第一”、“第二”等仅用于区分描述的目的,而不能理解为指示或暗示相对重要性,也不能理解为指示或暗示顺序。此外, 术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元。方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
应当理解,在本申请中,“至少一个(项)”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系,例如,“A和/或B”可以表示:只存在A,只存在B以及同时存在A和B三种情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,“a和b”,“a和c”,“b和c”,或“a和b和c”,其中a,b,c可以是单个,也可以是多个。
本申请涉及到的相关名词解释:
音频帧:音频数据是流式的,在实际应用中,为了便于音频处理和传输,通常取一时长内的音频数据量作为一帧音频,该时长被称为“采样时间”,可以根据编解码器和具体应用的需求确定其值,例如该时长为2.0ms~60ms,ms为毫秒。
图1为本申请音频播放系统的一个示例性的结构图,如图1所示,该音频播放系统包括:音频发送设备和音频接收设备,其中,音频发送设备包括例如手机、电脑(笔记本电脑、台式电脑等)、平板(手持平板、车载平板等)等可以进行音频编码并发送音频码流的设备;音频接收设备包括例如TWS耳机、普通无线耳机、音响、智能手表、智能眼镜等可以接收音频码流、解码音频码流并播放的设备。
音频发送设备和音频接收设备之间可以建立蓝牙连接,二者之间可以支持语音和音乐的传输。音频发送设备和音频接收设备的较为广泛的示例是手机与TWS耳机、无线头戴式耳机或者无线颈圈式耳机之间,或者手机与其他终端设备(例如智能音箱、智能手表、智能眼镜和车载音箱等)之间。可选的,音频发送设备和音频接收设备的示例也可以是平板、笔记本电脑或者台式电脑与TWS耳机、无线头戴式耳机、无线颈圈式耳机或其他终端设备(例如智能音箱、智能手表、智能眼镜和车载音箱)之间。
需要说明的是,音频发送设备和音频接收设备之间除蓝牙连接外,还可以通过其他通信方式连接,例如WiFi连接、有线连接或其他无线连接等,本申请对此不做具体限定。
图2为本申请音频译码系统10的一个示例性的结构框图,如图2所示,音频译码系统10可包括源设备12和目的设备14,源设备12可以是图1的音频发送设备,目的设备14可以是图1的音频接收设备。源设备12产生经编码的码流信息,因此,源设备12可被称为音频编码装置。目的设备14可对由源设备12所产生的经编码的码流信息进行解码,因此,目的设备14可被称为音频解码装置。
源设备12包括编码器20,可选地,可包括输入接口16、音频预处理器18、通信接口22。
输入接口16用于输入音频脉冲编码调制(pulse code modulation,PCM)数据和设定码率。其中,音频PCM数据可以分为语音类型或音乐类型,设定码率可以由用户预先设定。
音频预处理器18用于根据输入接口16输入的设定码率确定编码方式。即基于期望目的:设定码率小于阈值时,使用低码率编码处理对音频帧进行编码,设定码率大于阈值时, 使用高码率编码处理对音频帧进行编码。因此音频帧的最终编码方式取决于当前帧的设定码率和上一帧的最终编码方式。
编码器20用于根据音频预处理器18确定出的编码方式对音频帧进行编码得到码流信息。
源设备12中的通信接口22可用于接收码流信息并通过通信信道13向目的设备14发送该码流。
目的设备14包括解码器30,可选地,可包括通信接口28、音频后处理器32和播放设备34。
目的设备14中的通信接口28用于直接从源设备12接收码流,并将码流提供给解码器30。
通信接口22和通信接口28可用于通过源设备12与目的设备14之间的通信链路,例如蓝牙连接等,发送或接收码流。
例如,通信接口22可用于将码流封装为报文等合适的格式,和/或使用蓝牙的传输编码或处理来处理码流,以便在通信链路上进行传输。
通信接口28与通信接口22对应,例如,可用于接收码流,并使用对应传输解码或处理和/或解封装,得到码流。
通信接口22和通信接口28均可配置为如图2中从源设备12指向目的设备14的对应通信信道13的箭头所指示的单向通信接口,或双向通信接口,并且可用于发送和接收消息等,以建立连接,确认并交换与通信链路和/或编码音频数据等数据传输相关的任何其它信息,等等。
解码器30用于接收码流信息,并根据码流信息中的解码方式对码流信息中的码流解码得到音频数据。
音频后处理器32用于对解码的音频数据进行后处理,得到后处理后的音频数据。音频后处理器32执行的后处理可以包括例如修剪或重采样等。
播放设备34用于接收后处理后的音频数据,以向用户或收听者播放音频。播放设备34可以为或包括任意类型的用于播放重建后音频的播放器,例如,集成或外部扬声器。例如,扬声器可包括喇叭、音响等。
基于上述实施例的描述,本申请提供了一种音频编解码方法。
图3是本申请音频编解码方法的一个示例性的流程图。该过程300可由音频播放系统中的音频发送设备和音频接收设备执行,即由音频发送设备实现音频编码,然后将码流信息发送给音频接收设备,由音频接收设备对码流信息进行解码以获得目标音频帧。过程300描述为一系列的步骤或操作,应当理解的是,过程300可以以各种顺序执行和/或同时发生,不限于图3所示的执行顺序。如图3所示,该方法包括:
步骤301、音频发送设备获得待编码的当前音频帧的设定码率和上一帧音频帧的最终编码方式。
音频帧可以是音频发送设备发送给音频接收设备的音频中的任意一帧。本申请中每次编码的对象可以是音频中的一帧音频帧,即本申请提供的音频编码方法是针对一帧音频帧的,下文确定编码方式的方法适用于音频中的每一帧音频帧。因此为了区分,将音频发送设备正在进行编码的音频帧称为音频帧或者当前音频帧,将音频发送设备仅先于音频帧编 码的音频帧称为上一帧音频帧。可选的,音频帧可以以PCM数据的形式表示。
设定码率可以是用户根据当前的信道状态预先设定的目标编码码率。该设定码率例如可以是192kbps、256kbps、400kbps或者600kbps。
最终编码方式是指音频发送设备对音频帧进行编码时实际采用的编码方式,可以包括第一码率编码方式、第二码率编码方式、第一码率编码向第二码率编码切换编码方式或者第二码率编码向第一码率编码切换编码方式,其中,第一码率低于第二码率,例如第一码率可以是64kbps,128kbps,192kbps,256kbps,400kbps,或600kbps等等,第二码率可以是128kbps,192kbps,256kbps,400kbps,或600kbps等等;需要说明的是,本发明实施例中并不限定第一码率和第二码率的具体数值,只要满足第一码率低于第二码率即可。下文中也可以将第一码率称作低码率,将第二码率称作高码率。
步骤302、音频发送设备根据设定码率和上一帧音频帧的最终编码方式确定当前音频帧的最终编码方式。
本申请中涉及两种编码处理,即低码率编码处理和高码率编码处理,其中,低码率编码处理例如可以包括AAC、下一代蓝牙默认的LC3等,高码率编码处理例如可以包括LHDC、LC3plus等。
表1
Figure PCTCN2021125760-appb-000001
表1示例性的示出了在低码率编码处理的帧长和高码率编码处理的帧长相同,且低码率编码处理的编解码的总时延与高码率编码处理的编解码的总时延相同的情况下,音频发送设备根据设定码率和上一帧音频帧的最终编码方式确定出的当前音频帧的最终编码方式。表1中A表示低码率编码方式,B表示高码率编码方式,A→B表示低码率编码向高码率编码切换编码方式,B→A表示高码率编码向低码率编码切换编码方式。因此可以通过以下方法确定当前音频帧的最终编码方式:
当设定码率小于设定阈值,且上一帧音频帧的最终编码方式为低码率编码方式时,确定当前音频帧的最终编码方式为低码率编码方式;或者,
当设定码率小于设定阈值,且上一帧音频帧的最终编码方式为高码率编码方式时,确定当前音频帧的最终编码方式为高码率编码向低码率编码切换编码方式;或者,
当设定码率小于设定阈值,且上一帧音频帧的最终编码方式为低码率编码向高码率编码切换编码方式时,确定当前音频帧的最终编码方式为高码率编码向低码率编码切换编码方式;或者,
当设定码率小于设定阈值,且上一帧音频帧的最终编码方式为高码率编码向低码率编码切换编码方式时,确定当前音频帧的最终编码方式为低码率编码方式;或者,
当设定码率大于设定阈值,且上一帧音频帧的最终编码方式为低码率编码方式时,确 定当前音频帧的最终编码方式为低码率编码向高码率编码切换编码方式;或者,
当设定码率大于设定阈值,且上一帧音频帧的最终编码方式为高码率编码方式时,确定当前音频帧的最终编码方式为高码率编码方式;或者,
当设定码率大于设定阈值,且上一帧音频帧的最终编码方式为低码率编码向高码率编码切换编码方式时,确定当前音频帧的最终编码方式为高码率编码方式;或者,
当设定码率大于设定阈值,且上一帧音频帧的最终编码方式为高码率编码向低码率编码切换编码方式时,确定当前音频帧的最终编码方式为低码率编码向高码率编码切换编码方式。
上述设定阈值的取值与音频帧的声道数相关联。例如,音频帧的声道数为单声道时,设定阈值可以是150kbps,音频帧的声道数为双声道时,设定阈值可以是300kbps。
表2
Figure PCTCN2021125760-appb-000002
表2示例性的示出了在低码率编码处理的帧长和高码率编码处理的帧长不相同的情况下,音频发送设备根据设定码率和上一帧音频帧的最终编码方式确定出的当前音频帧的最终编码方式。表2中A表示低码率编码方式,B表示高码率编码方式,A→B表示低码率编码向高码率编码切换编码方式,B→A表示高码率编码向低码率编码切换编码方式。
在表2所示的情况下,增加切换帧,即无论上一帧音频帧采用A或者B的编码方式,只要接下来的音频帧确定是采用A→B或者B→A的编码方式,那么从该帧开始连续D个音频帧均直接被确定为是切换帧,D的取值可以采用以下方法获得:
(1)A→B的切换帧的个数
D=取整((max(低码率的编解码的总时延,高码率的编解码的总时延)+低码率向高码率交叠长度+处理帧长-1)/处理帧长)
(2)B→A的切换帧的个数
D=取整((max(低码率的编解码的总时延,高码率的编解码的总时延)+高码率向低码率交叠长度+处理帧长-1)/处理帧长)
其中,处理帧长=max(低码率编码处理的帧长,高码率编码处理的帧长),低码率向高码率交叠长度=处理帧长–低码率的编解码的总时延%处理帧长,高码率向低码率交叠长 度=处理帧长–高码率的编解码的总时延%处理帧长,%表示取余操作。
因此可以通过以下方法确定当前音频帧的最终编码方式:
当设定码率小于设定阈值,且上一帧音频帧的最终编码方式为低码率编码方式时,确定当前音频帧的最终编码方式为低码率编码方式;或者,
当设定码率小于设定阈值,且上一帧音频帧的最终编码方式为高码率编码方式时,确定当前音频帧的最终编码方式为高码率编码向低码率编码切换编码方式,且启动第一计数器,第一计数器的初始值为第一设定值,第一计数器在值为0时终止。启动第一计数器的目的在于统计切换帧的处理情况,在处理第一个切换帧时启动该第一计数器,并且将第一计数器的初始值设定为计算获得的切换帧的个数(第一设定值)。每处理完一个切换帧,就将第一计数器减1,当第一计数器的值为0时表示切换帧已经全部编码完成,此时将第一计数器终止。第一计数器的值为第一设定值时表示当前处理的是第一个切换帧,第一计数器的值为1时表示当前处理的是最后一个切换帧,第一计数器的值小于第一设定值且大于1时表示当前处理的是中间的切换帧;或者,
当设定码率大于设定阈值,且上一帧音频帧的最终编码方式为高码率编码方式时,确定当前音频帧的最终编码方式为高码率编码方式;或者,
当设定码率大于设定阈值,且上一帧音频帧的最终编码方式为低码率编码方式时,确定当前音频帧的最终编码方式为低码率编码向高码率编码切换编码方式,且启动第二计数器,第二计数器的初始值为第一设定值,第二计数器在值为0时终止。同样的,启动第二计数器的目的在于统计切换帧的处理情况,在处理第一个切换帧时启动该第二计数器,并且将第二计数器的初始值设定为计算获得的切换帧的个数(第一设定值)。每处理完一个切换帧,就将第二计数器减1,当第二计数器的值为0时表示切换帧已经全部编码完成,此时将第二计数器终止。第二计数器的值为第一设定值时表示当前处理的是第一个切换帧,第二计数器的值为1时表示当前处理的是最后一个切换帧,第二计数器的值小于第一设定值且大于1时表示当前处理的是中间的切换帧;或者,
当上一帧音频帧的最终编码方式为高码率编码向低码率编码切换编码方式,且第一计数器处于启动状态(亦即第一计数器的值大于0)时,将第一计数器的值减1;若第一计数器仍处于启动状态,则确定当前音频帧的最终编码方式为高码率编码向低码率编码切换编码方式;或者,若第一计数器终止(亦即第一计数器的值为0),则确定当前音频帧的最终编码方式为低码率编码方式;或者,
当上一帧音频帧的最终编码方式为低码率编码向高码率编码切换编码方式,且第二计数器处于启动状态(亦即第二计数器的值大于0)时,将第二计数器的值减1;若第二计数器仍处于启动状态,则确定当前音频帧的最终编码方式为低码率编码向高码率编码切换编码方式;或者,若第二计数器终止(亦即第二计数器的值为0),则确定当前音频帧的最终编码方式为高码率编码方式;
同样的,上述设定阈值的取值与音频帧的声道数相关联。例如,音频帧的声道数为单声道时,设定阈值可以是165kbps,音频帧的声道数为双声道时,设定阈值可以是330kbps。
上述第一设定值即为切换帧的个数D。
步骤303、音频发送设备根据当前音频帧的最终编码方式对当前音频帧进行编码。
根据步骤302中确定的当前音频帧的最终编码方式,音频发送设备对当前音频帧进行 编码,因此可以有以下几种情况:
1、低码率编码处理的帧长和高码率编码处理的帧长相同,且低码率编码处理的编解码的总时延与高码率编码处理的编解码的总时延相同
(1)当前音频帧的最终编码方式为低码率编码方式
音频发送设备对当前音频帧进行低码率编码处理。
在一种可能的实现方式中,音频发送设备可以先判断低码率编码处理是否支持当前音频帧的采样率,当低码率编码处理支持当前音频帧的采样率时,可以直接对当前音频帧进行低码率编码处理;或者,当低码率编码处理不支持当前音频帧的采样率时,可以先对当前音频帧进行下采样或上采样处理,然后再对下采样或上采样后的当前音频帧进行低码率编码处理。例如,低码率编码处理不支持88.2kHz和96kHz的采样率,音频发送设备可以采用QMF进行下采样处理,将88.2kHz对应的频带(0~44.1kHz)分成两个子带0~22.05kHz和22.05~44.1kHz,选取低子带0~22.05kHz进行低码率编码处理;将96kHz对应的频带(0~48kHz)分成两个子带0~24kHz和24~48kHz,选取低子带0~24kHz进行低码率编码处理。
(2)当前音频帧的最终编码方式为高码率编码方式
音频发送设备对当前音频帧进行高码率编码处理。
在一种可能的实现方式中,音频发送设备可以先判断高码率编码处理是否支持当前音频帧的采样率,当高码率编码处理支持当前音频帧的采样率时,可以直接对当前音频帧进行高码率编码处理;或者,当高码率编码处理不支持当前音频帧的采样率时,可以先对当前音频帧进行下采样或上采样处理,然后再对下采样或上采样后的当前音频帧进行高码率编码处理。
(3)当前音频帧的最终编码方式为低码率编码向高码率编码切换编码方式或者高码率编码向低码率编码切换编码方式
音频发送设备可以对当前音频帧进行低码率编码处理和高码率编码处理。同样的,音频发送设备可以参照上述描述先判断高低码率两种编码处理是否支持当前音频帧的采样率,此处不再赘述。
2、低码率编码处理和高码率编码处理各自的帧长不相同,或者低码率编码处理和高码率编码处理各自的帧长相同且编解码的总时延不相同
(1)当前音频帧的最终编码方式为低码率编码方式
音频发送设备可以先判断低码率编码处理是否支持当前音频帧的采样率,当低码率编码处理支持当前音频帧的采样率时,可以直接对当前音频帧进行低码率编码处理;或者,当低码率编码处理不支持当前音频帧的采样率时,可以先对当前音频帧进行下采样或上采样处理,然后再对下采样或上采样后的当前音频帧进行低码率编码处理。
(2)当前音频帧的最终编码方式为高码率编码方式
音频发送设备可以先判断高码率编码处理是否支持当前音频帧的采样率,当高码率编码处理支持当前音频帧的采样率时,可以直接对当前音频帧进行高码率编码处理;或者,当高码率编码处理不支持当前音频帧的采样率时,可以先对当前音频帧进行下采样或上采样处理,然后再对下采样或上采样后的当前音频帧进行高码率编码处理。
(3)当前音频帧的最终编码方式为低码率编码向高码率编码切换编码方式
A、低码率编码处理的帧长大于高码率编码处理的帧长
如上所述,第二计数器的作用是统计切换帧的处理情况,第二计数器处于启动状态表示当前处理的仍然是切换帧,此时当第二计数器的值大于1(表示当前处理的仍然是切换帧且不是切换帧中的最后一帧)时,对当前音频帧进行低码率编码处理;对当前音频帧进行高码率编码处理。或者,当第二计数器的值等于1(表示当前处理的是切换帧中的最后一帧)时,对当前音频帧进行高码率编码处理。
音频发送设备可以参照上述描述先判断高低码率两种编码处理是否支持当前音频帧的采样率,此处不再赘述。
B、低码率编码处理的帧长小于高码率编码处理的帧长
如上所述,第二计数器的作用是统计切换帧的处理情况,第二计数器处于启动状态表示当前处理的仍然是切换帧,此时第二计数器的值等于第一设定值(表示当前处理的是切换帧中的第一帧)时,对当前音频帧进行低码率编码处理;对当前音频帧进行高码率编码处理。或者,当第二计数器的值小于第一设定值(表示当前处理的仍然是切换帧且不是切换帧中的第一帧)时,对当前音频帧进行高码率编码处理。
音频发送设备可以参照上述描述先判断高低码率两种编码处理是否支持当前音频帧的采样率,此处不再赘述。
(4)当前音频帧的最终编码方式为高码率编码向低码率编码切换编码方式
A、低码率编码处理的帧长大于高码率编码处理的帧长
如上所述,第一计数器的作用是统计切换帧的处理情况,第一计数器处于启动状态表示当前处理的仍然是切换帧,此时当第一计数器的值等于第一设定值(表示当前处理的是切换帧中的第一帧)时,对当前音频帧进行低码率编码处理;对当前音频帧进行高码率编码处理。或者,当第一计数器的值小于第一设定值(表示当前处理的仍然是切换帧且不是切换帧中的第一帧)时,对当前音频帧进行低码率编码处理。
音频发送设备可以参照上述描述先判断高低码率两种编码处理是否支持当前音频帧的采样率,此处不再赘述。
B、低码率编码处理的帧长小于高码率编码处理的帧长
如上所述,第一计数器的作用是统计切换帧的处理情况,第一计数器处于启动状态表示当前处理的仍然是切换帧,此时当第一计数器的值大于1(表示当前处理的仍然是切换帧且不是切换帧中的最后一帧)时,对当前音频帧进行低码率编码处理;对当前音频帧进行高码率编码处理。或者,当第一计数器的值等于1(表示当前处理的是切换帧中的最后一帧)时,对当前音频帧进行低码率编码处理。
音频发送设备可以参照上述描述先判断高低码率两种编码处理是否支持当前音频帧的采样率,此处不再赘述。
在一种可能的实现方式中,音频发送设备可以根据上一帧音频帧的设定码率和当前音频帧的设定码率确定低码率编码处理对应的第一码率和高码率编码处理对应的第二码率,第一码率和第二码率之和为设定码率。
(1)当前音频帧的最终编码方式为高码率编码向低码率编码切换编码方式,可以有:
A、如果上一帧音频帧的设定码率brp满足:双声道:600kbps<brp≤990kbps,单声道:300kbps<brp≤495kbps,当前音频帧的设定码率brf满足:双声道:brf<300kbps,单声道: brf<150kbps,那么当前音频帧的码率分配为:第一码率为brf,第二码率为为brp-brf。
B、如果上一帧音频帧的设定码率brp满足:双声道:364kbps<brp≤600kbps,单声道:182kbps<brp≤300kbps,当前音频帧的设定码率brf满足:双声道:brf<300kbps,单声道:brf<150kbps,那么当前音频帧的码率分配为:第一码率为双声道是brp-300kbps,第一码率为单声道是brp-150kbps,第二码率为双声道是300kbps,第二码率为单声道是150kbps。
C、如果上一帧音频帧的设定码率brp满足:双声道:300kbps≤brp≤364kbps,单声道:150kbps≤brp≤182kbps,当前音频帧的设定码率brf满足:双声道:brf<300kbps,单声道:brf<150kbps,那么当前音频帧的码率分配为:第一码率为双声道是64kbps,第一码率为单声道是32kbps,第二码率为双声道是300kbps,第二码率为单声道是150kbps。
(2)当前音频帧的最终编码方式为低码率编码向高码率编码切换编码方式,可以有:
A、如果上一帧音频帧的设定码率brp满足:双声道:64kbps≤brp<300kbps,单声道:32kbps≤brp<150kbps,当前音频帧的设定码率brf满足:双声道:600kbps<brf≤990kbps,单声道:300kbps<brf≤495kbps,那么当前音频帧的码率分配为:第一码率为brp第二码率为brf-brp。
B、如果上一帧音频帧的设定码率brp满足:双声道:364kbps<brf≤600kbps,单声道:182kbps<brf≤300kbps,那么当前音频帧的码率分配为:第一码率为双声道是brf-300kbps,第一码率为单声道是brf-150kbps,第二码率为双声道是300kbps,第二码率为单声道是150kbps。
C、如果上一帧音频帧的设定码率brp满足:双声道:300kbps≤brf≤364kbps,单声道:150kbps≤brf≤182kbps,那么当前音频帧的码率分配为:第一码率为双声道是64kbps,第一码率为单声道是32kbps,第二码率为双声道是300kbps,第二码率为单声道是150kbps。
D、如果上一帧音频帧的设定码率brp满足:双声道:600kbps<brp≤990kbps,单声道:300kbps<brp≤495kbps,当前音频帧的设定码率brf满足:双声道:600kbps<brf≤990kbps,单声道:300kbps<brf≤495kbps,那么当前音频帧的码率分配为:第一码率和第二码率与上一帧音频帧的分配情况一致。
E、如果上一帧音频帧的设定码率brp满足:双声道:364kbps≤brp≤600kbps,单声道:182kbps≤brp≤300kbps,当前音频帧的设定码率brf满足:双声道:600kbps<brf≤990kbps,单声道:300kbps<brf≤495kbps,那么当前音频帧的码率分配为:第一码率为双声道是299kbps,第一码率为单声道是149kbps,第二码率为双声道是brf-299kbps,第二码率为单声道是brf-149kbps。
编码后的当前音频帧对应的码流信息包括包头信息、低码率编码码流和/或高码率编码码流,其中,包头信息包括当前音频帧的最终编码方式、采样率、声道数、帧长和低码率编码码流的长度。如果音频发送设备对当前音频帧只进行低码率编码处理,那么码流信息中只包含低码率编码码流;如果音频发送设备对当前音频帧只进行高码率编码处理,那么码流信息中只包含高码率编码码流;如果音频发送设备对当前音频帧进行低码率编码处理和高码率编码处理,那么码流信息中包含低码率编码码流和高码率编码码流。图4为本申请码流信息的一个示例性的格式图,如图4所示,码流信息中包含包头信息、低码率编码码流和高码率编码码流,其中包头信息包含最终编码方式(2bit)、采样率(2bit)、声道数(1bit)、帧长(1bit)、低码率编码码流的长度(10bit),可见包头信息的长度为2 字节。如果码流信息中有低码率编码码流,那么低码率编码码流的长度为低码率编码码流的实际长度,并且低码率编码码流中写入实际数据;如果码流信息中没有低码率编码码流,那么低码率编码码流的长度为0,并且低码率编码码流为空,或默认数据;如果码流信息中有高码率编码码流,那么高码率编码码流中写入实际数据;如果码流信息中没有高码率编码码流,那么高码率编码码流为空,或默认数据。
步骤304、音频发送设备将码流信息发送给音频接收设备。
音频发送设备可以通过蓝牙连接等通信方式将码流信息发送给音频接收设备。
步骤305、音频接收设备解析码流信息以获得解码方式和编码码流。
编码码流包括低码率编码码流和/或高码率编码码流,解码方式包括低码率解码方式、高码率解码方式、低码率解码向高码率解码切换解码方式或者高码率解码向低码率解码切换解码方式。如图4所示,编码码流的实际内容与音频发送设备对音频帧的最终编码方式相关联,因此音频接收设备在解析码流信息后,可以得到两个信息,一个是需要采用的解码方式,另一个是编码码流的内容。当解码方式为低码率解码方式时,编码码流只包含低码率编码码流;当解码方式为高码率解码方式时,编码码流只包含高码率编码码流;当解码方式为低码率解码向高码率解码切换解码方式或者高码率解码向低码率解码切换解码方式时,编码码流包含低码率编码码流和/或高码率编码码流。
步骤306、音频接收设备根据解码方式对编码码流进行解码以获得目标音频帧。
与编码端相对应,音频发送设备采用哪种编码方式对音频帧进行编码处理,在解码端就需要采用对应的解码方式对编码码流进行解码处理。因此有以下几种解码处理方式:
1、低码率编码处理的帧长和高码率编码处理的帧长相同,且低码率编码处理的编解码的总时延与高码率编码处理的编解码的总时延相同
(1)解码方式为低码率解码方式
音频接收设备对低码率编码码流进行低码率解码处理。
在一种可能的实现方式中,音频接收设备可以先判断低码率解码处理是否支持低码率编码码流对应的采样率,当低码率解码处理支持该采样率时,可以直接对低码率编码码流进行低码率解码处理;或者,当低码率解码处理不支持该采样率时,可以先对低码率编码码流进行低码率解码处理,然后对解码后的数据进行上采样或下采样处理以获得目标音频帧。需要说明的是,编码端和解码端所执行的上采样或下采样处理时相对应的,即编码端如果采用了下采样处理,那么解码端可以采用上采样处理;编码端如果采用了上采样处理,那么解码端可以采用下采样处理。例如,上述描述中在编码端,音频发送设备对音频帧进行下采样处理后取低子带进行编码处理,相应的,在解码端,音频接收设备对低码率编码码流解码后,由于缺少高子带部分的数据,因此通过补0的方式上采样获得目标音频帧。
(2)解码方式为高码率解码方式
音频接收设备对高码率编码码流进行高码率解码处理。
在一种可能的实现方式中,音频接收设备可以先判断高码率解码处理是否支持高码率编码码流对应的采样率,当高码率解码处理支持该采样率时,可以直接对高码率编码码流进行高码率解码处理;或者,当高码率解码处理不支持该采样率时,可以先对高码率编码码流进行高码率解码处理,然后对解码后的数据进行上采样或下采样处理以获得目标音频帧。
(3)解码方式为低码率解码向高码率解码切换解码方式
音频接收设备对低码率编码码流进行低码率解码处理以获得第二数据,并对高码率编码码流进行高码率解码处理以获得第一数据。得到第二数据和第一数据之后,音频接收设备可以对第二数据的后端数据和第一数据的前端数据进行平滑处理,以确保低高码率之间的平滑切换,平滑的数据长度为N个样点数据,即将第二数据的后N个样点数据与第一数据的前N个样点数据进行加权平均以获得N个样点平滑数据,根据第二数据除后N个样点数据外的其他数据和N个样点平滑数据获得目标音频帧。图5a为本申请数据平滑处理的一个示例性的示意图,如图5a所示,两条虚线之间的样点数据为需要进行平滑处理的N个样点数据,第二数据中的N个样点数据内的斜线表示第二数据的权重变化,第一数据中的N个样点数据内的斜线表示第一数据的权重变化。由于是低码率解码向高码率解码切换解码方式,因此低码率解码处理的第二数据在前,高码率解码处理的第一数据在后,并且目标音频帧的数据包含第二数据的前部分数据,以及平滑后的N个样点平滑数据。假设第二数据的后N个样点数据表示为ai,i=1,2,3…,N,第一数据的前N个样点数据表示为bi,i=1,2,3…,N,N个样点平滑数据表示为ci,i=1,2,3…,N,可以通过以下计算得到:
Figure PCTCN2021125760-appb-000003
c1=a1
cN=bN
同样的,音频接收设备可以上述描述先判断高低码率两种解码处理是否支持采样率,此处不再赘述。
(4)解码方式为高码率解码向低码率解码切换解码方式
音频接收设备对高码率编码码流进行高码率解码处理以获得第一数据,并对低码率编码码流进行低码率解码处理以获得第二数据。得到第一数据和第二数据之后,音频接收设备可以对第一数据的后端数据和第二数据的前端数据进行平滑处理,以确保高低码率之间的平滑切换,平滑的数据长度为N个样点数据,即将第一数据的后N个样点数据与第二数据的前N个样点数据进行加权平均以获得N个样点平滑数据,根据第一数据除后N个样点数据外的其他数据和N个样点平滑数据获得目标音频帧。图5b为本申请数据平滑处理的一个示例性的示意图,如图5b所示,两条虚线之间的样点数据为需要进行平滑处理的N个样点数据,第一数据中的N个样点数据内的斜线表示第一数据的权重变化,第二数据中的N个样点数据内的斜线表示第二数据的权重变化。由于是高码率解码向低码率解码切换解码方式,因此高码率解码处理的第一数据在前,低码率解码处理的第二数据在后,并且目标音频帧的数据包含第一数据的前部分数据,以及平滑后的N个样点平滑数据。N个样点平滑数据的计算方法可以参照上述描述,此处不再赘述。
同样的,音频接收设备可以上述描述先判断高低码率两种解码处理是否支持采样率,此处不再赘述。
2、低码率编码处理和高码率编码处理各自的帧长不相同
(1)解码方式为低码率解码方式
A、低码率编码处理的帧长大于高码率编码处理的帧长
音频接收设备对低码率编码码流进行低码率解码处理。
在一种可能的实现方式中,音频接收设备可以先判断低码率解码处理是否支持低码率编码码流对应的采样率,当低码率解码处理支持该采样率时,可以直接对低码率编码码流进行低码率解码处理;或者,当低码率解码处理不支持该采样率时,可以先对低码率编码码流进行低码率解码处理,然后对解码后的数据进行上采样或下采样处理以获得目标音频帧。
B、低码率编码处理的帧长小于高码率编码处理的帧长
音频接收设备对低码率编码码流进行低码率解码处理以获得第二数据。由于低码率解码处理的帧长小于高码率解码处理的帧长,为了使低码率解码处理和高码率解码处理得到的音频帧对齐,音频接收设备可以在得到第二数据后,将低码率解码处理对应的第二数据队列从队头溢出M个样点数据,并将第二数据放入第二数据队列的队尾,然后从第二数据队列的队头提取M个样点数据以获得目标音频帧。该第二数据队列遵循先入先出的原则。M与高码率解码处理的帧长相关联,例如,M=处理帧长×声道数,处理帧长是高码率解码处理的帧长。图6为本申请数据队列的一个示例性的示意图,如图6所示,假设第二数据队列的长度为n+M,按照先入先出的原则,从队头溢出M个样点数据,这样第二数据队列的队尾空出M个数据的位置,然后将第二数据放入第二数据队列的队尾,然后从第二数据队列的队头提取M个样点数据即为目标音频帧。例如,高码率解码处理对应的帧长大于低码率解码处理对应的帧长,且前者是后者的四倍,高码率解码处理对应的第一数据队列的长度A=帧长×声道数,低码率解码处理对应的第二数据队列的长度B=帧长×声道数+(高码率的编解码的总时延-低码率的编解码的总时延)×声道数。
同样的,音频接收设备可以上述描述先判断低码率解码处理是否支持采样率,此处不再赘述。
(2)解码方式为高码率解码方式
A、低码率编码处理的帧长大于高码率编码处理的帧长
音频接收设备对高码率编码码流进行高码率解码处理以获得第一数据。由于低码率编码处理的帧长大于高码率编码处理的帧长,为了使低码率解码处理和高码率解码处理得到的音频帧对齐,音频接收设备可以在得到第一数据后,将高码率解码处理对应的第一数据队列从队头溢出M个样点数据,并将第一数据放入第一数据队列的队尾,然后从第一数据队列的队头提取M个样点数据以获得目标音频帧。该第一数据队列遵循先入先出的原则。M与低码率解码处理的帧长相关联。
同样的,音频接收设备可以上述描述先判断高码率解码处理是否支持采样率,此处不再赘述。
B、低码率编码处理的帧长小于高码率编码处理的帧长
音频接收设备对高码率编码码流进行高码率解码处理。
在一种可能的实现方式中,音频接收设备可以先判断高码率解码处理是否支持高码率编码码流对应的采样率,当高码率解码处理支持该采样率时,可以直接对高码率编码码流进行高码率解码处理;或者,当高码率解码处理不支持该采样率时,可以先对高码率编码码流进行高码率解码处理,然后对解码后的数据进行上采样或下采样处理以获得目标音频帧。
(3)解码方式为低码率解码向高码率解码切换解码方式
A、低码率编码处理的帧长大于高码率编码处理的帧长
当上一帧音频帧的解码方式不是低码率解码向高码率解码切换解码方式时,将高码率解码处理对应的第一数据队列置为全0,第一数据队列遵循先入先出的原则;对低码率编码码流进行低码率解码处理以获得第二数据;对高码率编码码流进行高码率解码处理以获得第一数据;将第一数据队列从队头溢出M个样点数据,并将第一数据放入第一数据队列的队尾,M与低码率解码处理的帧长相关联;从第一数据队列的队头提取M个样点数据以获得第三数据;将第二数据的后N个样点数据与第三数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据第二数据除后N个样点数据外的其他数据和N个样点平滑数据获得目标音频帧。
音频发送设备可以参照上述描述先判断高低码率两种编码处理是否支持当前音频帧的采样率,此处不再赘述。
B、低码率编码处理的帧长小于高码率编码处理的帧长
当上一帧音频帧的解码方式不是低码率解码向高码率解码切换解码方式时,对低码率编码码流进行低码率解码处理以获得第二数据;将低码率解码处理对应的第二数据队列从队头溢出M个样点数据,并将第二数据放入第二数据队列的队尾,第二数据队列遵循先入先出的原则,M与高码率解码处理的帧长相关联;从第二数据队列的队头提取M个样点数据以获得第四数据;对高码率编码码流进行高码率解码处理以获得第一数据;将第四数据的后N个样点数据与第一数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据第四数据除后N个样点数据外的其他数据和N个样点平滑数据获得目标音频帧。或者,当上一帧音频帧的解码方式是低码率解码向高码率解码切换解码方式时,将第二数据队列从队头溢出M个样点数据;从第二数据队列的队头提取M个样点数据以获得第四数据;对高码率编码码流进行高码率解码处理以获得第一数据;将第四数据的后N个样点数据与第一数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据第四数据除后N个样点数据外的其他数据和N个样点平滑数据获得目标音频帧。
音频发送设备可以参照上述描述先判断高低码率两种编码处理是否支持当前音频帧的采样率,此处不再赘述。
(4)解码方式为高码率解码向低码率解码切换解码方式
A、低码率编码处理的帧长大于高码率编码处理的帧长
当上一帧音频帧的解码方式不是高码率解码向低码率解码切换解码方式时,对高码率编码码流进行高码率解码处理以获得第一数据;将高码率解码处理对应的第一数据队列从队头溢出M个样点数据,并将第一数据放入第一数据队列的队尾,第一数据队列遵循先入先出的原则,M与低码率解码处理的帧长相关联;从第一数据队列的队头提取M个样点数据以获得第三数据;对低码率编码码流进行低码率解码处理以获得第二数据;将第三数据的后N个样点数据与第二数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据第三数据除后N个样点数据外的其他数据和N个样点平滑数据获得目标音频帧。或者,当上一帧音频帧的解码方式是高码率解码向低码率解码切换解码方式时,将第一数据队列从队头溢出M个样点数据;从第一数据队列的队头提取M个样点数据以获得第三数据;对低码率编码码流进行低码率解码处理以获得第二数据;将第三数据的后N个 样点数据与第二数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据第三数据除后N个样点数据外的其他数据和N个样点平滑数据获得目标音频帧。
音频发送设备可以参照上述描述先判断高低码率两种编码处理是否支持当前音频帧的采样率,此处不再赘述。
B、低码率编码处理的帧长小于高码率编码处理的帧长
当上一帧音频帧的解码方式不是高码率解码向低码率解码切换解码方式时,将低码率解码处理对应的第二数据队列置为全0,第二数据队列遵循先入先出的原则;对高码率编码码流进行高码率解码处理以获得第一数据;对低码率编码码流进行低码率解码处理以获得第二数据;将第二数据队列从队头溢出M个样点数据,并将第二数据放入第二数据队列的队尾,M与高码率解码处理的帧长相关联;从第二数据队列的队头提取M个样点数据以获得第四数据;将第一数据的后N个样点数据与第四数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据第一数据除后N个样点数据外的其他数据和N个样点平滑数据获得目标音频帧。
音频发送设备可以参照上述描述先判断高低码率两种编码处理是否支持当前音频帧的采样率,此处不再赘述。
本申请基于设定码率和上一帧音频帧的最终编码方式确定当前音频帧的最终编码方式,并且据此设定发生高低码率切换的音频帧上的低码率编码处理和高码率编码处理各自对应的码率,编码端将码流信息发送给解码端;解码端解析码流信息获取解码方式,进而对码流数据进行解码,尤其是在发生高低码率切换的音频帧上对高低码率解码后的数据进行平滑处理,以实现低码率编解码处理和高码率编解码处理的无缝融合,使得在符合蓝牙信道对数据传输大小限制的前提下,最大化保证音频的音质,提高蓝牙信道的抗干扰能力,带给用户更优化的音频体验。
以下采用几个具体的实施例对本申请提供的音频编解码方法作进一步描述。
实施例一
低码率编解码处理支持64~300kbps,高码率编解码处理支持300~990kbps,二者支持的位深为16bit、24bit、32bit浮点或者32bit定点,均支持单声道和双声道的音频信号,低码率编解码处理支持的采样率包括44.1kHz和48kHz,不支持的采样率包括88.2kHz和96kHz。
低码率编解码处理和高码率编解码处理的帧长一致,编解码的总时延一致,相邻音频帧之间存在部分交叠(overlap),因此采取并行融合策略,即当进行低码率编码处理和高码率编码处理切换时,两种编码处理同时运行,来保证音频流的连续。为了叙述方便,设定低码率编解码处理为A,高码率编解码处理为B。
本实施例可以参照表1所示,确定当前音频帧的最终编码方式,此处不再赘述。而在A和B同时运行的音频帧,可以采用上述方法确定A和B的采样率,此处不再赘述。
A和/或B的编码处理和解码处理可以参照图3所示实施例,此处不再赘述。
其中,对于A不支持的采样率88.2kHz和96kHz,编码端可以采用QMF进行下采样处理,将88.2kHz对应的频带(0~44.1kHz)分成两个子带0~22.05kHz和22.05~44.1kHz,选取低子带0~22.05kHz进行低码率编码处理;将96kHz对应的频带(0~48kHz)分成两个子带0~24kHz和24~48kHz,选取低子带0~24kHz进行低码率编码处理。解码端可以采用 QMF将解码后的数据的高子带(88.2kHz中的高子带为22.05kHz-44.1kHz,96kHz中的高子带为24kHz-48kHz)补0实现上采样处理,使得上采样后的数据符合原始音频的采样率。
实施例二
低码率编解码处理支持64~300kbps,高码率编解码处理支持300~990kbps,二者支持的位深为16bit、24bit、32bit浮点或者32bit定点,均支持单声道和双声道的音频信号,低码率编解码处理和高码率编解码处理均支持的采样率包括44.1kHz、48kHz、88.2kHz和96kHz。
为了叙述方便,设定低码率编解码处理为A,高码率编解码处理为B。低码率编解码处理和高码率编解码处理的帧长不一致,例如,A的帧长为1024个样点数据,B的帧长为256个样点数据。A和B的编解码的总时延不一致,例如,A延迟2048个样点数据,B在采样率大于等于88.2kHz时为11个样点数据的延迟,在采样率低于88.2kHz时为5个样点数据的延迟。因此采取并行融合策略,即当进行低码率编码处理和高码率编码处理切换时,两种编码处理同时运行,来保证音频流的连续。
本实施例可以参照表2所示,确定当前音频帧的最终编码方式,此处不再赘述。而在A和B同时运行的音频帧,可以采用上述方法确定A和B的采样率,此处不再赘述。
图7为本申请音频帧的编解码处理的一个示例性的示意图,如图7所示,设定切换帧的个数为3,音频帧采用单声道,已确定帧序列中的各个音频帧的最终编码方式的顺序为
(A)(A)(A)(A→B)(A→B)(A→B)(B)(B)(B)(B→A)(B→A)(B→A)(A)
A处理一个音频帧(长度为1024个样点数据),B需要处理4个音频帧(长度为256个样点数据)拼成1024个样点数据。
在编码端,按照上述顺序,A连续编码三个音频帧。接下来三个切换帧(A→B),其中,第一个切换帧,A和B同时运行,且B只编码该音频帧的后一半数据;第二个切换帧,A和B仍然同时运行,A编码的数据全部置为0,B对整帧数据编码;第三个切换帧,A停止运行,B对整帧数据编码。接下来B连续编码三个音频帧。接下来又是三个切换帧(B→A),其中,第一个切换帧,A和B同时运行,且B只编码该音频帧的前一半数据;第二个和第三个切换帧,A对整帧数据编码,B停止运行。接下来A编码一个音频帧。
在解码端,按照上述顺序,A连续解码三个音频帧。接下来三个切换帧(A→B),其中,第一个切换帧,先将B对应的第一数据队列置为全0,该切换帧上A和B同时运行,B只解码该音频帧的后一半数据。将第一数据队列从队头溢出1024个样点数据,并将B解码得到的第一数据放入第一数据队列的队尾。从第一数据队列的队头提取1024个样点数据与A解码得到的第二数据进行平滑处理以获得音频帧;第二个切换帧和第三个切换帧,A和B仍然同时运行,均对整帧数据解码。在第二个切换帧和第三个切换帧上均将第一数据队列从队头溢出1024个样点数据,并将B解码得到的第一数据放入第一数据队列的队尾。从第一数据队列的队头提取1024个样点数据与A解码得到的第二数据进行平滑处理以获得音频帧。接下来B连续解码三个音频帧。接下来又是三个切换帧(B→A),其中,第一个切换帧,A和B同时运行,且B只解码该音频帧的前一半数据。将B对应的第一数据队列从队头溢出1024个样点数据,并将B解码得到的第一数据放入第一数据队列的队尾。从第一数据队列的队头提取1024个样点数据与A解码得到的第二数据进行平滑处理以获得音频帧;第二个切换帧和第三个切换帧,A对整帧数据解码,B停止运行。 在第二个切换帧和第三个切换帧上均将第一数据队列从队头溢出1024个样点数据。从第一数据队列的队头提取1024个样点数据与A解码得到的第二数据进行平滑处理以获得音频帧。接下来A解码一个音频帧。
图8为本申请音频编码装置实施例的结构示意图,如图8所示,该装置可以应用于上述实施例中的音频发送设备。本实施例的编码装置可以包括:获得模块801、确定模块802和编码模块803。其中,
获得模块801,用于获得待编码的当前音频帧的设定码率和上一帧音频帧的最终编码方式,所述最终编码方式包括第一码率编码方式、第二码率编码方式、第一码率编码向第二码率编码切换编码方式或者第二码率编码向第一码率编码切换编码方式,其中,第一码率低于第二码率;确定模块802,用于根据所述设定码率和所述上一帧音频帧的最终编码方式确定所述当前音频帧的最终编码方式;编码模块803,用于根据所述当前音频帧的最终编码方式对所述当前音频帧进行编码。
在一种可能的实现方式中,当第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,所述确定模块802,具体用于当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码方式;或者,当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式;或者,当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式;或者,当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码方式;或者,当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式;或者,当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码方式;或者,当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码方式;或者,当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式;其中,所述设定阈值的取值与音频帧的声道数相关联。
在一种可能的实现方式中,当第一码率编码处理的帧长和第二码率编码处理的帧长不相同时,所述确定模块802,具体用于当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码方式;或者,当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式,且启动第一计数器,所述第一计数器的初始值为第一设定值,所述第一计数器在值为0时终止;或者,当所述设定码率大于设定阈值,且所述上一帧音频帧的 最终编码方式为第二码率编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码方式;或者,当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式,且启动第二计数器,所述第二计数器的初始值为第一设定值,所述第二计数器在值为0时终止;或者,当所述上一帧音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式,且所述第一计数器的值大于0时,将所述第一计数器的值减1;若所述第一计数器的值仍大于0,则确定所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式;或者,若所述第一计数器的值为0,则确定所述当前音频帧的最终编码方式为第一码率编码方式;或者,当所述上一帧音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式,且所述第二计数器的值大于0时,将所述第二计数器的值减1;若所述第二计数器的值仍大于0,则确定所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式;或者,若所述第二计数器的值为0,则确定所述当前音频帧的最终编码方式为第二码率编码方式;其中,所述设定阈值的取值与音频帧的声道数相关联。
在一种可能的实现方式中,当所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式或者第二码率编码向第一码率编码切换编码方式时,所述编码模块803,具体用于对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理。
在一种可能的实现方式中,当所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式,且所述第一码率编码处理的帧长大于所述第二码率编码处理的帧长时,所述编码模块803,具体用于当所述第一计数器的值等于所述第一设定值时,对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理;或者,当所述第一计数器的值小于所述第一设定值时,对所述当前音频帧进行第一码率编码处理。
在一种可能的实现方式中,当所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式,且所述第一码率编码处理的帧长大于所述第二码率编码处理的帧长时,所述编码模块803,具体用于当所述第二计数器的值大于1时,对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理;或者,当所述第二计数器的值等于1时,对所述当前音频帧进行第二码率编码处理。
在一种可能的实现方式中,当所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式,且所述第一码率编码处理的帧长小于所述第二码率编码处理的帧长时,所述编码模块803,具体用于当所述第一计数器的值大于1时,对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理;或者,当所述第一计数器的值等于1时,对所述当前音频帧进行第一码率编码处理。
在一种可能的实现方式中,当所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式,且所述第一码率编码处理的帧长小于所述第二码率编码处理的帧长时,所述编码模块803,具体用于当所述第二计数器的值等于所述第一设定值时,对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理;或者,当所述第二计数器的值小于所述第一设定值时,对所述当前音频帧进行第二码率编码处理。
在一种可能的实现方式中,所述编码模块803,具体用于当所述第一码率编码处理支 持所述当前音频帧的采样率时,对所述当前音频帧进行所述第一码率编码处理;或者,当所述第一码率编码处理不支持所述当前音频帧的采样率时,对所述当前音频帧进行下采样或上采样处理以获得下采样或上采样后的当前音频帧,对所述下采样或上采样后的当前音频帧进行所述第一码率编码处理,所述第一码率编码处理支持所述下采样或上采样后的当前音频帧的采样率。
在一种可能的实现方式中,所述编码模块803,具体用于当所述第二码率编码处理支持所述当前音频帧的采样率时,对所述当前音频帧进行所述第二码率编码处理;或者,当所述第二码率编码处理不支持所述当前音频帧的采样率时,对所述当前音频帧进行下采样或上采样处理以获得下采样或上采样后的当前音频帧,对所述下采样或上采样后的当前音频帧进行所述第二码率编码处理,所述第二码率编码处理支持所述下采样或上采样后的当前音频帧的采样率。
在一种可能的实现方式中,当所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式或者第二码率编码向第一码率编码切换编码方式时,所述确定模块802,还用于根据所述上一帧音频帧的设定码率和所述当前音频帧的设定码率确定第一码率编码处理对应的第一码率和第二码率编码处理对应的第二码率,所述第一码率和所述第二码率之和为所述当前音频帧的设定码率;所述编码模块803,具体用于以所述第一码率对所述当前音频帧进行所述第一码率编码处理;以所述第二码率对所述当前音频帧进行所述第二码率编码处理。
在一种可能的实现方式中,编码后的当前音频帧对应的码流信息包括包头信息、第一码率编码码流和/或第二码率编码码流,其中,所述包头信息包括所述当前音频帧的最终编码方式、采样率、声道数、帧长和所述第一码率编码码流的长度。
本实施例的装置,可以用于执行图3所示方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。
图9为本申请音频解码装置实施例的结构示意图,如图9所示,该装置可以应用于上述实施例中的音频接收设备。本实施例的解码装置可以包括:获得模块901、解析模块902和解码模块903。其中,
获得模块901,用于获得码流信息;解析模块902,用于解析所述码流信息以获得解码方式和编码码流,所述编码码流包括第一码率编码码流和/或第二码率编码码流,所述解码方式包括第一码率解码方式、第二码率解码方式、第一码率解码向第二码率解码切换解码方式或者第二码率解码向第一码率解码切换解码方式,当所述解码方式为第一码率解码方式时,所述编码码流包括第一码率编码码流,当所述解码方式为第二码率解码方式时,所述编码码流包括第二码率编码码流,当所述解码方式为第一码率解码向第二码率解码切换解码方式或者第二码率解码向第一码率解码切换解码方式时,所述编码码流包括第一码率编码码流和第二码率编码码流;解码模块903,用于根据所述解码方式对所述编码码流进行解码以获得目标音频帧。
在一种可能的实现方式中,当所述解码方式为第一码率解码方式,并且第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,或者,当所述解码方式为第一码率解码方式,并且第一码率解码处理的帧长大于第二码率解码处理的帧长时,所述解码模块903, 具体用于根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得所述目标音频帧。
在一种可能的实现方式中,当所述解码方式为第一码率解码方式,并且第一码率解码处理的帧长小于第二码率解码处理的帧长时,所述解码模块903,具体用于根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第一码率解码处理对应的第二数据队列从队头溢出M个样点数据,按照先入先出FIFO方式将所述第二数据放入所述第二数据队列,M与所述第二码率解码处理的帧长相关联;从所述第二数据队列的队头提取M个样点数据以获得所述目标音频帧。
在一种可能的实现方式中,当所述解码方式为第二码率解码方式,并且第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,或者,当所述解码方式为第二码率解码方式,并且第一码率解码处理的帧长小于第二码率解码处理的帧长时,所述解码模块903,具体用于根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得所述目标音频帧。
在一种可能的实现方式中,当所述解码方式为第二码率解码方式,并且第一码率解码处理的帧长大于第二码率解码处理的帧长时,所述解码模块903,具体用于根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第二码率解码处理对应的第一数据队列从队头溢出M个样点数据,按照先入先出FIFO方式将所述第一数据放入所述第一数据队列,M与所述第一码率解码处理的帧长相关联;从所述第一数据队列的队头提取M个样点数据以获得所述目标音频帧。
在一种可能的实现方式中,当所述解码方式为第一码率解码向第二码率解码切换解码方式,并且第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,所述解码模块903,具体用于根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第二数据的后N个样点数据与所述第一数据的前N个样点数据进行加权平均以获得N个样点平滑数据,N为第二设定值;根据所述第二数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧。
在一种可能的实现方式中,当所述解码方式为第一码率解码向第二码率解码切换解码方式,并且第一码率解码处理的帧长大于第二码率解码处理的帧长时,所述解码模块903,具体用于当上一帧音频帧的解码方式不是所述第一码率解码向第二码率解码切换解码方式时,将所述第二码率解码处理对应的第一数据队列置为全0,所述第一数据队列遵循先入先出的原则;根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第一数据队列从队头溢出M个样点数据,并将所述第一数据放入所述第一数据队列的队尾,M与所述第一码率解码处理的帧长相关联;从所述第一数据队列的队头提取M个样点数据以获得第三数据;将所述第二数据的后N个样点数据与所述第三数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第二数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧。
在一种可能的实现方式中,当所述解码方式为第一码率解码向第二码率解码切换解码方式,并且第一码率解码处理的帧长小于第二码率解码处理的帧长时,所述解码模块903,具体用于当所述上一帧音频帧的解码方式不是所述第一码率解码向第二码率解码切换解码方式时,根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第一码率解码处理对应的第二数据队列从队头溢出M个样点数据,按照先入先出FIFO方式将所述第二数据放入所述第二数据队列,M与所述第二码率解码处理的帧长相关联;从所述第二数据队列的队头提取M个样点数据以获得第四数据;根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第四数据的后N个样点数据与所述第一数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第四数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧;或者,当所述上一帧音频帧的解码方式是所述第一码率解码向第二码率解码切换解码方式时,将所述第二数据队列从队头溢出M个样点数据;从所述第二数据队列的队头提取M个样点数据以获得第四数据;根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第四数据的后N个样点数据与所述第一数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第四数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧。
在一种可能的实现方式中,当所述解码方式为第二码率解码向第一码率解码切换解码方式,并且第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,所述解码模块903,具体用于根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第一数据的后N个样点数据与所述第二数据的前N个样点数据进行加权平均以获得N个样点平滑数据,N为第二设定值;根据所述第一数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧。
在一种可能的实现方式中,当所述解码方式为第二码率解码向第一码率解码切换解码方式,并且第一码率解码处理的帧长大于第二码率解码处理的帧长时,所述解码模块903,具体用于当所述上一帧音频帧的解码方式不是所述第二码率解码向第一码率解码切换解码方式时,根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第二码率解码处理对应的第一数据队列从队头溢出M个样点数据,按照先入先出FIFO方式将所述第一数据放入所述第一数据队列,M与所述第一码率解码处理的帧长相关联;从所述第一数据队列的队头提取M个样点数据以获得第三数据;根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第三数据的后N个样点数据与所述第二数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第三数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧;或者,当所述上一帧音频帧的解码方式是所述第二码率解码向第一码率解码切换解码方式时,将所述第一数据队列从队头溢出M个样点数据;从所述第一数据队列的队头提取M个样点数据以获得第三数据;根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第三数据的后N个样点数据 与所述第二数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第三数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧。
在一种可能的实现方式中,当所述解码方式为第二码率解码向第一码率解码切换解码方式,并且第一码率解码处理的帧长小于第二码率解码处理的帧长时,所述解码模块903,具体用于当上一帧音频帧的解码方式不是所述第二码率解码向第一码率解码切换解码方式时,将所述第一码率解码处理对应的第二数据队列置为全0,所述第二数据队列遵循先入先出的原则;根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第二数据队列从队头溢出M个样点数据,并将所述第二数据放入所述第二数据队列的队尾,M与所述第二码率解码处理的帧长相关联;从所述第二数据队列的队头提取M个样点数据以获得第四数据;将所述第一数据的后N个样点数据与所述第四数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第一数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧。
在一种可能的实现方式中,所述解码模块903,具体用于判断所述第一码率解码处理是否支持所述第一码率编码码流对应的采样率;若所述第一码率解码处理支持所述采样率,则对所述第一码率编码码流进行所述第一码率解码处理;或者,若所述第一码率解码处理不支持所述采样率,则对所述第一码率编码码流进行所述第一码率解码处理以获得第五数据,对所述第五数据进行上采样或下采样处理。
在一种可能的实现方式中,所述解码模块903,具体用于判断所述第二码率解码处理是否支持所述第二码率编码码流对应的采样率;若所述第二码率解码处理支持所述采样率,则对所述第二码率编码码流进行所述第二码率解码处理;或者,若所述第二码率解码处理不支持所述采样率,则对所述第二码率编码码流进行所述第二码率解码处理以获得第六数据,对所述第六数据进行上采样或下采样处理。
本实施例的装置,可以用于执行图3所示方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。
在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。处理器可以是通用处理器、数字信号处理器(digital signal processor,DSP)、特定应用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。本申请实施例公开的方法的步骤可以直接体现为硬件编码处理器执行完成,或者用编码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
上述各实施例中提及的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器 (erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。应注意,本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (25)

  1. 一种音频编码方法,其特征在于,包括:
    获得待编码的当前音频帧的设定码率和上一帧音频帧的最终编码方式,所述最终编码方式包括第一码率编码方式、第二码率编码方式、第一码率编码向第二码率编码切换编码方式或者第二码率编码向第一码率编码切换编码方式,其中,第一码率低于第二码率;
    根据所述设定码率和所述上一帧音频帧的最终编码方式确定所述当前音频帧的最终编码方式;
    根据所述当前音频帧的最终编码方式对所述当前音频帧进行编码。
  2. 根据权利要求1所述的方法,其特征在于:
    当第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,所述根据所述设定码率和所述上一帧音频帧的最终编码方式确定所述当前音频帧的最终编码方式,包括:
    当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码方式;或者,
    当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式;或者,
    当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式;或者,
    当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码方式;或者,
    当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式;或者,
    当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码方式;或者,
    当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码方式;或者,
    当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式;
    其中,所述设定阈值的取值与音频帧的声道数相关联;
    或者
    当第一码率编码处理的帧长和第二码率编码处理的帧长不相同时,所述根据所述设定码率和所述上一帧音频帧的最终编码方式确定所述当前音频帧的最终编码方式,包括:
    当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码方式;或者,
    当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式,且启动第一计数器,所述第一计数器的初始值为第一设定值,所述第一计数器在值为0时终止;或者,
    当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码方式;或者,
    当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式,且启动第二计数器,所述第二计数器的初始值为第一设定值,所述第二计数器在值为0时终止;或者,
    当所述上一帧音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式,且所述第一计数器的值大于0时,将所述第一计数器的值减1;若所述第一计数器的值仍大于0,则确定所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式;或者,若所述第一计数器的值为0,则确定所述当前音频帧的最终编码方式为第一码率编码方式;或者,
    当所述上一帧音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式,且所述第二计数器的值大于0时,将所述第二计数器的值减1;若所述第二计数器的值仍大于0,则确定所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式;或者,若所述第二计数器的值为0,则确定所述当前音频帧的最终编码方式为第二码率编码方式;
    其中,所述设定阈值的取值与音频帧的声道数相关联;
    或者
    当所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式或者第二码率编码向第一码率编码切换编码方式时,所述根据所述当前音频帧的最终编码方式对所述当前音频帧进行编码,包括:
    对所述当前音频帧进行第一码率编码处理;和
    对所述当前音频帧进行第二码率编码处理。
  3. 根据权利要求2所述的方法,其特征在于,当所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式,且所述第一码率编码处理的帧长大于所述第二码率编码处理的帧长时,所述根据所述当前音频帧的最终编码方式对所述当前音频帧进行编码,包括:
    当所述第一计数器的值等于所述第一设定值时,对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理;
    或者,
    当所述第一计数器的值小于所述第一设定值时,对所述当前音频帧进行第一码率编码处理;
    或者
    当所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式,且所述第一码率编码处理的帧长大于所述第二码率编码处理的帧长时,所述根据所述当前音频帧的最终编码方式对所述当前音频帧进行编码,包括:
    当所述第二计数器的值大于1时,对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理;
    或者,
    当所述第二计数器的值等于1时,对所述当前音频帧进行第二码率编码处理;
    或者
    当所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式,且所述第一码率编码处理的帧长小于所述第二码率编码处理的帧长时,所述根据所述当前音频帧的最终编码方式对所述当前音频帧进行编码,包括:
    当所述第一计数器的值大于1时,对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理;
    或者,
    当所述第一计数器的值等于1时,对所述当前音频帧进行第一码率编码处理;
    或者
    当所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式,且所述第一码率编码处理的帧长小于所述第二码率编码处理的帧长时,所述根据所述当前音频帧的最终编码方式对所述当前音频帧进行编码,包括:
    当所述第二计数器的值等于所述第一设定值时,对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理;
    或者,
    当所述第二计数器的值小于所述第一设定值时,对所述当前音频帧进行第二码率编码处理。
  4. 根据权利要求3所述的方法,其特征在于,所述对所述当前音频帧进行第一码率编码处理,包括:
    当所述第一码率编码处理支持所述当前音频帧的采样率时,对所述当前音频帧进行所述第一码率编码处理;或者,
    当所述第一码率编码处理不支持所述当前音频帧的采样率时,对所述当前音频帧进行下采样或上采样处理以获得下采样或上采样后的当前音频帧,对所述下采样或上采样后的当前音频帧进行所述第一码率编码处理,所述第一码率编码处理支持所述下采样或上采样后的当前音频帧的采样率;
    或者
    所述对所述当前音频帧进行第二码率编码处理,包括:
    当所述第二码率编码处理支持所述当前音频帧的采样率时,对所述当前音频帧进行所述第二码率编码处理;或者,
    当所述第二码率编码处理不支持所述当前音频帧的采样率时,对所述当前音频帧进行下采样或上采样处理以获得下采样或上采样后的当前音频帧,对所述下采样或上采样后的当前音频帧进行所述第二码率编码处理,所述第二码率编码处理支持 所述下采样或上采样后的当前音频帧的采样率。
  5. 根据权利要求1-4中任一项所述的方法,其特征在于,当所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式或者第二码率编码向第一码率编码切换编码方式时,所述根据所述当前音频帧的最终编码方式对所述当前音频帧进行编码之前,还包括:
    根据所述上一帧音频帧的设定码率和所述当前音频帧的设定码率确定第一码率编码处理对应的第一码率和第二码率编码处理对应的第二码率,所述第一码率和所述第二码率之和为所述当前音频帧的设定码率;
    所述根据所述当前音频帧的最终编码方式对所述当前音频帧进行编码,包括:
    以所述第一码率对所述当前音频帧进行所述第一码率编码处理;
    以所述第二码率对所述当前音频帧进行所述第二码率编码处理。
  6. 根据权利要求1-5中任一项所述的方法,其特征在于,编码后的当前音频帧对应的码流信息包括包头信息、第一码率编码码流和/或第二码率编码码流,其中,所述包头信息包括所述当前音频帧的最终编码方式、采样率、声道数、帧长和所述第一码率编码码流的长度。
  7. 一种音频解码方法,其特征在于,包括:
    获得码流信息;
    解析所述码流信息以获得解码方式和编码码流,所述编码码流包括第一码率编码码流和/或第二码率编码码流,所述解码方式包括第一码率解码方式、第二码率解码方式、第一码率解码向第二码率解码切换解码方式或者第二码率解码向第一码率解码切换解码方式,当所述解码方式为第一码率解码方式时,所述编码码流包括第一码率编码码流,当所述解码方式为第二码率解码方式时,所述编码码流包括第二码率编码码流,当所述解码方式为第一码率解码向第二码率解码切换解码方式或者第二码率解码向第一码率解码切换解码方式时,所述编码码流包括第一码率编码码流和第二码率编码码流;
    根据所述解码方式对所述编码码流进行解码以获得目标音频帧。
  8. 根据权利要求7所述的方法,其特征在于,当所述解码方式为第一码率解码方式,并且第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,或者,当所述解码方式为第一码率解码方式,并且第一码率解码处理的帧长大于第二码率解码处理的帧长时,所述根据所述解码方式对所述编码码流进行解码以获得目标音频帧,包括:
    根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得所述目标音频帧;
    或者
    当所述解码方式为第一码率解码方式,并且第一码率解码处理的帧长小于第二码率解码处理的帧长时,所述根据所述解码方式对所述编码码流进行解码以获得目标音频帧,包括:
    根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;
    将所述第一码率解码处理对应的第二数据队列从队头溢出M个样点数据,按照先入先出FIFO方式将所述第二数据放入所述第二数据队列,M与所述第二码率解码处理的帧 长相关联;
    从所述第二数据队列的队头提取M个样点数据以获得所述目标音频帧;
    或者
    当所述解码方式为第二码率解码方式,并且第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,或者,当所述解码方式为第二码率解码方式,并且第一码率解码处理的帧长小于第二码率解码处理的帧长时,所述根据所述解码方式对所述编码码流进行解码以获得目标音频帧,包括:
    根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得所述目标音频帧;
    或者
    当所述解码方式为第二码率解码方式,并且第一码率解码处理的帧长大于第二码率解码处理的帧长时,所述根据所述解码方式对所述编码码流进行解码以获得目标音频帧,包括:
    根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;
    将所述第二码率解码处理对应的第一数据队列从队头溢出M个样点数据,按照先入先出FIFO方式将所述第一数据放入所述第一数据队列,M与所述第一码率解码处理的帧长相关联;
    从所述第一数据队列的队头提取M个样点数据以获得所述目标音频帧;
    或者
    当所述解码方式为第一码率解码向第二码率解码切换解码方式,并且第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,所述根据所述解码方式对所述编码码流进行解码以获得目标音频帧,包括:
    根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;
    根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;
    将所述第二数据的后N个样点数据与所述第一数据的前N个样点数据进行加权平均以获得N个样点平滑数据,N为第二设定值;
    根据所述第二数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧;
    或者
    当所述解码方式为第一码率解码向第二码率解码切换解码方式,并且第一码率解码处理的帧长大于第二码率解码处理的帧长时,所述根据所述解码方式对所述编码码流进行解码以获得目标音频帧,包括:
    当上一帧音频帧的解码方式不是所述第一码率解码向第二码率解码切换解码方式时,将所述第二码率解码处理对应的第一数据队列置为全0,所述第一数据队列遵循先入先出的原则;
    根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所 述第一数据队列从队头溢出M个样点数据,并将所述第一数据放入所述第一数据队列的队尾,M与所述第一码率解码处理的帧长相关联;从所述第一数据队列的队头提取M个样点数据以获得第三数据;将所述第二数据的后N个样点数据与所述第三数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第二数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧;
    或者
    当所述解码方式为第一码率解码向第二码率解码切换解码方式,并且第一码率解码处理的帧长小于第二码率解码处理的帧长时,所述根据所述解码方式对所述编码码流进行解码以获得目标音频帧,包括:
    当所述上一帧音频帧的解码方式不是所述第一码率解码向第二码率解码切换解码方式时,根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第一码率解码处理对应的第二数据队列从队头溢出M个样点数据,按照先入先出FIFO方式将所述第二数据放入所述第二数据队列,M与所述第二码率解码处理的帧长相关联;从所述第二数据队列的队头提取M个样点数据以获得第四数据;根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第四数据的后N个样点数据与所述第一数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第四数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧;
    或者,
    当所述上一帧音频帧的解码方式是所述第一码率解码向第二码率解码切换解码方式时,将所述第二数据队列从队头溢出M个样点数据;从所述第二数据队列的队头提取M个样点数据以获得第四数据;根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第四数据的后N个样点数据与所述第一数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第四数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧;
    或者
    当所述解码方式为第二码率解码向第一码率解码切换解码方式,并且第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,所述根据所述解码方式对所述编码码流进行解码以获得目标音频帧,包括:
    根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;
    根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;
    将所述第一数据的后N个样点数据与所述第二数据的前N个样点数据进行加权平均以获得N个样点平滑数据,N为第二设定值;
    根据所述第一数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧;
    或者
    当所述解码方式为第二码率解码向第一码率解码切换解码方式,并且第一码率解码处理的帧长大于第二码率解码处理的帧长时,所述根据所述解码方式对所述编码码流进行解 码以获得目标音频帧,包括:
    当所述上一帧音频帧的解码方式不是所述第二码率解码向第一码率解码切换解码方式时,根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第二码率解码处理对应的第一数据队列从队头溢出M个样点数据,按照先入先出FIFO方式将所述第一数据放入所述第一数据队列,M与所述第一码率解码处理的帧长相关联;从所述第一数据队列的队头提取M个样点数据以获得第三数据;根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第三数据的后N个样点数据与所述第二数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第三数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧;
    或者,
    当所述上一帧音频帧的解码方式是所述第二码率解码向第一码率解码切换解码方式时,将所述第一数据队列从队头溢出M个样点数据;从所述第一数据队列的队头提取M个样点数据以获得第三数据;根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第三数据的后N个样点数据与所述第二数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第三数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧;
    或者
    当所述解码方式为第二码率解码向第一码率解码切换解码方式,并且第一码率解码处理的帧长小于第二码率解码处理的帧长时,所述根据所述解码方式对所述编码码流进行解码以获得目标音频帧,包括:
    当上一帧音频帧的解码方式不是所述第二码率解码向第一码率解码切换解码方式时,将所述第一码率解码处理对应的第二数据队列置为全0,所述第二数据队列遵循先入先出的原则;
    根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第二数据队列从队头溢出M个样点数据,并将所述第二数据放入所述第二数据队列的队尾,M与所述第二码率解码处理的帧长相关联;从所述第二数据队列的队头提取M个样点数据以获得第四数据;将所述第一数据的后N个样点数据与所述第四数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第一数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧。
  9. 根据权利要求8所述的方法,其特征在于,所述对所述第一码率编码码流进行第一码率解码处理,包括:
    判断所述第一码率解码处理是否支持所述第一码率编码码流对应的采样率;
    若所述第一码率解码处理支持所述采样率,则对所述第一码率编码码流进行所述第一码率解码处理;或者,
    若所述第一码率解码处理不支持所述采样率,则对所述第一码率编码码流进行所述第一码率解码处理以获得第五数据,对所述第五数据进行上采样或下采样处理。
  10. 根据权利要求8所述的方法,其特征在于,所述对所述第二码率编码码流进行第 二码率解码处理,包括:
    判断所述第二码率解码处理是否支持所述第二码率编码码流对应的采样率;
    若所述第二码率解码处理支持所述采样率,则对所述第二码率编码码流进行所述第二码率解码处理;或者,
    若所述第二码率解码处理不支持所述采样率,则对所述第二码率编码码流进行所述第二码率解码处理以获得第六数据,对所述第六数据进行上采样或下采样处理。
  11. 一种音频编码装置,其特征在于,包括:
    获得模块,用于获得待编码的当前音频帧的设定码率和上一帧音频帧的最终编码方式,所述最终编码方式包括第一码率编码方式、第二码率编码方式、第一码率编码向第二码率编码切换编码方式或者第二码率编码向第一码率编码切换编码方式,其中,第一码率低于第二码率;
    确定模块,用于根据所述设定码率和所述上一帧音频帧的最终编码方式确定所述当前音频帧的最终编码方式;
    编码模块,用于根据所述当前音频帧的最终编码方式对所述当前音频帧进行编码。
  12. 根据权利要求11所述的装置,其特征在于,当第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,所述确定模块,具体用于当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码方式;或者,当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式;或者,当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式;或者,当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码方式;或者,当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式;或者,当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码方式;或者,当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码方式;或者,当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式;其中,所述设定阈值的取值与音频帧的声道数相关联;
    或者
    当第一码率编码处理的帧长和第二码率编码处理的帧长不相同时,所述确定模块,具体用于当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码方式;或者,当所述设定码率小于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码方式时,确定所 述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式,且启动第一计数器,所述第一计数器的初始值为第一设定值,所述第一计数器在值为0时终止;或者,当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第二码率编码方式时,确定所述当前音频帧的最终编码方式为第二码率编码方式;或者,当所述设定码率大于设定阈值,且所述上一帧音频帧的最终编码方式为第一码率编码方式时,确定所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式,且启动第二计数器,所述第二计数器的初始值为第一设定值,所述第二计数器在值为0时终止;或者,当所述上一帧音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式,且所述第一计数器的值大于0时,将所述第一计数器的值减1;若所述第一计数器的值仍大于0,则确定所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式;或者,若所述第一计数器的值为0,则确定所述当前音频帧的最终编码方式为第一码率编码方式;或者,当所述上一帧音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式,且所述第二计数器的值大于0时,将所述第二计数器的值减1;若所述第二计数器的值仍大于0,则确定所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式;或者,若所述第二计数器的值为0,则确定所述当前音频帧的最终编码方式为第二码率编码方式;其中,所述设定阈值的取值与音频帧的声道数相关联;
    或者
    当所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式或者第二码率编码向第一码率编码切换编码方式时,所述编码模块,具体用于对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理。
  13. 根据权利要求12所述的装置,其特征在于,当所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式,且所述第一码率编码处理的帧长大于所述第二码率编码处理的帧长时,所述编码模块,具体用于当所述第一计数器的值等于所述第一设定值时,对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理;或者,当所述第一计数器的值小于所述第一设定值时,对所述当前音频帧进行第一码率编码处理;
    或者
    当所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式,且所述第一码率编码处理的帧长大于所述第二码率编码处理的帧长时,所述编码模块,具体用于当所述第二计数器的值大于1时,对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理;或者,当所述第二计数器的值等于1时,对所述当前音频帧进行第二码率编码处理;
    或者
    当所述当前音频帧的最终编码方式为第二码率编码向第一码率编码切换编码方式,且所述第一码率编码处理的帧长小于所述第二码率编码处理的帧长时,所述编码模块,具体用于当所述第一计数器的值大于1时,对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理;或者,当所述第一计数器的值等于1时,对所述当前音频帧进行第一码率编码处理;
    或者
    当所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式,且所述第一码率编码处理的帧长小于所述第二码率编码处理的帧长时,所述编码模块,具体用于当所述第二计数器的值等于所述第一设定值时,对所述当前音频帧进行第一码率编码处理;对所述当前音频帧进行第二码率编码处理;或者,当所述第二计数器的值小于所述第一设定值时,对所述当前音频帧进行第二码率编码处理。
  14. 根据权利要求13所述的装置,其特征在于,所述编码模块,具体用于当所述第一码率编码处理支持所述当前音频帧的采样率时,对所述当前音频帧进行所述第一码率编码处理;或者,当所述第一码率编码处理不支持所述当前音频帧的采样率时,对所述当前音频帧进行下采样或上采样处理以获得下采样或上采样后的当前音频帧,对所述下采样或上采样后的当前音频帧进行所述第一码率编码处理,所述第一码率编码处理支持所述下采样或上采样后的当前音频帧的采样率;
    或者
    所述编码模块,具体用于当所述第二码率编码处理支持所述当前音频帧的采样率时,对所述当前音频帧进行所述第二码率编码处理;或者,当所述第二码率编码处理不支持所述当前音频帧的采样率时,对所述当前音频帧进行下采样或上采样处理以获得下采样或上采样后的当前音频帧,对所述下采样或上采样后的当前音频帧进行所述第二码率编码处理,所述第二码率编码处理支持所述下采样或上采样后的当前音频帧的采样率。
  15. 根据权利要求11-14中任一项所述的装置,其特征在于,当所述当前音频帧的最终编码方式为第一码率编码向第二码率编码切换编码方式或者第二码率编码向第一码率编码切换编码方式时,所述确定模块,还用于根据所述上一帧音频帧的设定码率和所述当前音频帧的设定码率确定第一码率编码处理对应的第一码率和第二码率编码处理对应的第二码率,所述第一码率和所述第二码率之和为所述当前音频帧的设定码率;
    所述编码模块,具体用于以所述第一码率对所述当前音频帧进行所述第一码率编码处理;以所述第二码率对所述当前音频帧进行所述第二码率编码处理。
  16. 根据权利要求11-15中任一项所述的装置,其特征在于,编码后的当前音频帧对应的码流信息包括包头信息、第一码率编码码流和/或第二码率编码码流,其中,所述包头信息包括所述当前音频帧的最终编码方式、采样率、声道数、帧长和所述第一码率编码码流的长度。
  17. 一种音频解码装置,其特征在于,包括:
    获得模块,用于获得码流信息;
    解析模块,用于解析所述码流信息以获得解码方式和编码码流,所述编码码流包括第一码率编码码流和/或第二码率编码码流,所述解码方式包括第一码率解码方式、第二码率解码方式、第一码率解码向第二码率解码切换解码方式或者第二码率解码向第一码率解码切换解码方式,当所述解码方式为第一码率解码方式时,所述编码码流包括第一码率编码码流,当所述解码方式为第二码率解码方式时,所述编码码流包括第二码率编码码流,当所述解码方式为第一码率解码向第二码率解码切换解码方式或者第二码率解码向第一码率解码切换解码方式时,所述编码码流包括第一码率编码码流和第二码率编码码流;
    解码模块,用于根据所述解码方式对所述编码码流进行解码以获得目标音频帧。
  18. 根据权利要求17所述的装置,其特征在于,当所述解码方式为第一码率解码方 式,并且第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,或者,当所述解码方式为第一码率解码方式,并且第一码率解码处理的帧长大于第二码率解码处理的帧长时,所述解码模块,具体用于根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得所述目标音频帧;
    或者
    当所述解码方式为第一码率解码方式,并且第一码率解码处理的帧长小于第二码率解码处理的帧长时,所述解码模块,具体用于根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第一码率解码处理对应的第二数据队列从队头溢出M个样点数据,按照先入先出FIFO方式将所述第二数据放入所述第二数据队列,M与所述第二码率解码处理的帧长相关联;从所述第二数据队列的队头提取M个样点数据以获得所述目标音频帧;
    或者
    当所述解码方式为第二码率解码方式,并且第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,或者,当所述解码方式为第二码率解码方式,并且第一码率解码处理的帧长小于第二码率解码处理的帧长时,所述解码模块,具体用于根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得所述目标音频帧;
    或者
    当所述解码方式为第二码率解码方式,并且第一码率解码处理的帧长大于第二码率解码处理的帧长时,所述解码模块,具体用于根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第二码率解码处理对应的第一数据队列从队头溢出M个样点数据,按照先入先出FIFO方式将所述第一数据放入所述第一数据队列,M与所述第一码率解码处理的帧长相关联;从所述第一数据队列的队头提取M个样点数据以获得所述目标音频帧;
    或者
    当所述解码方式为第一码率解码向第二码率解码切换解码方式,并且第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,所述解码模块,具体用于根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第二数据的后N个样点数据与所述第一数据的前N个样点数据进行加权平均以获得N个样点平滑数据,N为第二设定值;根据所述第二数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧;
    或者
    当所述解码方式为第一码率解码向第二码率解码切换解码方式,并且第一码率解码处理的帧长大于第二码率解码处理的帧长时,所述解码模块,具体用于当上一帧音频帧的解码方式不是所述第一码率解码向第二码率解码切换解码方式时,将所述第二码率解码处理对应的第一数据队列置为全0,所述第一数据队列遵循先入先出的原则;根据所述第一码 率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第一数据队列从队头溢出M个样点数据,并将所述第一数据放入所述第一数据队列的队尾,M与所述第一码率解码处理的帧长相关联;从所述第一数据队列的队头提取M个样点数据以获得第三数据;将所述第二数据的后N个样点数据与所述第三数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第二数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧;
    或者
    当所述解码方式为第一码率解码向第二码率解码切换解码方式,并且第一码率解码处理的帧长小于第二码率解码处理的帧长时,所述解码模块,具体用于当所述上一帧音频帧的解码方式不是所述第一码率解码向第二码率解码切换解码方式时,根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第一码率解码处理对应的第二数据队列从队头溢出M个样点数据,按照先入先出FIFO方式将所述第二数据放入所述第二数据队列,M与所述第二码率解码处理的帧长相关联;从所述第二数据队列的队头提取M个样点数据以获得第四数据;根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第四数据的后N个样点数据与所述第一数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第四数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧;或者,当所述上一帧音频帧的解码方式是所述第一码率解码向第二码率解码切换解码方式时,将所述第二数据队列从队头溢出M个样点数据;从所述第二数据队列的队头提取M个样点数据以获得第四数据;根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第四数据的后N个样点数据与所述第一数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第四数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧;
    或者
    当所述解码方式为第二码率解码向第一码率解码切换解码方式,并且第一码率编码处理的帧长和第二码率编码处理的帧长相同,且所述第一码率编码处理的编解码的总时延与所述第二码率编码处理的编解码的总时延相同时,所述解码模块,具体用于根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第一数据的后N个样点数据与所述第二数据的前N个样点数据进行加权平均以获得N个样点平滑数据,N为第二设定值;根据所述第一数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧;
    或者
    当所述解码方式为第二码率解码向第一码率解码切换解码方式,并且第一码率解码处理的帧长大于第二码率解码处理的帧长时,所述解码模块,具体用于当所述上一帧音频帧的解码方式不是所述第二码率解码向第一码率解码切换解码方式时,根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;将所述第二码率解码处理对应的第一数据队列从队头溢出M个样点数据,按照先入先出FIFO方式将所述第一数据 放入所述第一数据队列,M与所述第一码率解码处理的帧长相关联;从所述第一数据队列的队头提取M个样点数据以获得第三数据;根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第三数据的后N个样点数据与所述第二数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第三数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧;或者,当所述上一帧音频帧的解码方式是所述第二码率解码向第一码率解码切换解码方式时,将所述第一数据队列从队头溢出M个样点数据;从所述第一数据队列的队头提取M个样点数据以获得第三数据;根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第三数据的后N个样点数据与所述第二数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第三数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧;
    或者
    当所述解码方式为第二码率解码向第一码率解码切换解码方式,并且第一码率解码处理的帧长小于第二码率解码处理的帧长时,所述解码模块,具体用于当上一帧音频帧的解码方式不是所述第二码率解码向第一码率解码切换解码方式时,将所述第一码率解码处理对应的第二数据队列置为全0,所述第二数据队列遵循先入先出的原则;根据所述第二码率解码方式对所述第二码率编码码流进行解码处理以获得第一数据;根据所述第一码率解码方式对所述第一码率编码码流进行解码处理以获得第二数据;将所述第二数据队列从队头溢出M个样点数据,并将所述第二数据放入所述第二数据队列的队尾,M与所述第二码率解码处理的帧长相关联;从所述第二数据队列的队头提取M个样点数据以获得第四数据;将所述第一数据的后N个样点数据与所述第四数据的前N个样点数据进行加权平均以获得N个样点平滑数据;根据所述第一数据除所述后N个样点数据外的其他数据和所述N个样点平滑数据获得所述目标音频帧。
  19. 根据权利要求18所述的装置,其特征在于,所述解码模块,具体用于判断所述第一码率解码处理是否支持所述第一码率编码码流对应的采样率;若所述第一码率解码处理支持所述采样率,则对所述第一码率编码码流进行所述第一码率解码处理;或者,若所述第一码率解码处理不支持所述采样率,则对所述第一码率编码码流进行所述第一码率解码处理以获得第五数据,对所述第五数据进行上采样或下采样处理。
  20. 根据权利要求18所述的装置,其特征在于,所述解码模块,具体用于判断所述第二码率解码处理是否支持所述第二码率编码码流对应的采样率;若所述第二码率解码处理支持所述采样率,则对所述第二码率编码码流进行所述第二码率解码处理;或者,若所述第二码率解码处理不支持所述采样率,则对所述第二码率编码码流进行所述第二码率解码处理以获得第六数据,对所述第六数据进行上采样或下采样处理。
  21. 一种音频编码设备,其特征在于,包括:
    一个或多个处理器;
    存储器,用于存储一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-6中任一项所述的方法。
  22. 一种音频解码设备,其特征在于,包括:
    一个或多个处理器;
    存储器,用于存储一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求7-10中任一项所述的方法。
  23. 一种计算机可读存储介质,其特征在于,包括计算机程序,所述计算机程序在计算机上被执行时,使得所述计算机执行权利要求1-10中任一项所述的方法。
  24. 一种计算机可读存储介质,其特征在于,包括根据如权利要求1-6中任一项所述的音频编码方法获得的码流信息。
  25. 一种计算机可读存储介质,其特征在于,包括根据如权利要求7-10中任一项所述的音频解码方法获得的音频帧。
PCT/CN2021/125760 2020-11-11 2021-10-22 音频编解码方法和装置 WO2022100414A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011258196.6A CN114495951A (zh) 2020-11-11 2020-11-11 音频编解码方法和装置
CN202011258196.6 2020-11-11

Publications (1)

Publication Number Publication Date
WO2022100414A1 true WO2022100414A1 (zh) 2022-05-19

Family

ID=81489793

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/125760 WO2022100414A1 (zh) 2020-11-11 2021-10-22 音频编解码方法和装置

Country Status (2)

Country Link
CN (1) CN114495951A (zh)
WO (1) WO2022100414A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115174538A (zh) * 2022-06-30 2022-10-11 Oppo广东移动通信有限公司 数据传输方法、装置、电子设备及计算机可读介质

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114666763B (zh) * 2022-05-24 2022-08-26 东莞市云仕电子有限公司 车载无线耳机系统、控制方法及车载无线系统
CN115223577A (zh) * 2022-07-01 2022-10-21 哲库科技(上海)有限公司 音频处理方法、芯片、装置、设备和计算机可读存储介质

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1121374A (zh) * 1994-02-17 1996-04-24 摩托罗拉公司 减缓通信系统的音频质量下降的方法和装置
EP1020997A2 (en) * 1999-01-12 2000-07-19 Deutsche Thomson-Brandt Gmbh Method for processing and apparatus for encoding audio or video frame data
CN103915100A (zh) * 2013-01-07 2014-07-09 中兴通讯股份有限公司 一种编码模式切换方法和装置、解码模式切换方法和装置
CN104517612A (zh) * 2013-09-30 2015-04-15 上海爱聊信息科技有限公司 基于amr-nb语音信号的可变码率编码器和解码器及其编码和解码方法
CN105225668A (zh) * 2013-05-30 2016-01-06 华为技术有限公司 信号编码方法及设备
CN107342090A (zh) * 2016-04-29 2017-11-10 华为技术有限公司 一种音频信号编码、解码方法及音频信号编码器、解码器
CN107888992A (zh) * 2017-11-17 2018-04-06 北京松果电子有限公司 视频数据传输方法、接收方法、装置、存储介质及设备
CN107948628A (zh) * 2016-10-12 2018-04-20 阿里巴巴集团控股有限公司 一种多维视频数据的编码、解码方法和装置
CN109389987A (zh) * 2017-08-10 2019-02-26 华为技术有限公司 音频编解码模式确定方法和相关产品
CN109600610A (zh) * 2018-11-12 2019-04-09 深圳市景阳科技股份有限公司 一种数据编码方法及终端
CN111816197A (zh) * 2020-06-15 2020-10-23 北京达佳互联信息技术有限公司 音频编码方法、装置、电子设备和存储介质

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1121374A (zh) * 1994-02-17 1996-04-24 摩托罗拉公司 减缓通信系统的音频质量下降的方法和装置
EP1020997A2 (en) * 1999-01-12 2000-07-19 Deutsche Thomson-Brandt Gmbh Method for processing and apparatus for encoding audio or video frame data
CN103915100A (zh) * 2013-01-07 2014-07-09 中兴通讯股份有限公司 一种编码模式切换方法和装置、解码模式切换方法和装置
CN105225668A (zh) * 2013-05-30 2016-01-06 华为技术有限公司 信号编码方法及设备
CN104517612A (zh) * 2013-09-30 2015-04-15 上海爱聊信息科技有限公司 基于amr-nb语音信号的可变码率编码器和解码器及其编码和解码方法
CN107342090A (zh) * 2016-04-29 2017-11-10 华为技术有限公司 一种音频信号编码、解码方法及音频信号编码器、解码器
CN107948628A (zh) * 2016-10-12 2018-04-20 阿里巴巴集团控股有限公司 一种多维视频数据的编码、解码方法和装置
CN109389987A (zh) * 2017-08-10 2019-02-26 华为技术有限公司 音频编解码模式确定方法和相关产品
CN107888992A (zh) * 2017-11-17 2018-04-06 北京松果电子有限公司 视频数据传输方法、接收方法、装置、存储介质及设备
CN109600610A (zh) * 2018-11-12 2019-04-09 深圳市景阳科技股份有限公司 一种数据编码方法及终端
CN111816197A (zh) * 2020-06-15 2020-10-23 北京达佳互联信息技术有限公司 音频编码方法、装置、电子设备和存储介质

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115174538A (zh) * 2022-06-30 2022-10-11 Oppo广东移动通信有限公司 数据传输方法、装置、电子设备及计算机可读介质

Also Published As

Publication number Publication date
CN114495951A (zh) 2022-05-13

Similar Documents

Publication Publication Date Title
WO2022100414A1 (zh) 音频编解码方法和装置
JP7302006B2 (ja) Bluetoothデバイスを操作するための方法
WO2022062942A1 (zh) 音频编解码方法和装置
TWI287371B (en) Method and system for dynamically changing audio stream bit rate based on condition of a bluetooth connection
CN110770824B (zh) 多流音频译码
WO2020037810A1 (zh) 基于蓝牙的音频传输方法、系统、音频播放设备及计算机可读存储介质
US20050164632A1 (en) Radio transmission device and method, radio receiving device and method, radio transmitting/receiving system, and storage medium
CN109785841B (zh) 一种蓝牙智能设备语音交互系统及方法
WO2021160040A1 (zh) 音频传输方法及电子设备
CN113365129A (zh) 蓝牙音频数据处理方法、发射器、接收器及收发设备
WO2024001447A1 (zh) 音频处理方法、芯片、装置、设备和计算机可读存储介质
JP2002152310A (ja) 無線送信装置及び方法、無線受信装置及び方法、無線送受信システム、並びに記憶媒体
EP3923280A1 (en) Adapting multi-source inputs for constant rate encoding
WO2024001405A1 (zh) 音频处理方法、装置、芯片、电子设备及存储介质
WO2019001142A1 (zh) 一种声道间相位差参数的编码方法及装置
WO2023124587A1 (zh) 一种媒体文件的传输方法和设备
CN111225102A (zh) 一种蓝牙音频信号传输方法和装置
CN111385780A (zh) 一种蓝牙音频信号传输方法和装置
WO2020232631A1 (zh) 一种语音分频传输方法、源端、播放端、源端电路和播放端电路
WO2022179306A1 (zh) 一种音视频播放方法、装置和电子设备
CN115174538A (zh) 数据传输方法、装置、电子设备及计算机可读介质
US9437203B2 (en) Error concealment for speech decoder
WO2019047239A1 (zh) 智能终端及其音频数据多通道传输方法
WO2023051041A1 (zh) 数据传输方法、装置、电子设备和可读存储介质
WO2024021730A1 (zh) 音频信号的处理方法及其装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21890941

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21890941

Country of ref document: EP

Kind code of ref document: A1