WO2024021732A1 - 音频编解码方法、装置、存储介质及计算机程序产品 - Google Patents

音频编解码方法、装置、存储介质及计算机程序产品 Download PDF

Info

Publication number
WO2024021732A1
WO2024021732A1 PCT/CN2023/092051 CN2023092051W WO2024021732A1 WO 2024021732 A1 WO2024021732 A1 WO 2024021732A1 CN 2023092051 W CN2023092051 W CN 2023092051W WO 2024021732 A1 WO2024021732 A1 WO 2024021732A1
Authority
WO
WIPO (PCT)
Prior art keywords
encoding
subbands
group
spectrum
codebooks
Prior art date
Application number
PCT/CN2023/092051
Other languages
English (en)
French (fr)
Inventor
王卓
冯斌
杜春晖
范泛
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2024021732A1 publication Critical patent/WO2024021732A1/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing

Definitions

  • the present application relates to the field of audio coding and decoding, and in particular to an audio coding and decoding method, device, storage medium and computer program product.
  • This application provides an audio encoding and decoding method, device, storage medium and computer program product, which can reduce the power consumption of the decoding end.
  • the technical solutions are as follows:
  • an audio decoding method includes:
  • the channel decoding mode is obtained.
  • the first condition includes that the audio signal is a two-channel signal, the encoding code rate of the audio signal is not less than the code rate threshold, and the The sampling rate of the audio signal is not less than the sampling rate threshold; if the channel decoding mode is the left channel decoding mode, then the left channel bit stream in the code stream is decoded to obtain the left channel data of the audio signal, Copy the left channel data to the right channel; if the channel decoding mode is the right channel decoding mode, decode the right channel bit stream in the code stream to obtain the right channel of the audio signal channel data, copy the right channel data to the left channel.
  • the method further includes: if the channel decoding mode is not the left channel decoding mode and is not the right channel decoding mode, decoding the left channel bit stream and the right channel bit stream. channel bitstream to obtain the left channel data and the right channel data.
  • the method further includes: if the audio signal does not meet the first condition, decoding the binaural interleaved bits in the code stream if the audio signal is a binaural signal. stream to obtain the left channel data and the right channel data.
  • the method further includes: if the audio signal does not meet the first condition, decoding the mono bit stream in the code stream if the audio signal is a mono signal. , to obtain the monophonic number of the audio signal according to.
  • the method further includes: obtaining the total data amount of the code stream; decoding the header of the code stream to obtain the number of channels, sampling rate and frame length of the audio signal; based on the total data quantity, the number of channels, the sampling rate and the frame length, to determine whether the audio signal satisfies the first condition.
  • the method further includes: decoding a bit stream of side information in the code stream to obtain the side information, where the side information includes an encoding codebook identifier; based on the encoding codebook identifier, from multiple Determine the target decoding codebook required for decoding in the decoding codebook.
  • the multiple decoding codebooks are as follows:
  • an audio coding method includes:
  • the left channel data of the audio signal is encoded into the code stream, and the right channel data of the audio signal is encoded into the code stream.
  • the first condition The method includes that the audio signal is a two-channel signal, the encoding code rate of the audio signal is not less than a code rate threshold, and the sampling rate of the audio signal is not less than the sampling rate threshold.
  • the method further includes: if the audio signal does not meet the first condition, if the audio signal is a two-channel signal, the audio signal is encoded by two-channel interleaved coding. The left channel data and the right channel data are encoded into the code stream.
  • the method further includes: if the audio signal does not meet the first condition, if the audio signal is a mono signal, encoding the mono data of the audio signal into The code stream.
  • encoding the left channel data of the audio signal into a code stream includes:
  • the quantization level measurement factor represents the average number of bits required to encode each spectrum value in the corresponding subband, the plurality of subbands refers to the left channel
  • the data includes multiple subbands into which the quantized spectrum data is divided;
  • the multiple sub-bands are divided into multiple groups of sub-bands, and the quantization level measurement factors of the same group of sub-bands are the same;
  • the target encoding codebook Based on the quantization level measurement factor of each group of subbands, determine the target encoding codebook corresponding to each group of subbands from multiple encoding codebooks, and determine the bit stream of spectrum values in each group of subbands, the target encoding codebook Refers to the coding codebook used to encode the spectrum values within the corresponding set of subbands;
  • the identification of the target encoding codebook corresponding to each group of subbands is encoded into the code stream as a kind of side information of the left channel data.
  • determine the target encoding codebook corresponding to each group of subbands from multiple encoding codebooks, and determine the bit stream of spectrum values within each group of subbands include:
  • any group of subbands in the plurality of groups of subbands if the quantization level measurement factor of any group of subbands is a first value, a plurality of first encoding codes in the plurality of encoding codebooks are used The spectrum values in any group of subbands are respectively encoded to obtain a plurality of first candidate spectrum bit streams that correspond one-to-one to the plurality of first encoding codebooks;
  • the first coding codebook corresponding to the candidate spectrum bit stream is determined as the target coding codebook corresponding to any group of subbands.
  • the first value is 1;
  • the use of a plurality of first encoding codebooks in the plurality of encoding codebooks to respectively encode the spectrum values in any group of subbands includes:
  • the plurality of first encoding codebooks are as follows:
  • determine the target encoding codebook corresponding to each group of subbands from multiple encoding codebooks, and determine the bit stream of spectrum values within each group of subbands include:
  • any group of subbands in the plurality of groups of subbands if the quantization level measurement factor of any group of subbands is a second value, use a plurality of second encoding codes in the plurality of encoding codebooks.
  • the spectrum values in any group of subbands are respectively encoded to obtain a plurality of second candidate spectrum bit streams that correspond one-to-one to the plurality of second encoding codebooks;
  • the second coding codebook corresponding to the candidate spectrum bit stream is determined as the target coding codebook corresponding to any group of subbands.
  • the second value is 2;
  • the spectrum values in any group of subbands are respectively processed using a plurality of second coding codebooks among the plurality of coding codebooks.
  • Line encoding including:
  • the plurality of second encoding codebooks are as follows:
  • determine the target encoding codebook corresponding to each group of subbands from multiple encoding codebooks, and determine the bit stream of spectrum values within each group of subbands include:
  • a plurality of third encoding codes in the plurality of encoding codebooks are used. Encode the spectrum values within any group of subbands respectively to obtain a plurality of third candidate spectrum bit streams that correspond one-to-one to the plurality of third encoding codebooks;
  • the third coding codebook corresponding to the candidate spectrum bit stream is determined as the target coding codebook corresponding to any group of subbands.
  • determine the target encoding codebook corresponding to each group of subbands from multiple encoding codebooks, and determine the bit stream of spectrum values within each group of subbands include:
  • the plurality of third encoding codebooks are used to respectively encode the any group of subbands. Encoding the first part bits in each spectrum value within the group subband to obtain a plurality of first part candidate bit streams corresponding one-to-one to the plurality of third encoding codebooks;
  • the three-coding codebook is determined as the target coding codebook corresponding to any group of subbands;
  • Uniform quantization encoding is performed on the second part of bits except the first part of bits in each spectrum value in any group of subbands to obtain a bit stream of the second part of bits.
  • the first part of the bits refers to the N high-order bits in the spectrum value
  • the second part of the bits refers to the M low-order bits of the spectrum value
  • the M is equal to the The quantization level measurement factor of any group of subbands minus Go to the third value.
  • the third value is 3;
  • the use of a plurality of third encoding codebooks among the plurality of encoding codebooks to respectively encode the spectrum values in any group of subbands includes:
  • the plurality of third encoding codebooks are as follows:
  • the audio encoding and decoding method provided by this application when the audio signal is a two-channel signal, even if the code stream includes a left channel bit stream and a right channel bit stream, during the decoding process, it can also be based on the channel decoding mode. , to decode the left channel bit stream without decoding the right channel bit stream, or to decode the right channel bit stream without decoding the left channel bit stream, thereby reducing the power consumption of the decoding end when the decoding end resources are limited.
  • the encoding end can also sequentially encode the left channel data and the right channel data according to the conditions met by the audio signal, instead of having to encode according to the two-channel interleaved encoding method or the two-channel deinterleaved encoding method. It can be seen that the coding method of this scheme is more flexible.
  • an audio decoding device has the function of implementing the audio decoding method in the first aspect.
  • the audio decoding device includes one or more modules, and the one or more modules are used to implement the audio decoding method provided in the first aspect.
  • an audio encoding device has the function of realizing the behavior of the audio encoding method in the above second aspect.
  • the audio encoding device includes one or more modules, and the one or more modules are used to implement the audio encoding method provided in the second aspect.
  • an audio decoding device in a fifth aspect, includes a processor and a memory.
  • the memory is used to store a program for executing the audio decoding method provided in the first aspect, and to store a program for implementing the above-mentioned third audio decoding method.
  • the processor is configured to execute a program stored in the memory.
  • the audio decoding device may further include a communication bus used to establish a connection between the processor and the memory.
  • an audio encoding device in a sixth aspect, includes a processor and a memory.
  • the memory is used to store a program for executing the audio encoding method provided in the second aspect, and to store a program for implementing the above-mentioned audio encoding method.
  • the processor is configured to execute a program stored in the memory.
  • the audio decoding device may also include a communication bus for establishing a connection between the processor and the memory. Connect immediately.
  • a computer-readable storage medium is provided. Instructions are stored in the computer-readable storage medium. When run on a computer, the computer is caused to execute the audio decoding method described in the first aspect, or execute The audio coding method described in the second aspect above.
  • An eighth aspect provides a computer program product containing instructions that, when run on a computer, causes the computer to perform the audio decoding method described in the first aspect, or the audio encoding method described in the second aspect.
  • Figure 1 is a schematic diagram of a Bluetooth interconnection scenario provided by an embodiment of the present application.
  • Figure 2 is a system framework diagram involved in the audio encoding and decoding method provided by the embodiment of the present application;
  • Figure 3 is an overall framework diagram of an audio codec provided by an embodiment of the present application.
  • Figure 4 is a schematic structural diagram of a coding and decoding device provided by an embodiment of the present application.
  • Figure 5 is a flow chart of an audio encoding method provided by an embodiment of the present application.
  • Figure 6 is a schematic diagram of a code stream structure provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of another code stream format provided by an embodiment of the present application.
  • Figure 8 is a flow chart of another audio encoding method provided by an embodiment of the present application.
  • Figure 9 is a flow chart of an audio decoding method provided by an embodiment of the present application.
  • Figure 10 is a flow chart of another audio decoding method provided by an embodiment of the present application.
  • Figure 11 is a schematic structural diagram of an audio decoding device provided by an embodiment of the present application.
  • Figure 12 is a schematic structural diagram of an audio encoding device provided by an embodiment of the present application.
  • Bluetooth devices such as true wireless stereo (TWS) headphones, smart speakers, and smart watches in people's daily lives
  • TWS true wireless stereo
  • Bluetooth signals are prone to interference.
  • the Bluetooth interconnection scenario due to the limitation of data transmission size by the Bluetooth channel connecting the audio sending device and the audio receiving device, the audio signal must be compressed by the audio encoder in the audio sending device and then transmitted to the audio receiving device.
  • the audio decoder in the audio receiving device decodes the compressed audio signal before it can be played. It can be seen that in wireless The popularity of Bluetooth devices has also led to the vigorous development of various Bluetooth audio codecs.
  • Bluetooth audio codecs include sub-band coding (SBC), advanced audio coding (AAC), aptX series codecs, and low-latency high-definition audio codec (low-latency hi- definition audio codec, LHDC), low-power low-latency LC3 audio codec and LC3plus, etc.
  • audio encoding and decoding method provided by the embodiment of the present application can be applied to audio sending devices (ie, encoding end) and audio receiving devices (ie, decoding end) in Bluetooth interconnection scenarios.
  • FIG 1 is a schematic diagram of a Bluetooth interconnection scenario provided by an embodiment of the present application.
  • the audio sending device in the Bluetooth interconnection scenario can be a mobile phone, computer, tablet, etc.
  • the computer can be a laptop computer, a desktop computer, etc.
  • the tablet can be a handheld tablet, a vehicle-mounted tablet, etc.
  • Audio receiving devices in Bluetooth interconnection scenarios can be TWS headsets, smart speakers, wireless headsets, wireless neckband headphones, smart watches, smart glasses, smart vehicle equipment, etc.
  • the audio receiving device in the Bluetooth interconnection scenario can also be a mobile phone, computer, tablet, etc.
  • the audio encoding and decoding methods provided by the embodiments of the present application can also be applied to other device interconnection scenarios.
  • the system architecture and business scenarios described in the embodiments of this application are to more clearly explain the technical solutions of the embodiments of this application, and do not constitute a limitation on the technical solutions provided by the embodiments of this application.
  • Common skills in the art Personnel can know that with the evolution of system architecture and the emergence of new business scenarios, the technical solutions provided in the embodiments of this application are also applicable to similar technical problems.
  • Figure 2 is a system framework diagram involved in the audio encoding and decoding method provided by the embodiment of the present application.
  • the system includes an encoding end and a decoding end.
  • the encoding end includes input module, encoding module and sending module.
  • the decoding end includes a receiving module, an input module, a decoding module and a playback module.
  • users determine one encoding mode from two encoding modes based on usage scenarios. These two encoding modes are low-latency encoding mode and high-quality encoding mode.
  • the encoding frame lengths of these two encoding modes are 5ms and 10ms respectively. For example, if the usage scenario is playing games, live streaming, making phone calls, etc., the user can choose the low-latency encoding mode; if the usage scenario is listening to music through headphones or speakers, the user can choose the high-quality encoding mode.
  • the user also needs to provide the audio signal to be encoded (pulse code modulation (PCM) data as shown in Figure 2) to the encoding end.
  • PCM pulse code modulation
  • the user also needs to set the target bit rate of the code stream obtained by encoding, that is, the encoding bit rate of the audio signal.
  • the target code rate the better the sound quality, but the worse the anti-interference performance of the code stream during short-distance transmission
  • the lower the target code rate the relatively worse sound quality, but the anti-interference performance of the code stream during short-distance transmission.
  • the input module on the encoding side obtains the encoding frame length, encoding bit rate, and audio signal to be encoded submitted by the user.
  • the input module on the encoding side inputs the data submitted by the user into the frequency domain encoder of the encoding module.
  • the frequency domain encoder of the encoding module encodes the received data to obtain the code stream.
  • the frequency domain encoding end analyzes the audio signal to be encoded to obtain the signal characteristics (including mono/dual channel, stationary/non-stationary, full bandwidth/narrow bandwidth signal, subjective/objective, etc.).
  • the bit rate gear i.e. encoding bit rate
  • the sending module at the encoding end sends the code stream to the decoding end.
  • the sending module is a short-distance sending module as shown in Figure 2 or other types of sending modules, which is not limited in this embodiment of the present application.
  • the receiving module of the decoding end After the receiving module of the decoding end receives the code stream, it sends the code stream to the frequency domain decoder of the decoding module, and notifies the input module of the decoding end to obtain the configured bit depth and channel decoding mode.
  • the receiving module is as shown in Figure 2
  • the short-range receiving module or other types of receiving modules shown are not limited in the embodiments of the present application.
  • the input module at the decoding end inputs the acquired information such as bit depth and channel decoding mode into the frequency domain decoder of the decoding module.
  • the frequency domain decoder of the decoding module decodes the code stream based on bit depth, channel decoding mode, etc. to obtain the required audio data (PCM data as shown in Figure 2), and sends the obtained audio data to the playback module for playback module for audio playback.
  • the channel decoding mode indicates the channel to be decoded.
  • Figure 3 is an overall framework diagram of an audio coding and decoding provided by an embodiment of the present application.
  • the encoding process on the encoding side includes the following steps:
  • the PCM data is monophonic data or dual-channel data.
  • the bit depth can be 16 bits (bit), 24bit, 32bit floating point or 32bit fixed point.
  • the PCM input module converts the input PCM data to the same bit depth, such as 24-bit bit depth, deinterleaves the PCM data and places it according to the left channel and the right channel.
  • the MDCT domain signal analysis module takes effect in full bit rate scenarios, and the adaptive bandwidth detection module is activated in low bit rates (such as bit rate ⁇ 150kbps/channel).
  • bandwidth detection is performed based on the spectrum data in the MDCT domain obtained in step (2) above to obtain the cutoff frequency or effective bandwidth.
  • signal analysis on the spectrum data within the effective bandwidth, that is, analyze whether the frequency point distribution is concentrated or uniform to obtain the energy concentration degree.
  • a sign indicating whether the audio signal to be encoded is an objective signal or a subjective signal is obtained. (flag) (the flag of the objective signal is 1, and the flag of the subjective signal is 0).
  • the frequency domain noise shaping (SNS) processing of the scaling factor and the smoothing of the MDCT spectrum are not performed at low code rates, because this will reduce the coding effect of the objective signal. Then, it is determined whether to perform the sub-band cutoff operation in the MDCT domain based on the bandwidth detection results and the subjective and objective signal flags.
  • SNS frequency domain noise shaping
  • the audio signal is an objective signal, no sub-band cutoff operation is performed; if the audio signal is a subjective signal and the bandwidth detection result is marked as 0 (full bandwidth), the sub-band cutoff operation is determined by the code rate; if the audio signal is a subjective signal And the bandwidth detection result flag is non-0 (that is, the bandwidth is a limited bandwidth less than half of the sampling rate), then the subband cutoff operation is determined by the bandwidth detection result.
  • the best sub-band division method is selected from multiple sub-band division methods, and the total number of sub-bands required to encode the audio signal is obtained. number.
  • the envelope of the spectrum is calculated, that is, the scaling factor corresponding to the selected sub-band dividing method is calculated.
  • a joint coding judgment is performed based on the scaling factor calculated in the above step (4), that is, whether to perform MS channel transformation on the left and right channel data.
  • the spectrum smoothing module performs MDCT spectrum smoothing based on low code rate settings (such as code rate ⁇ 150kbps/channel).
  • the frequency domain noise shaping module performs frequency domain noise shaping on the spectrally smoothed data based on the scaling factor to obtain the adjustment factor.
  • the adjustment factor is used to quantize the spectral values of the audio signal.
  • the low bit rate setting is controlled by the low bit rate identification module.
  • Differential encoding or entropy encoding is performed on the scaling factors of multiple subbands according to the distribution of the scaling factors.
  • the encoding is controlled to a constant bit rate (CBR) encoding mode through the rough estimation and fine estimation bit allocation strategies, and the MDCT spectrum is The values are quantized and entropy coded.
  • CBR constant bit rate
  • the uncoded subbands are further sorted by importance, and the bits are preferentially allocated to the encoding of the MDCT spectrum values of the important subbands.
  • the header information includes audio sampling rate (such as 44.1kHz/48kHz/88.2kHz/96kHz), channel information (such as mono and dual channels), encoding frame length (such as 5ms and 10ms), encoding mode (such as time domain, Frequency domain, time domain switching frequency domain or frequency domain switching time domain mode), etc.
  • audio sampling rate such as 44.1kHz/48kHz/88.2kHz/96kHz
  • channel information such as mono and dual channels
  • encoding frame length such as 5ms and 10ms
  • encoding mode such as time domain, Frequency domain, time domain switching frequency domain or frequency domain switching time domain mode
  • Bit stream i.e. code stream
  • the code stream includes packet header, side information, payload, etc.
  • the packet header carries packet header information, and the packet header information is as described in step (10) above.
  • the side information includes the encoding code stream of the scaling factor, information on the selected sub-band division method, cutoff frequency information, low code rate flag, joint coding discrimination information (i.e. MS transform flag), quantization step size and other information.
  • the payload includes the coded code stream of the MDCT spectrum and the residual coded code stream.
  • the decoding process at the decoding end includes the following steps:
  • the header information includes the sampling rate of the audio signal, channel information, encoding frame length, encoding mode and other information.
  • the encoding bit rate is calculated based on the code stream size, sampling rate and encoding frame length. That is, the code rate gear information is obtained.
  • the side information is decoded from the code stream, including information about the selected sub-band division method, cutoff frequency information, low bit rate flag, joint coding discrimination information, quantization step size and other information, as well as the scaling factor of each sub-band.
  • frequency domain noise shaping needs to be performed based on the scaling factor to obtain an adjustment factor.
  • the adjustment factor is used to inverse quantize the code value of the spectrum value.
  • the low bit rate setting is controlled by the low bit rate discrimination module. When the low bit rate setting is not met, there is no need to perform frequency domain noise shaping.
  • the MDCT spectrum decoding module decodes the MDCT spectrum data in the code stream based on the information on the subband division method, the quantization step size information and the scaling factor obtained in the above step (2). Hole completion is performed at a low code rate. If there are still bits left after calculation, the MDCT spectrum decoding module performs residual decoding to obtain MDCT spectrum data of other subbands, and then the final MDCT spectrum data.
  • step (4) if it is determined according to the joint coding discrimination that it is a two-channel joint encoding mode and not a decoding low power consumption mode (such as the encoding code rate is greater than or equal to 300kbps and the sampling rate is greater than 88.2kHz), then step (4) get The obtained MDCT spectrum data is subjected to LR channel transformation.
  • the inverse MDCT transformation module performs inverse MDCT transformation on the obtained MDCT spectrum data to obtain the time domain aliasing signal, and then adds a low delay synthesis window module to perform the time domain aliasing signal Adding a low-latency synthesis window, the overlap-and-add module superimposes the time-domain aliasing buffer signal of the current frame and the previous frame to obtain the PCM signal, that is, the final PCM data is obtained through overlap and addition.
  • the PCM data of the corresponding channel is output.
  • Figure 4 is a schematic structural diagram of a coding and decoding device according to an embodiment of the present application.
  • the codec device is any device shown in Figure 1 , and the codec device includes one or more processors 401, a communication bus 402, a memory 403, and one or more communication interfaces 404.
  • the processor 401 is a general central processing unit (CPU), a network processing unit (NP), a microprocessor, or one or more integrated circuits used to implement the solution of the present application, for example, a dedicated Integrated circuit (application-specific integrated circuit, ASIC), programmable logic device (programmable logic device, PLD) or a combination thereof.
  • a dedicated Integrated circuit application-specific integrated circuit, ASIC
  • programmable logic device programmable logic device
  • PLD programmable logic device
  • the above-mentioned PLD is a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a general array logic (GAL) or any of them combination.
  • CPLD complex programmable logic device
  • FPGA field-programmable gate array
  • GAL general array logic
  • Communication bus 402 is used to transfer information between the above-mentioned components.
  • the communication bus 402 is divided into an address bus, a data bus, a control bus, etc.
  • address bus a data bus
  • control bus a control bus
  • only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.
  • the memory 403 is a read-only memory (read-only memory, ROM), a random access memory (random access memory, RAM), or an electrically erasable programmable read-only memory (EEPROM). , optical disc (including compact disc read-only memory, CD-ROM), compressed optical disc, laser disc, digital versatile disc, Blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or can be used for portable Or any other medium that stores the desired program code in the form of instructions or data structures and can be accessed by a computer, without limitation.
  • the memory 403 exists independently and is connected to the processor 401 through the communication bus 402, or the memory 403 and the processor 401 are integrated together.
  • the Communication interface 404 uses any transceiver-like device for communicating with other devices or communication networks.
  • the communication interface 404 includes a wired communication interface and, optionally, a wireless communication interface.
  • the wired communication interface is such as an Ethernet interface.
  • the Ethernet interface is an optical interface, an electrical interface, or a combination thereof.
  • the wireless communication interface is a wireless local area network (WLAN) interface, a cellular network communication interface or a combination thereof.
  • WLAN wireless local area network
  • the encoding and decoding device includes multiple processors, such as processor 401 and processor 405 as shown in FIG. 4 .
  • processors such as processor 401 and processor 405 as shown in FIG. 4 .
  • Each of these processors is a single-core processor, or a multi-core processor.
  • a processor here refers to one or more devices, circuits, and/or processing cores for processing data (such as computer program instructions).
  • the encoding and decoding device also includes an output device 406 and an input device 407.
  • Output device 406 communicates with processor 401 and can display information in a variety of ways.
  • the output device 406 is a liquid crystal display Liquid crystal display (LCD), light emitting diode (LED) display device, cathode ray tube (CRT) display device or projector (projector), etc.
  • the input device 407 communicates with the processor 401 and can receive user input in a variety of ways.
  • the input device 407 is a mouse, a keyboard, a touch screen device or a sensing device, or the like.
  • the memory 403 is used to store the program code 410 for executing the solution of the present application, and the processor 401 can execute the program code 410 stored in the memory 403.
  • the program code includes one or more software modules, and the encoding and decoding device can implement the audio encoding method provided in the embodiment of FIG. 5 below and/or the audio encoding method shown in FIG. 9 through the processor 401 and the program code 410 in the memory 403. Audio decoding method.
  • Figure 5 is a flow chart of an audio encoding method provided by an embodiment of the present application, and the method is applied to the encoding end. Please refer to Figure 5.
  • the method includes the following steps.
  • Step 501 If the audio signal to be encoded meets the first condition, encode the left channel data of the audio signal into the code stream.
  • the first condition includes that the audio signal is a two-channel signal and the encoding code rate of the audio signal is not is less than the code rate threshold, and the sampling rate of the audio signal is not less than the sampling rate threshold.
  • the encoding end in order to achieve flexible coding on the encoding end, can sequentially encode the left channel data and the right channel data according to the conditions met by the audio signal to be encoded, instead of having to follow the two-channel interleaved encoding method. Or it must be encoded according to the two-channel deinterleaved encoding method.
  • the encoding will encode the left channel data of the audio signal into the code stream.
  • the first condition includes that the audio signal is a two-channel signal and the encoding code rate of the audio signal is not is less than the code rate threshold, and the sampling rate of the audio signal is not less than the sampling rate threshold.
  • the code rate threshold and the sampling rate threshold are both preset parameters, the code rate threshold is 300kbps or other values, and the sampling rate threshold is 88.2kHz or other values. In the embodiment of this application, the code rate threshold is 300kbps and the sampling rate threshold is 88.2kHz.
  • the encoding end can encode the left channel data according to the distribution characteristics of the quantized value of the spectrum of the audio signal to be encoded. This will be introduced next.
  • One implementation method for the encoding end to encode the left channel data of the audio signal into the code stream is: the encoding end obtains the quantization level measurement factor of each subband in the multiple subbands, and the quantization level measurement factor represents the encoding within the corresponding subband. The average number of bits required for each spectrum value.
  • the multiple subbands refer to the multiple subbands into which the encoding end divides the quantized spectrum data included in the left channel data.
  • the encoding end divides the multiple subbands into multiple groups of subbands based on the quantization level measurement factors of the multiple subbands, and the quantization level measurement factors of the same group of subbands are the same.
  • the encoding end Based on the quantization level measurement factor of each group of subbands, the encoding end determines the target encoding codebook corresponding to each group of subbands from multiple encoding codebooks, and determines the bit stream of the spectrum value in each group of subbands, and the target encoding codebook Refers to the coding codebook used to encode the spectrum values within the corresponding set of subbands.
  • the encoding end encodes the identification of the target encoding codebook corresponding to each group of subbands into the code stream as a kind of side information of the left channel data.
  • the left channel data includes quantized spectrum data, which is divided into multiple subbands.
  • the encoding end selects from multiple encoding codebooks based on the quantization level measurement factor of each subband in the multiple subbands.
  • a better encoding codebook ie, target encoding codebook
  • the encoding end since different encoding codebooks correspond to different decoding codebooks, in order for the decoding end to know which decoding codebook corresponds to the target encoding codebook used by the encoding end, the encoding end also encodes the identification of the target encoding codebook.
  • Input code stream wherein, the left channel bit stream includes bit streams of spectral values in the above multiple sets of subbands.
  • the encoding end determines the target encoding codebook corresponding to each group of subbands from multiple encoding codebooks based on the quantization level measurement factor of each group of subbands, and determines the bit stream of the spectrum value in each group of subbands.
  • the implementation process includes: for any group of subbands in the plurality of groups of subbands, if the quantization level measurement factor of any group of subbands is the first value, the encoding end uses the plurality of first values in the plurality of encoding codebooks.
  • An encoding codebook separately encodes the spectrum values in any group of subbands to obtain a plurality of first candidate spectrum bit streams that correspond one-to-one to the plurality of first encoding codebooks.
  • the encoding end determines the first candidate spectrum bit stream with the smallest total number of bits among the plurality of first candidate spectrum bit streams as a bit stream of spectrum values in any group of subbands, and determines the first candidate spectrum bit stream with the smallest total number of bits.
  • the first coding codebook corresponding to the spectrum bit stream is determined as the target coding codebook corresponding to any group of subbands.
  • the encoding end selects a better encoding codebook from multiple first encoding codebooks as the target encoding codebook for these sub-bands, and uses the target encoding
  • the spectrum bit stream obtained by the codebook has the smallest amount of data, which means efficient encoding at the encoding end is achieved, which facilitates the subsequent decoding end to quickly decode the spectrum bit stream.
  • the first value is 1, and the encoding end combines every four spectrum values in any group of subbands into a binary number, and uses the plurality of first encoding codebooks to perform the decimal number represented by the binary number respectively. coding. It should be understood that the first value is 1, which means that the average number of bits required to encode each spectrum value in the corresponding subband is 1. Then, the encoding end forms every four spectrum values in any group of subbands into a binary The number of bits of the binary number is 4. The encoding end uses the plurality of first encoding codebooks to respectively encode the decimal numbers represented by the 4 bit numbers to obtain multiple bit streams of the four spectrum values. By analogy, after the encoding end encodes all the spectrum values in any group of subbands according to the above method, the first candidate spectrum bit stream corresponding to each first encoding codebook is obtained.
  • the four adjacent spectrum values within a certain subband in any group of subbands are 0, 0, 1, and 1 (binary numbers) respectively.
  • the encoding end combines these four spectrum values into a binary number 0011.
  • the decimal number represented by the binary number 0011 is 3, then the encoding end uses the plurality of first encoding codebooks to encode the decimal 3 respectively.
  • the plurality of first coding codebooks are Huffman coding codebooks, and Huffman coding is a kind of entropy coding.
  • the plurality of first encoding codebooks are determined based on statistical characteristics of a large amount of quantized spectrum data or obtained through other methods.
  • the plurality of first encoding codebooks are as follows:
  • the encoding end uses a plurality of the plurality of coding codebooks.
  • the two encoding codebooks respectively encode the spectrum values in any group of subbands to obtain a plurality of second candidate spectrum bit streams corresponding one-to-one to the plurality of second encoding codebooks.
  • the encoding end determines the second candidate spectrum bit stream with the smallest total number of bits among the plurality of second candidate spectrum bit streams as a bit stream of spectrum values in any group of subbands, and determines the second candidate spectrum bit stream with the smallest total number of bits.
  • the second coding codebook corresponding to the spectrum bit stream is determined as the target coding codebook corresponding to any group of subbands. Simply put, for those subbands whose quantization level measurement factors are the second values, the encoding end selects a better encoding codebook from multiple second encoding codebooks as the target encoding codebook for these subbands.
  • the second value is 2, and the encoding end combines every two spectrum values in any group of subbands into a binary number, and uses the plurality of second encoding codebooks to perform the decimal number represented by the binary number respectively. coding. It should be understood that the second value is 2, which means that the average number of bits required to encode each spectrum value in the corresponding subband is 2. Then, the encoding end forms every two spectrum values in any group of subbands into a binary The number of bits of the binary number is 4. The encoding end uses the multiple second encoding codebooks to respectively encode the decimal numbers represented by the 4 bit numbers to obtain multiple bit streams of the two spectrum values. By analogy, after the encoding end encodes all the spectrum values in any group of subbands according to the above method, the second candidate spectrum bit stream corresponding to each second encoding codebook is obtained.
  • the two adjacent spectrum values in a certain subband in any group of subbands are 01 and 11 (binary numbers) respectively.
  • the encoding end combines these two spectrum values into a binary number 0111.
  • the binary number 0111 The represented decimal number is 7, and the encoding end uses the plurality of second encoding codebooks to encode the decimal 7 respectively.
  • the plurality of second encoding codebooks are Huffman encoding codebooks.
  • the plurality of second encoding codebooks are determined based on statistical characteristics of a large amount of quantized spectrum data or obtained through other methods.
  • the plurality of second encoding codebooks are as follows:
  • the encoding end uses a plurality of third values in the plurality of encoding codebooks.
  • the three encoding codebooks respectively encode the spectrum values in any group of subbands to obtain a plurality of third candidate spectrum bit streams that correspond one-to-one to the plurality of third encoding codebooks.
  • the encoding end will The third candidate spectrum bit stream with the smallest total number of bits among the plurality of third candidate spectrum bit streams is determined as the bit stream of spectrum values in any group of subbands, and the third candidate spectrum bit stream with the smallest total number of bits is determined as the bit stream.
  • the corresponding third coding codebook is determined as the target coding codebook corresponding to any group of subbands. Simply put, for those subbands whose quantization level measurement factors are the third values, the encoding end selects a better encoding codebook from multiple third encoding codebooks as the target encoding codebook for these subbands.
  • the third value is 3, and the encoding end uses the plurality of third encoding codebooks to respectively encode each spectrum value in any group of subbands. It should be understood that the third value is 3, which means that the average number of bits required to encode each spectrum value in the corresponding subband is 3. Then, the encoding end directly uses the plurality of third encoding codebooks to encode any group respectively. Each spectrum value within the subband is encoded to obtain multiple bit streams of each spectrum value. By analogy, after the encoding end encodes all the spectrum values in any group of subbands according to the above method, the third candidate spectrum bit stream corresponding to each third encoding codebook is obtained.
  • the plurality of first encoding codebooks are Huffman encoding codebooks.
  • the plurality of third encoding codebooks are determined based on statistical characteristics of a large amount of quantized spectrum data or obtained through other methods.
  • the plurality of third encoding codebooks are as follows:
  • the encoding end uses the plurality of third encoding codebooks to encode any group of subbands respectively.
  • the first part bits in each spectrum value within a group of subbands are encoded to obtain a plurality of first part candidate bit streams corresponding one-to-one to a plurality of third encoding codebooks.
  • the encoding end determines the first part candidate bit stream with the smallest total number of bits among the plurality of first part candidate bit streams as the first part bit stream, and determines the third code corresponding to the first part candidate bit stream with the smallest total number of bits.
  • the codebook is determined as the target coding codebook corresponding to any group of subbands.
  • the second part of the bits except the first part of the bits in each spectrum value in any group of subbands is uniformly quantized and encoded to obtain a bit stream of the second part of the bits.
  • the encoding end splits each spectrum value in these sub-bands into a first part of bits and a second part of bits, and uses the above-mentioned multiple third encodings
  • the codebook separately encodes the first part of bits and uniformly quantizes the second part of bits.
  • the encoding end selects a better encoding codebook from the plurality of third encoding codebooks as the target encoding codebook for these subbands.
  • the first part of bits refers to the N high-order bits in the spectrum value
  • the second part of the bits refers to the M low-order bits in the spectrum value.
  • M is equal to the quantization level measurement factor of any group of subbands minus Go to the third value.
  • the encoding end converts the high-order 3 bits of the spectrum value bit (i.e., binary number 101) is determined as the first part of the bits of the spectrum value, and the lower 2 bits of the spectrum value (i.e., binary number 00) is determined as the second part of the bits of the spectrum value.
  • the encoding end samples the above-mentioned plurality of third encoding codebooks to respectively encode the decimal number 5 represented by the binary number 101 to obtain the bit stream of the first part of the spectrum value, and performs uniform quantization encoding on the binary number 00 to obtain Obtain the bit stream of the second part of the spectrum value.
  • the encoding end determines based on the encoding bit rate that there is still available encoding ratio after encoding the left channel data into the bit stream.
  • the encoding end can also perform residual coding on the spectrum data included in the left channel data to obtain the residual bit stream in the left channel bit stream.
  • Step 502 Encode the right channel data of the audio signal into a code stream.
  • the encoding end also encodes the right channel data of the audio signal into the code stream.
  • the implementation process of encoding the right channel data at the encoding end is similar to the implementation process of encoding the left channel data.
  • the encoding end obtains the quantization level measurement factor of each subband in multiple subbands.
  • the quantization level measurement factor represents the average number of bits required to encode each spectrum value in the corresponding subband.
  • the multiple subbands refer to the encoding end.
  • the channel data includes a plurality of subbands into which the quantized spectral data is divided.
  • the encoding end divides the multiple subbands into multiple groups of subbands based on the quantization level measurement factors of the multiple subbands, and the quantization level measurement factors of the same group of subbands are the same.
  • the encoding end determines the target encoding codebook corresponding to each group of subbands from multiple encoding codebooks, and determines the bit stream of the spectrum value in each group of subbands, and the target encoding codebook Refers to the coding codebook used to encode the spectrum values within the corresponding set of subbands.
  • the encoding end encodes the identification of the target encoding codebook corresponding to each group of subbands into the code stream as a kind of side information of the right channel data.
  • For specific implementation methods please refer to the relevant introduction to encoding left channel data above, and will not be described again here.
  • the encoding end may also encode the right channel data in a different manner from encoding the left channel data.
  • the encoding end initializes and enters the monocoding logic, and performs the following operations in sequence: individually Encode the left channel data, pack the encoded left channel data into the code stream, encode the right channel data separately, and pack the encoded right channel data into the code stream.
  • the encoding end in addition to encoding the left channel data and the right channel data, the encoding end also encodes the side information of the left channel and the side information of the right channel into the code stream. In addition, the encoding end also compiles some parameters of the audio signal into the header of the code stream. The header and side information of the code stream are used for decoding at the decoding end.
  • FIG. 6 is a schematic diagram of a code stream structure provided by an embodiment of the present application.
  • the code stream structure shown in Figure 6 is the code stream structure of dual-channel deinterleaved coding.
  • the dual-channel deinterleaved coding method is a coding method in which the encoding end encodes left channel data and right channel data into the code stream respectively.
  • the code stream structure includes header (also called packet header), left channel side information (left channel side information), left channel payload (left channel payload), right channel side information ( fields such as right channel side information) and right channel payload.
  • the header field includes subfields such as coding type (codec type, CT), sampling rate (sample rate, SR), channel number (CN), frame length (FL), etc.
  • codec type codec type, CT
  • sampling rate sample rate, SR
  • channel number CN
  • FL frame length
  • the number of bits occupied by these four subfields is 2, 2, 1, and 2 in order.
  • the left channel side information field includes low bitrate flag (LBF), global coding control factor (DR), local coding control factor (DRQuater, DRQ), scale factor identity (SFID), sub Subfields such as band number (BN), differential encoding flag (differential encoding flag, DEF), scaling factor (SF), and Huffman encoding tuple ID (HufTupID).
  • LLF low bitrate flag
  • DR global coding control factor
  • DRQuater local coding control factor
  • SFID scale factor identity
  • sub Subfields such as band number (BN), differential encoding flag (differential encoding flag, DEF), scaling factor (SF), and Huffman encoding tuple ID (HufTupID).
  • BN band number
  • DEF differential encoding flag
  • SF scaling factor
  • HufTupID Huffman encoding tuple ID
  • the low bit rate flag is 1, indicating that the audio signal meets the second condition.
  • the low bit rate flag is 0, indicating that the audio signal does not meet the second condition.
  • the second condition means that the encoding code rate of the audio signal is less than the code rate threshold, and the energy concentration of the audio signal is less than the concentration threshold.
  • the scaling factor indicates the scaling factor of each subband.
  • the scaling factor is used by the encoding end to shape the spectrum envelope and instructs the decoding end to perform inverse processing of the spectrum envelope.
  • the scaling factor is derived from the maximum spectral value within the corresponding subband.
  • the Huffman coding codebook identification is the identification of the target encoding codebook.
  • the left channel payload field includes the spectral bitstream and residual bitstream of the left channel.
  • the spectrum data is modified discrete cosine transform (MDCT) spectrum data as an example.
  • the spectrum bit stream is MDCT quantization coding (MDCTQ) bit stream
  • the residual bit stream is MDCT residual (MDCTRES) coding bit stream.
  • the residual bitstream is optional.
  • the number of bits occupied by the left channel load field is N, and N is greater than 0.
  • the right channel side information field includes low bitrate flag (LBF), global coding control factor (DR), local coding control factor (DRQuater), scaling factor identification (SFID), subband number (BN), differential Subfields such as coding flag (DF), scaling factor (SF), and Huffman coding codebook identification (HufTupID).
  • LLF low bitrate flag
  • DR global coding control factor
  • DRQuater local coding control factor
  • SFID scaling factor identification
  • BN subband number
  • differential Subfields such as coding flag (DF), scaling factor (SF), and Huffman coding codebook identification (HufTupID).
  • the number of bits occupied by these eight subfields is 1, 5, 3, 3, 4, 1, (5*number of subbands) or Huffman coding length (HEncL), 6.
  • the right channel payload field includes the spectrum bitstream and residual bitstream of the right channel.
  • the spectrum bit stream is an MDCT quantization coded bit stream
  • the residual bit stream is an MDCT residual coded bit stream.
  • the number of bits occupied by the left channel load field is N, and N is greater than 0.
  • the encoding end when the audio signal is a two-channel signal, combines the left channel data and the right channel data through two-channel interleaved coding.
  • the data is encoded into the code stream. That is, the encoding end uses a joint stereo encoding method to encode the audio signal.
  • the encoding end in addition to performing dual-channel interleaved encoding on the left and right channel data, also encodes the side information of the left and right channels into the code stream. In addition, the encoding end also compiles some parameters of the audio signal into the header of the code stream, and these parameters are used as common parameters for two channels.
  • the encoding end encodes the identity of the common parameters of the two channels, encodes the side information of the left and right channels, and encodes the spectrum data of the left and right channels through binaural interleaving coding.
  • common parameters for dual-channel audio include encoding type, sampling rate, number of channels, frame length, etc.
  • Side information includes low bitrate flag, mid/side-stereo transform coding flag (MSF), global coding control factor, local coding control factor, scaling factor flag, number of subbands, differential coding flag, label degree factor and Huffman coding codebook, etc.
  • the spectrum data of the left and right channels includes left channel data and right channel data.
  • MSFlag indicates whether the encoding end performs MS transformation on the spectrum data of the left and right channels and then encodes it into the code stream.
  • the encoding end initializes into the two-channel interleaved encoding mode and performs the following operations in sequence: Encoding And package the identification of the common parameters of the two channels, encode and package the scaling factors of the left and right channels, encode and package the Huffman coding codebook identification corresponding to the MDCT spectrum value, and use the Huffman coding codebook to perform the MDCT spectrum value Encoding and packaging.
  • Figure 7 is a schematic diagram of another code stream structure provided by an embodiment of the present application.
  • the code stream structure shown in Figure 7 is the code stream structure of two-channel interleaved coding.
  • the code stream structure includes header, side information, and payload fields in sequence.
  • the header field includes subfields such as encoding type, sampling rate, number of channels, frame length, etc.
  • the number of bits occupied by these four subfields is 2, 2, 1, and 2 in order.
  • the side information field includes low bit rate flag (LBF), plus and minus stereo transformation flag (MSF), and global coding control factor.
  • DR low bit rate flag
  • MSF plus and minus stereo transformation flag
  • DRQ global coding control factor
  • DRQ local coding control factor
  • SFID scaling factor identification
  • BN number of subbands
  • DEF differential coding flag
  • SF scaling factor
  • HufTupID Huffman coding codebook identification
  • the payload field includes the left and right channel interleaved coded spectral bitstream and residual bitstream.
  • the spectrum bit stream is an MDCT quantization coded bit stream
  • the residual bit stream is an MDCT residual coded bit stream.
  • the residual bitstream is optional.
  • the number of bits occupied by the payload field is N, and N is greater than 0.
  • the bit allocation strategy for the left and right channels adopted by the encoding end is an average allocation strategy.
  • the excess bits will be allocated to the right channel first.
  • the excess bits may also be preferentially allocated to the left channel.
  • the encoding end encodes the mono data of the audio signal into the code stream.
  • the code stream structure obtained by encoding the mono-channel data at the encoding end is consistent with the code stream structure of the two-channel interleaved coding. It’s just that the code stream corresponding to mono-channel data only has data related to a single channel.
  • the encoding process shown in Figure 8 is a process for encoding a two-channel signal.
  • the decoding process of the mono signal is similar to the left branch process in Figure 8.
  • the encoding end obtains input parameters, which include the encoded frame length, encoding bit rate, audio signal to be encoded (such as PCM data), etc.
  • the encoding end packages the header of the code stream based on the input parameters.
  • the encoding end selects the encoding mode (that is, selects the code stream packaging method).
  • the encoding end When the dual-channel interleaving mode is selected, the encoding end enters the left branch process, sequentially encoding the identification of the common parameters of the dual-channel, encoding the scaling factors corresponding to the left and right channels, and encoding codebook identification used to encode the left and right channels. , encoding the spectral data of the left and right channels.
  • the encoding end When the two-channel deinterleaving mode is selected, the encoding end enters the right branch process and performs left channel encoding and right channel encoding in sequence.
  • the encoding process of each channel includes encoding channel parameter identification, encoding channel scaling factor, encoding channel encoding codebook identification and encoding channel spectrum data. After all data is packaged, the encoding end ends the encoding process.
  • the encoding end improves the encoding efficiency by selecting an appropriate encoding codebook, which is also conducive to efficient decoding on the decoding end. It can also be seen from the above that a scaling factor is incorporated into the code stream. In the embodiment of the present application, the encoding end can also improve the encoding effect and compression efficiency to a certain extent by appropriately encoding the scaling factor. Next, the implementation process of encoding scaling factors on the encoding side will be introduced.
  • the encoding end obtains the scaling factor of each subband in multiple subbands. If the absolute value of the difference between the scaling factors between two adjacent subbands in the multiple subbands is greater than the difference threshold, Then the encoding end performs uniform quantization coding on the scaling factors of the multiple subbands. If the absolute value of the difference between the scaling factors of the multiple sub-bands does not exist between two adjacent sub-bands is greater than the difference threshold, the encoding end performs differential encoding on the scaling factors of the multiple sub-bands.
  • the difference threshold is 6.
  • the difference threshold can also be other values. It should be noted that the difference threshold is determined based on the statistical characteristics of the scaling factors of multiple sub-bands or determined by other methods.
  • the encoding end determines the differential scale value of each subband in the plurality of subbands. Wherein, the difference between the first subband among the multiple subbands The subscale value is the scaling factor of the first subband plus the reference value.
  • the differential scaling value of any other subband in the plurality of subbands except the first subband is the difference between the scaling factor of any subband and the scaling factor of the previous subband of any subband. difference.
  • the encoding end performs uniform quantization encoding on the differential scale value of the first subband.
  • the encoding end performs entropy coding, such as Huffman coding, on any other subband among the multiple subbands except the first subband.
  • the encoding end uses a Huffman coding codebook to encode any other subband among the plurality of subbands except the first subband.
  • the Huffman coding codebook is as follows:
  • HuffmanCB HUF_ENC_DIFF_SF[HUF_ENC_DIFF_SF_MAXIDX]
  • the above Huffman coding codebook includes multiple triples.
  • the first element in each triple represents the decimal number to be encoded, and the second element represents the decimal number corresponding to the encoded binary number, that is, the code Value, the third element represents the number of bits occupied by the encoded binary number, that is, the code length.
  • the above reference value is 8. In other embodiments, the above reference value may also be other numerical values.
  • the encoding end determines the differential scale value corresponding to each subband in the plurality of subbands according to formula (1).
  • b represents the number of subbands
  • bandsNum+1 represents the total number of subbands
  • sf(b) represents the scaling factor of subband b
  • sfDiff(b) represents the differential scaling value of subband b.
  • the encoding end uses the above Huffman coding codebook to encode, and the encoded binary number is encoded into the code stream.
  • the encoding end determines the sign bit of sfDiff(b) as 1, and encodes the sign bit into the code stream, and the sign bit occupies 1 bit.
  • the negative sfDiff(b) does not occupy the sign bit.
  • the above-mentioned Huffman coding codebook is obtained based on experience or big data training. Generally speaking, the code length corresponding to a value that appears frequently is shorter, and the code length corresponding to a value that appears frequently is longer.
  • the encoding end can sequentially encode the left channel data and the right channel data according to the conditions met by the audio signal, instead of having to follow the two-channel interleaved coding method or the two-channel interleaved coding method. Coding using deinterleaved coding. It can be seen that the coding method of this scheme is more flexible.
  • Figure 9 is a flow chart of an audio decoding method provided by an embodiment of the present application.
  • the decoding method is applied to the decoding end.
  • This decoding method matches the encoding method shown in Figure 5. Please refer to Figure 9.
  • the decoding method includes the following steps.
  • Step 901 If the audio signal to be decoded meets the first condition, obtain the channel decoding mode.
  • the first condition includes that the audio signal is a two-channel signal, the encoding code rate of the audio signal is not less than the code rate threshold, and the audio signal The sampling rate of the signal is not less than the sampling rate threshold.
  • the decoding end in order to achieve low power consumption at the decoding end so that electronic devices with limited resources can successfully decode the code stream, the decoding end may be able to press the decoding mode according to the conditions met by the audio signal to be decoded and the channel decoding mode. Need to decode.
  • the channel decoding mode is a parameter configured on the decoding end.
  • the channel decoding mode is represented by numerical values such as 0, 1, 2, etc., and different numerical values represent different channel decoding modes. For example, 0 indicates binaural decoding mode, 1 indicates left audio Channel decoding mode, 2 indicates right channel decoding mode.
  • the decoding end obtains the channel decoding mode, and determines which channel bit stream to decode based on the channel decoding mode.
  • the code rate threshold is 300kbps or other values
  • the sampling rate threshold is 88.2kHz or other values.
  • the code rate threshold is 300kbps and the sampling rate threshold is 88.2kHz. It should be noted that the code rate threshold at the encoding end is the same as the code rate threshold at the decoding end, and the sampling rate threshold at the encoding end is the same as the sampling rate threshold at the decoding end.
  • the implementation process for the decoding end to determine whether the audio signal meets the first condition includes: the decoding end obtains the total data amount of the code stream, and decodes the header of the code stream to obtain the number of channels of the audio signal, Sampling rate and frame length.
  • the decoder determines whether the audio signal meets the first condition based on the total data amount, number of channels, sampling rate and frame length.
  • the decoder determines the coding rate of the audio signal based on the total number, sampling rate and frame length. The decoder determines whether the audio signal satisfies the first condition based on the encoding code rate, number of channels, and sampling rate of the audio signal.
  • Step 902 If the channel decoding mode is the left channel decoding mode, decode the left channel bit stream in the code stream to obtain the left channel data of the audio signal, and copy the left channel data to the right channel.
  • the left channel decoding mode means that there is no need to decode the right channel bit stream in the code stream.
  • the decoding end copies the left channel data to the right channel.
  • the decoding end performs inverse quantization on the quantized left channel spectrum data, and copies the inverse quantized left channel spectrum data to the right channel. road.
  • the decoder performs inverse quantization on the quantized left channel spectrum data, and performs an inverse transform in the time-frequency domain on the inverse quantized left channel data (such as inverse-modified discrete cosine transform (IMDCT)).
  • IMDCT inverse-modified discrete cosine transform
  • the code stream contains the parameters required for decoding by the decoder.
  • the code stream contains side information of the left channel.
  • This side information includes the encoding code used to encode the left channel data. Ben's logo.
  • the decoder decodes the bit stream of the side information in the code stream to obtain the side information, and the side information includes the encoding codebook identifier.
  • the decoding end determines the target decoding codebook required for decoding from multiple decoding codebooks based on the encoding codebook identification.
  • the side information here refers to the side information of the left channel.
  • the multiple encoding codebooks correspond to multiple decoding codebooks one-to-one, and the decoding end determines the corresponding relationship from the encoding and decoding codebook identifiers. , determine the decoding codebook identifier corresponding to the encoding codebook identifier included in the side information, and determine the determined decoding codebook identifier as the target decoding identifier.
  • the plurality of encoding codebooks include the plurality of first encoding codebooks, the plurality of second encoding codebooks and the plurality of third encoding codebooks introduced above, and the corresponding plurality of decoding codebooks include multiple third encoding codebooks.
  • the plurality of first decoding codebooks are in one-to-one correspondence with the plurality of first encoding codebooks
  • the plurality of second decoding codebooks are in one-to-one correspondence with the plurality of second encoding codebooks
  • the plurality of third decoding codebooks are in one-to-one correspondence.
  • the plurality of first decoding codebooks are as follows:
  • the plurality of second decoding codebooks are as follows:
  • the plurality of third decoding codebooks are as follows:
  • Step 903 If the channel decoding mode is the right channel decoding mode, decode the right channel bit stream in the code stream to obtain the right channel data of the audio signal, and copy the right channel data to the left channel.
  • the right channel decoding mode means that there is no need to decode the left channel bit stream in the code stream.
  • the decoding end copies the right channel data to the left channel.
  • the decoding end performs inverse quantization on the quantized right channel spectrum data, and copies the inverse quantized right channel spectrum data to the left channel data. road.
  • the decoding end performs inverse quantization on the quantized right channel spectrum data, and performs an inverse transform (such as IMDCT) in the time and frequency domain on the inverse quantized right channel data to obtain the right channel time domain overlapping signal.
  • the time-domain overlap signals of the right channel are overlapped and added to reconstruct the time-domain signal of the right channel, and the reconstructed time-domain signal of the right channel is copied to the left channel.
  • the code stream contains the parameters required by the decoder for decoding.
  • the code stream contains side information for the right channel.
  • This side information includes the encoding code used to encode the right channel data. Ben's logo.
  • the decoder decodes the bit stream of the side information in the code stream to obtain the side information, and the side information includes the encoding codebook identifier.
  • the decoding end determines the target decoding codebook required for decoding from multiple decoding codebooks based on the encoding codebook identification.
  • the side information here refers to the side information of the right channel.
  • multiple decoding codebooks are the same as the multiple decoding codebooks introduced in step 902, and will not be described again here.
  • the decoding end decodes the left channel bit stream and the right channel bit stream to obtain the left channel data and the right channel bit stream. data. That is, the decoding end decodes all the data in the code stream.
  • the decoder decodes the two-channel interleaved bit stream in the code stream to obtain left channel data and right channel data. It should be understood that the audio signal is a two-channel signal, but it does not meet the first condition, which means that the encoding end encodes the left channel data and the right channel data according to the two-channel interleaved coding method. Then, the decoding end encodes the left channel data and the right channel data according to The two-channel interleaved decoding method decodes the two-channel interleaved bit stream to obtain left channel data and right channel data.
  • the left channel data and right channel data obtained by the decoding end include quantized spectrum data. If the channel decoding mode is the left channel decoding mode, the decoding end performs the quantized spectrum data included in the left channel data. Inverse quantization is performed to obtain the inverse quantized left channel spectrum data, and the inverse quantized left channel spectrum data is copied to the right channel. If the channel decoding mode is the right channel decoding mode, the decoding end performs inverse quantization on the quantized spectrum data included in the right channel data to obtain the inverse quantized right channel spectrum data, and converts the inverse quantized right channel spectrum data into Channel spectrum data is copied to the left channel.
  • the decoding end performs inverse quantization on the quantized spectrum data included in the left channel data to obtain to obtain inversely quantized left channel spectrum data, and inversely quantize the quantized spectrum data included in the right channel data to obtain inversely quantized right channel spectrum data.
  • the decoding end performs inverse quantization on the quantized spectrum data included in the left channel data to obtain the inverse quantized left channel spectrum data, and performs the inverse quantization on the inverse quantized spectrum data.
  • the left channel spectrum data is subjected to inverse transformation in the time-frequency domain (such as IMDCT) to obtain the left channel time domain overlapping signal, and the left channel time domain overlapping signal is overlapped and added to reconstruct the left channel time domain. Domain signal, copy the reconstructed left channel time domain signal to the right channel.
  • the decoding end performs inverse quantization on the quantized spectrum data included in the right channel data to obtain the inverse quantized right channel spectrum data.
  • the channel spectrum data is subjected to inverse transformation in the time-frequency domain (such as IMDCT) to obtain the right channel time domain overlapping signal, and the right channel time domain overlapping signal is overlapped and added to reconstruct the right channel time domain signal. , copy the reconstructed right channel time domain signal to the left channel.
  • the channel decoding mode is a dual-channel decoding mode
  • the decoding end sequentially performs inverse quantization, inverse transformation in the time-frequency domain, and overlap addition on the quantized spectrum data included in the left channel data to reconstruct the left channel.
  • the time domain signal, and the quantized spectrum data included in the right channel data are sequentially subjected to inverse quantization, inverse transformation of the time and frequency domain, and overlap and addition to reconstruct the right channel time domain signal.
  • the decoding end after obtaining the left channel data and/or right channel data, it is also necessary to perform MS inverse transformation on the left channel data and/or right channel data to obtain the original left channel data and/or right channel data.
  • the decoding end decodes the mono bit stream in the code stream to obtain the mono data of the audio signal.
  • the decoder copies the mono data to the left and right channels.
  • the mono-channel data obtained by decoding is quantized mono-channel spectrum data
  • the decoding end performs inverse quantization on the quantized mono-channel spectrum data, and performs time-frequency domain analysis on the inverse-quantized mono-channel spectrum data. Inverse transformation (such as IMDCT) is used to obtain the mono time domain overlapping signal, and the mono time domain overlapping signal is overlapped and added to reconstruct the mono time domain signal.
  • the reconstructed mono time domain signal is The time domain signal is copied to the left and right channels.
  • playback devices such as headphones can obtain the signals of the corresponding channels for playback according to the configuration.
  • the decoding process shown in Figure 10 is a process for decoding a two-channel signal.
  • the decoding process of the mono signal is similar to the left branch process in Figure 10.
  • the decoder obtains input parameters, including the channel decoding mode, etc.
  • the decoding end decodes the packet header of the code stream.
  • the decoding end selects the decoding mode (ie, selects the code stream unpacking method) based on the packet header and the channel decoding mode.
  • the decoding end When the dual-channel interleaved mode is selected, the decoding end enters the left branch process, and sequentially decodes the identification of the common parameters of the dual-channel, decodes the scaling factors of the left and right channels, and decodes the coding codebook identification used for the left and right channels. Decode the spectral data of the left and right channels.
  • the decoder When the two-channel deinterleaving mode is selected, the decoder enters the right branch process and performs left channel decoding and right channel decoding in sequence. Among them, the decoding process of each channel includes decoding the channel parameter identifier, decoding the channel scaling factor, decoding the channel encoding codebook identifier and decoding the channel spectrum data in sequence.
  • the decoding end After the decoding is completed, the decoding end sequentially performs inverse quantization and MDCT inverse transformation on the analyzed spectrum data of the corresponding channels according to the channels indicated by the configured channel decoding mode to reconstruct the signals of the corresponding channels.
  • the above mainly introduces the implementation process of the spectrum data in the code stream parsed by the decoder.
  • the code stream also incorporates scaling factors for multiple subbands.
  • the scaling factors are used to analyze the audio signal.
  • the spectrum envelope is shaped.
  • the decoder parses the side information from the code stream. After obtaining the scaling factor in the side information, it shapes the spectrum envelope of the audio signal based on the scaling factor, such as shaping the left channel and/or right channel.
  • the spectrum envelope is shaped.
  • decode The implementation process of end decoding scaling factor is exemplarily introduced.
  • the decoding end first obtains the binary numbers of the first 5 bits in the bit stream of the scaling factor, obtains the differential scaling value of the first subband among the multiple subbands, and subtracts the differential scaling value of the first subband from the reference value (the reference value is 8 in this example) to get the scaling factor of the first subband. Then, the decoder reads the next 5-bit binary number from the code stream, uses the decimal number corresponding to the binary number as the code value, and obtains the decoded decimal number corresponding to the code value by searching the Huffman decoding codebook. This decimal number is the absolute value of the differential scale value of the next subband.
  • the decoding end continues to read 1 bit in the code stream as the sign bit of the absolute value, thereby obtaining the differential scale value of the subband. If the absolute value is 0, the absolute value of the differential scale value is the differential scale value of the subband.
  • the decoding end obtains the differential scale value of each subband in the multiple subbands.
  • the decoding end recovers the scaling factor of each subband in the multiple subbands according to formula (2).
  • the decoding end analyzes the scaling factors of the multiple subbands based on the number of subbands.
  • the Huffman decoding codebook used at the decoding end is as follows:
  • the audio signal when the audio signal is a two-channel signal, even if the code stream includes a left channel bit stream and a right channel bit stream, during the decoding process, the audio signal can be decoded according to the channel Decoding mode to decode the left channel bit stream without decoding the right channel bit stream, or decode the right channel bit stream without decoding the left channel bit stream, thereby reducing the power consumption of the decoding end when the decoding end resources are limited. .
  • FIG 11 is a schematic structural diagram of an audio decoding device 1100 provided by an embodiment of the present application.
  • the audio decoding device 1100 can be implemented as part or all of an audio codec device by software, hardware, or a combination of both.
  • the device 1100 includes: an acquisition module 1101 and a decoding module 1102.
  • the acquisition module 1101 is used to acquire the channel decoding mode if the audio signal to be decoded satisfies a first condition.
  • the first condition includes that the audio signal is a two-channel signal, the encoding code rate of the audio signal is not less than the code rate threshold, and the audio signal is a two-channel signal.
  • the sampling rate of the signal is not less than the sampling rate threshold;
  • Decoding module 1102 used to decode the left channel bit stream in the code stream to obtain the left channel data of the audio signal if the channel decoding mode is the left channel decoding mode, and copy the left channel data to the right channel ;
  • the decoding module 1102 is also used to decode the right channel bit stream in the code stream to obtain the right channel data of the audio signal if the channel decoding mode is the right channel decoding mode, and copy the right channel data to the left channel data. road.
  • the decoding module 1102 is also configured to decode the left channel bit stream and the right channel bit stream if the channel decoding mode is not the left channel decoding mode and is not the right channel decoding mode to obtain the left channel bit stream. data and right channel data.
  • the decoding module 1102 is also configured to: if the audio signal does not meet the first condition, when the audio signal is two-channel In the case of a signal, the two-channel interleaved bit stream in the code stream is decoded to obtain left channel data and right channel data.
  • the decoding module 1102 is also configured to decode the mono bit stream in the code stream to obtain the mono bit stream of the audio signal if the audio signal does not meet the first condition.
  • Channel data is also configured to decode the mono bit stream in the code stream to obtain the mono bit stream of the audio signal if the audio signal does not meet the first condition.
  • the device 1100 also includes:
  • the second acquisition module is used to obtain the total data volume of the code stream
  • the decoding module is also used to decode the packet header of the code stream to obtain the number of channels, sampling rate and frame length of the audio signal;
  • the first determination module is used to determine whether the audio signal meets the first condition based on the total data amount, number of channels, sampling rate and frame length.
  • the decoding module 1102 is also used to decode the bit stream of side information in the code stream to obtain the side information, where the side information includes the encoding codebook identifier;
  • the device 1100 further includes: a second determination module, configured to determine a target decoding codebook required for decoding from a plurality of decoding codebooks based on the encoding codebook identification.
  • the multiple decoding codebooks are as follows:
  • the audio signal when the audio signal is a two-channel signal, even if the code stream includes a left channel bit stream and a right channel bit stream, during the decoding process, it can be decoded according to the channel decoding mode.
  • the left channel bit stream is decoded without decoding the right channel bit stream, or the right channel bit stream is decoded without decoding the left channel bit stream, thereby reducing the power consumption of the decoding end when the decoding end resources are limited.
  • the encoding end can also sequentially encode the left channel data and the right channel data according to the conditions met by the audio signal, instead of having to encode according to the two-channel interleaved encoding method or the two-channel deinterleaved encoding method. It can be seen that the coding method of this scheme is more flexible.
  • the audio decoding device provided in the above embodiment decodes the audio signal
  • only the division of the above functional modules is used as an example.
  • the above function allocation can be completed by different functional modules as needed. That is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
  • the audio decoding device provided by the above embodiments and the audio decoding method embodiments belong to the same concept. Please refer to the method embodiments for the specific implementation process, which will not be described again here.
  • FIG 12 is a schematic structural diagram of an audio encoding device 1200 provided by an embodiment of the present application.
  • the audio encoding device 1200 can be implemented by software, hardware, or a combination of the two to become part or all of the audio coding and decoding device.
  • the device 1200 includes: an encoding module 1201.
  • Encoding module 1201 configured to encode the left channel data of the audio signal into the code stream and the right channel data of the audio signal into the code stream if the audio signal to be coded satisfies the first condition.
  • the first condition includes audio
  • the signal is a two-channel signal, the encoding code rate of the audio signal is not less than the code rate threshold, and the sampling rate of the audio signal is not less than the sampling rate threshold.
  • the encoding module 1201 is also used for:
  • the audio signal does not meet the first condition, if the audio signal is a two-channel signal, the left channel data and the right channel data are encoded into the code stream through two-channel interleaved coding.
  • the encoding module 1201 is also used for:
  • the audio signal does not meet the first condition, if the audio signal is a mono signal, the mono data of the audio signal is encoded into the code stream.
  • the encoding module 1201 includes:
  • the acquisition submodule is used to obtain the quantization level measurement factor of each subband in multiple subbands.
  • the quantization level measurement factor represents the average number of bits required to encode each spectrum value in the corresponding subband.
  • Multiple subbands refer to the left channel
  • the data includes multiple subbands into which the quantized spectrum data is divided;
  • the sub-division module is used to divide multiple sub-bands into multiple groups of sub-bands based on the quantization level measurement factors of multiple sub-bands.
  • the quantization level measurement factors of the same group of sub-bands are the same;
  • the target coding codebook refers to the coding codebook used to encode the spectrum values in the corresponding group of subbands
  • the encoding submodule is used to encode the identification of the target encoding codebook corresponding to each group of subbands into the code stream as a kind of side information of the left channel data.
  • any group of subbands in the plurality of groups of subbands if the quantization level measurement factor of any group of subbands is the first value, then use multiple first encoding codebooks in the plurality of encoding codebooks to encode any group of subbands respectively. Encoding the spectrum values within the subband to obtain multiple first candidate spectrum bit streams that correspond one-to-one to multiple first encoding codebooks;
  • the first coding codebook is determined as the target coding codebook corresponding to any group of subbands.
  • the first value is 1;
  • Multiple first coding codebooks in multiple coding codebooks are used to respectively encode the spectrum values in any group of subbands, including:
  • any group of subbands in the plurality of groups of subbands if the quantization level measurement factor of any group of subbands is the second value, then use multiple second encoding codebooks in the plurality of encoding codebooks to encode any group of subbands respectively. Encoding the spectrum values within the subband to obtain multiple second candidate spectrum bit streams that correspond one-to-one to the multiple second encoding codebooks;
  • the second coding codebook is determined as the target coding codebook corresponding to any group of subbands.
  • the second value is 2;
  • Multiple second encoding codebooks in multiple encoding codebooks are used to encode the spectrum values in any group of subbands, including:
  • multiple second encoding codebooks are as follows:
  • any group of subbands in multiple groups of subbands if the quantization level measurement factor of any group of subbands is a third value, multiple third encoding codebooks in multiple encoding codebooks are used to encode any group of subbands respectively. Encoding the spectrum values within the subband to obtain a plurality of third candidate spectrum bit streams that correspond one-to-one to the plurality of third encoding codebooks;
  • the third coding codebook is determined as the target coding codebook corresponding to any group of subbands.
  • any group of subbands in the multiple groups of subbands if the quantization level measurement factor of any group of subbands is the fourth value, multiple third encoding codebooks are used to separately encode each spectrum value in any group of subbands. Encoding the first part bits in to obtain a plurality of first part candidate bit streams corresponding one-to-one to a plurality of third encoding codebooks;
  • the first part candidate bit stream with the smallest total number of bits among the plurality of first part candidate bit streams is determined as the bit stream of the first part bits, and the third encoding codebook corresponding to the first part candidate bit stream with the smallest total number of bits is determined as The target encoding codebook corresponding to any group of subbands;
  • the second part of the bits except the first part of the bits in each spectrum value in any group of subbands is uniformly quantized and encoded to obtain a bit stream of the second part of the bits.
  • the first part of bits refers to the N bits of the high bits in the spectrum value
  • the second part of the bits refers to the M bits of the low bits of the spectrum value.
  • M is equal to the quantization level measurement factor of any group of subbands minus third value.
  • the third value is 3;
  • Multiple third coding codebooks in multiple coding codebooks are used to respectively encode the spectrum values in any group of subbands, including:
  • the audio signal when the audio signal is a two-channel signal, even if the code stream includes a left channel bit stream and a right channel bit stream, during the decoding process, it can be decoded according to the channel decoding mode.
  • the left channel bit stream is decoded without decoding the right channel bit stream, or the right channel bit stream is decoded without decoding the left channel bit stream, thereby reducing the power consumption of the decoding end when the decoding end resources are limited.
  • the encoding end can also sequentially encode the left channel data and the right channel data according to the conditions met by the audio signal, instead of having to encode according to the two-channel interleaved encoding method or the two-channel deinterleaved encoding method. It can be seen that the coding method of this scheme is more flexible.
  • the audio encoding device provided in the above embodiments encodes audio signals
  • only the division of the above functional modules is used as an example.
  • the above function allocation can be completed by different functional modules as needed. That is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
  • the audio coding device provided by the above embodiments and the audio coding method embodiments belong to the same concept. Please refer to the method embodiments for the specific implementation process, which will not be described again here.
  • the above embodiments it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • software when When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are generated in whole or in part.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another, e.g., the computer instructions may be transferred from a website, computer, server, or data center Transmission to another website, computer, server or data center through wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium can be any available medium that can be accessed by a computer, or a data storage device such as a server or data center integrated with one or more available media.
  • the available media may be magnetic media (such as floppy disks, hard disks, magnetic tapes), optical media (such as digital versatile discs (DVD)) or semiconductor media (such as solid state disks (SSD)) wait.
  • the computer-readable storage media mentioned in the embodiments of this application may be non-volatile storage media, in other words, may be non-transitory storage media.
  • the information including but not limited to user equipment information, user personal information, etc.
  • data including but not limited to data used for analysis, stored data, displayed data, etc.
  • Signals are all authorized by the user or fully authorized by all parties, and the collection, use and processing of relevant data need to comply with the relevant laws, regulations and standards of relevant countries and regions.

Abstract

本申请公开了一种音频编解码方法、装置、存储介质及计算机程序产品,属于音频编解码领域。在本方案中,在音频信号为双声道信号的情况下,即使码流包括左声道比特流和右声道比特流,在解码过程中,也能够依据声道解码模式,来解码左声道比特流且不解码右声道比特流,或者解码右声道比特流而不解码左声道比特流,从而在解码端资源有限的情况下,降低解码端的功耗。相应地,编码端也能够依据音频信号所满足的条件,依次编码左声道数据和右声道数据,而非必须按照双声道交织编码方式或必须按照双声道解交织编码方式进行编码。可见本方案的编码方式更加灵活。

Description

音频编解码方法、装置、存储介质及计算机程序产品
本申请要求于2022年7月27日提交的申请号为202210892837.6、发明名称为“音频编解码方法、装置、存储介质及计算机程序产品”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。本申请还要求于2022年9月19日提交的申请号为202211139716.0、发明名称为“音频编解码方法、装置、存储介质及计算机程序产品”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及音频编解码领域,特别涉及一种音频编解码方法、装置、存储介质及计算机程序产品。
背景技术
随着生活质量的提高,人们对高质量音频的需求不断增大。为了利用有限的带宽更好地传输音频信号,通常需要先在编码端对音频信号进行数据压缩,以得到码流,然后将码流传输到解码端。解码端对接收到的码流进行解码处理,以重建音频信号,重建出的音频信号用于回放。然而,一些解码端的资源有限,例如某些蓝牙耳机、智能音箱等。如何降低解码端在解码过程中的功耗,成为一个亟需解决的技术问题。
发明内容
本申请提供了一种音频编解码方法、装置、存储介质及计算机程序产品,能够降低解码端的功耗。所述技术方案如下:
第一方面,提供了一种音频解码方法,所述方法包括:
如果待解码的音频信号满足第一条件,则获取声道解码模式,所述第一条件包括所述音频信号为双声道信号、所述音频信号的编码码率不小于码率阈值,且所述音频信号的采样率不小于采样率阈值;如果所述声道解码模式为左声道解码模式,则解码码流中的左声道比特流,以得到所述音频信号的左声道数据,将所述左声道数据复制到右声道;如果所述声道解码模式为右声道解码模式,则解码所述码流中的右声道比特流,以得到所述音频信号的右声道数据,将所述右声道数据复制到左声道。
可选地,所述方法还包括:如果所述声道解码模式不是所述左声道解码模式,且不是所述右声道解码模式,则解码所述左声道比特流和所述右声道比特流,以得到所述左声道数据和所述右声道数据。
可选地,所述方法还包括:如果所述音频信号不满足所述第一条件,则在所述音频信号为双声道信号的情况下,解码所述码流中的双声道交织比特流,以得到所述左声道数据和所述右声道数据。
可选地,所述方法还包括:如果所述音频信号不满足所述第一条件,则在所述音频信号为单声道信号的情况下,解码所述码流中的单声道比特流,以得到所述音频信号的单声道数 据。
可选地,所述方法还包括:获取所述码流的总数据量;解码所述码流的包头,以得到所述音频信号的声道数、采样率和帧长;基于所述总数据量、所述声道数、所述采样率和所述帧长,确定所述音频信号是否满足所述第一条件。
可选地,所述方法还包括:解码所述码流中边信息的比特流,以得到所述边信息,所述边信息包括编码码本标识;基于所述编码码本标识,从多个解码码本中确定解码所需的目标解码码本。
可选地,所述多个解码码本如下:
{{0,0,4},{1,1,4},{2,2,4},{3,3,4},{4,4,4},{5,5,4},{6,6,4},{7,7,4},
{8,8,4},{9,9,4},{10,10,4},{11,11,4},{12,12,4},{13,13,4},{14,14,4},{15,15,4}
};
{{11,0,5},{15,1,5},{13,2,4},{13,3,4},{0,4,3},{0,5,3},{0,6,3},{0,7,3},
{9,8,4},{9,9,4},{5,10,4},{5,11,4},{14,12,4},{14,13,4},{7,14,4},{7,15,4},{6,16,4},{6,17,4},{3,18,4},{3,19,4},{10,20,4},{10,21,4},{12,22,4},{12,23,4},{2,24,4},{2,25,4},{1,26,4},{1,27,4},{8,28,4},{8,29,4},{4,30,4},{4,31,4}
};
{{2,0,4},{2,1,4},{2,2,4},{2,3,4},{1,4,4},{1,5,4},{1,6,4},{1,7,4},
{8,8,4},{8,9,4},{8,10,4},{8,11,4},{4,12,4},{4,13,4},{4,14,4},{4,15,4},{0,16,2},{0,17,2},{0,18,2},{0,19,2},{0,20,2},{0,21,2},{0,22,2},{0,23,2},
{0,24,2},{0,25,2},{0,26,2},{0,27,2},{0,28,2},{0,29,2},{0,30,2},{0,31,2},{11,32,6},{15,33,6},{14,34,5},{14,35,5},{7,36,5},{7,37,5},{13,38,5},
{13,39,5},{12,40,4},{12,41,4},{12,42,4},{12,43,4},{6,44,4},{6,45,4},
{6,46,4},{6,47,4},{3,48,4},{3,49,4},{3,50,4},{3,51,4},{5,52,4},{5,53,4},{5,54,4},{5,55,4},{10,56,4},{10,57,4},{10,58,4},{10,59,4},{9,60,4},{9,61,4},{9,62,4},{9,63,4}
};
{{11,0,7},{15,1,7},{14,2,6},{14,3,6},{7,4,6},{7,5,6},{13,6,6},{13,7,6},{12,8,5},{12,9,5},{12,10,5},{12,11,5},{6,12,5},{6,13,5},{6,14,5},{6,15,5},{3,16,5},{3,17,5},{3,18,5},{3,19,5},{5,20,5},{5,21,5},{5,22,5},{5,23,5},{10,24,5},{10,25,5},{10,26,5},{10,27,5},{9,28,5},{9,29,5},{9,30,5},{9,31,5},{2,32,4},{2,33,4},{2,34,4},{2,35,4},{2,36,4},{2,37,4},{2,38,4},{2,39,4},{1,40,4},{1,41,4},{1,42,4},{1,43,4},{1,44,4},{1,45,4},{1,46,4},{1,47,4},{8,48,4},{8,49,4},{8,50,4},{8,51,4},{8,52,4},{8,53,4},{8,54,4},{8,55,4},{4,56,4},{4,57,4},{4,58,4},{4,59,4},{4,60,4},{4,61,4},{4,62,4},{4,63,4},{0,64,1},{0,65,1},{0,66,1},{0,67,1},{0,68,1},{0,69,1},{0,70,1},{0,71,1},{0,72,1},{0,73,1},{0,74,1},{0,75,1},{0,76,1},{0,77,1},{0,78,1},{0,79,1},{0,80,1},{0,81,1},{0,82,1},{0,83,1},{0,84,1},{0,85,1},{0,86,1},{0, 87,1},{0,88,1},{0,89,1},{0,90,1},{0,91,1},{0,92,1},{0,93,1},{0,94,1},{0,95,1},{0,96,1},{0,97,1},{0,98,1},{0,99,1},{0,100,1},{0,101,1},{0,102,1},{0,103,1},{0,104,1},{0,105,1},{0,106,1},{0,107,1},{0,108,1},{0,109,1},{0,110,1},{0,111,1},{0,112,1},{0,113,1},{0,114,1},{0,115,1},{0,116,1},{0,117,1},{0,118,1},{0,119,1},{0,120,1},{0,121,1},{0,122,1},{0,123,1},{0,124,1},
{0,125,1},{0,126,1},{0,127,1}
};
{{11,0,5},{15,1,5},{14,2,4},{14,3,4},{0,4,3},{0,5,3},{0,6,3},{0,7,3},{7,8,4},{7,9,4},{9,10,4},{9,11,4},{10,12,4},{10,13,4},{13,14,4},{13,15,4},{3,16,4},{3,17,4},{8,18,4},{8,19,4},{6,20,4},{6,21,4},{12,22,4},{12,23,4},{4,24,4},{4,25,4},{1,26,4},{1,27,4},{2,28,4},{2,29,4},{5,30,4},{5,31,4}
};
{{2,0,4},{2,1,4},{5,2,4},{5,3,4},{0,4,3},{0,5,3},{0,6,3},{0,7,3},{10,8,5},{13,9,5},{7,10,5},{14,11,5},{12,12,4},{12,13,4},{3,14,4},{3,15,4},{8,16,4},{8,17,4},{9,18,4},{9,19,4},{6,20,4},{6,21,4},{11,22,5},{15,23,5},{4,24,3},{4,25,3},{4,26,3},{4,27,3},{1,28,3},{1,29,3},{1,30,3},{1,31,3}
};
{{4,0,4},{4,1,4},{1,2,4},{1,3,4},{2,4,4},{2,5,4},{5,6,4},{5,7,4},{0,8,2},{0,9,2},{0,10,2},{0,11,2},{0,12,2},{0,13,2},{0,14,2},{0,15,2},{10,16,5},{13,17,5},{7,18,5},{14,19,5},{12,20,4},{12,21,4},{3,22,4},{3,23,4},{8,24,4},{8,25,4},{9,26,4},{9,27,4},{6,28,4},{6,29,4},{11,30,5},{15,31,5},
};
{{10,0,6},{13,1,6},{7,2,6},{14,3,6},{12,4,5},{12,5,5},{3,6,5},{3,7,5},{8,8,5},{8,9,5},{9,10,5},{9,11,5},{6,12,5},{6,13,5},{11,14,6},{15,15,6},{4,16,4},{4,17,4},{4,18,4},{4,19,4},{1,20,4},{1,21,4},{1,22,4},{1,23,4},{2,24,4},{2,25,4},{2,26,4},{2,27,4},{5,28,4},{5,29,4},{5,30,4},{5,31,4},{0,32,1},{0,33,1},{0,34,1},{0,35,1},{0,36,1},{0,37,1},{0,38,1},{0,39,1},{0,40,1},{0,41,1},{0,42,1},{0,43,1},{0,44,1},{0,45,1},{0,46,1},{0,47,1},{0,48,1},{0,49,1},{0,50,1},{0,51,1},{0,52,1},{0,53,1},{0,54,1},{0,55,1},{0,56,1},{0,57,1},{0,58,1},{0,59,1},{0,60,1},{0,61,1},{0,62,1},{0,63,1}
};
{{4,0,4},{4,1,4},{4,2,4},{4,3,4},{6,4,6},{7,5,6},{5,6,5},{5,7,5},{3,8,3},{3,9,3},{3,10,3},{3,11,3},{3,12,3},{3,13,3},{3,14,3},{3,15,3},{2,16,2},{2,17,2},{2,18,2},{2,19,2},{2,20,2},{2,21,2},{2,22,2},{2,23,2},{2,24,2},{2,25,2},{2,26,2},{2,27,2},{2,28,2},{2,29,2},{2,30,2},{2,31,2},{1,32,2},{1,33,2},{1,34,2},{1,35,2},{1,36,2},{1,37,2},{1,38,2},{1,39,2},{1,40,2},{1,41,2},{1,42,2},{1,43,2},{1,44,2},{1,45,2},{1,46,2},{1,47,2},{0,48,2}, {0,49,2},{0,50,2},{0,51,2},{0,52,2},{0,53,2},{0,54,2},{0,55,2},{0,56,2},{0,57,2},{0,58,2},{0,59,2},{0,60,2},{0,61,2},{0,62,2},{0,63,2}
};
{{2,0,3},{2,1,3},{2,2,3},{2,3,3},{3,4,3},{3,5,3},{3,6,3},{3,7,3},{5,8,4},{5,9,4},{6,10,5},{7,11,5},{4,12,3},{4,13,3},{4,14,3},{4,15,3},{1,16,2},{1,17,2},{1,18,2},{1,19,2},{1,20,2},{1,21,2},{1,22,2},{1,23,2},{0,24,2},{0,25,2},{0,26,2},{0,27,2},{0,28,2},{0,29,2},{0,30,2},{0,31,2}
};
{{0,0,1},{0,1,1},{0,2,1},{0,3,1},{0,4,1},{0,5,1},{0,6,1},{0,7,1},{0,8,1},{0,9,1},{0,10,1},{0,11,1},{0,12,1},{0,13,1},{0,14,1},{0,15,1},{0,16,1},{0,17,1},{0,18,1},{0,19,1},{0,20,1},{0,21,1},{0,22,1},{0,23,1},{0,24,1},{0,25,1},{0,26,1},{0,27,1},{0,28,1},{0,29,1},{0,30,1},{0,31,1},{5,32,5},{5,33,5},{6,34,6},{7,35,6},{4,36,4},{4,37,4},{4,38,4},{4,39,4},{1,40,3},{1,41,3},{1,42,3},{1,43,3},{1,44,3},{1,45,3},{1,46,3},{1,47,3},{2,48,3},{2,49,3},{2,50,3},{2,51,3},{2,52,3},{2,53,3},{2,54,3},{2,55,3},{3,56,3},{3,57,3},{3,58,3},{3,59,3},{3,60,3},{3,61,3},{3,62,3},{3,63,3}
};
{{0,0,3},{1,1,3},{2,2,3},{3,3,3},{4,4,3},{5,5,3},{6,6,3},{7,7,3}
}。
第二方面,提供了一种音频编码方法,所述方法包括:
如果待编码的音频信号满足第一条件,则将所述音频信号的左声道数据编入码流,以及将所述音频信号的右声道数据编入所述码流,所述第一条件包括所述音频信号为双声道信号、所述音频信号的编码码率不小于码率阈值,且所述音频信号的采样率不小于采样率阈值。
可选地,所述方法还包括:如果所述音频信号不满足所述第一条件,则在所述音频信号为双声道信号的情况下,通过双声道交织编码的方式,将所述左声道数据与所述右声道数据编入所述码流。
可选地,所述方法还包括:如果所述音频信号不满足所述第一条件,则在所述音频信号为单声道信号的情况下,将所述音频信号的单声道数据编入所述码流。
可选地,所述将所述音频信号的左声道数据编入码流,包括:
获取多个子带中各个子带的量化等级衡量因子,所述量化等级衡量因子表征编码相应子带内的各个频谱值所需的平均比特数,所述多个子带是指将所述左声道数据包括的经量化的频谱数据所划分到的多个子带;
基于所述多个子带的量化等级衡量因子,将所述多个子带划分为多组子带,同一组子带的量化等级衡量因子相同;
基于每组子带的量化等级衡量因子,从多个编码码本中确定每组子带对应的目标编码码本,以及确定每组子带内的频谱值的比特流,所述目标编码码本是指对相应一组子带内的频谱值进行编码所采用的编码码本;
将每组子带对应的目标编码码本的标识作为所述左声道数据的一种边信息编入所述码流。
可选地,所述基于每组子带的量化等级衡量因子,从多个编码码本中确定每组子带对应的目标编码码本,以及确定每组子带内的频谱值的比特流,包括:
对于所述多组子带中的任一组子带,如果所述任一组子带的量化等级衡量因子为第一值,则采用所述多个编码码本中的多个第一编码码本分别对所述任一组子带内的频谱值进行编码,以得到与所述多个第一编码码本一一对应的多个第一候选频谱比特流;
将所述多个第一候选频谱比特流中比特总数最少的第一候选频谱比特流,确定为所述任一组子带内的频谱值的比特流,以及将所述比特总数最少的第一候选频谱比特流所对应的第一编码码本确定为所述任一组子带对应的目标编码码本。
可选地,所述第一值为1;
所述采用所述多个编码码本中的多个第一编码码本分别对所述任一组子带内的频谱值进行编码,包括:
将所述任一组子带中的每四个频谱值组合成一个二进制数;
采用所述多个第一编码码本分别对所述二进制数所表示的十进制数进行编码;
其中,所述多个第一编码码本如下:
{{0,0,4},{1,1,4},{2,2,4},{3,3,4},{4,4,4},{5,5,4},{6,6,4},{7,7,4},
{8,8,4},{9,9,4},{10,10,4},{11,11,4},{12,12,4},{13,13,4},{14,14,4},{15,15,4}
};
{{0,1,3},{1,13,4},{2,12,4},{3,9,4},{4,15,4},{5,5,4},{6,8,4},{7,7,4},
{8,14,4},{9,4,4},{10,10,4},{11,0,5},{12,11,4},{13,1,4},{14,6,4},{15,1,5}
};
{{0,1,2},{1,1,4},{2,0,4},{3,12,4},{4,3,4},{5,13,4},{6,11,4},{7,18,5},
{8,2,4},{9,15,4},{10,14,4},{11,32,6},{12,10,4},{13,19,5},{14,17,5},{15,33,6}
};
{{0,1,1},{1,5,4},{2,4,4},{3,4,5},{4,7,4},{5,5,5},{6,3,5},{7,2,6},
{8,6,4},{9,7,5},{10,6,5},{11,0,7},{12,2,5},{13,3,6},{14,1,6},{15,1,7}
}。
可选地,所述基于每组子带的量化等级衡量因子,从多个编码码本中确定每组子带对应的目标编码码本,以及确定每组子带内的频谱值的比特流,包括:
对于所述多组子带中的任一组子带,如果所述任一组子带的量化等级衡量因子为第二值,则采用所述多个编码码本中的多个第二编码码本分别对所述任一组子带内的频谱值进行编码,以得到与所述多个第二编码码本一一对应的多个第二候选频谱比特流;
将所述多个第二候选频谱比特流中比特总数最少的第二候选频谱比特流,确定为所述任一组子带内的频谱值的比特流,以及将所述比特总数最少的第二候选频谱比特流所对应的第二编码码本确定为所述任一组子带对应的目标编码码本。
可选地,所述第二值为2;
所述采用所述多个编码码本中的多个第二编码码本分别对所述任一组子带内的频谱值进 行编码,包括:
将所述任一组子带内的每两个频谱值组合成一个二进制数;
采用所述多个第二编码码本分别对所述二进制数所表示的十进制数进行编码;
其中,所述多个第二编码码本如下:
{{0,1,3},{1,13,4},{2,14,4},{3,8,4},{4,12,4},{5,15,4},{6,10,4},{7,4,4},{8,9,4},{9,5,4},{10,6,4},{11,0,5},{12,11,4},{13,7,4},{14,1,4},{15,1,5}
};
{{0,1,3},{1,7,3},{2,0,4},{3,7,4},{4,6,3},{5,1,4},{6,10,4},{7,10,5},
{8,8,4},{9,9,4},{10,8,5},{11,22,5},{12,6,4},{13,9,5},{14,11,5},{15,23,5}
};
{{0,1,2},{1,1,4},{2,2,4},{3,11,4},{4,0,4},{5,3,4},{6,14,4},{7,18,5},
{8,12,4},{9,13,4},{10,16,5},{11,30,5},{12,10,4},{13,17,5},{14,19,5},{15,31,5}
};
{{0,1,1},{1,5,4},{2,6,4},{3,3,5},{4,4,4},{5,7,4},{6,6,5},{7,2,6},
{8,4,5},{9,5,5},{10,0,6},{11,14,6},{12,2,5},{13,1,6},{14,3,6},{15,15,6}
}。
可选地,所述基于每组子带的量化等级衡量因子,从多个编码码本中确定每组子带对应的目标编码码本,以及确定每组子带内的频谱值的比特流,包括:
对于所述多组子带中的任一组子带,如果所述任一组子带的量化等级衡量因子为第三值,则采用所述多个编码码本中的多个第三编码码本分别对所述任一组子带内的频谱值进行编码,以得到与所述多个第三编码码本一一对应的多个第三候选频谱比特流;
将所述多个第三候选频谱比特流中比特总数最少的第三候选频谱比特流,确定为所述任一组子带内的频谱值的比特流,以及将所述比特总数最少的第三候选频谱比特流所对应的第三编码码本确定为所述任一组子带对应的目标编码码本。
可选地,所述基于每组子带的量化等级衡量因子,从多个编码码本中确定每组子带对应的目标编码码本,以及确定每组子带内的频谱值的比特流,包括:
对于所述多组子带中的任一组子带,如果所述任一组子带的量化等级衡量因子为第四值,则采用所述多个第三编码码本分别对所述任一组子带内的各个频谱值中的第一部分比特位进行编码,以得到与所述多个第三编码码本一一对应的多个第一部分候选比特流;
将所述多个第一部分候选比特流中比特总数最少的第一部分候选比特流,确定为所述第一部分比特位的比特流,以及将所述比特总数最少的第一部分候选比特流所对应的第三编码码本确定为所述任一组子带对应的目标编码码本;
对所述任一组子带内的各个频谱值中的除第一部分比特位之外的第二部分比特位进行均匀量化编码,以得到所述第二部分比特位的比特流。
可选地,所述第一部分比特位是指所述频谱值中高位的N个比特位,所述第二部分比特位是指所述频谱值中低位的M个比特,所述M等于所述任一组子带的量化等级衡量因子减 去所述第三值。
可选地,所述第三值为3;
所述采用所述多个编码码本中的多个第三编码码本分别对所述任一组子带内的频谱值进行编码,包括:
采用所述多个第三编码码本分别对所述任一组子带内的各个频谱值进行编码;
其中,所述多个第三编码码本如下:
{{0,3,2},{1,2,2},{2,1,2},{3,1,3},{4,0,4},{5,3,5},{6,4,6},{7,5,6}
};
{{0,3,2},{1,2,2},{2,0,3},{3,1,3},{4,3,3},{5,4,4},{6,10,5},{7,11,5}
};
{{0,0,1},{1,5,3},{2,6,3},{3,7,3},{4,9,4},{5,16,5},{6,34,6},{7,35,6}
};
{{0,0,3},{1,1,3},{2,2,3},{3,3,3},{4,4,3},{5,5,3},{6,6,3},{7,7,3}
}。
本申请所提供的音频编解码方法,在音频信号为双声道信号的情况下,即使码流包括左声道比特流和右声道比特流,在解码过程中,也能够依据声道解码模式,来解码左声道比特流且不解码右声道比特流,或者解码右声道比特流而不解码左声道比特流,从而在解码端资源有限的情况下,降低解码端的功耗。相应地,编码端也能够依据音频信号所满足的条件,依次编码左声道数据和右声道数据,而非必须按照双声道交织编码方式或必须按照双声道解交织编码方式进行编码。可见本方案的编码方式更加灵活。
第三方面,提供了一种音频解码装置,所述音频解码装置具有实现上述第一方面中音频解码方法行为的功能。所述音频解码装置包括一个或多个模块,该一个或多个模块用于实现上述第一方面所提供的音频解码方法。
第四方面,提供了一种音频编码装置,所述音频编码装置具有实现上述第二方面中音频编码方法行为的功能。所述音频编码装置包括一个或多个模块,该一个或多个模块用于实现上述第二方面所提供的音频编码方法。
第五方面,提供了一种音频解码设备,所述音频解码设备包括处理器和存储器,所述存储器用于存储执行上述第一方面所提供的音频解码方法的程序,以及存储用于实现上述第一方面所提供的音频解码方法所涉及的数据。所述处理器被配置为用于执行所述存储器中存储的程序。所述音频解码设备还可以包括通信总线,该通信总线用于该处理器与存储器之间建立连接。
第六方面,提供了一种音频编码设备,所述音频编码设备包括处理器和存储器,所述存储器用于存储执行上述第二方面所提供的音频编码方法的程序,以及存储用于实现上述第二方面所提供的音频编码方法所涉及的数据。所述处理器被配置为用于执行所述存储器中存储的程序。所述音频解码设备还可以包括通信总线,该通信总线用于该处理器与存储器之间建 立连接。
第七方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第一方面所述的音频解码方法,或者执行上述第二方面所述的音频编码方法。
第八方面,提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面所述的音频解码方法,或者执行上述第二方面所述的音频编码方法。
上述第二方面至第八方面所获得的技术效果与第一方面中对应的技术手段获得的技术效果近似,在这里不再赘述。
附图说明
图1是本申请实施例提供的一种蓝牙互连场景的示意图;
图2是本申请实施例提供的音频编解码方法所涉及的一种系统框架图;
图3是本申请实施例提供的一种音频编解码整体框架图;
图4是本申请实施例提供的一种编解码设备的结构示意图;
图5是本申请实施例提供的一种音频编码方法的流程图;
图6是本申请实施例提供的一种码流结构的示意图;
图7是本申请实施例提供的另一种码流格式的示意图;
图8是本申请实施例提供的另一种音频编码方法的流程图;
图9是本申请实施例提供的一种音频解码方法的流程图;
图10是本申请实施例提供的另一种音频解码方法的流程图;
图11是本申请实施例提供的一种音频解码装置的结构示意图;
图12是本申请实施例提供的一种音频编码装置的结构示意图。
具体实施方式
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
本申请实施例描述的网络架构以及业务场景是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域普通技术人员可知,随着网络架构的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
首先对本申请实施例涉及的实施环境和背景知识进行介绍。
随着真无线立体声(true wireless stereo,TWS)耳机、智能音箱和智能手表等无线蓝牙设备在人们日常生活中的广泛普及和使用,人们在各种场景下对追求高质量音频播放体验的需求也变得越来越迫切,尤其是在地铁、机场、火车站等蓝牙信号易受干扰的环境中。在蓝牙互连场景中,由于连通音频发送设备与音频接收设备的蓝牙信道对数据传输大小的限制,音频信号要经过音频发送设备中的音频编码器进行数据压缩后再传输到音频接收设备,通过音频接收设备中的音频解码器对压缩后的音频信号进行解码后才能进行播放。可见,在无线 蓝牙设备普及的同时,也促使了各种蓝牙音频编解码器的蓬勃发展。
目前蓝牙音频编解码器有子带编码器(sub-band coding,SBC)、高级音频编码器(advanced audio coding,AAC)、aptX系列编码器、低延迟高清音频编解码器(low-latency hi-definition audio codec,LHDC)、低功耗低延迟的LC3音频编解码器以及LC3plus等。
应当理解的是,本申请实施例提供的音频编解码方法可以应用于蓝牙互连场景中的音频发送设备(即编码端)和音频接收设备(即解码端)。
图1是本申请实施例提供的一种蓝牙互连场景的示意图。参见图1,蓝牙互连场景中的音频发送设备可以是手机、电脑、平板等。其中,电脑可以为笔记本电脑、台式电脑等,平板可以为手持式平板、车载式平板等。蓝牙互连场景中的音频接收设备可以是TWS耳机、智能音箱、无线头戴式耳机、无线颈圈式耳机、智能手表、智能眼镜、智能车载设备等。在另一些实施例中,蓝牙互连场景中的音频接收设备也可以是手机、电脑、平板等。
需要说明的是,本申请实施例提供的音频编解码方法,除了应用于蓝牙互连场景之外,也可应用于其他的设备互连场景中。换种方式来讲,本申请实施例描述的系统架构以及业务场景是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域普通技术人员可知,随着系统架构的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
图2是本申请实施例提供的音频编解码方法所涉及的一种系统框架图。参见图2,该系统包括编码端和解码端。其中,编码端包括输入模块、编码模块和发送模块。解码端包括接收模块、输入模块、解码模块和播放模块。
在编码端,用户根据使用场景从两种编码模式中确定一种编码模式,这两种编码模式为低延迟编码模式和高音质编码模式。这两种编码模式的编码帧长分别为5ms和10ms。比如使用场景为打游戏、直播、通话等,则用户可选择低延迟编码模式,使用场景为通过耳机或音响欣赏音乐等,则用户可选择高音质编码模式。用户还需要提供待编码的音频信号(如图2所示的脉冲编码调制(pulse code modulation,PCM)数据)给编码端。此外,用户还需要设定编码所得到的码流的目标码率,即音频信号的编码码率。其中,目标码率越高表示音质相对越好,但是在短距传输过程中码流的抗干扰性越差;目标码率越低,音质相对越差,但是在短距传输中码流的抗干扰性越高。简单来讲,编码端的输入模块获取用户提交的编码帧长、编码码率以及待编码的音频信号。
编码端的输入模块将用户提交的数据输入到编码模块的频域编码器中。
编码模块的频域编码器基于接收到的数据,通过编码以得到码流。其中,频域编码端对待编码的音频信号进行分析,以得到信号特点(包括单声道/双声道、平稳/非平稳、满带宽/窄带宽信号、主观/客观等),根据信号特点以及码率档位(即编码码率)进入相应的编码处理子模块,通过编码处理子模块来编码音频信号,以及打包码流的包头(包括采样率、声道数、编码模式、帧长等),最终得到码流。
编码端的发送模块将码流发送给解码端。可选地,该发送模块为如图2所示的短距发送模块或其他类型的发送模块,本申请实施例对此不作限定。
在解码端,解码端的接收模块接收到码流之后,将码流发送给解码模块的频域解码器中,并通知解码端的输入模块获取配置的位深和声道解码模式等。可选地,该接收模块为如图2 所示的短距接收模块或其他类型的接收模块,本申请实施例对此不作限定。
解码端的输入模块将获取的位深和声道解码模式等信息输入到解码模块的频域解码器中。
解码模块的频域解码器基于位深、声道解码模式等来解码码流,以得到所需的音频数据(如图2所示的PCM数据),将得到的音频数据发送给播放模块,播放模块进行音频播放。其中,声道解码模式指示所需解码的声道。
图3是本申请实施例提供的一种音频编解码整体框架图。参见图3,编码端的编码流程包括如下步骤:
(1)PCM输入模块
输入PCM数据,该PCM数据为单声道数据或双声道数据,位深可以是16比特(bit)、24bit、32bit浮点或32bit定点。可选地,PCM输入模块将输入的PCM数据变换到同一位深,比如24bit位深,并对PCM数据进行解交织后按照左声道和右声道放置。
(2)加低延迟分析窗&改进离散余弦变换(modified discrete cosine transform,MDCT)变换模块
对步骤(1)处理后的PCM数据加低延迟分析窗以及进行MDCT变换后得到MDCT域的频谱数据。加窗的作用是防止频谱泄漏。
(3)MDCT域信号分析&自适应带宽检测模块
MDCT域信号分析模块在全码率场景下生效,自适应带宽检测模块在低码率(如码率<150kbps/声道)下激活。首先,根据上述步骤(2)得到的MDCT域的频谱数据,进行带宽检测,以得到截止频率或者说有效带宽。其次,对有效带宽内的频谱数据进行信号分析,即分析频点分布是集中的还是均匀的,以得到能量集中度,基于能量集中度得到指示待编码的音频信号是客观信号还是主观信号的标志(flag)(客观信号的标志为1,主观信号的标志为0)。如果是客观信号,在低码率下不对标度因子进行频域噪声整形(spectral noise shaping,SNS)处理和MDCT谱的平滑,因为这样会降低客观信号的编码效果。然后,基于带宽检测结果和主客观信号标志来确定是否进行MDCT域的子带截止操作。如果音频信号是客观信号,则不做子带截止操作;如果音频信号是主观信号且带宽检测结果标识为0(满带宽的),则子带截止操作由码率决定;如果音频信号是主观信号且带宽检测结果标识非0(即带宽小于采样率的一半的有限带宽),则子带截止操作由带宽检测结果决定。
(4)子带划分选取和标度因子计算模块
根据码率档位以及上述步骤(3)得到的主客观信号标志和截止频率,从多种子带划分方式中选取最佳的子带划分方式,并得到编码该音频信号所需要的子带总个数。同时计算得到频谱的包络线,即计算所选取的划分子带方式对应的标度因子。
(5)MS声道变换模块
针对双声道的PCM数据,根据上述步骤(4)计算得到的标度因子进行联合编码判别,即判别是否对左右声道数据进行MS声道变换。
(6)谱平滑模块和基于标度因子的频域噪声整形模块
谱平滑模块根据低码率的设定(如码率<150kbps/声道)进行MDCT谱平滑,频域噪声整形模块基于标度因子对经过谱平滑的数据进行频域噪声整形,得到调节因子,调节因子用于对音频信号的频谱值进行量化。其中,低码率的设定由低码率判别模块进行控制,当不满足 低码率的设定时,无需进行谱平滑和频域噪声整形。
(7)标度因子编码模块
根据标度因子的分布对多个子带的标度因子进行差分编码或者熵编码。
(8)比特分配&MDCT谱量化和熵编码模块
基于步骤(4)得到的标度因子和步骤(6)得到的调节因子,通过粗估和精估的比特分配策略来控制编码为恒定码率(constant bit rate,CBR)编码模式,并对MDCT谱值进行量化和熵编码。
(9)残余编码模块
若步骤(8)的比特消耗还没有达到目标比特,则进一步对未编码的子带进行重要性排序,将比特优先分配到重要子带的MDCT谱值的编码上。
(10)流包头信息打包模块
包头信息包括音频采样率(如44.1kHz/48kHz/88.2kHz/96kHz)、声道信息(如单声道和双声道)、编码帧长(如5ms和10ms)、编码模式(如时域、频域、时域切频域或频域切时域模式)等。
(11)比特流(即码流)发送模块
码流包含包头、边信息、载荷等。其中,包头携带包头信息,该包头信息如上述步骤(10)中的描述。边信息包括标度因子的编码码流、选取的子带划分方式的信息、截止频率信息、低码率标志、联合编码判别信息(即MS变换标志)、量化步长等信息。载荷包括MDCT频谱的编码码流和残余编码码流。
解码端的解码流程包括如下步骤:
(1)流包头信息解析模块
从接收的码流中解析出包头信息,包头信息包括音频信号的采样率、声道信息、编码帧长、编码模式等信息,根据码流大小、采样率和编码帧长计算得到编码码率,即得到码率档位信息。
(2)标度因子解码模块
从码流中解码出边信息,包括选取的子带划分方式的信息、截止频率信息、低码率标志、联合编码判别信息、量化步长等信息,以及各个子带的标度因子。
(3)基于标度因子的频域噪声整形模块
在低码率(如编码码率小于300kbps,即150kbps/声道)下,还需要基于标度因子做频域噪声整形,得到调节因子,调节因子用于对频谱值的码值进行反量化。其中,低码率的设定由低码率判别模块进行控制,当不满足低码率的设定时,无需进行频域噪声整形。
(4)MDCT谱解码模块和残余解码模块
MDCT谱解码模块根据上述步骤(2)得到的子带划分方式的信息、量化步长信息以及标度因子,解码码流中的MDCT频谱数据。在低码率档位下进行空洞补全,如计算得到比特还有剩余,则MDCT谱解码模块进行残余解码,以得到其他子带的MDCT频谱数据,进而最终的MDCT频谱数据。
(5)LR声道变换模块
根据步骤(2)得到的边信息,如果根据联合编码判别判定是双声道联合编码模式且不是解码低功耗模式(如编码码率大于或等于300kbps且采样率大于88.2kHz),则对步骤(4)得 到的MDCT频谱数据进行LR声道变换。
(6)逆MDCT变换&加低延迟合成窗模块和交叠相加模块
逆MDCT变换模块在步骤(4)和步骤(5)的基础上,对得到的MDCT频谱数据进行MDCT逆变换,以得到时域混叠信号,然后加低延迟合成窗模块对时域混叠信号加低延迟合成窗,交叠相加模块将当前帧与上一帧的时域混叠缓存信号叠加得到PCM信号,即通过交叠相加得到最终的PCM数据。
(7)PCM输出模块
根据配置的位深和声道解码模式,输出相应声道的PCM数据。
需要说明的是,图3所示的音频编解码框架仅作为本申请实施例终端一个示例,并不用于限制本申请实施例,本领域技术人员可以在图3的基础上得到其他的编解码框架。
请参考图4,图4是根据本申请实施例示出的一种编解码设备的结构示意图。可选地,该编解码设备为图1中所示的任一设备,该编解码设备包括一个或多个处理器401、通信总线402、存储器403以及一个或多个通信接口404。
处理器401为一个通用中央处理器(central processing unit,CPU)、网络处理器(network processing,NP)、微处理器、或者为一个或多个用于实现本申请方案的集成电路,例如,专用集成电路(application-specific integrated circuit,ASIC),可编程逻辑器件(programmable logic device,PLD)或其组合。可选地,上述PLD为复杂可编程逻辑器件(complex programmable logic device,CPLD),现场可编程逻辑门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)或其任意组合。
通信总线402用于在上述组件之间传送信息。可选地,通信总线402分为地址总线、数据总线、控制总线等。为便于表示,图中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
可选地,存储器403为只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、光盘(包括只读光盘(compact disc read-only memory,CD-ROM)、压缩光盘、激光盘、数字通用光盘、蓝光光盘等)、磁盘存储介质或者其它磁存储设备,或者是能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其它介质,但不限于此。存储器403独立存在,并通过通信总线402与处理器401相连接,或者,存储器403与处理器401集成在一起。
通信接口404使用任何收发器一类的装置,用于与其它设备或通信网络通信。通信接口404包括有线通信接口,可选地,还包括无线通信接口。其中,有线通信接口例如以太网接口等。可选地,以太网接口为光接口、电接口或其组合。无线通信接口为无线局域网(wireless local area networks,WLAN)接口、蜂窝网络通信接口或其组合等。
可选地,在一些实施例中,编解码设备包括多个处理器,如图4中所示的处理器401和处理器405。这些处理器中的每一个为一个单核处理器,或者一个多核处理器。可选地,这里的处理器指一个或多个设备、电路、和/或用于处理数据(如计算机程序指令)的处理核。
在具体实现中,作为一种实施例,编解码设备还包括输出设备406和输入设备407。输出设备406和处理器401通信,能够以多种方式来显示信息。例如,输出设备406为液晶显 示器(liquid crystal display,LCD)、发光二级管(light emitting diode,LED)显示设备、阴极射线管(cathode ray tube,CRT)显示设备或投影仪(projector)等。输入设备407和处理器401通信,能够以多种方式接收用户的输入。例如,输入设备407是鼠标、键盘、触摸屏设备或传感设备等。
在一些实施例中,存储器403用于存储执行本申请方案的程序代码410,处理器401能够执行存储器403中存储的程序代码410。该程序代码中包括一个或多个软件模块,该编解码设备能够通过处理器401以及存储器403中的程序代码410,来实现下文图5实施例提供的音频编码方法和/或图9所示的音频解码方法。
图5是本申请实施例提供的一种音频编码方法的流程图,该方法应用于编码端。请参考图5,该方法包括如下步骤。
步骤501:如果待编码的音频信号满足第一条件,则将该音频信号的左声道数据编入码流,第一条件包括该音频信号为双声道信号、该音频信号的编码码率不小于码率阈值,且该音频信号的采样率不小于采样率阈值。
在本申请实施例中,为了实现编码端的灵活编码,编码端能够依据待编码的音频信号所满足的条件,依次编码左声道数据和右声道数据,而非必须按照双声道交织编码方式或必须按照双声道解交织编码方式进行编码。
其中,如果待编码的音频信号满足第一条件,则编码将该音频信号的左声道数据编入码流,第一条件包括该音频信号为双声道信号、该音频信号的编码码率不小于码率阈值,且该音频信号的采样率不小于采样率阈值。
可选地,该码率阈值和采样率阈值均为预设的参数,该码率阈值为300kbps或其他数值,采样率阈值为88.2kHz或其他数值。在本申请实施例中,以码率阈值为300kbps、采样率阈值为88.2kHz为例进行介绍。
为了进行高效的编码,在本申请实施例中,编码端能够根据待编码的音频信号的频谱的量化值的分布特点,来编码左声道数据。接下来将对此进行介绍。
编码端将该音频信号的左声道数据编入码流的一种实现方式为:编码端获取多个子带中各个子带的量化等级衡量因子,该量化等级衡量因子表征编码相应子带内的各个频谱值所需的平均比特数,该多个子带是指编码端将左声道数据包括的经量化的频谱数据所划分到的多个子带。编码端基于该多个子带的量化等级衡量因子,将该多个子带划分为多组子带,同一组子带的量化等级衡量因子相同。编码端基于每组子带的量化等级衡量因子,从多个编码码本中确定每组子带对应的目标编码码本,以及确定每组子带内的频谱值的比特流,目标编码码本是指对相应一组子带内的频谱值进行编码所采用的编码码本。编码端将每组子带对应的目标编码码本的标识作为左声道数据的一种边信息编入码流。
简单来讲,左声道数据包括经量化的频谱数据,该频谱数据被划分到多个子带,编码端基于该多个子带中各个子带的量化等级衡量因子,从多个编码码本中选出较佳的编码码本(即目标编码码本),从而得到采用该编码码本编码得到的左声道比特流。另外,由于不同的编码码本对应不同的解码码本,因此为了解码端能够获知编码端所采用的目标编码码本所对应的解码码本是哪个,编码端还将目标编码码本的标识编入码流。其中,左声道比特流包括上述多组子带内的频谱值的比特流。
需要说明的是,编码端获取多个子带中各个子带的量化等级衡量因子的实现方式有多种,本申请实施例对此不作限定。
可选地,编码端基于每组子带的量化等级衡量因子,从多个编码码本中确定每组子带对应的目标编码码本,以及确定每组子带内的频谱值的比特流的实现过程包括:对于该多组子带中的任一组子带,如果该任一组子带的量化等级衡量因子为第一值,则编码端采用该多个编码码本中的多个第一编码码本分别对该任一组子带内的频谱值进行编码,以得到与该多个第一编码码本一一对应的多个第一候选频谱比特流。编码端将该多个第一候选频谱比特流中比特总数最少的第一候选频谱比特流,确定为该任一组子带内的频谱值的比特流,以及将该比特总数最少的第一候选频谱比特流所对应的第一编码码本确定为该任一组子带对应的目标编码码本。简单来讲,对于量化等级衡量因子为第一值的这些子带,编码端从多个第一编码码本中选择较佳的一个编码码本作为这些子带的目标编码码本,采用目标编码码本所得到的频谱比特流的数据量最少,也即实现了编码端高效的编码,进而便于后续解码端能够快速解码频谱比特流。
其中,第一值为1,编码端将该任一组子带中的每四个频谱值组合成一个二进制数,采用该多个第一编码码本分别对该二进制数所表示的十进制数进行编码。应当理解的是,第一值为1,表示编码相应子带内的各个频谱值所需的平均比特数为1,那么,编码端将该任一组子带中每四个频谱值组成一个二进制数,该二进制数的比特数为4,编码端采用该多个第一编码码本分别对这4个比特数所表示的十进制数进行编码,以得到这四个频谱值的多个比特流。以此类推,编码端按照上述方法对该任一组子带中所有的频谱值进行编码之后,得到各个第一编码码本对应的第一候选频谱比特流。
示例性地,该任一组子带中某个子带内的相邻四个频谱值分别为0、0、1、1(二进制数),编码端将这四个频谱值组合成一个二进制数0011,二进制数0011所表示的十进制数为3,则编码端采用该多个第一编码码本分别对十进制3进行编码。
该多个第一编码码本为哈夫曼(Huffman)编码码本,哈夫曼编码是一种熵编码。该多个第一编码码本依据大量经量化的频谱数据的统计特征确定或通过其他方式得到。该多个第一编码码本如下:
{{0,0,4},{1,1,4},{2,2,4},{3,3,4},{4,4,4},{5,5,4},{6,6,4},{7,7,4},
{8,8,4},{9,9,4},{10,10,4},{11,11,4},{12,12,4},{13,13,4},{14,14,4},{15,15,4}
};
{{0,1,3},{1,13,4},{2,12,4},{3,9,4},{4,15,4},{5,5,4},{6,8,4},{7,7,4},
{8,14,4},{9,4,4},{10,10,4},{11,0,5},{12,11,4},{13,1,4},{14,6,4},{15,1,5}
};
{{0,1,2},{1,1,4},{2,0,4},{3,12,4},{4,3,4},{5,13,4},{6,11,4},{7,18,5},
{8,2,4},{9,15,4},{10,14,4},{11,32,6},{12,10,4},{13,19,5},{14,17,5},{15,33,6}
};
{{0,1,1},{1,5,4},{2,4,4},{3,4,5},{4,7,4},{5,5,5},{6,3,5},{7,2,6},
{8,6,4},{9,7,5},{10,6,5},{11,0,7},{12,2,5},{13,3,6},{14,1,6},{15,1,7}
}。
可选地,对于该多组子带中的任一组子带,如果该任一组子带的量化等级衡量因子为第二值,则编码端采用该多个编码码本中的多个第二编码码本分别对该任一组子带内的频谱值进行编码,以得到与该多个第二编码码本一一对应的多个第二候选频谱比特流。编码端将该多个第二候选频谱比特流中比特总数最少的第二候选频谱比特流,确定为该任一组子带内的频谱值的比特流,以及将该比特总数最少的第二候选频谱比特流所对应的第二编码码本确定为任一组子带对应的目标编码码本。简单来讲,对于量化等级衡量因子为第二值的这些子带,编码端从多个第二编码码本中选择较佳的一个编码码本作为这些子带的目标编码码本。
其中,第二值为2,编码端将该任一组子带内的每两个频谱值组合成一个二进制数,采用该多个第二编码码本分别对该二进制数所表示的十进制数进行编码。应当理解的是,第二值为2,表示编码相应子带内的各个频谱值所需的平均比特数为2,那么,编码端将该任一组子带中每两个频谱值组成一个二进制数,该二进制数的比特数为4,编码端采用该多个第二编码码本分别对这4个比特数所表示的十进制数进行编码,以得到这两个频谱值的多个比特流。以此类推,编码端按照上述方法对该任一组子带中所有的频谱值进行编码之后,得到各个第二编码码本对应的第二候选频谱比特流。
示例性地,该任一组子带中某个子带内的相邻两个频谱值分别为01、11(二进制数),编码端将这两个频谱值组合成一个二进制数0111,二进制数0111所表示的十进制数为7,则编码端采用该多个第二编码码本分别对十进制7进行编码。
该多个第二编码码本为Huffman编码码本。该多个第二编码码本依据大量经量化的频谱数据的统计特征确定或通过其他方式得到。该多个第二编码码本如下:
{{0,1,3},{1,13,4},{2,14,4},{3,8,4},{4,12,4},{5,15,4},{6,10,4},{7,4,4},{8,9,4},{9,5,4},{10,6,4},{11,0,5},{12,11,4},{13,7,4},{14,1,4},{15,1,5}
};
{{0,1,3},{1,7,3},{2,0,4},{3,7,4},{4,6,3},{5,1,4},{6,10,4},{7,10,5},
{8,8,4},{9,9,4},{10,8,5},{11,22,5},{12,6,4},{13,9,5},{14,11,5},{15,23,5}
};
{{0,1,2},{1,1,4},{2,2,4},{3,11,4},{4,0,4},{5,3,4},{6,14,4},{7,18,5},
{8,12,4},{9,13,4},{10,16,5},{11,30,5},{12,10,4},{13,17,5},{14,19,5},{15,31,5}
};
{{0,1,1},{1,5,4},{2,6,4},{3,3,5},{4,4,4},{5,7,4},{6,6,5},{7,2,6},
{8,4,5},{9,5,5},{10,0,6},{11,14,6},{12,2,5},{13,1,6},{14,3,6},{15,15,6}
}。
可选地,对于该多组子带中的任一组子带,如果该任一组子带的量化等级衡量因子为第三值,则编码端采用该多个编码码本中的多个第三编码码本分别对该任一组子带内的频谱值进行编码,以得到与该多个第三编码码本一一对应的多个第三候选频谱比特流。编码端将该 多个第三候选频谱比特流中比特总数最少的第三候选频谱比特流,确定为该任一组子带内的频谱值的比特流,以及将该比特总数最少的第三候选频谱比特流所对应的第三编码码本确定为该任一组子带对应的目标编码码本。简单来讲,对于量化等级衡量因子为第三值的这些子带,编码端从多个第三编码码本中选择较佳的一个编码码本作为这些子带的目标编码码本。
其中,第三值为3,编码端采用该多个第三编码码本分别对该任一组子带内的各个频谱值进行编码。应当理解的是,第三值为3,表示编码相应子带内的各个频谱值所需的平均比特数为3,那么,编码端直接采用该多个第三编码码本分别对该任一组子带内的各个频谱值进行编码,以得到各个频谱值的多个比特流。以此类推,编码端按照上述方法对该任一组子带中所有的频谱值进行编码之后,得到各个第三编码码本对应的第三候选频谱比特流。
该多个第一编码码本为Huffman编码码本。该多个第三编码码本依据大量经量化的频谱数据的统计特征确定或通过其他方式得到。该多个第三编码码本如下:
{{0,3,2},{1,2,2},{2,1,2},{3,1,3},{4,0,4},{5,3,5},{6,4,6},{7,5,6}
};
{{0,3,2},{1,2,2},{2,0,3},{3,1,3},{4,3,3},{5,4,4},{6,10,5},{7,11,5}
};
{{0,0,1},{1,5,3},{2,6,3},{3,7,3},{4,9,4},{5,16,5},{6,34,6},{7,35,6}
};
{{0,0,3},{1,1,3},{2,2,3},{3,3,3},{4,4,3},{5,5,3},{6,6,3},{7,7,3}
}。
可选地,对于该多组子带中的任一组子带,如果该任一组子带的量化等级衡量因子为第四值,则编码端采用该多个第三编码码本分别对任一组子带内的各个频谱值中的第一部分比特位进行编码,以得到与多个第三编码码本一一对应的多个第一部分候选比特流。编码端将该多个第一部分候选比特流中比特总数最少的第一部分候选比特流,确定为第一部分比特位的比特流,以及将该比特总数最少的第一部分候选比特流所对应的第三编码码本确定为该任一组子带对应的目标编码码本。对该任一组子带内的各个频谱值中的除第一部分比特位之外的第二部分比特位进行均匀量化编码,以得到第二部分比特位的比特流。
简单来讲,对于量化等级衡量因子为第四值的这些子带,编码端将这些子带内的各个频谱值拆分成第一部分比特位和第二部分比特位,采用上述多个第三编码码本分别对第一部分比特位进行编码,以及对第二部分比特位进行均匀量化编码。编码端从该多个第三编码码本中选出较佳的编码码本作为这些子带的目标编码码本。
可选地,第一部分比特位是指频谱值中高位的N个比特位,第二部分比特位是指频谱值中低位的M个比特,M等于该任一组子带的量化等级衡量因子减去第三值。
以第三值为3、第四值为5为例,M=5-3=2,N=3,某个频谱值为10100(二进制数),编码端将该频谱值中高位的3个比特位(即二进制数101)确定为该频谱值的第一部分比特位,将该频谱值中低位的2个比特位(即二进制数00)确定为该频谱值的第二部分比特位。编码端采样上述多个第三编码码本分别对二进制数101所表示的十进制数5进行编码,以得到该频谱值的第一部分比特位的比特流,以及对二进制数00进行均匀量化编码,以得到该频谱值的第二部分比特位的比特流。
可选地,如果编码端基于编码码率确定将左声道数据编入码流后还剩余有可用的编码比 特,则编码端还能够对左声道数据所包括的频谱数据进行残余编码,以得到左声道比特流中的残余比特流。
步骤502:将该音频信号的右声道数据编入码流。
即,如果该音频信号满足第一条件,则编码端还将该音频信号的右声道数据编入码流。
可选地,编码端编码右声道数据的实现过程与编码左声道数据的实现过程类似。例如,编码端获取多个子带中各个子带的量化等级衡量因子,该量化等级衡量因子表征编码相应子带内的各个频谱值所需的平均比特数,该多个子带是指编码端将右声道数据包括的经量化的频谱数据所划分到的多个子带。编码端基于该多个子带的量化等级衡量因子,将该多个子带划分为多组子带,同一组子带的量化等级衡量因子相同。编码端基于每组子带的量化等级衡量因子,从多个编码码本中确定每组子带对应的目标编码码本,以及确定每组子带内的频谱值的比特流,目标编码码本是指对相应一组子带内的频谱值进行编码所采用的编码码本。编码端将每组子带对应的目标编码码本的标识作为右声道数据的一种边信息编入码流。具体实现方式可参照上文对编码左声道数据的相关介绍,这里不再赘述。
当然,在其他一些实施例中,编码端也可以按照与编码左声道数据不同的方式来编码右声道数据。
作为一种示例,如果待编码的音频信号为双声道信号、采样率为88.2kHz或96kHz、且编码码率不小于300kbps,则编码端初始化进入单声道编码逻辑,依次执行以下操作:单独编码左声道数据,将编码后的左声道数据打包入码流,单独编码右声道数据,将编码后的右声道数据打包入码流。
需要说明的是,编码端除了编码左声道数据和右声道数据之外,还将左声道的边信息以及右声道的边信息编入码流。另外,编码端还在码流的头部编入音频信号的一些参数。码流的头部以及边信息用于解码端的解码。
图6是本申请实施例提供的一种码流结构的示意图。图6所示的码流结构是双声道解交织编码的码流结构,双声道解交织编码的方式即编码端分别将左声道数据和右声道数据编入码流的编码方式。参见图6,该码流结构依次包括头部(header)(也称为包头)、左声道边信息(left channel side information)、左声道载荷(left channel payload)、右声道边信息(right channel side information)和右声道载荷(right channel payload)等字段。
头部字段包括编码类型(codec type,CT)、采样率(sample rate,SR)、声道数(channel number,CN)、帧长(frame length,FL)等子字段。这四个子字段所占比特数依次为2、2、1、2。
左声道边信息字段包括低码率标志(low bitrate flag,LBF)、全局编码控制因子(DR)、局部编码控制因子(DRQuater,DRQ),标度因子标识(scale factor identity,SFID)、子带数量(band number,BN)、差分编码标志(differential encoding flag,DEF)、标度因子(SF)和哈夫曼编码码本标识(Huffman encoding tuple ID,HufTupID)等子字段。这八个子字段所占比特数依次为1、5、3、3、4、1、(5*BN)或哈夫曼编码长度(Huffman encoding length,HEncL)、6。
其中,低码率标志为1,表示音频信号满足第二条件。低码率标志为0,表示音频信号不满足第二条件。第二条件是指音频信号的编码码率小于码率阈值,且音频信号的能量集中度小于集中度阈值。
标度因子指示各个子带的标度因子,标度因子用于编码端对频谱包络进行整形,以及指示解码端对频谱包络进行相反处理。标度因子根据相应子带内的最大频谱值得到。
哈夫曼编码码本标识即目标编码码本的标识。
左声道载荷字段包括左声道的频谱比特流和残余比特流。其中,以频谱数据为改进离散余弦编码(modified discrete cosine transform,MDCT)频谱数据为例,该频谱比特流为MDCT量化编码(MDCTQ)比特流,残余比特流为MDCT残余(MDCTRES)编码比特流。其中,残余比特流是可选地。左声道载荷字段所占比特数为N,N大于0。
同样地,右声道边信息字段包括低码率标志(LBF)、全局编码控制因子(DR)、局部编码控制因子(DRQuater),标度因子标识(SFID)、子带数量(BN)、差分编码标志(DF)、标度因子(SF)和哈夫曼编码码本标识(HufTupID)等子字段。这八个子字段所占比特数依次为1、5、3、3、4、1、(5*子带数量)或哈夫曼编码长度(HEncL)、6。
右声道载荷字段包括右声道的频谱比特流和残余比特流。其中,以频谱数据为MDCT频谱数据为例,该频谱比特流为MDCT量化编码比特流,残余比特流为MDCT残余编码比特流。左声道载荷字段所占比特数为N,N大于0。
上文介绍了在音频信号满足第一条件的情况下,编码端采用双声道解交织编码的方式分布对左声道数据和右声道数据进行编码的实现过程。接下来介绍其他情况下编码端的编码过程。
在本申请实施例中,如果该音频信号不满足第一条件,则编码端在该音频信号为双声道信号的情况下,通过双声道交织编码的方式,将左声道数据与右声道数据编入码流。也即是,编码端采用联合立体声编码方式来编码音频信号。
需要说明的是,编码端除了对左右声道数据进行双声道交织编码之外,还将左右声道的边信息编入码流。另外,编码端还在码流的头部编入音频信号的一些参数,这些参数作为双声道公用参数。
可选地,编码端通过双声道交织编码的方式,编码双声道公用参数的标识,编码左右声道的边信息,编码左右声道的频谱数据。其中,双声道公用的参数包括编码类型、采样率、声道数和帧长等。边信息包括低码率标志、加减立体声变换标志(mid/side-stereo transform coding flag,MSF)、全局编码控制因子、局部编码控制因子、标度因子标识、子带数量、差分编码标识、标度因子和哈夫曼编码码本等。左右声道的频谱数据包括左声道数据和右声道数据。其中,MSFlag指示编码端是否对作左右声道的频谱数据进行了MS变换后编入码流。
作为一种示例,如果待编码的音频信号为双声道信号,且采样率为44.1kHz和/或编码码率小于300kbps,则编码端初始化进入双声道交织编码模式,依次执行以下操作:编码并打包双声道公用参数的标识,编码并打包左右声道的标度因子,编码并打包MDCT频谱值对应的哈夫曼编码码本标识,采用该哈夫曼编码码本对MDCT频谱值进行编码和打包。
图7是本申请实施例提供的另一种码流结构的示意图。图7所示的码流结构是双声道交织编码的码流结构。参见图7,该码流结构依次包括头部(header)、边信息(side information)、和载荷(payload)等字段。
头部字段包括编码类型、采样率、声道数、帧长等子字段。这四个子字段所占比特数依次为2、2、1、2。
边信息字段包括低码率标志(LBF)、加减立体声变换标志(MSF)、全局编码控制因子 (DR)、局部编码控制因子(DRQ)、标度因子标识(SFID)、子带数量(BN)、差分编码标志(DEF)、标度因子(SF)和哈夫曼编码码本标识(HufTupID)等子字段。这九个子字段所占比特数依次为1、1、5、3、3、4、1、(5*BN*CN)或哈夫曼编码长度(HEncL)、6*CN。
载荷字段包括左右声道交织编码的频谱比特流和残余比特流。其中,以频谱数据为MDCT频谱数据为例,该频谱比特流为MDCT量化编码比特流,残余比特流为MDCT残余编码比特流。其中,残余比特流是可选地。载荷字段所占比特数为N,N大于0。
可选地,在该音频信号为双声道信号的情况下,编码端所采用的左右声道的比特分配策略为平均分配策略。在本申请实施例中,如果可分配的比特数不能均分,即遇到不能整除的码率档位,多余的比特优先分配给右声道。当然,在其他一些实施例中,多余的比特也可以优先分配给左声道。
在本申请实施例中,如果该音频信号不满足第一条件,则在该音频信号为单声道信号的情况下,编码端将该音频信号的单声道数据编入码流。
需要说明的是,编码端编码单声道数据的实现过程,与编码左声道数据或右声道数据的实现过程类似,这里过多介绍。
可选地,编码端编码单声道数据所得到的码流的结构,与双声道交织编码的码流结构一致。只不过单声道数据对应的码流中只有单个声道的相关数据。
接下来请参照图8对本申请实施例中的编码流程再次进行解释说明。图8所示的编码流程是对双声道信号进行编码的流程。单声道信号的解码流程与图8中左侧分支流程类似。参见图8,编码端获取输入参数,输入参数包括编码的帧长、编码码率和待编码的音频信号(如PCM数据)等。编码端基于输入参数来打包码流的包头。然后,编码端选择编码模式(即选择码流打包方式)。在选择双声道交织模式的情况下,编码端进入左侧分支流程,依次编码双声道公用参数的标识,编码左右声道对应的标度因子,编码左右声道所采用的编码码本标识,编码左右声道的频谱数据。在选择双声道解交织模式的情况下,编码端进入右侧分支流程,依次进行左声道编码和右声道编码。其中,每个声道的编码过程依次包括编码声道参数标识、编码声道标度因子、编码声道编码码本标识和编码声道频谱数据。在所有数据打包完成后,编码端结束编码流程。
由上文可知,编码端在编码声道的频谱数据的过程中,通过选择合适的编码码本来提高编码效率,也利于解码端的高效解码。由上文还可知,码流中编入有标度因子,在本申请实施例中,编码端通过对标度因子进行合适的编码,也能够在一定程度升提升编码效果和压缩效率。接下来将介绍编码端编码标度因子的实现过程。
在本申请实施例中,编码端获取多个子带中各个子带的标度因子,如果该多个子带中存在相邻两个子带之间的标度因子的差值的绝对值大于差异阈值,则编码端对该多个子带的标度因子进行均匀量化编码。如果该多个子带中不存在相邻两个子带之间的标度因子的差值的绝对值大于差异阈值,则编码端对该多个子带的标度因子进行差分编码。
可选地,在本申请实施例中,该差异阈值为6。在其他一些实施例中,该差异阈值也可以为其他数值。需要说明的是,该差异阈值基于多个子带的标度因子的统计特征确定或者通过其他方式确定。
这里对编码端对该多个子带的标度因子进行差分编码的一种实现方式进行介绍。
编码端确定该多个子带中各个子带的差分标度值。其中,该多个子带中的首个子带的差 分标度值为该首个子带的标度因子加上参考值。该多个子带中除首个子带之外的其他任一子带的差分标度值为该任一子带的标度因子与该任一子带的前一个子带的标度因子之间的差值。编码端对首个子带的差分标度值进行均匀量化编码。编码端对该多个子带中除首个子带之外的其他任一子带进行熵编码,如哈夫曼编码。
可选地,编码端采用哈夫曼编码码本对该多个子带中除首个子带之外的其他任一子带进行编码。该哈夫曼编码码本如下:
HuffmanCB HUF_ENC_DIFF_SF[HUF_ENC_DIFF_SF_MAXIDX]={
{0,3,2},{1,2,2},{2,1,2},{3,1,3},{4,0,4},{5,2,5},{6,3,5}
}。
上述哈夫曼编码码本包括多个三元组,每个三元组中的第一个元素表示待编码的十进制数,第二个元素表示编码后的二进制数所对应的十进制数,即码值,第三个元素表示编码后的二进制数所占的比特位数,即码长。
可选地,在本申请实施例中,上述参考值为8。在其他一些实施例中,上述参考值也可以为其他数值。
示例性地,编码端按照公式(1)来确定该多个子带中各个子带对应的差分标度值。
在公式(1)中,b表示子带的编号,bandsNum+1表示子带总数量,sf(b)表示子带b的标度因子,sfDiff(b)表示子带b的差分标度值。
b=0时,编码端对sfDiff(0)进行均匀量化编码,即将sfDiff(0)均匀编码成一个二进制数,该二进制数占5个比特,将该5个比特打包入码流。
b>0时,编码端对sfDiff(b)取绝对值后,采用上述哈夫曼编码码本进行编码,将编码后的二进制数编入码流。其中,对于为正数的sfDiff(b),编码端将该sfDiff(b)的符号位确定为1,将该符号位编入码流,该符号位占1比特。而负数的sfDiff(b)不占符号位。
需要说明的是,上述哈夫曼编码码本根据经验或者大数据训练得到,总体遵循高频次出现的数值对应的码长较短,低频次出现的数值对应的码长较长。
综上所述,在本申请实施例中,编码端能够依据音频信号所满足的条件,依次编码左声道数据和右声道数据,而非必须按照双声道交织编码方式或必须按照双声道解交织编码方式进行编码。可见本方案的编码方式更加灵活。
图9是本申请实施例提供的一种音频解码方法的流程图,该解码方法应用于解码端。该解码方法与图5所示的编码方法匹配。请参考图9,该解码方法包括如下步骤。
步骤901:如果待解码的音频信号满足第一条件,则获取声道解码模式,第一条件包括该音频信号为双声道信号、该音频信号的编码码率不小于码率阈值,且该音频信号的采样率不小于采样率阈值。
在本申请实施例中,为了实现解码端的低功耗,使得资源有限的电子设备也能够成功解码码流,解码端可能够依据待解码的音频信号所满足的条件,以及声道解码模式来按需解码。
其中,声道解码模式为解码端配置的参数。可选地,声道解码模式由0、1、2等数值来表示,不同的数值表示不同的声道解码模式。示例性地,0表示双声道解码模式,1表示左声 道解码模式,2表示右声道解码模式。
在本申请实施例中,在音频信号为双声道信号、编码码率不小于码率阈值且采样率不小于采样率阈值的情况下,码流包括左声道比特流和右声道比特流,解码端获取声道解码模式,基于声道解码模式来判断解码哪些声道的比特流。
可选地,该码率阈值为300kbps或其他数值,采样率阈值为88.2kHz或其他数值。在本申请实施例中,以码率阈值为300kbps、采样率阈值为88.2kHz为例进行介绍。需要说明的是,编码端的码率阈值与解码端的码率阈值相同,编码端的采样率阈值与解码端的采样率阈值相同。
在本申请实施例中,解码端确定该音频信号是否满足第一条件的实现过程包括:解码端获取码流的总数据量,以及解码码流的包头,以得到该音频信号的声道数、采样率和帧长。解码端基于该总数据量、声道数、采样率和帧长,确定该音频信号是否满足第一条件。
其中,解码端基于该总数量、采样率和帧长,确定该音频信号的编码码率。解码端基于该音频信号的编码码率、声道数和采样率,确定该音频信号是否满足第一条件。
步骤902:如果该声道解码模式为左声道解码模式,则解码码流中的左声道比特流,以得到该音频信号的左声道数据,将左声道数据复制到右声道。
其中,左声道解码模式表示无需解码码流中的右声道比特流。解码端在解码出左声道数据之后,将左声道数据复制到右声道即可。需要说明的是,如果左声道数据包括经量化的左声道频谱数据,则解码端对经量化的左声道频谱数据进行反量化,将反量化后的左声道频谱数据复制到右声道。或者,解码端对经量化的左声道频谱数据进行反量化,对反量化后的左声道数据进行时频域的逆变换(如改进离散余弦逆变换(inverse-modified discrete cosine transform,IMDCT),以得到左声道时域交叠信号,对左声道时域交叠信号进行交叠相加,以重建出左声道时域信号,将重建出的左声道时域信号复制到右声道。
由前文对编码流程的相关介绍可知,码流中有解码端进行解码所需的参数,例如,码流中有左声道的边信息,该边信息包括编码左声道数据所采用的编码码本的标识。基于此,解码端解码码流中边信息的比特流,以得到该边信息,该边信息包括编码码本标识。解码端基于该编码码本标识,从多个解码码本中确定解码所需的目标解码码本。此处的边信息是指左声道的边信息。
以编码码本标识为多个编码码本中的任一编码码本的标识为例,该多个编码码本与多个解码码本一一对应,解码端从编解码码本标识的对应关系中,确定与边信息所包括的编码码本标识对应的解码码本标识,将所确定的解码码本标识确定为目标解码标识。
示例性地,该多个编码码本包括前文所介绍的多个第一编码码本、多个第二编码码本和多个第三编码码本,对应的多个解码码本包括多个第一解码码本、多个第二解码码本和多个第三解码码本。其中,该多个第一解码码本与多个第一编码码本一一对应,该多个第二解码码本与多个第二编码码本一一对应,该多个第三解码码本与多个第三编码码本一一对应。
该多个第一解码码本如下:
{{0,0,4},{1,1,4},{2,2,4},{3,3,4},{4,4,4},{5,5,4},{6,6,4},{7,7,4},
{8,8,4},{9,9,4},{10,10,4},{11,11,4},{12,12,4},{13,13,4},{14,14,4},{15,15,4}
};
}{11,0,5},{15,1,5},{13,2,4},{13,3,4},{0,4,3},{0,5,3},{0,6,3},{0,7,3},
{9,8,4},{9,9,4},{5,10,4},{5,11,4},{14,12,4},{14,13,4},{7,14,4},{7,15,4},{6,16,4},{6,17,4},{3,18,4},{3,19,4},{10,20,4},{10,21,4},{12,22,4},{12,23,4},{2,24,4},{2,25,4},{1,26,4},{1,27,4},{8,28,4},{8,29,4},{4,30,4},{4,31,4}
};
{{2,0,4},{2,1,4},{2,2,4},{2,3,4},{1,4,4},{1,5,4},{1,6,4},{1,7,4},
{8,8,4},{8,9,4},{8,10,4},{8,11,4},{4,12,4},{4,13,4},{4,14,4},{4,15,4},{0,16,2},{0,17,2},{0,18,2},{0,19,2},{0,20,2},{0,21,2},{0,22,2},{0,23,2},0,24,2},{0,25,2},{0,26,2},{0,27,2},{0,28,2},{0,29,2},{0,30,2},{0,31,2},{11,32,6},{15,33,6},{14,34,5},{14,35,5},{7,36,5},{7,37,5},{13,38,5},{13,39,5},{12,40,4},{12,41,4},{12,42,4},{12,43,4},{6,44,4},{6,45,4},{6,46,4},{6,47,4},{3,48,4},{3,49,4},{3,50,4},{3,51,4},{5,52,4},{5,53,4},{5,54,4},{5,55,4},{10,56,4},{10,57,4},{10,58,4},{10,59,4},{9,60,4},{9,61,4},{9,62,4},{9,63,4}
};
{{11,0,7},{15,1,7},{14,2,6},{14,3,6},{7,4,6},{7,5,6},{13,6,6},{13,7,6},{12,8,5},{12,9,5},{12,10,5},{12,11,5},{6,12,5},{6,13,5},{6,14,5},{6,15,5},{3,16,5},{3,17,5},{3,18,5},{3,19,5},{5,20,5},{5,21,5},{5,22,5},{5,23,5},{10,24,5},{10,25,5},{10,26,5},{10,27,5},{9,28,5},{9,29,5},{9,30,5},{9,31,5},{2,32,4},{2,33,4},{2,34,4},{2,35,4},{2,36,4},{2,37,4},{2,38,4},{2,39,4},{1,40,4},{1,41,4},{1,42,4},{1,43,4},{1,44,4},{1,45,4},{1,46,4},{1,47,4},{8,48,4},{8,49,4},{8,50,4},{8,51,4},{8,52,4},{8,53,4},{8,54,4},{8,55,4},{4,56,4},{4,57,4},{4,58,4},{4,59,4},{4,60,4},{4,61,4},{4,62,4},{4,63,4},{0,64,1},{0,65,1},{0,66,1},{0,67,1},{0,68,1},{0,69,1},{0,70,1},{0,71,1},{0,72,1},{0,73,1},{0,74,1},{0,75,1},{0,76,1},{0,77,1},{0,78,1},{0,79,1},{0,80,1},{0,81,1},{0,82,1},{0,83,1},{0,84,1},{0,85,1},{0,86,1},{0,87,1},{0,88,1},{0,89,1},{0,90,1},{0,91,1},{0,92,1},{0,93,1},{0,94,1},{0,95,1},{0,96,1},{0,97,1},{0,98,1},{0,99,1},{0,100,1},{0,101,1},{0,102,1},{0,103,1},{0,104,1},{0,105,1},{0,106,1},{0,107,1},{0,108,1},{0,109,1},{0,110,1},{0,111,1},{0,112,1},{0,113,1},{0,114,1},{0,115,1},{0,116,1},{0,117,1},{0,118,1},{0,119,1},{0,120,1},{0,121,1},{0,122,1},{0,123,1},{0,124,1},{0,125,1},{0,126,1},{0,127,1}
}。
该多个第二解码码本如下:
{{11,0,5},{15,1,5},{14,2,4},{14,3,4},{0,4,3},{0,5,3},{0,6,3},{0,7,3},{7,8,4},{7,9,4},{9,10,4},{9,11,4},{10,12,4},{10,13,4},{13,14,4},{13,15,4},{3,16,4},{3,17,4},{8,18,4},{8,19,4},{6,20,4},{6,21,4},{12,22,4},{12,23,4},{4,24,4},{4,25,4},{1,26,4},{1,27,4},{2,28,4},{2,29,4},{5,30,4},{5,31, 4}
};
{{2,0,4},{2,1,4},{5,2,4},{5,3,4},{0,4,3},{0,5,3},{0,6,3},{0,7,3},{10,8,5},{13,9,5},{7,10,5},{14,11,5},{12,12,4},{12,13,4},{3,14,4},{3,15,4},{8,16,4},{8,17,4},{9,18,4},{9,19,4},{6,20,4},{6,21,4},{11,22,5},{15,23,5},{4,24,3},{4,25,3},{4,26,3},{4,27,3},{1,28,3},{1,29,3},{1,30,3},{1,31,3}
};
{{4,0,4},{4,1,4},{1,2,4},{1,3,4},{2,4,4},{2,5,4},{5,6,4},{5,7,4},{0,8,2},{0,9,2},{0,10,2},{0,11,2},{0,12,2},{0,13,2},{0,14,2},{0,15,2},{10,16,5},{13,17,5},{7,18,5},{14,19,5},{12,20,4},{12,21,4},{3,22,4},{3,23,4},{8,24,4},{8,25,4},{9,26,4},{9,27,4},{6,28,4},{6,29,4},{11,30,5},{15,31,5},
};
{{10,0,6},{13,1,6},{7,2,6},{14,3,6},{12,4,5},{12,5,5},{3,6,5},{3,7,5},{8,8,5},{8,9,5},{9,10,5},{9,11,5},{6,12,5},{6,13,5},{11,14,6},{15,15,6},{4,16,4},{4,17,4},{4,18,4},{4,19,4},{1,20,4},{1,21,4},{1,22,4},{1,23,4},{2,24,4},{2,25,4},{2,26,4},{2,27,4},{5,28,4},{5,29,4},{5,30,4},{5,31,4},{0,32,1},{0,33,1},{0,34,1},{0,35,1},{0,36,1},{0,37,1},{0,38,1},{0,39,1},{0,40,1},{0,41,1},{0,42,1},{0,43,1},{0,44,1},{0,45,1},{0,46,1},{0,47,1},{0,48,1},{0,49,1},{0,50,1},{0,51,1},{0,52,1},{0,53,1},{0,54,1},{0,55,1},{0,56,1},{0,57,1},{0,58,1},{0,59,1},{0,60,1},{0,61,1},{0,62,1},{0,63,1}
}。
该多个第三解码码本如下:
{{4,0,4},{4,1,4},{4,2,4},{4,3,4},{6,4,6},{7,5,6},{5,6,5},{5,7,5},{3,8,3},{3,9,3},{3,10,3},{3,11,3},{3,12,3},{3,13,3},{3,14,3},{3,15,3},{2,16,2},{2,17,2},{2,18,2},{2,19,2},{2,20,2},{2,21,2},{2,22,2},{2,23,2},{2,24,2},{2,25,2},{2,26,2},{2,27,2},{2,28,2},{2,29,2},{2,30,2},{2,31,2},{1,32,2},{1,33,2},{1,34,2},{1,35,2},{1,36,2},{1,37,2},{1,38,2},{1,39,2},{1,40,2},{1,41,2},{1,42,2},{1,43,2},{1,44,2},{1,45,2},{1,46,2},{1,47,2},{0,48,2},{0,49,2},{0,50,2},{0,51,2},{0,52,2},{0,53,2},{0,54,2},{0,55,2},{0,56,2},{0,57,2},{0,58,2},{0,59,2},{0,60,2},{0,61,2},{0,62,2},{0,63,2}
};
{{2,0,3},{2,1,3},{2,2,3},{2,3,3},{3,4,3},{3,5,3},{3,6,3},{3,7,3},{5,8,4},{5,9,4},{6,10,5},{7,11,5},{4,12,3},{4,13,3},{4,14,3},{4,15,3},{1,16,2},{1,17,2},{1,18,2},{1,19,2},{1,20,2},{1,21,2},{1,22,2},{1,23,2},{0,24,2},{0,25,2},{0,26,2},{0,27,2},{0,28,2},{0,29,2},{0,30,2},{0,31,2}
};
{{0,0,1},{0,1,1},{0,2,1},{0,3,1},{0,4,1},{0,5,1},{0,6,1},{0,7,1},{0,8,1},{0,9,1},{0,10,1},{0,11,1},{0,12,1},{0,13,1},{0,14,1},{0,15,1},{0,16,1}, {0,17,1},{0,18,1},{0,19,1},{0,20,1},{0,21,1},{0,22,1},{0,23,1},{0,24,1},{0,25,1},{0,26,1},{0,27,1},{0,28,1},{0,29,1},{0,30,1},{0,31,1},{5,32,5},{5,33,5},{6,34,6},{7,35,6},{4,36,4},{4,37,4},{4,38,4},{4,39,4},{1,40,3},{1,41,3},{1,42,3},{1,43,3},{1,44,3},{1,45,3},{1,46,3},{1,47,3},{2,48,3},{2,49,3},{2,50,3},{2,51,3},{2,52,3},{2,53,3},{2,54,3},{2,55,3},{3,56,3},{3,57,3},{3,58,3},{3,59,3},{3,60,3},{3,61,3},{3,62,3},{3,63,3}
};
{{0,0,3},{1,1,3},{2,2,3},{3,3,3},{4,4,3},{5,5,3},{6,6,3},{7,7,3}
}。
步骤903:如果该声道解码模式为右声道解码模式,则解码码流中的右声道比特流,以得到该音频信号的右声道数据,将右声道数据复制到左声道。
其中,右声道解码模式表示无需解码码流中的左声道比特流。解码端在解码出右声道数据之后,将右声道数据复制到左声道即可。需要说明的是,如果右声道数据包括经量化的右声道频谱数据,则解码端对经量化的右声道频谱数据进行反量化,将反量化后的右声道频谱数据复制到左声道。或者,解码端对经量化的右声道频谱数据进行反量化,对反量化后的右声道数据进行时频域的逆变换(如IMDCT),以得到右声道时域交叠信号,对右声道时域时域交叠信号进行交叠相加,以重建出右声道时域信号,将重建出的右声道时域信号复制到左声道。
由前文对编码流程的相关介绍可知,码流中有解码端进行解码所需的参数,例如,码流中有右声道的边信息,该边信息包括编码右声道数据所采用的编码码本的标识。基于此,解码端解码码流中边信息的比特流,以得到该边信息,该边信息包括编码码本标识。解码端基于该编码码本标识,从多个解码码本中确定解码所需的目标解码码本。此处的边信息是指右声道的边信息。
需要说明的是,该多个解码码本与步骤902中所介绍的多个解码码本相同,这里不再赘述。
可选地,如果声道解码模式不是左声道解码模式,且不是右声道解码模式,则解码端解码左声道比特流和右声道比特流,以得到左声道数据和右声道数据。也即是,解码端解码码流中的全部数据。
如果该音频信号不满足第一条件,则在该音频信号为双声道信号的情况下,解码端解码码流中的双声道交织比特流,以得到左声道数据和右声道数据。应当理解的是,该音频信号为双声道信号,但不满足第一条件,说明编码端是按照双声道交织编码的方式来编码左声道数据和右声道数据,那么,解码端按照双声道交织解码方式来解码双声道交织比特流,从而得到左声道数据和右声道数据。
其中,解码端所得到的左声道数据和右声道数据包括经量化的频谱数据,如果声道解码模式为左声道解码模式,则解码端对左声道数据包括的经量化的频谱数据进行反量化,以得到反量化后的左声道频谱数据,将反量化后的左声道频谱数据复制到右声道。如果声道解码模式为右声道解码模式,则解码端对右声道数据包括的经量化的频谱数据进行反量化,以得到反量化后的右声道频谱数据,将反量化后的右声道频谱数据复制到左声道。如果声道解码模式为双声道解码模式,则解码端对左声道数据包括的经量化的频谱数据进行反量化,以得 到反量化后的左声道频谱数据,以及对右声道数据包括的经量化的频谱数据进行反量化,以得到反量化后的右声道频谱数据。
或者,如果声道解码模式为左声道解码模式,则解码端对左声道数据包括的经量化的频谱数据进行反量化,以得到反量化后的左声道频谱数据,对反量化后的左声道频谱数据进行时频域的逆变换(如IMDCT),以得到左声道时域交叠信号,对左声道时域交叠信号进行交叠相加,以重建出左声道时域信号,将重建出的左声道时域信号复制到右声道。如果声道解码模式为右声道解码模式,则解码端对右声道数据包括的经量化的频谱数据进行反量化,以得到反量化后的右声道频谱数据,对反量化后的右声道频谱数据进行时频域的逆变换(如IMDCT),以得到右声道时域交叠信号,对右声道时域交叠信号进行交叠相加,以重建出右声道时域信号,将重建出的右声道时域信号复制到左声道。如果声道解码模式为双声道解码模式,则解码端对左声道数据包括的经量化的频谱数据依次进行反量化、时频域的逆变换和交叠相加,以重建出左声道时域信号,以及对右声道数据包括的经量化的频谱数据依次进行反量化、时频域的逆变换和交叠相加,以重建出右声道时域信号。
可选地,如果编码端对左右声道数据进行了MS变换,将MS变换后的左右声道数据编入了码流,则解码端在得到左声道数据和/或右声道数据之后,还需要对左声道数据和/或右声道数据进行MS反变换,以得到原始的左声道数据和/或右声道数据。
如果该音频信号不满足第一条件,则在音频信号为单声道信号的情况下,解码端解码码流中的单声道比特流,以得到该音频信号的单声道数据。可选地,解码端将该单声道数据复制到左声道和右声道。如果解码所得到的单声道数据为经量化的单声道频谱数据,则解码端对经量化的单声道频谱数据进行反量化,对反量化后的单声道频谱数据进行时频域的逆变换(如IMDCT),以得到单声道时域交叠信号,对单声道时域交叠信号进行交叠相加,以重建出单声道时域信号,将重建出的单声道时域信号复制到左声道和右声道。
在本申请实施例中,解码端在左声道和右声道置入信号之后,耳机等播放设备可根据配置获取相应声道的信号进行播放。
接下来请参照图10对本申请实施例提供的解码流程再次进行解释说明。图10所示的解码流程对双声道信号进行解码的流程。单声道信号的解码流程与图10中左侧分支流程类似。参见图10,解码端获取输入参数,输入参数包括声道解码模式等。解码端解码码流的包头。然后,解码端基于包头以及声道解码模式来选择解码模式(即选择码流解包方式)。在选择双声道交织模式的情况下,解码端进入左侧分支流程,依次解码双声道公用参数的标识,解码左右声道的标度因子,解码左右声道所采用的编码码本标识,解码左右声道的频谱数据。在选择双声道解交织模式的情况下,解码端进入右侧分支流程,依次进行左声道解码和右声道解码。其中,每个声道的解码过程依次包括解码声道参数标识、解码声道标度因子、解码声道编码码本标识和解码声道频谱数据。解码完成后,解码端按照配置的声道解码模式所指示的声道,来对解析出的相应声道的频谱数据依次进行反量化和MDCT逆变换,以重建出相应声道的信号。
以上主要介绍了解码端解析码流中的频谱数据实现过程,由前文对编码过程的相关介绍可知,码流中还编入有多个子带的标度因子,标度因子用于对音频信号的频谱包络进行整形。基于此,解码端从码流中解析出边信息,获得边信息中的标度因子之后,基于标度因子对音频信号的频谱包络进行整形,例如对左声道和/或右声道的频谱包络进行整形。接下来对解码 端解码标度因子的实现过程进行示例性地介绍。
作为一种示例,解码端首先获取标度因子的比特流中前5个比特的二进制数,得到多个子带中首个子带的差分标度值,将首个子带的差分标度值减去参考值(本示例中参考值为8),以得到首个子带的标度因子。然后,解码端从码流中读取下一个5比特的二进制数,将该二进制数对应的十进制数作为码值,通过查找哈夫曼解码码本得到该码值对应的解码后的十进制数,该十进制数即下一个子带的差分标度值的绝对值。然后,如果该绝对值大于0,则解码端继续读取码流中的1比特,作为该绝对值的符号位,从而得到该子带的差分标度值。如果该绝对值为0,则该差分标度值的绝对值即为该子带的差分标度值。依次类推,解码端得到该多个子带中各个子带的差分标度值。解码端按照公式(2)恢复出该多个子带中各个子带的标度因子。
至此,解码端根据子带数量解析出该多个子带的标度因子。
其中,解码端所采用的哈夫曼解码码本如下:
HuffmanCB HUF_DEC_DIFF_SF[HUF_DEC_DIFF_SF_MAXIDX]={
{4,0,4},{4,1,4},{5,2,5},{6,3,5},{3,4,3},{3,5,3},{3,6,3},{3,7,3},
{2,8,2},{2,9,2},{2,10,2},{2,11,2},{2,12,2},{2,13,2},{2,14,2},{2,15,2},{1,16,2},{1,17,2},{1,18,2},{1,19,2},{1,20,2},{1,21,2},{1,22,2},{1,23,2},{0,24,2},{0,25,2},{0,26,2},{0,27,2},{0,28,2},{0,29,2},{0,30,2},{0,31,2}
}。
综上所述,在本申请实施例中,在音频信号为双声道信号的情况下,即使码流包括左声道比特流和右声道比特流,在解码过程中,也能够依据声道解码模式,来解码左声道比特流且不解码右声道比特流,或者解码右声道比特流而不解码左声道比特流,从而在解码端资源有限的情况下,降低解码端的功耗。
图11是本申请实施例提供的一种音频解码装置1100的结构示意图,该音频解码装置1100可以由软件、硬件或者两者的结合实现成为音频编解码设备的部分或者全部。参见图11,该装置1100包括:获取模块1101、解码模块1102。
获取模块1101,用于如果待解码的音频信号满足第一条件,则获取声道解码模式,第一条件包括音频信号为双声道信号、音频信号的编码码率不小于码率阈值,且音频信号的采样率不小于采样率阈值;
解码模块1102,用于如果声道解码模式为左声道解码模式,则解码码流中的左声道比特流,以得到音频信号的左声道数据,将左声道数据复制到右声道;
解码模块1102,还用于如果声道解码模式为右声道解码模式,则解码码流中的右声道比特流,以得到音频信号的右声道数据,将右声道数据复制到左声道。
可选地,解码模块1102,还用于如果声道解码模式不是左声道解码模式,且不是右声道解码模式,则解码左声道比特流和右声道比特流,以得到左声道数据和右声道数据。
可选地,解码模块1102,还用于如果音频信号不满足第一条件,则在音频信号为双声道 信号的情况下,解码码流中的双声道交织比特流,以得到左声道数据和右声道数据。
可选地,解码模块1102,还用于如果音频信号不满足第一条件,则在音频信号为单声道信号的情况下,解码码流中的单声道比特流,以得到音频信号的单声道数据。
可选地,该装置1100还包括:
第二获取模块,用于获取码流的总数据量;
解码模块,还用于解码码流的包头,以得到音频信号的声道数、采样率和帧长;
第一确定模块,用于基于总数据量、声道数、采样率和帧长,确定音频信号是否满足第一条件。
可选地,解码模块1102,还用于解码码流中边信息的比特流,以得到边信息,边信息包括编码码本标识;
该装置1100还包括:第二确定模块,用于基于编码码本标识,从多个解码码本中确定解码所需的目标解码码本。
可选地,该多个解码码本如下:
{{0,0,4},{1,1,4},{2,2,4},{3,3,4},{4,4,4},{5,5,4},{6,6,4},{7,7,4},
{8,8,4},{9,9,4},{10,10,4},{11,11,4},{12,12,4},{13,13,4},{14,14,4},{15,15,4}
};
{{11,0,5},{15,1,5},{13,2,4},{13,3,4},{0,4,3},{0,5,3},{0,6,3},{0,7,3},
{9,8,4},{9,9,4},{5,10,4},{5,11,4},{14,12,4},{14,13,4},{7,14,4},{7,15,4},{6,16,4},{6,17,4},{3,18,4},{3,19,4},{10,20,4},{10,21,4},{12,22,4},{12,23,4},{2,24,4},{2,25,4},{1,26,4},{1,27,4},{8,28,4},{8,29,4},{4,30,4},{4,31,4}
};
{{2,0,4},{2,1,4},{2,2,4},{2,3,4},{1,4,4},{1,5,4},{1,6,4},{1,7,4},
{8,8,4},{8,9,4},{8,10,4},{8,11,4},{4,12,4},{4,13,4},{4,14,4},{4,15,4},{0,16,2},{0,17,2},{0,18,2},{0,19,2},{0,20,2},{0,21,2},{0,22,2},{0,23,2},
{0,24,2},{0,25,2},{0,26,2},{0,27,2},{0,28,2},{0,29,2},{0,30,2},{0,31,2},{11,32,6},{15,33,6},{14,34,5},{14,35,5},{7,36,5},{7,37,5},{13,38,5},
{13,39,5},{12,40,4},{12,41,4},{12,42,4},{12,43,4},{6,44,4},{6,45,4},
{6,46,4},{6,47,4},{3,48,4},{3,49,4},{3,50,4},{3,51,4},{5,52,4},{5,53,4},{5,54,4},{5,55,4},{10,56,4},{10,57,4},{10,58,4},{10,59,4},{9,60,4},{9,61,4},{9,62,4},{9,63,4}
};
{{11,0,7},{15,1,7},{14,2,6},{14,3,6},{7,4,6},{7,5,6},{13,6,6},{13,7,6},{12,8,5},{12,9,5},{12,10,5},{12,11,5},{6,12,5},{6,13,5},{6,14,5},{6,15,5},{3,16,5},{3,17,5},{3,18,5},{3,19,5},{5,20,5},{5,21,5},{5,22,5},{5,23,5},{10,24,5},{10,25,5},{10,26,5},{10,27,5},{9,28,5},{9,29,5},{9,30,5},{9,31,5},{2,32,4},{2,33,4},{2,34,4},{2,35,4},{2,36,4},{2,37,4},{2,38,4},{2,39,4},{1,40,4},{1,41,4},{1,42,4},{1,43,4},{1,44,4},{1,45,4},{1,46,4},{1, 47,4},{8,48,4},{8,49,4},{8,50,4},{8,51,4},{8,52,4},{8,53,4},{8,54,4},{8,55,4},{4,56,4},{4,57,4},{4,58,4},{4,59,4},{4,60,4},{4,61,4},{4,62,4},{4,63,4},{0,64,1},{0,65,1},{0,66,1},{0,67,1},{0,68,1},{0,69,1},{0,70,1},{0,71,1},{0,72,1},{0,73,1},{0,74,1},{0,75,1},{0,76,1},{0,77,1},{0,78,1},{0,79,1},{0,80,1},{0,81,1},{0,82,1},{0,83,1},{0,84,1},{0,85,1},{0,86,1},{0,87,1},{0,88,1},{0,89,1},{0,90,1},{0,91,1},{0,92,1},{0,93,1},{0,94,1},{0,95,1},{0,96,1},{0,97,1},{0,98,1},{0,99,1},{0,100,1},{0,101,1},{0,102,1},{0,103,1},{0,104,1},{0,105,1},{0,106,1},{0,107,1},{0,108,1},{0,109,1},{0,110,1},{0,111,1},{0,112,1},{0,113,1},{0,114,1},{0,115,1},{0,116,1},{0,117,1},{0,118,1},{0,119,1},{0,120,1},{0,121,1},{0,122,1},{0,123,1},{0,124,1},
{0,125,1},{0,126,1},{0,127,1}
};
{{11,0,5},{15,1,5},{14,2,4},{14,3,4},{0,4,3},{0,5,3},{0,6,3},{0,7,3},{7,8,4},{7,9,4},{9,10,4},{9,11,4},{10,12,4},{10,13,4},{13,14,4},{13,15,4},{3,16,4},{3,17,4},{8,18,4},{8,19,4},{6,20,4},{6,21,4},{12,22,4},{12,23,4},{4,24,4},{4,25,4},{1,26,4},{1,27,4},{2,28,4},{2,29,4},{5,30,4},{5,31,4}
};
{{2,0,4},{2,1,4},{5,2,4},{5,3,4},{0,4,3},{0,5,3},{0,6,3},{0,7,3},{10,8,5},{13,9,5},{7,10,5},{14,11,5},{12,12,4},{12,13,4},{3,14,4},{3,15,4},{8,16,4},{8,17,4},{9,18,4},{9,19,4},{6,20,4},{6,21,4},{11,22,5},{15,23,5},{4,24,3},{4,25,3},{4,26,3},{4,27,3},{1,28,3},{1,29,3},{1,30,3},{1,31,3}
};
{{4,0,4},{4,1,4},{1,2,4},{1,3,4},{2,4,4},{2,5,4},{5,6,4},{5,7,4},{0,8,2},{0,9,2},{0,10,2},{0,11,2},{0,12,2},{0,13,2},{0,14,2},{0,15,2},{10,16,5},{13,17,5},{7,18,5},{14,19,5},{12,20,4},{12,21,4},{3,22,4},{3,23,4},{8,24,4},{8,25,4},{9,26,4},{9,27,4},{6,28,4},{6,29,4},{11,30,5},{15,31,5},
};
{{10,0,6},{13,1,6},{7,2,6},{14,3,6},{12,4,5},{12,5,5},{3,6,5},{3,7,5},{8,8,5},{8,9,5},{9,10,5},{9,11,5},{6,12,5},{6,13,5},{11,14,6},{15,15,6},{4,16,4},{4,17,4},{4,18,4},{4,19,4},{1,20,4},{1,21,4},{1,22,4},{1,23,4},{2,24,4},{2,25,4},{2,26,4},{2,27,4},{5,28,4},{5,29,4},{5,30,4},{5,31,4},{0,32,1},{0,33,1},{0,34,1},{0,35,1},{0,36,1},{0,37,1},{0,38,1},{0,39,1},{0,40,1},{0,41,1},{0,42,1},{0,43,1},{0,44,1},{0,45,1},{0,46,1},{0,47,1},{0,48,1},{0,49,1},{0,50,1},{0,51,1},{0,52,1},{0,53,1},{0,54,1},{0,55,1},{0,56,1},{0,57,1},{0,58,1},{0,59,1},{0,60,1},{0,61,1},{0,62,1},{0,63,1}
};
{{4,0,4},{4,1,4},{4,2,4},{4,3,4},{6,4,6},{7,5,6},{5,6,5},{5,7,5},{3,8, 3},{3,9,3},{3,10,3},{3,11,3},{3,12,3},{3,13,3},{3,14,3},{3,15,3},{2,16,2},{2,17,2},{2,18,2},{2,19,2},{2,20,2},{2,21,2},{2,22,2},{2,23,2},{2,24,2},{2,25,2},{2,26,2},{2,27,2},{2,28,2},{2,29,2},{2,30,2},{2,31,2},{1,32,2},{1,33,2},{1,34,2},{1,35,2},{1,36,2},{1,37,2},{1,38,2},{1,39,2},{1,40,2},{1,41,2},{1,42,2},{1,43,2},{1,44,2},{1,45,2},{1,46,2},{1,47,2},{0,48,2},{0,49,2},{0,50,2},{0,51,2},{0,52,2},{0,53,2},{0,54,2},{0,55,2},{0,56,2},{0,57,2},{0,58,2},{0,59,2},{0,60,2},{0,61,2},{0,62,2},{0,63,2}
};
{{2,0,3},{2,1,3},{2,2,3},{2,3,3},{3,4,3},{3,5,3},{3,6,3},{3,7,3},{5,8,4},{5,9,4},{6,10,5},{7,11,5},{4,12,3},{4,13,3},{4,14,3},{4,15,3},{1,16,2},{1,17,2},{1,18,2},{1,19,2},{1,20,2},{1,21,2},{1,22,2},{1,23,2},{0,24,2},{0,25,2},{0,26,2},{0,27,2},{0,28,2},{0,29,2},{0,30,2},{0,31,2}
};
{{0,0,1},{0,1,1},{0,2,1},{0,3,1},{0,4,1},{0,5,1},{0,6,1},{0,7,1},{0,8,1},{0,9,1},{0,10,1},{0,11,1},{0,12,1},{0,13,1},{0,14,1},{0,15,1},{0,16,1},{0,17,1},{0,18,1},{0,19,1},{0,20,1},{0,21,1},{0,22,1},{0,23,1},{0,24,1},{0,25,1},{0,26,1},{0,27,1},{0,28,1},{0,29,1},{0,30,1},{0,31,1},{5,32,5},{5,33,5},{6,34,6},{7,35,6},{4,36,4},{4,37,4},{4,38,4},{4,39,4},{1,40,3},{1,41,3},{1,42,3},{1,43,3},{1,44,3},{1,45,3},{1,46,3},{1,47,3},{2,48,3},{2,49,3},{2,50,3},{2,51,3},{2,52,3},{2,53,3},{2,54,3},{2,55,3},{3,56,3},{3,57,3},{3,58,3},{3,59,3},{3,60,3},{3,61,3},{3,62,3},{3,63,3}
};
{{0,0,3},{1,1,3},{2,2,3},{3,3,3},{4,4,3},{5,5,3},{6,6,3},{7,7,3}
}。
在本申请实施例中,在音频信号为双声道信号的情况下,即使码流包括左声道比特流和右声道比特流,在解码过程中,也能够依据声道解码模式,来解码左声道比特流且不解码右声道比特流,或者解码右声道比特流而不解码左声道比特流,从而在解码端资源有限的情况下,降低解码端的功耗。相应地,编码端也能够依据音频信号所满足的条件,依次编码左声道数据和右声道数据,而非必须按照双声道交织编码方式或必须按照双声道解交织编码方式进行编码。可见本方案的编码方式更加灵活。
需要说明的是:上述实施例提供的音频解码装置在解码音频信号时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的音频解码装置与音频解码方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
图12是本申请实施例提供的一种音频编码装置1200的结构示意图,该音频编码装置1200可以由软件、硬件或者两者的结合实现成为音频编解码设备的部分或者全部。参见图12,该装置1200包括:编码模块1201。
编码模块1201,用于如果待编码的音频信号满足第一条件,则将音频信号的左声道数据编入码流,以及将音频信号的右声道数据编入码流,第一条件包括音频信号为双声道信号、音频信号的编码码率不小于码率阈值,且音频信号的采样率不小于采样率阈值。
可选地,编码模块1201,还用于:
如果音频信号不满足第一条件,则在音频信号为双声道信号的情况下,通过双声道交织编码的方式,将左声道数据与右声道数据编入码流。
可选地,编码模块1201,还用于:
如果音频信号不满足第一条件,则在音频信号为单声道信号的情况下,将音频信号的单声道数据编入码流。
可选地,编码模块1201包括:
获取子模块,用于获取多个子带中各个子带的量化等级衡量因子,量化等级衡量因子表征编码相应子带内的各个频谱值所需的平均比特数,多个子带是指将左声道数据包括的经量化的频谱数据所划分到的多个子带;
划分子模块,用于基于多个子带的量化等级衡量因子,将多个子带划分为多组子带,同一组子带的量化等级衡量因子相同;
确定子模块,用于基于每组子带的量化等级衡量因子,从多个编码码本中确定每组子带对应的目标编码码本,以及确定每组子带内的频谱值的比特流,目标编码码本是指对相应一组子带内的频谱值进行编码所采用的编码码本;
编码子模块,用于将每组子带对应的目标编码码本的标识作为左声道数据的一种边信息编入码流。
可选地,确定子模块用于:
对于多组子带中的任一组子带,如果任一组子带的量化等级衡量因子为第一值,则采用多个编码码本中的多个第一编码码本分别对任一组子带内的频谱值进行编码,以得到与多个第一编码码本一一对应的多个第一候选频谱比特流;
将多个第一候选频谱比特流中比特总数最少的第一候选频谱比特流,确定为任一组子带内的频谱值的比特流,以及将比特总数最少的第一候选频谱比特流所对应的第一编码码本确定为任一组子带对应的目标编码码本。
可选地,第一值为1;
采用多个编码码本中的多个第一编码码本分别对任一组子带内的频谱值进行编码,包括:
将任一组子带中的每四个频谱值组合成一个二进制数;
采用多个第一编码码本分别对二进制数所表示的十进制数进行编码;
其中,多个第一编码码本如下:
{{0,0,4},{1,1,4},{2,2,4},{3,3,4},{4,4,4},{5,5,4},{6,6,4},{7,7,4},
{8,8,4},{9,9,4},{10,10,4},{11,11,4},{12,12,4},{13,13,4},{14,14,4},{15,15,4}
};
{{0,1,3},{1,13,4},{2,12,4},{3,9,4},{4,15,4},{5,5,4},{6,8,4},{7,7,4},
{8,14,4},{9,4,4},{10,10,4},{11,0,5},{12,11,4},{13,1,4},{14,6,4},{15,1,5}
};
{{0,1,2},{1,1,4},{2,0,4},{3,12,4},{4,3,4},{5,13,4},{6,11,4},{7,18,5},
{8,2,4},{9,15,4},{10,14,4},{11,32,6},{12,10,4},{13,19,5},{14,17,5},{15,33,6}
};
{{0,1,1},{1,5,4},{2,4,4},{3,4,5},{4,7,4},{5,5,5},{6,3,5},{7,2,6},
{8,6,4},{9,7,5},{10,6,5},{11,0,7},{12,2,5},{13,3,6},{14,1,6},{15,1,7}
}。
可选地,确定子模块用于:
对于多组子带中的任一组子带,如果任一组子带的量化等级衡量因子为第二值,则采用多个编码码本中的多个第二编码码本分别对任一组子带内的频谱值进行编码,以得到与多个第二编码码本一一对应的多个第二候选频谱比特流;
将多个第二候选频谱比特流中比特总数最少的第二候选频谱比特流,确定为任一组子带内的频谱值的比特流,以及将比特总数最少的第二候选频谱比特流所对应的第二编码码本确定为任一组子带对应的目标编码码本。
可选地,第二值为2;
采用多个编码码本中的多个第二编码码本分别对任一组子带内的频谱值进行编码,包括:
将任一组子带内的每两个频谱值组合成一个二进制数;
采用多个第二编码码本分别对二进制数所表示的十进制数进行编码;
其中,多个第二编码码本如下:
{{0,1,3},{1,13,4},{2,14,4},{3,8,4},{4,12,4},{5,15,4},{6,10,4},{7,4,4},{8,9,4},{9,5,4},{10,6,4},{11,0,5},{12,11,4},{13,7,4},{14,1,4},{15,1,5}
};
{{0,1,3},{1,7,3},{2,0,4},{3,7,4},{4,6,3},{5,1,4},{6,10,4},{7,10,5},
{8,8,4},{9,9,4},{10,8,5},{11,22,5},{12,6,4},{13,9,5},{14,11,5},{15,23,5}
};
{{0,1,2},{1,1,4},{2,2,4},{3,11,4},{4,0,4},{5,3,4},{6,14,4},{7,18,5},
{8,12,4},{9,13,4},{10,16,5},{11,30,5},{12,10,4},{13,17,5},{14,19,5},{15,31,5}
};
{{0,1,1},{1,5,4},{2,6,4},{3,3,5},{4,4,4},{5,7,4},{6,6,5},{7,2,6},
{8,4,5},{9,5,5},{10,0,6},{11,14,6},{12,2,5},{13,1,6},{14,3,6},{15,15,6}
}。
可选地,确定子模块用于:
对于多组子带中的任一组子带,如果任一组子带的量化等级衡量因子为第三值,则采用多个编码码本中的多个第三编码码本分别对任一组子带内的频谱值进行编码,以得到与多个第三编码码本一一对应的多个第三候选频谱比特流;
将多个第三候选频谱比特流中比特总数最少的第三候选频谱比特流,确定为任一组子带内的频谱值的比特流,以及将比特总数最少的第三候选频谱比特流所对应的第三编码码本确定为任一组子带对应的目标编码码本。
可选地,确定子模块用于:
对于多组子带中的任一组子带,如果任一组子带的量化等级衡量因子为第四值,则采用多个第三编码码本分别对任一组子带内的各个频谱值中的第一部分比特位进行编码,以得到与多个第三编码码本一一对应的多个第一部分候选比特流;
将多个第一部分候选比特流中比特总数最少的第一部分候选比特流,确定为第一部分比特位的比特流,以及将比特总数最少的第一部分候选比特流所对应的第三编码码本确定为任一组子带对应的目标编码码本;
对任一组子带内的各个频谱值中的除第一部分比特位之外的第二部分比特位进行均匀量化编码,以得到第二部分比特位的比特流。
可选地,第一部分比特位是指频谱值中高位的N个比特位,第二部分比特位是指频谱值中低位的M个比特,M等于任一组子带的量化等级衡量因子减去第三值。
可选地,第三值为3;
采用多个编码码本中的多个第三编码码本分别对任一组子带内的频谱值进行编码,包括:
采用多个第三编码码本分别对任一组子带内的各个频谱值进行编码;
其中,多个第三编码码本如下:
{{0,3,2},{1,2,2},{2,1,2},{3,1,3},{4,0,4},{5,3,5},{6,4,6},{7,5,6}
};
{{0,3,2},{1,2,2},{2,0,3},{3,1,3},{4,3,3},{5,4,4},{6,10,5},{7,11,5}
};
{{0,0,1},{1,5,3},{2,6,3},{3,7,3},{4,9,4},{5,16,5},{6,34,6},{7,35,6}
};
{{0,0,3},{1,1,3},{2,2,3},{3,3,3},{4,4,3},{5,5,3},{6,6,3},{7,7,3}
}。
在本申请实施例中,在音频信号为双声道信号的情况下,即使码流包括左声道比特流和右声道比特流,在解码过程中,也能够依据声道解码模式,来解码左声道比特流且不解码右声道比特流,或者解码右声道比特流而不解码左声道比特流,从而在解码端资源有限的情况下,降低解码端的功耗。相应地,编码端也能够依据音频信号所满足的条件,依次编码左声道数据和右声道数据,而非必须按照双声道交织编码方式或必须按照双声道解交织编码方式进行编码。可见本方案的编码方式更加灵活。
需要说明的是:上述实施例提供的音频编码装置在编码音频信号时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的音频编码装置与音频编码方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意结合来实现。当 使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络或其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如:同轴电缆、光纤、数据用户线(digital subscriber line,DSL))或无线(例如:红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质,或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如:软盘、硬盘、磁带)、光介质(例如:数字通用光盘(digital versatile disc,DVD))或半导体介质(例如:固态硬盘(solid state disk,SSD))等。值得注意的是,本申请实施例提到的计算机可读存储介质可以为非易失性存储介质,换句话说,可以是非瞬时性存储介质。
应当理解的是,本文提及的“至少一个”是指一个或多个,“多个”是指两个或两个以上。在本申请实施例的描述中,除非另有说明,“/”表示或的意思,例如,A/B可以表示A或B;本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,为了便于清楚描述本申请实施例的技术方案,在本申请的实施例中,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。
需要说明的是,本申请实施例所涉及的信息(包括但不限于用户设备信息、用户个人信息等)、数据(包括但不限于用于分析的数据、存储的数据、展示的数据等)以及信号,均为经用户授权或者经过各方充分授权的,且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准。
以上所述为本申请提供的实施例,并不用以限制本申请,凡在本申请的原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (42)

  1. 一种音频解码方法,其特征在于,所述方法包括:
    如果待解码的音频信号满足第一条件,则获取声道解码模式,所述第一条件包括所述音频信号为双声道信号、所述音频信号的编码码率不小于码率阈值,且所述音频信号的采样率不小于采样率阈值;
    如果所述声道解码模式为左声道解码模式,则解码码流中的左声道比特流,以得到所述音频信号的左声道数据,将所述左声道数据复制到右声道;
    如果所述声道解码模式为右声道解码模式,则解码所述码流中的右声道比特流,以得到所述音频信号的右声道数据,将所述右声道数据复制到左声道。
  2. 如权利要求1所述的方法,其特征在于,所述方法还包括:
    如果所述声道解码模式不是所述左声道解码模式,且不是所述右声道解码模式,则解码所述左声道比特流和所述右声道比特流,以得到所述左声道数据和所述右声道数据。
  3. 如权利要求1或2所述的方法,其特征在于,所述方法还包括:
    如果所述音频信号不满足所述第一条件,则在所述音频信号为双声道信号的情况下,解码所述码流中的双声道交织比特流,以得到所述左声道数据和所述右声道数据。
  4. 如权利要求1-3任一所述的方法,其特征在于,所述方法还包括:
    如果所述音频信号不满足所述第一条件,则在所述音频信号为单声道信号的情况下,解码所述码流中的单声道比特流,以得到所述音频信号的单声道数据。
  5. 如权利要求1-4任一所述的方法,其特征在于,所述方法还包括:
    获取所述码流的总数据量;
    解码所述码流的包头,以得到所述音频信号的声道数、采样率和帧长;
    基于所述总数据量、所述声道数、所述采样率和所述帧长,确定所述音频信号是否满足所述第一条件。
  6. 如权利要求1-5任一所述的方法,其特征在于,所述方法还包括:
    解码所述码流中边信息的比特流,以得到所述边信息,所述边信息包括编码码本标识;
    基于所述编码码本标识,从多个解码码本中确定解码所需的目标解码码本。
  7. 如权利要求6的方法,其特征在于,所述多个解码码本如下:
    {{0,0,4},{1,1,4},{2,2,4},{3,3,4},{4,4,4},{5,5,4},{6,6,4},{7,7,4},
    {8,8,4},{9,9,4},{10,10,4},{11,11,4},{12,12,4},{13,13,4},{14,14,4},{15,15,4}
    };
    {{11,0,5},{15,1,5},{13,2,4},{13,3,4},{0,4,3},{0,5,3},{0,6,3},{0,7,3},
    {9,8,4},{9,9,4},{5,10,4},{5,11,4},{14,12,4},{14,13,4},{7,14,4},{7,15,4},{6,16,4},{6,17,4},{3,18,4},{3,19,4},{10,20,4},{10,21,4},{12,22,4},{12,23,4},{2,24,4},{2,25,4},{1,26,4},{1,27,4},{8,28,4},{8,29,4},{4,30,4},{4,31,4}
    };
    {{2,0,4},{2,1,4},{2,2,4},{2,3,4},{1,4,4},{1,5,4},{1,6,4},{1,7,4},
    {8,8,4},{8,9,4},{8,10,4},{8,11,4},{4,12,4},{4,13,4},{4,14,4},{4,15,4},{0,16,2},{0,17,2},{0,18,2},{0,19,2},{0,20,2},{0,21,2},{0,22,2},{0,23,2},
    {0,24,2},{0,25,2},{0,26,2},{0,27,2},{0,28,2},{0,29,2},{0,30,2},{0,31,2},{11,32,6},{15,33,6},{14,34,5},{14,35,5},{7,36,5},{7,37,5},{13,38,5},
    {13,39,5},{12,40,4},{12,41,4},{12,42,4},{12,43,4},{6,44,4},{6,45,4},
    {6,46,4},{6,47,4},{3,48,4},{3,49,4},{3,50,4},{3,51,4},{5,52,4},{5,53,4},{5,54,4},{5,55,4},{10,56,4},{10,57,4},{10,58,4},{10,59,4},{9,60,4},{9,61,4},{9,62,4},{9,63,4}
    };
    {{11,0,7},{15,1,7},{14,2,6},{14,3,6},{7,4,6},{7,5,6},{13,6,6},{13,7,6},{12,8,5},{12,9,5},{12,10,5},{12,11,5},{6,12,5},{6,13,5},{6,14,5},{6,15,5},{3,16,5},{3,17,5},{3,18,5},{3,19,5},{5,20,5},{5,21,5},{5,22,5},{5,23,5},{10,24,5},{10,25,5},{10,26,5},{10,27,5},{9,28,5},{9,29,5},{9,30,5},{9,31,5},{2,32,4},{2,33,4},{2,34,4},{2,35,4},{2,36,4},{2,37,4},{2,38,4},{2,39,4},{1,40,4},{1,41,4},{1,42,4},{1,43,4},{1,44,4},{1,45,4},{1,46,4},{1,47,4},{8,48,4},{8,49,4},{8,50,4},{8,51,4},{8,52,4},{8,53,4},{8,54,4},{8,55,4},{4,56,4},{4,57,4},{4,58,4},{4,59,4},{4,60,4},{4,61,4},{4,62,4},{4,63,4},{0,64,1},{0,65,1},{0,66,1},{0,67,1},{0,68,1},{0,69,1},{0,70,1},{0,71,1},{0,72,1},{0,73,1},{0,74,1},{0,75,1},{0,76,1},{0,77,1},{0,78,1},{0,79,1},{0,80,1},{0,81,1},{0,82,1},{0,83,1},{0,84,1},{0,85,1},{0,86,1},{0,87,1},{0,88,1},{0,89,1},{0,90,1},{0,91,1},{0,92,1},{0,93,1},{0,94,1},{0,95,1},{0,96,1},{0,97,1},{0,98,1},{0,99,1},{0,100,1},{0,101,1},{0,102,1},{0,103,1},{0,104,1},{0,105,1},{0,106,1},{0,107,1},{0,108,1},{0,109,1},{0,110,1},{0,111,1},{0,112,1},{0,113,1},{0,114,1},{0,115,1},{0,116,1},{0,117,1},{0,118,1},{0,119,1},{0,120,1},{0,121,1},{0,122,1},{0,123,1},{0,124,1},
    {0,125,1},{0,126,1},{0,127,1}
    };
    {{11,0,5},{15,1,5},{14,2,4},{14,3,4},{0,4,3},{0,5,3},{0,6,3},{0,7,3},{7,8,4},{7,9,4},{9,10,4},{9,11,4},{10,12,4},{10,13,4},{13,14,4},{13,15,4},{3,16,4},{3,17,4},{8,18,4},{8,19,4},{6,20,4},{6,21,4},{12,22,4},{12,23,4},{4,24,4},{4,25,4},{1,26,4},{1,27,4},{2,28,4},{2,29,4},{5,30,4},{5,31,4}
    };
    {{2,0,4},{2,1,4},{5,2,4},{5,3,4},{0,4,3},{0,5,3},{0,6,3},{0,7,3},{10,8,5},{13,9,5},{7,10,5},{14,11,5},{12,12,4},{12,13,4},{3,14,4},{3,15,4},{8,16,4},{8,17,4},{9,18,4},{9,19,4},{6,20,4},{6,21,4},{11,22,5},{15,23,5},{4,24,3},{4,25,3},{4,26,3},{4,27,3},{1,28,3},{1,29,3},{1,30,3},{1,31,3}
    };
    {{4,0,4},{4,1,4},{1,2,4},{1,3,4},{2,4,4},{2,5,4},{5,6,4},{5,7,4},{0,8,2},{0,9,2},{0,10,2},{0,11,2},{0,12,2},{0,13,2},{0,14,2},{0,15,2},{10,16,5},{13,17,5},{7,18,5},{14,19,5},{12,20,4},{12,21,4},{3,22,4},{3,23,4},{8,24,4},{8,25,4},{9,26,4},{9,27,4},{6,28,4},{6,29,4},{11,30,5},{15,31,5},
    };
    {{10,0,6},{13,1,6},{7,2,6},{14,3,6},{12,4,5},{12,5,5},{3,6,5},{3,7,5},{8,8,5},{8,9,5},{9,10,5},{9,11,5},{6,12,5},{6,13,5},{11,14,6},{15,15,6},{4,16,4},{4,17,4},{4,18,4},{4,19,4},{1,20,4},{1,21,4},{1,22,4},{1,23,4},{2,24,4},{2,25,4},{2,26,4},{2,27,4},{5,28,4},{5,29,4},{5,30,4},{5,31,4},{0,32,1},{0,33,1},{0,34,1},{0,35,1},{0,36,1},{0,37,1},{0,38,1},{0,39,1},{0,40,1},{0,41,1},{0,42,1},{0,43,1},{0,44,1},{0,45,1},{0,46,1},{0,47,1},{0,48,1},{0,49,1},{0,50,1},{0,51,1},{0,52,1},{0,53,1},{0,54,1},{0,55,1},{0,56,1},{0,57,1},{0,58,1},{0,59,1},{0,60,1},{0,61,1},{0,62,1},{0,63,1}
    };
    {{4,0,4},{4,1,4},{4,2,4},{4,3,4},{6,4,6},{7,5,6},{5,6,5},{5,7,5},{3,8,3},{3,9,3},{3,10,3},{3,11,3},{3,12,3},{3,13,3},{3,14,3},{3,15,3},{2,16,2},{2,17,2},{2,18,2},{2,19,2},{2,20,2},{2,21,2},{2,22,2},{2,23,2},{2,24,2},{2,25,2},{2,26,2},{2,27,2},{2,28,2},{2,29,2},{2,30,2},{2,31,2},{1,32,2},{1,33,2},{1,34,2},{1,35,2},{1,36,2},{1,37,2},{1,38,2},{1,39,2},{1,40,2},{1,41,2},{1,42,2},{1,43,2},{1,44,2},{1,45,2},{1,46,2},{1,47,2},{0,48,2},{0,49,2},{0,50,2},{0,51,2},{0,52,2},{0,53,2},{0,54,2},{0,55,2},{0,56,2},{0,57,2},{0,58,2},{0,59,2},{0,60,2},{0,61,2},{0,62,2},{0,63,2}
    };
    {{2,0,3},{2,1,3},{2,2,3},{2,3,3},{3,4,3},{3,5,3},{3,6,3},{3,7,3},{5,8,4},{5,9,4},{6,10,5},{7,11,5},{4,12,3},{4,13,3},{4,14,3},{4,15,3},{1,16,2},{1,17,2},{1,18,2},{1,19,2},{1,20,2},{1,21,2},{1,22,2},{1,23,2},{0,24,2},{0,25,2},{0,26,2},{0,27,2},{0,28,2},{0,29,2},{0,30,2},{0,31,2}
    };
    {{0,0,1},{0,1,1},{0,2,1},{0,3,1},{0,4,1},{0,5,1},{0,6,1},{0,7,1},{0,8,1},{0,9,1},{0,10,1},{0,11,1},{0,12,1},{0,13,1},{0,14,1},{0,15,1},{0,16,1},{0,17,1},{0,18,1},{0,19,1},{0,20,1},{0,21,1},{0,22,1},{0,23,1},{0,24,1},{0,25,1},{0,26,1},{0,27,1},{0,28,1},{0,29,1},{0,30,1},{0,31,1},{5,32,5}, {5,33,5},{6,34,6},{7,35,6},{4,36,4},{4,37,4},{4,38,4},{4,39,4},{1,40,3},{1,41,3},{1,42,3},{1,43,3},{1,44,3},{1,45,3},{1,46,3},{1,47,3},{2,48,3},{2,49,3},{2,50,3},{2,51,3},{2,52,3},{2,53,3},{2,54,3},{2,55,3},{3,56,3},{3,57,3},{3,58,3},{3,59,3},{3,60,3},{3,61,3},{3,62,3},{3,63,3}
    };
    {{0,0,3},{1,1,3},{2,2,3},{3,3,3},{4,4,3},{5,5,3},{6,6,3},{7,7,3}
    }。
  8. 一种音频编码方法,其特征在于,所述方法包括:
    如果待编码的音频信号满足第一条件,则将所述音频信号的左声道数据编入码流,以及将所述音频信号的右声道数据编入所述码流,所述第一条件包括所述音频信号为双声道信号、所述音频信号的编码码率不小于码率阈值,且所述音频信号的采样率不小于采样率阈值。
  9. 如权利要求8所述的方法,其特征在于,所述方法还包括:
    如果所述音频信号不满足所述第一条件,则在所述音频信号为双声道信号的情况下,通过双声道交织编码的方式,将所述左声道数据与所述右声道数据编入所述码流。
  10. 如权利要求8或9所述的方法,其特征在于,所述方法还包括:
    如果所述音频信号不满足所述第一条件,则在所述音频信号为单声道信号的情况下,将所述音频信号的单声道数据编入所述码流。
  11. 如权利要求8-10任一所述的方法,其特征在于,所述将所述音频信号的左声道数据编入码流,包括:
    获取多个子带中各个子带的量化等级衡量因子,所述量化等级衡量因子表征编码相应子带内的各个频谱值所需的平均比特数,所述多个子带是指将所述左声道数据包括的经量化的频谱数据所划分到的多个子带;
    基于所述多个子带的量化等级衡量因子,将所述多个子带划分为多组子带,同一组子带的量化等级衡量因子相同;
    基于每组子带的量化等级衡量因子,从多个编码码本中确定每组子带对应的目标编码码本,以及确定每组子带内的频谱值的比特流,所述目标编码码本是指对相应一组子带内的频谱值进行编码所采用的编码码本;
    将每组子带对应的目标编码码本的标识作为所述左声道数据的一种边信息编入所述码流。
  12. 如权利要求11所述的方法,其特征在于,所述基于每组子带的量化等级衡量因子,从多个编码码本中确定每组子带对应的目标编码码本,以及确定每组子带内的频谱值的比特流,包括:
    对于所述多组子带中的任一组子带,如果所述任一组子带的量化等级衡量因子为第一值,则采用所述多个编码码本中的多个第一编码码本分别对所述任一组子带内的频谱值进行编码,以得到与所述多个第一编码码本一一对应的多个第一候选频谱比特流;
    将所述多个第一候选频谱比特流中比特总数最少的第一候选频谱比特流,确定为所述任一组子带内的频谱值的比特流,以及将所述比特总数最少的第一候选频谱比特流所对应的第一编码码本确定为所述任一组子带对应的目标编码码本。
  13. 如权利要求12所述的方法,其特征在于,所述第一值为1;
    所述采用所述多个编码码本中的多个第一编码码本分别对所述任一组子带内的频谱值进行编码,包括:
    将所述任一组子带中的每四个频谱值组合成一个二进制数;
    采用所述多个第一编码码本分别对所述二进制数所表示的十进制数进行编码;
    其中,所述多个第一编码码本如下:
    {{0,0,4},{1,1,4},{2,2,4},{3,3,4},{4,4,4},{5,5,4},{6,6,4},{7,7,4},
    {8,8,4},{9,9,4},{10,10,4},{11,11,4},{12,12,4},{13,13,4},{14,14,4},{15,15,4}
    };
    {{0,1,3},{1,13,4},{2,12,4},{3,9,4},{4,15,4},{5,5,4},{6,8,4},{7,7,4},
    {8,14,4},{9,4,4},{10,10,4},{11,0,5},{12,11,4},{13,1,4},{14,6,4},{15,1,5}
    };
    {{0,1,2},{1,1,4},{2,0,4},{3,12,4},{4,3,4},{5,13,4},{6,11,4},{7,18,5},
    {8,2,4},{9,15,4},{10,14,4},{11,32,6},{12,10,4},{13,19,5},{14,17,5},{15,33,6}
    };
    {{0,1,1},{1,5,4},{2,4,4},{3,4,5},{4,7,4},{5,5,5},{6,3,5},{7,2,6},
    {8,6,4},{9,7,5},{10,6,5},{11,0,7},{12,2,5},{13,3,6},{14,1,6},{15,1,7}
    }。
  14. 如权利要求11-13任一所述的方法,其特征在于,所述基于每组子带的量化等级衡量因子,从多个编码码本中确定每组子带对应的目标编码码本,以及确定每组子带内的频谱值的比特流,包括:
    对于所述多组子带中的任一组子带,如果所述任一组子带的量化等级衡量因子为第二值,则采用所述多个编码码本中的多个第二编码码本分别对所述任一组子带内的频谱值进行编码,以得到与所述多个第二编码码本一一对应的多个第二候选频谱比特流;
    将所述多个第二候选频谱比特流中比特总数最少的第二候选频谱比特流,确定为所述任一组子带内的频谱值的比特流,以及将所述比特总数最少的第二候选频谱比特流所对应的第二编码码本确定为所述任一组子带对应的目标编码码本。
  15. 如权利要求14所述的方法,其特征在于,所述第二值为2;
    所述采用所述多个编码码本中的多个第二编码码本分别对所述任一组子带内的频谱值进行编码,包括:
    将所述任一组子带内的每两个频谱值组合成一个二进制数;
    采用所述多个第二编码码本分别对所述二进制数所表示的十进制数进行编码;
    其中,所述多个第二编码码本如下:
    {{0,1,3},{1,13,4},{2,14,4},{3,8,4},{4,12,4},{5,15,4},{6,10,4},{7,4,4},{8,9,4},{9,5,4},{10,6,4},{11,0,5},{12,11,4},{13,7,4},{14,1,4},{15,1,5}
    };
    {{0,1,3},{1,7,3},{2,0,4},{3,7,4},{4,6,3},{5,1,4},{6,10,4},{7,10,5},
    {8,8,4},{9,9,4},{10,8,5},{11,22,5},{12,6,4},{13,9,5},{14,11,5},{15,23,5}
    };
    {{0,1,2},{1,1,4},{2,2,4},{3,11,4},{4,0,4},{5,3,4},{6,14,4},{7,18,5},
    {8,12,4},{9,13,4},{10,16,5},{11,30,5},{12,10,4},{13,17,5},{14,19,5},{15,31,5}
    };
    {{0,1,1},{1,5,4},{2,6,4},{3,3,5},{4,4,4},{5,7,4},{6,6,5},{7,2,6},
    {8,4,5},{9,5,5},{10,0,6},{11,14,6},{12,2,5},{13,1,6},{14,3,6},{15,15,6}
    }。
  16. 如权利要求11-15任一所述的方法,其特征在于,所述基于每组子带的量化等级衡量因子,从多个编码码本中确定每组子带对应的目标编码码本,以及确定每组子带内的频谱值的比特流,包括:
    对于所述多组子带中的任一组子带,如果所述任一组子带的量化等级衡量因子为第三值,则采用所述多个编码码本中的多个第三编码码本分别对所述任一组子带内的频谱值进行编码,以得到与所述多个第三编码码本一一对应的多个第三候选频谱比特流;
    将所述多个第三候选频谱比特流中比特总数最少的第三候选频谱比特流,确定为所述任一组子带内的频谱值的比特流,以及将所述比特总数最少的第三候选频谱比特流所对应的第三编码码本确定为所述任一组子带对应的目标编码码本。
  17. 如权利要求16所述的方法,其特征在于,所述基于每组子带的量化等级衡量因子,从多个编码码本中确定每组子带对应的目标编码码本,以及确定每组子带内的频谱值的比特流,包括:
    对于所述多组子带中的任一组子带,如果所述任一组子带的量化等级衡量因子为第四值,则采用所述多个第三编码码本分别对所述任一组子带内的各个频谱值中的第一部分比特位进行编码,以得到与所述多个第三编码码本一一对应的多个第一部分候选比特流;
    将所述多个第一部分候选比特流中比特总数最少的第一部分候选比特流,确定为所述第一部分比特位的比特流,以及将所述比特总数最少的第一部分候选比特流所对应的第三编码码本确定为所述任一组子带对应的目标编码码本;
    对所述任一组子带内的各个频谱值中的除第一部分比特位之外的第二部分比特位进行均 匀量化编码,以得到所述第二部分比特位的比特流。
  18. 如权利要求17所述的方法,其特征在于,所述第一部分比特位是指所述频谱值中高位的N个比特位,所述第二部分比特位是指所述频谱值中低位的M个比特,所述M等于所述任一组子带的量化等级衡量因子减去所述第三值。
  19. 如权利要求16-18任一所述的方法,其特征在于,所述第三值为3;
    所述采用所述多个编码码本中的多个第三编码码本分别对所述任一组子带内的频谱值进行编码,包括:
    采用所述多个第三编码码本分别对所述任一组子带内的各个频谱值进行编码;
    其中,所述多个第三编码码本如下:
    {{0,3,2},{1,2,2},{2,1,2},{3,1,3},{4,0,4},{5,3,5},{6,4,6},{7,5,6}
    };
    {{0,3,2},{1,2,2},{2,0,3},{3,1,3},{4,3,3},{5,4,4},{6,10,5},{7,11,5}
    };
    {{0,0,1},{1,5,3},{2,6,3},{3,7,3},{4,9,4},{5,16,5},{6,34,6},{7,35,6}
    };
    {{0,0,3},{1,1,3},{2,2,3},{3,3,3},{4,4,3},{5,5,3},{6,6,3},{7,7,3}
    }。
  20. 一种音频解码装置,其特征在于,所述装置包括:
    获取模块,用于如果待解码的音频信号满足第一条件,则获取声道解码模式,所述第一条件包括所述音频信号为双声道信号、所述音频信号的编码码率不小于码率阈值,且所述音频信号的采样率不小于采样率阈值;
    解码模块,用于如果所述声道解码模式为左声道解码模式,则解码码流中的左声道比特流,以得到所述音频信号的左声道数据,将所述左声道数据复制到右声道;
    解码模块,还用于如果所述声道解码模式为右声道解码模式,则解码所述码流中的右声道比特流,以得到所述音频信号的右声道数据,将所述右声道数据复制到左声道。
  21. 如权利要求20所述的装置,其特征在于,所述解码模块,还用于:
    如果所述声道解码模式不是所述左声道解码模式,且不是所述右声道解码模式,则解码所述左声道比特流和所述右声道比特流,以得到所述左声道数据和所述右声道数据。
  22. 如权利要求20或21所述的装置,其特征在于,所述解码模块,还用于:
    如果所述音频信号不满足所述第一条件,则在所述音频信号为双声道信号的情况下,解码所述码流中的双声道交织比特流,以得到所述左声道数据和所述右声道数据。
  23. 如权利要求20-22任一所述的装置,其特征在于,所述解码模块,还用于;
    如果所述音频信号不满足所述第一条件,则在所述音频信号为单声道信号的情况下,解 码所述码流中的单声道比特流,以得到所述音频信号的单声道数据。
  24. 如权利要求20-23任一所述的装置,其特征在于,所述装置还包括:
    第二获取模块,用于获取所述码流的总数据量;
    解码模块,还用于解码所述码流的包头,以得到所述音频信号的声道数、采样率和帧长;
    第一确定模块,用于基于所述总数据量、所述声道数、所述采样率和所述帧长,确定所述音频信号是否满足所述第一条件。
  25. 如权利要求20-24任一所述的装置,其特征在于,
    解码模块,还用于解码所述码流中边信息的比特流,以得到所述边信息,所述边信息包括编码码本标识;
    所述装置还包括:第二确定模块,用于基于所述编码码本标识,从多个解码码本中确定解码所需的目标解码码本。
  26. 如权利要求25的装置,其特征在于,所述多个解码码本如下:
    {{0,0,4},{1,1,4},{2,2,4},{3,3,4},{4,4,4},{5,5,4},{6,6,4},{7,7,4},
    {8,8,4},{9,9,4},{10,10,4},{11,11,4},{12,12,4},{13,13,4},{14,14,4},{15,15,4}
    };
    {{11,0,5},{15,1,5},{13,2,4},{13,3,4},{0,4,3},{0,5,3},{0,6,3},{0,7,3},
    {9,8,4},{9,9,4},{5,10,4},{5,11,4},{14,12,4},{14,13,4},{7,14,4},{7,15,4},{6,16,4},{6,17,4},{3,18,4},{3,19,4},{10,20,4},{10,21,4},{12,22,4},{12,23,4},{2,24,4},{2,25,4},{1,26,4},{1,27,4},{8,28,4},{8,29,4},{4,30,4},{4,31,4}
    };
    {{2,0,4},{2,1,4},{2,2,4},{2,3,4},{1,4,4},{1,5,4},{1,6,4},{1,7,4},
    {8,8,4},{8,9,4},{8,10,4},{8,11,4},{4,12,4},{4,13,4},{4,14,4},{4,15,4},{0,16,2},{0,17,2},{0,18,2},{0,19,2},{0,20,2},{0,21,2},{0,22,2},{0,23,2},
    {0,24,2},{0,25,2},{0,26,2},{0,27,2},{0,28,2},{0,29,2},{0,30,2},{0,31,2},{11,32,6},{15,33,6},{14,34,5},{14,35,5},{7,36,5},{7,37,5},{13,38,5},
    {13,39,5},{12,40,4},{12,41,4},{12,42,4},{12,43,4},{6,44,4},{6,45,4},
    {6,46,4},{6,47,4},{3,48,4},{3,49,4},{3,50,4},{3,51,4},{5,52,4},{5,53,4},{5,54,4},{5,55,4},{10,56,4},{10,57,4},{10,58,4},{10,59,4},{9,60,4},{9,61,4},{9,62,4},{9,63,4}
    };
    {{11,0,7},{15,1,7},{14,2,6},{14,3,6},{7,4,6},{7,5,6},{13,6,6},{13,7,6},{12,8,5},{12,9,5},{12,10,5},{12,11,5},{6,12,5},{6,13,5},{6,14,5},{6,15,5},{3,16,5},{3,17,5},{3,18,5},{3,19,5},{5,20,5},{5,21,5},{5,22,5},{5,23,5},{10,24,5},{10,25,5},{10,26,5},{10,27,5},{9,28,5},{9,29,5},{9,30,5},{9, 31,5},{2,32,4},{2,33,4},{2,34,4},{2,35,4},{2,36,4},{2,37,4},{2,38,4},{2,39,4},{1,40,4},{1,41,4},{1,42,4},{1,43,4},{1,44,4},{1,45,4},{1,46,4},{1,47,4},{8,48,4},{8,49,4},{8,50,4},{8,51,4},{8,52,4},{8,53,4},{8,54,4},{8,55,4},{4,56,4},{4,57,4},{4,58,4},{4,59,4},{4,60,4},{4,61,4},{4,62,4},{4,63,4},{0,64,1},{0,65,1},{0,66,1},{0,67,1},{0,68,1},{0,69,1},{0,70,1},{0,71,1},{0,72,1},{0,73,1},{0,74,1},{0,75,1},{0,76,1},{0,77,1},{0,78,1},{0,79,1},{0,80,1},{0,81,1},{0,82,1},{0,83,1},{0,84,1},{0,85,1},{0,86,1},{0,87,1},{0,88,1},{0,89,1},{0,90,1},{0,91,1},{0,92,1},{0,93,1},{0,94,1},{0,95,1},{0,96,1},{0,97,1},{0,98,1},{0,99,1},{0,100,1},{0,101,1},{0,102,1},{0,103,1},{0,104,1},{0,105,1},{0,106,1},{0,107,1},{0,108,1},{0,109,1},{0,110,1},{0,111,1},{0,112,1},{0,113,1},{0,114,1},{0,115,1},{0,116,1},{0,117,1},{0,118,1},{0,119,1},{0,120,1},{0,121,1},{0,122,1},{0,123,1},{0,124,1},
    {0,125,1},{0,126,1},{0,127,1}
    };
    {{11,0,5},{15,1,5},{14,2,4},{14,3,4},{0,4,3},{0,5,3},{0,6,3},{0,7,3},{7,8,4},{7,9,4},{9,10,4},{9,11,4},{10,12,4},{10,13,4},{13,14,4},{13,15,4},{3,16,4},{3,17,4},{8,18,4},{8,19,4},{6,20,4},{6,21,4},{12,22,4},{12,23,4},{4,24,4},{4,25,4},{1,26,4},{1,27,4},{2,28,4},{2,29,4},{5,30,4},{5,31,4}
    };
    {{2,0,4},{2,1,4},{5,2,4},{5,3,4},{0,4,3},{0,5,3},{0,6,3},{0,7,3},{10,8,5},{13,9,5},{7,10,5},{14,11,5},{12,12,4},{12,13,4},{3,14,4},{3,15,4},{8,16,4},{8,17,4},{9,18,4},{9,19,4},{6,20,4},{6,21,4},{11,22,5},{15,23,5},{4,24,3},{4,25,3},{4,26,3},{4,27,3},{1,28,3},{1,29,3},{1,30,3},{1,31,3}
    };
    {{4,0,4},{4,1,4},{1,2,4},{1,3,4},{2,4,4},{2,5,4},{5,6,4},{5,7,4},{0,8,2},{0,9,2},{0,10,2},{0,11,2},{0,12,2},{0,13,2},{0,14,2},{0,15,2},{10,16,5},{13,17,5},{7,18,5},{14,19,5},{12,20,4},{12,21,4},{3,22,4},{3,23,4},{8,24,4},{8,25,4},{9,26,4},{9,27,4},{6,28,4},{6,29,4},{11,30,5},{15,31,5},
    };
    {{10,0,6},{13,1,6},{7,2,6},{14,3,6},{12,4,5},{12,5,5},{3,6,5},{3,7,5},{8,8,5},{8,9,5},{9,10,5},{9,11,5},{6,12,5},{6,13,5},{11,14,6},{15,15,6},{4,16,4},{4,17,4},{4,18,4},{4,19,4},{1,20,4},{1,21,4},{1,22,4},{1,23,4},{2,24,4},{2,25,4},{2,26,4},{2,27,4},{5,28,4},{5,29,4},{5,30,4},{5,31,4},{0,32,1},{0,33,1},{0,34,1},{0,35,1},{0,36,1},{0,37,1},{0,38,1},{0,39,1},{0,40,1},{0,41,1},{0,42,1},{0,43,1},{0,44,1},{0,45,1},{0,46,1},{0,47,1},{0,48,1},{0,49,1},{0,50,1},{0,51,1},{0,52,1},{0,53,1},{0,54,1},{0,55,1},{0,56,1},{0,57,1},{0,58,1},{0,59,1},{0,60,1},{0,61,1},{0,62,1},{0,63,1}
    };
    {{4,0,4},{4,1,4},{4,2,4},{4,3,4},{6,4,6},{7,5,6},{5,6,5},{5,7,5},{3,8,3},{3,9,3},{3,10,3},{3,11,3},{3,12,3},{3,13,3},{3,14,3},{3,15,3},{2,16,2},{2,17,2},{2,18,2},{2,19,2},{2,20,2},{2,21,2},{2,22,2},{2,23,2},{2,24,2},{2,25,2},{2,26,2},{2,27,2},{2,28,2},{2,29,2},{2,30,2},{2,31,2},{1,32,2},{1,33,2},{1,34,2},{1,35,2},{1,36,2},{1,37,2},{1,38,2},{1,39,2},{1,40,2},{1,41,2},{1,42,2},{1,43,2},{1,44,2},{1,45,2},{1,46,2},{1,47,2},{0,48,2},{0,49,2},{0,50,2},{0,51,2},{0,52,2},{0,53,2},{0,54,2},{0,55,2},{0,56,2},{0,57,2},{0,58,2},{0,59,2},{0,60,2},{0,61,2},{0,62,2},{0,63,2}
    };
    {{2,0,3},{2,1,3},{2,2,3},{2,3,3},{3,4,3},{3,5,3},{3,6,3},{3,7,3},{5,8,4},{5,9,4},{6,10,5},{7,11,5},{4,12,3},{4,13,3},{4,14,3},{4,15,3},{1,16,2},{1,17,2},{1,18,2},{1,19,2},{1,20,2},{1,21,2},{1,22,2},{1,23,2},{0,24,2},{0,25,2},{0,26,2},{0,27,2},{0,28,2},{0,29,2},{0,30,2},{0,31,2}
    };
    {{0,0,1},{0,1,1},{0,2,1},{0,3,1},{0,4,1},{0,5,1},{0,6,1},{0,7,1},{0,8,1},{0,9,1},{0,10,1},{0,11,1},{0,12,1},{0,13,1},{0,14,1},{0,15,1},{0,16,1},{0,17,1},{0,18,1},{0,19,1},{0,20,1},{0,21,1},{0,22,1},{0,23,1},{0,24,1},{0,25,1},{0,26,1},{0,27,1},{0,28,1},{0,29,1},{0,30,1},{0,31,1},{5,32,5},{5,33,5},{6,34,6},{7,35,6},{4,36,4},{4,37,4},{4,38,4},{4,39,4},{1,40,3},{1,41,3},{1,42,3},{1,43,3},{1,44,3},{1,45,3},{1,46,3},{1,47,3},{2,48,3},{2,49,3},{2,50,3},{2,51,3},{2,52,3},{2,53,3},{2,54,3},{2,55,3},{3,56,3},{3,57,3},{3,58,3},{3,59,3},{3,60,3},{3,61,3},{3,62,3},{3,63,3}
    };
    {{0,0,3},{1,1,3},{2,2,3},{3,3,3},{4,4,3},{5,5,3},{6,6,3},{7,7,3}
    }。
  27. 一种音频编码装置,其特征在于,所述装置包括:
    编码模块,用于如果待编码的音频信号满足第一条件,则将所述音频信号的左声道数据编入码流,以及将所述音频信号的右声道数据编入所述码流,所述第一条件包括所述音频信号为双声道信号、所述音频信号的编码码率不小于码率阈值,且所述音频信号的采样率不小于采样率阈值。
  28. 如权利要求27所述的装置,其特征在于,所述编码模块,还用于:
    如果所述音频信号不满足所述第一条件,则在所述音频信号为双声道信号的情况下,通过双声道交织编码的方式,将所述左声道数据与所述右声道数据编入所述码流。
  29. 如权利要求27或28所述的装置,其特征在于,所述编码模块,还用于:
    如果所述音频信号不满足所述第一条件,则在所述音频信号为单声道信号的情况下,将 所述音频信号的单声道数据编入所述码流。
  30. 如权利要求27-29任一所述的装置,其特征在于,所述编码模块包括:
    获取子模块,用于获取多个子带中各个子带的量化等级衡量因子,所述量化等级衡量因子表征编码相应子带内的各个频谱值所需的平均比特数,所述多个子带是指将所述左声道数据包括的经量化的频谱数据所划分到的多个子带;
    划分子模块,用于基于所述多个子带的量化等级衡量因子,将所述多个子带划分为多组子带,同一组子带的量化等级衡量因子相同;
    确定子模块,用于基于每组子带的量化等级衡量因子,从多个编码码本中确定每组子带对应的目标编码码本,以及确定每组子带内的频谱值的比特流,所述目标编码码本是指对相应一组子带内的频谱值进行编码所采用的编码码本;
    编码子模块,用于将每组子带对应的目标编码码本的标识作为所述左声道数据的一种边信息编入所述码流。
  31. 如权利要求30所述的装置,其特征在于,所述确定子模块用于:
    对于所述多组子带中的任一组子带,如果所述任一组子带的量化等级衡量因子为第一值,则采用所述多个编码码本中的多个第一编码码本分别对所述任一组子带内的频谱值进行编码,以得到与所述多个第一编码码本一一对应的多个第一候选频谱比特流;
    将所述多个第一候选频谱比特流中比特总数最少的第一候选频谱比特流,确定为所述任一组子带内的频谱值的比特流,以及将所述比特总数最少的第一候选频谱比特流所对应的第一编码码本确定为所述任一组子带对应的目标编码码本。
  32. 如权利要求31所述的装置,其特征在于,所述第一值为1;
    所述采用所述多个编码码本中的多个第一编码码本分别对所述任一组子带内的频谱值进行编码,包括:
    将所述任一组子带中的每四个频谱值组合成一个二进制数;
    采用所述多个第一编码码本分别对所述二进制数所表示的十进制数进行编码;
    其中,所述多个第一编码码本如下:
    {{0,0,4},{1,1,4},{2,2,4},{3,3,4},{4,4,4},{5,5,4},{6,6,4},{7,7,4},
    {8,8,4},{9,9,4},{10,10,4},{11,11,4},{12,12,4},{13,13,4},{14,14,4},{15,15,4}
    };
    {{0,1,3},{1,13,4},{2,12,4},{3,9,4},{4,15,4},{5,5,4},{6,8,4},{7,7,4},
    {8,14,4},{9,4,4},{10,10,4},{11,0,5},{12,11,4},{13,1,4},{14,6,4},{15,1,5}
    };
    {{0,1,2},{1,1,4},{2,0,4},{3,12,4},{4,3,4},{5,13,4},{6,11,4},{7,18,5},
    {8,2,4},{9,15,4},{10,14,4},{11,32,6},{12,10,4},{13,19,5},{14,17,5},{15,33,6}
    };
    {{0,1,1},{1,5,4},{2,4,4},{3,4,5},{4,7,4},{5,5,5},{6,3,5},{7,2,6},
    {8,6,4},{9,7,5},{10,6,5},{11,0,7},{12,2,5},{13,3,6},{14,1,6},{15,1,7}
    }。
  33. 如权利要求30-32任一所述的装置,其特征在于,所述确定子模块用于:
    对于所述多组子带中的任一组子带,如果所述任一组子带的量化等级衡量因子为第二值,则采用所述多个编码码本中的多个第二编码码本分别对所述任一组子带内的频谱值进行编码,以得到与所述多个第二编码码本一一对应的多个第二候选频谱比特流;
    将所述多个第二候选频谱比特流中比特总数最少的第二候选频谱比特流,确定为所述任一组子带内的频谱值的比特流,以及将所述比特总数最少的第二候选频谱比特流所对应的第二编码码本确定为所述任一组子带对应的目标编码码本。
  34. 如权利要求33所述的装置,其特征在于,所述第二值为2;
    所述采用所述多个编码码本中的多个第二编码码本分别对所述任一组子带内的频谱值进行编码,包括:
    将所述任一组子带内的每两个频谱值组合成一个二进制数;
    采用所述多个第二编码码本分别对所述二进制数所表示的十进制数进行编码;
    其中,所述多个第二编码码本如下:
    {{0,1,3},{1,13,4},{2,14,4},{3,8,4},{4,12,4},{5,15,4},{6,10,4},{7,4,4},{8,9,4},{9,5,4},{10,6,4},{11,0,5},{12,11,4},{13,7,4},{14,1,4},{15,1,5}};
    {{0,1,3},{1,7,3},{2,0,4},{3,7,4},{4,6,3},{5,1,4},{6,10,4},{7,10,5},
    {8,8,4},{9,9,4},{10,8,5},{11,22,5},{12,6,4},{13,9,5},{14,11,5},{15,23,5}
    };
    {{0,1,2},{1,1,4},{2,2,4},{3,11,4},{4,0,4},{5,3,4},{6,14,4},{7,18,5},
    {8,12,4},{9,13,4},{10,16,5},{11,30,5},{12,10,4},{13,17,5},{14,19,5},{15,31,5}
    };
    {{0,1,1},{1,5,4},{2,6,4},{3,3,5},{4,4,4},{5,7,4},{6,6,5},{7,2,6},
    {8,4,5},{9,5,5},{10,0,6},{11,14,6},{12,2,5},{13,1,6},{14,3,6},{15,15,6}
    }。
  35. 如权利要求30-34任一所述的装置,其特征在于,所述确定子模块用于:
    对于所述多组子带中的任一组子带,如果所述任一组子带的量化等级衡量因子为第三值,则采用所述多个编码码本中的多个第三编码码本分别对所述任一组子带内的频谱值进行编码,以得到与所述多个第三编码码本一一对应的多个第三候选频谱比特流;
    将所述多个第三候选频谱比特流中比特总数最少的第三候选频谱比特流,确定为所述任一组子带内的频谱值的比特流,以及将所述比特总数最少的第三候选频谱比特流所对应的第三编码码本确定为所述任一组子带对应的目标编码码本。
  36. 如权利要求35所述的装置,其特征在于,所述确定子模块用于:
    对于所述多组子带中的任一组子带,如果所述任一组子带的量化等级衡量因子为第四值,则采用所述多个第三编码码本分别对所述任一组子带内的各个频谱值中的第一部分比特位进行编码,以得到与所述多个第三编码码本一一对应的多个第一部分候选比特流;
    将所述多个第一部分候选比特流中比特总数最少的第一部分候选比特流,确定为所述第一部分比特位的比特流,以及将所述比特总数最少的第一部分候选比特流所对应的第三编码码本确定为所述任一组子带对应的目标编码码本;
    对所述任一组子带内的各个频谱值中的除第一部分比特位之外的第二部分比特位进行均匀量化编码,以得到所述第二部分比特位的比特流。
  37. 如权利要求36所述的装置,其特征在于,所述第一部分比特位是指所述频谱值中高位的N个比特位,所述第二部分比特位是指所述频谱值中低位的M个比特,所述M等于所述任一组子带的量化等级衡量因子减去所述第三值。
  38. 如权利要求35-37任一所述的装置,其特征在于,所述第三值为3;
    所述采用所述多个编码码本中的多个第三编码码本分别对所述任一组子带内的频谱值进行编码,包括:
    采用所述多个第三编码码本分别对所述任一组子带内的各个频谱值进行编码;
    其中,所述多个第三编码码本如下:
    {{0,3,2},{1,2,2},{2,1,2},{3,1,3},{4,0,4},{5,3,5},{6,4,6},{7,5,6}
    };
    {{0,3,2},{1,2,2},{2,0,3},{3,1,3},{4,3,3},{5,4,4},{6,10,5},{7,11,5}
    };
    {{0,0,1},{1,5,3},{2,6,3},{3,7,3},{4,9,4},{5,16,5},{6,34,6},{7,35,6}
    };
    {{0,0,3},{1,1,3},{2,2,3},{3,3,3},{4,4,3},{5,5,3},{6,6,3},{7,7,3}
    }。
  39. 一种音频解码设备,其特征在于,所述设备包括存储器和处理器;
    所述存储器,用于存储计算机程序,所述计算机程序包括程序指令;
    所述处理器,用于调用所述计算机程序,实现如权利要求1至7任一所述的音频解码方法。
  40. 一种音频编码设备,其特征在于,所述设备包括存储器和处理器;
    所述存储器,用于存储计算机程序,所述计算机程序包括程序指令;
    所述处理器,用于调用所述计算机程序,实现如权利要求8-19任一所述的音频编码方法。
  41. 一种计算机可读存储介质,其特征在于,所述存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1-19任一所述的方法的步骤。
  42. 一种计算机程序产品,其特征在于,所述计算机程序产品内存储有计算机指令,所述计算机指令被处理器执行时实现权利要求1-19任一所述的方法的步骤。
PCT/CN2023/092051 2022-07-27 2023-05-04 音频编解码方法、装置、存储介质及计算机程序产品 WO2024021732A1 (zh)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202210892837.6 2022-07-27
CN202210892837 2022-07-27
CN202211139716.0 2022-09-19
CN202211139716.0A CN117476016A (zh) 2022-07-27 2022-09-19 音频编解码方法、装置、存储介质及计算机程序产品

Publications (1)

Publication Number Publication Date
WO2024021732A1 true WO2024021732A1 (zh) 2024-02-01

Family

ID=89629888

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/092051 WO2024021732A1 (zh) 2022-07-27 2023-05-04 音频编解码方法、装置、存储介质及计算机程序产品

Country Status (2)

Country Link
CN (1) CN117476016A (zh)
WO (1) WO2024021732A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1218334A (zh) * 1997-11-20 1999-06-02 三星电子株式会社 可伸缩的立体声音频编码/解码方法和装置
CN102014262A (zh) * 2010-10-27 2011-04-13 杭州海康威视软件有限公司 一种硬盘录像机、多媒体格式转换的系统及方法
US8483699B1 (en) * 2012-08-29 2013-07-09 Sprint Spectrum L.P. Codec selection for wireless communication
CN112489666A (zh) * 2020-11-26 2021-03-12 北京百瑞互联技术有限公司 一种蓝牙le音频传播数据处理方法、装置及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1218334A (zh) * 1997-11-20 1999-06-02 三星电子株式会社 可伸缩的立体声音频编码/解码方法和装置
CN102014262A (zh) * 2010-10-27 2011-04-13 杭州海康威视软件有限公司 一种硬盘录像机、多媒体格式转换的系统及方法
US8483699B1 (en) * 2012-08-29 2013-07-09 Sprint Spectrum L.P. Codec selection for wireless communication
CN112489666A (zh) * 2020-11-26 2021-03-12 北京百瑞互联技术有限公司 一种蓝牙le音频传播数据处理方法、装置及存储介质

Also Published As

Publication number Publication date
CN117476016A (zh) 2024-01-30

Similar Documents

Publication Publication Date Title
US7848931B2 (en) Audio encoder
US9799339B2 (en) Stereo audio signal encoder
WO2021208792A1 (zh) 音频信号编码方法、解码方法、编码设备以及解码设备
WO2019105575A1 (en) Determination of spatial audio parameter encoding and associated decoding
CN109215668B (zh) 一种声道间相位差参数的编码方法及装置
US11393482B2 (en) Audio encoding and decoding method and related product
WO2024021732A1 (zh) 音频编解码方法、装置、存储介质及计算机程序产品
CN114495951A (zh) 音频编解码方法和装置
US20220238123A1 (en) Sound signal receiving and decoding method, sound signal decoding method, sound signal receiving side apparatus, decoding apparatus, program and storage medium
CN101800048A (zh) 基于dra编码器的多声道数字音频编码方法及其编码系统
WO2024021731A1 (zh) 音频编解码方法、装置、存储介质及计算机程序产品
WO2024021733A1 (zh) 音频信号的处理方法、装置、存储介质及计算机程序产品
WO2024021730A1 (zh) 音频信号的处理方法及其装置
US20220115024A1 (en) Apparatus, Methods, and Computer Programs for Encoding Spatial Metadata
WO2024021729A1 (zh) 量化方法、反量化方法及其装置
WO2023173941A1 (zh) 一种多声道信号的编解码方法和编解码设备以及终端设备
TW202411983A (zh) 量化方法、反量化方法及其裝置
WO2022012677A1 (zh) 音频编解码方法和相关装置及计算机可读存储介质
US10586546B2 (en) Inversely enumerated pyramid vector quantizers for efficient rate adaptation in audio coding
WO2021139757A1 (zh) 一种音频编解码方法和音频编解码设备
CN116798438A (zh) 一种多声道信号的编解码方法和编解码设备以及终端设备
GB2598773A (en) Quantizing spatial audio parameters
CN117935822A (zh) 音频编码方法、装置、介质、设备和程序产品
EP4162487A1 (en) Spatial audio parameter encoding and associated decoding
WO2020201619A1 (en) Spatial audio representation and associated rendering

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23844947

Country of ref document: EP

Kind code of ref document: A1