WO2022012554A1

WO2022012554A1 - Multi-channel audio signal encoding method and apparatus

Info

Publication number: WO2022012554A1
Application number: PCT/CN2021/106102
Authority: WO
Inventors: 王智; 丁建策; 王宾; 李海婷; 王喆
Original assignee: 华为技术有限公司
Priority date: 2020-07-17
Filing date: 2021-07-13
Publication date: 2022-01-20
Also published as: JP2023533367A; US20230154472A1; CN113948097A; EP4174853A1; BR112023000835A2; EP4174853A4

Abstract

A multi-channel audio signal encoding method and apparatus (700). The method may comprise: obtaining audio signals of P channels of the current frame of a multi-channel audio signal, the audio signals of the P channels comprising audio signals of K channel pairs (steps 101, 201, 501); determining the respective number of bits of the K channel pairs according to the respective energy/amplitude and available number of bits of the audio signals of the P channels (steps 102, 202); and encoding the audio signals of the P channels according to the respective number of bits of the K channels to obtain an encoded code stream (step 103), so as to improve the coding quality.

Description

Multi-channel audio signal encoding method and device

This application claims the priority of the Chinese patent application with the application number 202010699775.8 and the application name "Multi-channel audio signal coding method and device", which was submitted to the Chinese Patent Office on July 17, 2020, the entire contents of which are incorporated herein by reference Applying.

technical field

The present application relates to audio coding and decoding technologies, and in particular, to a multi-channel audio signal coding method and device.

Background technique

With the continuous development of multimedia technology, audio has been widely used in multimedia communication, consumer electronics, virtual reality, human-computer interaction and other fields. Audio coding is one of the key technologies of multimedia technology. Audio coding compresses the amount of data by removing redundant information in the original audio signal to facilitate storage or transmission.

Multi-channel audio coding is the coding of more than two channels, and the common ones are 5.1 channels, 7.1 channels, 7.1.4 channels, 22.2 channels, etc. By performing multi-channel signal screening, group pairing, stereo processing, multi-channel side information generation, quantization processing, entropy coding processing and code stream multiplexing on multiple original audio signals to form a serial bit stream (coded code stream) , to facilitate transmission over the channel or storage in digital media. Among them, since the energy difference between the multi-channel channels is relatively large, it is necessary to perform energy equalization on the multi-channel before performing the stereo processing, so as to increase the revenue of the stereo processing, thereby improving the coding efficiency.

For energy equalization, the energy of all channels is usually averaged. This way affects the quality of the encoded audio signal. For example, in the case of large energy difference between channels, the above energy equalization method may cause insufficient quality of coded bits of channel frames with large energy/amplitude, and redundant coded bits of channel frames with small energy wastes resources. In the case of low bit rates, the total available bits are tight, resulting in a significant degradation in the quality of channel frames with large energy/amplitude.

SUMMARY OF THE INVENTION

The present application provides a multi-channel audio signal encoding method and device, which are beneficial to improve the quality of the encoded audio signal.

In a first aspect, an embodiment of the present application provides a multi-channel audio signal encoding method, the method may include: acquiring audio signals of P channels of a current frame of the multi-channel audio signal, where P is a positive integer greater than 1, The audio signals of the P channels include audio signals of K channel pairs, where K is a positive integer. Obtain the respective energy/amplitude of the audio signals of the P channels. The respective bit numbers of the K channel pairs are determined according to the respective energy/amplitude of the audio signals of the P channels and the number of available bits. According to the respective bit numbers of the K channels, the audio signals of the P channels are encoded to obtain an encoded code stream.

Wherein, the energy/amplitude of the audio signal of one channel in the P channels includes the energy/amplitude of the audio signal of the one channel in the time domain, the energy/amplitude of the audio signal of the one channel after time-frequency transformation Amplitude, the time-frequency transformed and whitened energy/amplitude of the audio signal of the one channel, the energy/amplitude of the audio signal of the one channel after energy/amplitude equalization, or the audio signal of the one channel after stereo processing At least one of the following energy/amplitude.

In this implementation, according to the energy/amplitude of the audio signals of the P channels in the time domain, the energy/amplitude after time-frequency transformation and whitening, the energy/amplitude after energy/amplitude equalization, or the energy/amplitude after stereo processing At least one of the energy/amplitude of the channel pair is allocated to the channel pair, and the number of bits for each of the K channel pairs is determined, so as to realize the reasonable allocation of the bit number of each channel pair in the multi-channel signal encoding, so as to ensure the decoding end. Reconstruct the quality of the audio signal.

In a possible design, the K channel pairs include the current channel pair, and the method may further include: performing energy/amplitude measurements on the audio signals of the two channels of the current channel pair in the K channel pairs. Equalization to obtain the energy/amplitude of the audio signals of the two channels of the current channel pair after energy/amplitude equalization.

In this implementation manner, by performing energy/amplitude equalization on the audio signals of the two channels in a single channel pair, it is possible to achieve that for channel pairs with large energy/amplitude differences, after energy/amplitude equalization, the Large energy/amplitude difference, so that when bit allocation is performed based on the energy/amplitude after energy/amplitude equalization, more bits can be allocated to channel pairs with larger energy/amplitude to ensure that The encoded bits of the channel pair meet their encoding requirements, thereby improving the quality of the reconstructed audio signal at the decoding end.

In a possible design, the K channel pairs include the current channel pair, and encoding the audio signals of the P channels according to the respective bit numbers of the K channel pairs may include: according to the current channel The number of bits of the pair and the respective stereo-processed energy/amplitude of the audio signals of the two channels in the current channel pair determine the respective number of bits of the two channels in the current channel pair. The audio signals of the two channels are encoded according to the respective bit numbers of the two channels in the current channel pair.

In this implementation manner, after obtaining the respective bit numbers of the K channel pairs, the bits within the channel pair can be allocated based on the respective bit numbers of the K channels, so as to achieve a reasonable allocation of each channel in the multi-channel signal encoding. The number of bits to ensure the quality of the reconstructed audio signal at the decoding end.

In a possible design, determining the respective bit numbers of the K channel pairs according to the respective energy/amplitude of the audio signals of the P channels and the number of available bits may include: The respective energy/amplitude of the audio signal determines the sum of the energy/amplitude of the current frame. The respective bit coefficients of the K channel pairs are determined according to the sum of the respective energy/amplitude of the audio signals of the K channel pairs and the energy/amplitude sum of the current frame. The respective bit numbers of the K channel pairs are determined according to the respective bit coefficients of the K channel pairs and the available number of bits.

In a possible design, determining the energy/amplitude sum of the current frame according to the respective energy/amplitude of the audio signals of the P channels, may include: after the stereo processing of the audio signals of the P channels, respectively. energy/amplitude, determine the energy/amplitude sum of the current frame.

In this implementation, the energy/amplitude equalization can be performed on the two channels in a single channel pair, so that the channel pair with a large energy/amplitude difference can still maintain a large energy/amplitude equalization after the energy/amplitude equalization. energy/amplitude difference, so that when bit allocation is performed based on the energy/amplitude after energy/amplitude equalization, more bits can be allocated to channel pairs with larger energy/amplitude to ensure that channels with larger energy/amplitude The right coded bits meet its coding requirements, thereby improving the quality of the reconstructed audio signal at the decoding end.

In a possible design, determining the energy/amplitude sum of the current frame according to the respective stereo-processed energy/amplitude of the audio signals of the P channels may include: according to the formula

Calculate the energy/magnitude and sum_E _post for this current frame.

in,

Among them, ch represents the channel index, E _post (ch) represents the stereo-processed energy/amplitude of the audio signal of the channel whose channel index is ch, and sampleCoef _post (ch, i) represents the ch-th sound after stereo processing. The ith coefficient of the current frame of the track, N represents the number of coefficients of the current frame, and N takes a positive integer greater than 1.

In a possible design, determining the energy/amplitude sum of the current frame according to the respective energy/amplitude of the audio signals of the P channels may include: equalizing according to the respective energy/amplitude of the audio signals of the P channels energy/amplitude before, determine the energy/amplitude sum of the current frame, the energy/amplitude of the audio signal of one channel in the P channels The energy/amplitude before equalization includes the audio signal of the one channel in the time domain , or the energy/amplitude of the audio signal of the one channel after time-frequency transformation, or the energy/amplitude of the audio signal of the one channel after time-frequency transformation and whitening.

In this implementation, the energy/amplitude sum of the current frame is determined by using the energy/amplitude of the audio signals of the P channels of the current frame before equalization, so as to perform bit allocation based on the energy/amplitude sum of the current frame , that is, using the energy/amplitude before energy/amplitude equalization to perform bit allocation, it is possible to reasonably allocate the number of bits of each channel in multi-channel signal encoding, so as to ensure the quality of the reconstructed audio signal at the decoding end. This implementation manner can solve the problem of insufficient coding bits for the signal of the channel with large energy/amplitude, so as to ensure the quality of the reconstructed audio signal at the decoding end.

Using the energy/amplitude before energy/amplitude equalization for bit allocation, compared with using the energy/amplitude after energy/amplitude equalization for bit allocation, can reasonably allocate the number of bits of each channel in multi-channel signal coding, and the number of bits The allocation processing is decoupled from the energy/amplitude equalization processing. That is, the bit allocation process is not affected by the energy/amplitude equalization process. For example, even if the energy/amplitude of all channels is averaged during the energy/amplitude equalization process, this implementation method uses the energy/amplitude before the energy/amplitude equalization to perform bit allocation, and can achieve reasonable distribution of multi-channel signals The number of bits of each channel in encoding, so that more encoding bits are allocated to channel signals with large energy/amplitude, so as to ensure the quality of the reconstructed audio signal at the decoding end.

In a possible design, the energy/amplitude sum of the current frame is determined according to the energy/amplitude before equalization of the respective energy/amplitude of the audio signals of the P channels, which may include:

According to the formula

Calculate the energy/amplitude sum_E _{pre of} the current frame, where ch represents the channel index, and E _pre (ch) represents the energy/amplitude of the audio signal of the channel whose channel index is ch before energy/amplitude equalization.

In a possible design, determining the energy/amplitude sum of the current frame according to the respective energy/amplitude of the audio signals of the P channels may include: equalizing according to the respective energy/amplitude of the audio signals of the P channels The previous energy/amplitude and the respective weighting coefficients of the P channels are used to determine the energy/amplitude sum of the current frame, and the weighting coefficient is less than or equal to 1.

In this implementation manner, the number of bits of each channel in the multi-channel signal encoding can be adjusted through the weighting coefficient, so as to achieve reasonable allocation of the number of bits of each channel in the multi-channel signal encoding.

In a possible design, the energy/amplitude sum is determined according to the energy/amplitude of the audio signals of the P channels before equalization and the respective weighting coefficients of the P channels, which may include:

According to the formula

Calculate the energy/amplitude and sum_E _{pre of} the current frame;

Among them, ch represents the channel index, E _pre (ch) is the energy/amplitude of the audio signal of the ch-th channel before energy/amplitude equalization, α(ch) is the weighting coefficient of the ch-th channel, and the The weighting coefficients of the two channels are the same, and the magnitude of the weighting coefficients of the two channels of the one channel pair is inversely proportional to the normalized correlation value between the two channels of the one channel pair.

In this implementation manner, the number of bits of each channel in multi-channel signal coding is adjusted by the weighting coefficient, and the size of the weighting coefficient of the two channels of a channel pair is normalized between the two channels of the channel pair. The correlation value is inversely proportional, that is, the number of bits of the channel pair with low correlation can be increased through the weighting coefficient, thereby improving the encoding effect and ensuring the quality of the reconstructed audio signal at the decoding end.

In a possible design, the audio signals of the P channels further include the audio signals of the Q channels that are not paired, and P=2×K+Q, where Q is a positive integer. According to the respective energy/amplitude of the audio signals of the P channels and the number of available bits, determining the respective bit numbers of the K channel pairs may include: according to the respective energy/amplitude of the audio signals of the P channels, and the number of available bits, determine the number of bits for each of the K channel pairs and the number of bits for each of the Q channels. Encoding the audio signals of the P channels according to the respective bit numbers of the K channel pairs may include: respectively encoding the audio signals of the K channel pairs according to the respective bit numbers of the K channel pairs. encoding, encoding the audio signals of the Q channels according to the respective bit numbers of the Q channels. Wherein, one of the Q channels may be a monophonic channel, or may also be a channel obtained by downmixing.

In a possible design, according to the respective energies/amplitudes of the audio signals of the P channels and the number of available bits, the respective bit numbers of the K channel pairs and the respective bit numbers of the Q channels can be determined. The method includes: determining the energy/amplitude sum of the current frame according to the respective energy/amplitude of the audio signals of the P channels. The respective bit coefficients of the K channel pairs are determined according to the sum of the respective energy/amplitude of the audio signals of the K channel pairs and the energy/amplitude sum of the current frame. The respective bit coefficients of the Q channels are determined according to the sum of the respective energy/amplitude of the audio signals of the Q channels and the energy/amplitude of the current frame. The respective bit numbers of the K channel pairs are determined according to the respective bit coefficients of the K channel pairs and the available number of bits. The respective bit numbers of the Q channels are determined according to the respective bit coefficients of the Q channels and the available number of bits.

In a possible design, encoding the audio signals of the P channels according to the respective bit numbers of the K channels may include: encoding the P channels according to the respective bit numbers of the K channels. The energy/amplitude equalized audio signal of the channel is encoded.

In this implementation manner, the energy/amplitude equalized audio signals of the P channels can be encoded, wherein the energy/amplitude equalized audio signals of the P channels can be encoded by encoding the audio signals of the P channels. Obtained after performing energy/amplitude equalization, the encoding may include stereo processing, entropy encoding, etc., which can improve encoding efficiency and encoding effect.

In a second aspect, an embodiment of the present application provides a multi-channel audio signal encoding device, and the multi-channel audio signal encoding device may be an audio encoder, or a chip or a system-on-a-chip of an audio encoding device, or an audio encoder. A functional module of a method for implementing the above-mentioned first aspect or any possible design of the above-mentioned first aspect. The multi-channel audio signal encoding apparatus can implement the functions executed in the above first aspect or each possible design of the above first aspect, and the functions can be implemented by executing corresponding software through hardware. The hardware or software includes one or more modules corresponding to the above functions. For example, in a possible design, the multi-channel audio signal encoding apparatus may include: an acquisition module configured to acquire the audio signals of the P channels of the current frame of the multi-channel audio signal and the P audio signals The respective energy/amplitude of the audio signals of the channels, P is a positive integer greater than 1, the audio signals of the P channels include audio signals of K channel pairs, and K is a positive integer. The bit allocation module is configured to determine the respective bit numbers of the K channel pairs according to the respective energy/amplitude of the audio signals of the P channels and the number of available bits. The encoding module is configured to encode the audio signals of the P channels according to the respective bit numbers of the K channels to obtain an encoded code stream.

In a possible design, the K channel pairs include the current channel pair, and the encoding module is used for: according to the number of bits of the current channel pair and the respective audio signals of the two channels in the current channel pair. The energy/amplitude after stereo processing determines the respective bit numbers of the two channels in the current channel pair. The audio signals of the two channels are encoded according to the respective bit numbers of the two channels in the current channel pair.

In a possible design, the bit allocation module is configured to: determine the energy/amplitude sum of the current frame according to the respective energy/amplitude of the audio signals of the P channels. The respective bit coefficients of the K channel pairs are determined according to the sum of the respective energy/amplitude of the audio signals of the K channel pairs and the energy/amplitude sum of the current frame. The respective bit numbers of the K channel pairs are determined according to the respective bit coefficients of the K channel pairs and the available number of bits.

In a possible design, the bit allocation module is configured to: determine the energy/amplitude sum of the current frame according to the respective stereo-processed energy/amplitude of the audio signals of the P channels.

In a possible design, the bit allocation module is used to: according to the formula

Calculate the energy/magnitude and sum_E _post for this current frame.

in,

Among them, ch represents the channel index, E _post (ch) represents the stereo-processed energy/amplitude of the audio signal of the channel whose channel index is ch, and sampleCoef _post (ch, i) represents the ch-th sound after stereo processing. The ith coefficient of the current frame of the track, N represents the number of coefficients in the current frame, and N takes a positive integer greater than 1.

In a possible design, the bit allocation module is used to: determine the energy/amplitude sum of the current frame according to the energy/amplitude before equalization of the respective energy/amplitude of the audio signals of the P channels, the P channels. The energy/amplitude of the audio signal of one channel before equalization includes the energy/amplitude of the audio signal of the one channel in the time domain, or the energy/amplitude of the audio signal of the one channel after time-frequency transformation. Amplitude, or the energy/amplitude of the audio signal of one channel after time-frequency transformation and whitening.

In a possible design, the bit allocation module is used for: according to the energy/amplitude before equalization of the respective energy/amplitude of the audio signals of the P channels and the respective weighting coefficients of the P channels, determine the value of the current frame. Energy/amplitude sum, the weighting factor is less than or equal to 1.

In one possible design, the bit allocation block is used to:

According to the formula

Calculate the energy/amplitude and sum_E _{pre of} the current frame;

Among them, ch represents the channel index, E _pre (ch) is the energy/amplitude of the audio signal of the ch-th channel before energy/amplitude equalization, α(ch) is the weighting coefficient of the ch-th channel, and the The weighting coefficients of the two channels are the same, and the size of the weighting coefficients of the two channels of the one channel pair is inversely proportional to the normalized correlation value between the two channels of the one channel pair.

In a possible design, the audio signals of the P channels also include audio signals of the Q channels that are not paired, where P=2×K+Q, where K is a positive integer, and Q is a positive integer. The bit allocation module is configured to: determine the respective bit numbers of the K channel pairs and the respective bit numbers of the Q channels according to the respective energy/amplitude of the audio signals of the P channels and the number of available bits. The encoding module is used to encode the audio signals of the K channel pairs according to the respective bit numbers of the K channel pairs, and respectively encode the audio signals of the Q channels according to the respective bit numbers of the Q channels to encode.

In a possible design, the bit allocation module is configured to: determine the sum of the energy/amplitude of the current frame according to the respective energy/amplitude of the audio signals of the P channels. The respective bit coefficients of the K channel pairs are determined according to the sum of the respective energy/amplitude of the audio signals of the K channel pairs and the energy/amplitude sum of the current frame. The respective bit coefficients of the Q channels are determined according to the sum of the energy/amplitude of the audio signals of the Q channels and the energy/amplitude of the current frame. The respective bit numbers of the K channel pairs are determined according to the respective bit coefficients of the K channel pairs and the available number of bits. The respective bit numbers of the Q channels are determined according to the respective bit coefficients of the Q channels and the available number of bits.

In a possible design, the encoding module is configured to encode the energy/amplitude equalized audio signals of the P channels according to the respective bit numbers of the K channels.

In one embodiment, the apparatus may further include: an energy/amplitude equalization module. The energy/amplitude equalization module is configured to obtain the energy/amplitude equalized audio signals of the P channels according to the audio signals of the P channels.

In a third aspect, an embodiment of the present application provides a multi-channel audio signal encoding method, the method may include: acquiring audio signals of P channels of a current frame of the multi-channel audio signal, where P is a positive integer greater than 1, The audio signals of the P channels include audio signals of K channel pairs, where K is a positive integer. According to the respective energy/amplitude of the audio signals of the two channels of the current channel pair in the K channel pairs, perform energy/amplitude equalization on the audio signals of the two channels of the current channel pair to obtain the current channel pair. The energy/amplitude of the respective energy/amplitude equalized audio signals of the two channels of the channel pair. The respective bit numbers of the two channels of the current channel pair are determined according to the energy/amplitude after energy/amplitude equalization of the audio signals of the two channels of the current channel pair, and the number of available bits. The audio signals of the two channels are encoded respectively according to the respective bit numbers of the two channels of the current channel pair, so as to obtain an encoded code stream.

In a possible design, P=2×K, K is a positive integer, according to the energy/amplitude after energy/amplitude equalization of the audio signals of the two channels of the current channel pair, and the number of available bits, Determining the respective bit numbers of the two channels of the current channel pair may include: determining the energy/amplitude sum of the current frame according to the respective energy/amplitude equalized energy/amplitude of the audio signals of the P channels. According to the energy/amplitude sum of the current frame, the energy/amplitude after energy/amplitude equalization of the audio signals of the two channels of the current channel pair, and the number of available bits, determine the two audio channels of the current channel pair. the number of bits for each channel.

In a possible design, the audio signals of the P channels also include audio signals of the Q channels that are not paired, where P=2×K+Q, where K is a positive integer, and Q is a positive integer. According to the respective energy/amplitude equalized energy/amplitude of the audio signals of the two channels of the current channel pair, and the number of available bits, determine the respective bit numbers of the two channels of the current channel pair, which may include : According to the energy/amplitude after the energy/amplitude equalization of the audio signals of the respective two channels by the K channels, and the energy/amplitude after the energy/amplitude equalization of the audio signals of the Q channels, determine The energy/magnitude sum of the current frame. The respective bit numbers of the two channels of the current channel pair are determined according to the energy/amplitude sum of the current frame, the respective energy/amplitude of the audio signals of the two channels of the current channel pair, and the number of available bits. The respective bit numbers of the Q channels are determined according to the energy/amplitude sum of the current frame, the energy/amplitude after energy/amplitude equalization of the audio signals of the Q channels, and the number of available bits. Encoding the audio signals of the two channels according to the respective bit numbers of the two channels of the current channel pair, and obtaining the encoded code stream, may include: according to the respective bit numbers of the K channel pairs, respectively. The audio signals of the K channel pairs are encoded, and the audio signals of the Q channels are encoded according to the respective bit numbers of the Q channels, so as to obtain an encoded code stream.

In a fourth aspect, an embodiment of the present application provides a multi-channel audio signal encoding device, and the multi-channel audio signal encoding device may be an audio encoder, or a chip or a system-on-chip of an audio encoding device, or an audio encoder. A functional module of a method for implementing the above third aspect or any possible design of the above third aspect. The multi-channel audio signal encoding apparatus can implement the functions executed in the above third aspect or each possible design of the above third aspect, and the functions can be implemented by executing corresponding software in hardware. The hardware or software includes one or more modules corresponding to the above functions. For example, in a possible design, the multi-channel audio signal encoding apparatus may include: an acquisition module configured to acquire the audio signals of P channels of the current frame of the multi-channel audio signal, where P is greater than 1 A positive integer of , the audio signals of the P channels include audio signals of K channel pairs, and K is a positive integer. The energy/amplitude equalization module is used for performing energy analysis on the audio signals of the two channels of the current channel pair according to the respective energy/amplitude of the audio signals of the two channels of the current channel pair in the K channel pairs. /amplitude equalization, to obtain the respective energy/amplitude equalized energy/amplitude of the audio signals of the two channels of the current channel pair. A bit allocation module, configured to determine the respective energy/amplitude of the audio signals of the two channels of the current channel pair after energy/amplitude equalization, and the number of available bits, to determine the respective two channels of the current channel pair. number of bits. The encoding module is configured to encode the audio signals of the two channels according to the respective bit numbers of the two channels of the current channel pair, so as to obtain an encoded code stream.

In a possible design, P=2×K, K is a positive integer, and the bit allocation module is used to: determine the current energy/amplitude according to the respective energy/amplitude equalized energy/amplitude of the audio signals of the P channels. The energy/amplitude sum of the frame; according to the energy/amplitude sum of the current frame, the respective energy/amplitude equalized energy/amplitude of the audio signals of the two channels of the current channel pair, and the available number of bits, determine the The number of bits for each of the two channels of the current channel pair.

In a possible design, the audio signals of the P channels also include audio signals of the Q channels that are not paired, where P=2×K+Q, where K is a positive integer, and Q is a positive integer. The bit allocation module is used for: according to the energy/amplitude equalized energy/amplitude of the audio signals of the respective two channels according to the K channels, and the energy/amplitude equalization of the audio signals of the Q channels Determine the energy/amplitude sum of the current frame; according to the energy/amplitude sum of the current frame, the respective energy/amplitude of the audio signals of the two channels of the current channel pair, and the number of available bits, Determine the respective bit numbers of the two channels of the current channel pair; according to the energy/amplitude sum of the current frame, the respective energy/amplitude equalized energy/amplitude of the audio signals of the Q channels, and the available bits number to determine the number of bits for each of the Q channels. The encoding module is used for: encoding the audio signals of the K channel pairs according to the respective bit numbers of the K channel pairs, and respectively encoding the audio signals of the Q channels according to the respective bit numbers of the Q channels The signal is encoded to obtain the encoded code stream.

In a fifth aspect, an embodiment of the present application provides an audio signal encoding apparatus, comprising: a non-volatile memory and a processor coupled to each other, the processor calling program codes stored in the memory to execute the above-mentioned first The method of any one of the aspects, or to perform the method of any one of the third aspects above.

In a sixth aspect, an embodiment of the present application provides an audio signal encoding device, including: an encoder, where the encoder is configured to perform the method described in any one of the first aspect above, or perform the method described in the third aspect above The method of any one.

In a seventh aspect, an embodiment of the present application provides a computer-readable storage medium, including a computer program, when the computer program is executed on a computer, the computer program causes the computer to execute the method described in any one of the above-mentioned first aspects, Alternatively, the method according to any one of the above third aspects is performed.

In an eighth aspect, an embodiment of the present application provides a computer-readable storage medium, including an encoded code stream obtained according to any of the methods described in the first aspect above, or the method described in any of the above-mentioned third aspects. The encoded code stream obtained by the method.

In a ninth aspect, the present application provides a computer program product, the computer program product includes a computer program, when the computer program is executed by a computer, for executing the method described in any one of the above first aspects, or executing the above The method of any one of the third aspects.

In a tenth aspect, the present application provides a chip, including a processor and a memory, the memory is used for storing a computer program, and the processor is used for calling and running the computer program stored in the memory, so as to execute the above-mentioned first aspect The method of any one of the above, or to perform the method of any one of the third aspects above.

The multi-channel audio signal encoding method and device according to the embodiments of the present application acquire the audio signals of P channels of the current frame of the multi-channel audio signal, where the audio signals of the P channels include audio signals of K channel pairs , according to the respective energy/amplitude of the audio signals of the P channels and the number of available bits, determine the respective bit numbers of the K channel pairs, and according to the respective bit numbers of the K channel pairs, for the audio signals of the P channels Encode to get the encoded bitstream. Wherein, the energy/amplitude of the audio signal of one channel of the P channels includes the energy/amplitude of the audio signal of the one channel in the time domain, the energy/amplitude of the audio signal of the one channel after time-frequency transformation, The energy/amplitude of the audio signal of the one channel after time-frequency transformation and whitening, the energy/amplitude of the audio signal of the one channel after energy/amplitude equalization, or the audio signal of the one channel after stereo processing At least one of energy/amplitude. According to the energy/amplitude of the audio signals of the P channels in the time domain, the energy/amplitude after time-frequency transformation, the energy/amplitude after time-frequency transformation and whitening, and the energy/amplitude after energy/amplitude equalization , or at least one of the energy/amplitude after stereo processing performs the bit allocation for the channel pair, and determines the respective bit numbers of the K channel pairs, thereby realizing the reasonable allocation of the bits of each channel pair in the multi-channel signal encoding. to ensure the quality of the reconstructed audio signal at the decoding end. For example, in the case of a large difference in energy/amplitude between channel pairs, the method of the embodiments of the present application can solve the problem of insufficient coding bits for channel pairs with large energy/amplitude, so as to ensure the quality of the reconstructed audio signal at the decoding end .

Description of drawings

1 is a schematic diagram of an example of an audio encoding and decoding system in an embodiment of the application;

2 is a flowchart of a method for encoding a multi-channel audio signal according to an embodiment of the present application;

3 is a flowchart of a multi-channel audio signal encoding method according to an embodiment of the application;

4 is a flowchart of a method for allocating bits of a channel pair according to an embodiment of the present application;

5 is a schematic diagram of a processing process of an encoding end according to an embodiment of the present application;

6 is a schematic diagram of a processing process of a channel coding unit according to an embodiment of the present application;

7 is a schematic diagram of a processing process of a channel coding unit according to an embodiment of the present application;

8 is a flowchart of another multi-channel audio signal encoding method according to an embodiment of the application;

9 is a schematic structural diagram of an audio signal encoding apparatus according to an embodiment of the application;

FIG. 10 is a schematic structural diagram of an audio signal encoding device according to an embodiment of the present application.

detailed description

The terms "first", "second", etc. involved in the embodiments of the present application are only used for the purpose of distinguishing and describing, and cannot be understood as indicating or implying relative importance, nor can they be understood as indicating or implying order. Furthermore, the terms "comprising" and "having" and any variations thereof, are intended to cover non-exclusive inclusion, eg, comprising a series of steps or elements. A method, system, product or device is not necessarily limited to those steps or units expressly listed, but may include other steps or units not expressly listed or inherent to the process, method, product or device.

It should be understood that, in this application, "at least one (item)" refers to one or more, and "a plurality" refers to two or more. "And/or" is used to describe the relationship between related objects, indicating that there can be three kinds of relationships, for example, "A and/or B" can mean: only A, only B, and both A and B exist , where A and B can be singular or plural. The character "/" generally indicates that the associated objects are an "or" relationship. "At least one item(s) below" or similar expressions thereof refer to any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (a) of a, b or c, can mean: a, b, c, "a and b", "a and c", "b and c", or "a and b and c" ”, where a, b, c can be single or multiple respectively, or part of them can be single and part of them can be multiple.

The following describes the system architecture to which the embodiments of the present application are applied. Referring to FIG. 1 , FIG. 1 exemplarily shows a schematic block diagram of an audio encoding and decoding system 10 to which the embodiments of the present application are applied. As shown in FIG. 1, audio encoding and decoding system 10 may include source device 12 and destination device 14, source device 12 producing encoded audio data, and thus source device 12 may be referred to as an audio encoding device. Destination device 14 may decode the encoded audio data produced by source device 12, and thus destination device 14 may be referred to as an audio decoding device. Various implementations of source device 12, destination device 14, or both may include one or more processors and a memory coupled to the one or more processors. The memory may include, but is not limited to, RAM, ROM, EEPROM, flash memory, or any other medium that may be used to store the desired program code in the form of instructions or data structures accessible by a computer, as described herein. Source device 12 and destination device 14 may include a variety of devices, including desktop computers, mobile computing devices, notebook (eg, laptop) computers, tablet computers, set-top boxes, so-called "smart" phones, and other telephone handsets , TVs, speakers, digital media players, video game consoles, in-vehicle computers, any wearable devices, virtual reality (VR) devices, servers providing VR services, augmented reality (AR) devices, A server, wireless communication device or the like that provides AR services.

Although FIG. 1 depicts source device 12 and destination device 14 as separate devices, device embodiments may also include the functionality of both source device 12 and destination device 14 or both, ie source device 12 or a corresponding and the functionality of the destination device 14 or corresponding. In such embodiments, source device 12 or corresponding functionality and destination device 14 or corresponding functionality may be implemented using the same hardware and/or software, or using separate hardware and/or software, or any combination thereof .

Source device 12 and destination device 14 may be communicatively connected via link 13 through which destination device 14 may receive encoded audio data from source device 12 . Link 13 may include one or more media or devices capable of moving encoded audio data from source device 12 to destination device 14 . In one example, link 13 may include one or more communication media that enable source device 12 to transmit encoded audio data directly to destination device 14 in real-time. In this example, source device 12 may modulate the encoded audio data according to a communication standard, such as a wireless communication protocol, and may transmit the modulated audio data to destination device 14 . The one or more communication media may include wireless and/or wired communication media, such as radio frequency (RF) spectrum or one or more physical transmission lines. The one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (eg, the Internet). The one or more communication media may include routers, switches, base stations, or other devices that facilitate communication from source device 12 to destination device 14.

Source device 12 includes encoder 20 , and optionally, source device 12 may also include audio source 16 , pre-processor 18 , and communication interface 22 . In a specific implementation form, the encoder 20 , the audio source 16 , the preprocessor 18 , and the communication interface 22 may be hardware components in the source device 12 or software programs in the source device 12 . They are described as follows:

Audio source 16, which may include or may be any type of sound capture device, for example capturing real world sounds, and/or any type of audio generation device. Audio source 16 may be a microphone for capturing sound or a memory for storing audio data, audio source 16 may also include any category (internal or external) that stores previously captured or generated audio data and/or acquires or receives audio data. )interface. When the audio source 16 is a microphone, the audio source 16 may be, for example, a local or integrated microphone integrated in the source device; when the audio source 16 is a memory, the audio source 16 may be local or, for example, an integrated microphone integrated in the source device memory. When the audio source 16 includes an interface, the interface may be, for example, an external interface that receives audio data from an external audio source, such as an external sound capture device, such as a microphone, an external memory, or an external audio generation device. The interface may be any class of interface according to any proprietary or standardized interface protocol, eg wired or wireless interfaces, optical interfaces.

In this embodiment of the present application, the audio data transmitted from the audio source 16 to the preprocessor 18 may also be referred to as original audio data 17 .

The preprocessor 18 is used for receiving the original audio data 17 and performing preprocessing on the original audio data 17 to obtain the preprocessed audio 19 or the preprocessed audio data 19 . For example, the preprocessing performed by the preprocessor 18 may include filtering, or denoising, or the like.

The encoder 20 (or called the audio encoder 20) is used to receive the pre-processed audio data 19, and used to execute the various embodiments described later, so as to realize the encoding method of the audio signal encoding method described in this application. application.

A communication interface 22 that can be used to receive encoded audio data 21 and to transmit the encoded audio data 21 via link 13 to destination device 14 or any other device (eg, memory) for storage or direct reconstruction , the other device can be any device for decoding or storage. The communication interface 22 may, for example, be used to encapsulate the encoded audio data 21 into a suitable format, eg, data packets, for transmission over the link 13 .

The destination device 14 includes a decoder 30 , and optionally, the destination device 14 may also include a communication interface 28 , an audio post-processor 32 and a speaker device 34 . They are described as follows:

A communication interface 28 may be used to receive encoded audio data 21 from source device 12 or any other source, such as a storage device, such as an encoded audio data storage device. The communication interface 28 may be used to transmit or receive encoded audio data 21 via the link 13 between the source device 12 and the destination device 14, such as a direct wired or wireless connection, or via any kind of network. Classes of networks are, for example, wired or wireless networks or any combination thereof, or any classes of private and public networks, or any combination thereof. The communication interface 28 may, for example, be used to decapsulate data packets transmitted by the communication interface 22 to obtain encoded audio data 21 .

Both the communication interface 28 and the communication interface 22 may be configured as a one-way communication interface or a two-way communication interface, and may be used, for example, to send and receive messages to establish connections, acknowledge and exchange any other communication links and/or, for example, encoded audio Data transfer information about data transfer.

Decoder 30 (or referred to as decoder 30 ) for receiving encoded audio data 21 and providing decoded audio data 31 or decoded audio 31 .

An audio post-processor 32 for performing post-processing on the decoded audio data 31 (also referred to as reconstructed audio data) to obtain post-processed audio data 33. The post-processing performed by the audio post-processor 32 may include, for example, rendering, or any other processing, and may also be used to transmit the post-processed audio data 33 to the speaker device 34 .

A loudspeaker device 34 for receiving post-processed audio data 33 to play audio to eg a user or viewer. The speaker device 34 may be or include any type of speaker for presenting the reconstructed sound.

Although FIG. 1 depicts source device 12 and destination device 14 as separate devices, device embodiments may include the functionality of both source device 12 and destination device 14 or both, ie source device 12 or Corresponding functionality and destination device 14 or corresponding functionality. In such embodiments, source device 12 or corresponding functionality and destination device 14 or corresponding functionality may be implemented using the same hardware and/or software, or using separate hardware and/or software, or any combination thereof .

It will be apparent to those skilled in the art based on the description that the functionality of the different units or the existence and (exact) division of the functionality of the source device 12 and/or the destination device 14 shown in FIG. 1 may vary depending on the actual device and application. Source device 12 and destination device 14 may include any of a variety of devices, including any class of handheld or stationary devices, such as notebook or laptop computers, mobile phones, smartphones, tablet or tablet computers, cameras, desktops Computers, set-top boxes, televisions, cameras, in-vehicle equipment, stereos, digital media players, audio game consoles, audio streaming devices (such as content serving servers or content distribution servers), broadcast receiver equipment, broadcast transmitter equipment, Smart glasses, smart watches, etc., and can use no or any kind of operating system.

Both encoder 20 and decoder 30 may be implemented as any of a variety of suitable circuits, eg, one or more microprocessors, digital signal processors (DSPs), application-specific integrated circuits (application-specific integrated circuits) circuit, ASIC), field-programmable gate array (FPGA), discrete logic, hardware, or any combination thereof. If the techniques are implemented in part in software, an apparatus may store instructions for the software in a suitable non-transitory computer-readable storage medium and may execute the instructions in hardware using one or more processors to perform the techniques of this disclosure . Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) may be considered one or more processors.

In some cases, the audio encoding and decoding system 10 shown in FIG. 1 is merely an example, and the techniques of this application may be applicable to audio encoding setups (eg, audio encoding or decoding). In other examples, data may be retrieved from local storage, streamed over a network, and the like. An audio encoding device may encode and store data to memory, and/or an audio decoding device may retrieve and decode data from memory. In some examples, encoding and decoding is performed by devices that do not communicate with each other but merely encode data to and/or retrieve data from memory and decode data.

The above-mentioned encoder may be a multi-channel encoder, for example, a stereo encoder, a 5.1 channel encoder, or a 7.1 channel encoder, or the like.

The above audio data may also be referred to as an audio signal. The audio signal in the embodiment of the present application refers to an input signal in an audio coding device, and the audio signal may include multiple frames. For example, the current frame may specifically refer to a certain one of the audio signals. frame, in the embodiment of the present application, the encoding and decoding of the audio signal of the current frame is used as an example, and the previous frame or the next frame of the current frame in the audio signal can be encoded and decoded correspondingly according to the encoding and decoding mode of the audio signal of the current frame, The encoding and decoding process of the previous frame or the next frame of the current frame in the audio signal will not be described one by one. In addition, the audio signal in this embodiment of the present application may be a multi-channel signal, that is, an audio signal including P channels. The embodiments of the present application are used to implement multi-channel audio signal encoding.

It should be noted that “energy/amplitude” in the embodiments of the present application represents energy or amplitude, and, in the actual processing process, for the processing of a frame, if the energy is initially processed, then in the subsequent processing All are processing energy, or, if amplitude is initially processed, then amplitude is processed in subsequent processing.

The above encoder may execute the multi-channel audio signal encoding method of the embodiments of the present application, so as to reasonably allocate the number of bits of each channel in the multi-channel signal encoding, so as to ensure the quality of the reconstructed audio signal at the decoding end and improve the encoding quality. The specific implementation can refer to the specific explanations of the following embodiments.

FIG. 2 is a flowchart of a multi-channel audio signal encoding method according to an embodiment of the present application. The execution body of the embodiment of the present application may be the above encoder. As shown in FIG. 2 , the method in this embodiment may include:

Step 101: Acquire the audio signals of P channels of the current frame of the multi-channel audio signal, where P is a positive integer greater than 1, and the audio signals of the P channels include audio signals of K channel pairs.

The audio signal of one channel pair includes audio signals of two channels. One channel pair in this embodiment of the present application may be any one of the K channel pairs. Coupling the audio signals of two channels is the audio signal of one channel pair.

In some embodiments, P=2K. After filtering, grouping, stereo processing, and generating multi-channel side information of multi-channel signals, audio signals of P channels, that is, audio signals of K channel pairs can be obtained.

In some embodiments, the audio signals of the P channels further include unpaired audio signals of the Q channels, where P=2×K+Q, where K is a positive integer, and Q is a positive integer.

After multi-channel signal screening, group pairing, stereo processing and multi-channel side information generation, audio signals of K channel pairs and audio signals of Q channels without stereo processing can be obtained. Taking a 5.1 channel signal as an example, the 5.1 channel includes a left (L) channel, a right (R) channel, a center (C) channel, a low frequency effects (LFE) channel, and a left surround (LS) channel. ) channel, and Surround Right (RS) channel. The L channel signal and the R channel signal are paired to form the first channel pair, and after stereo processing, the middle channel M1 channel signal and the side channel S1 channel signal are obtained, and the LS channel signal and the RS channel signal are obtained. The channel signals are grouped to form a second channel pair, and the center channel M2 channel signal and the side channel S2 channel signal are obtained through stereo processing. The LFE channel signal and the C channel signal are unpaired audio signals. That is, P=6, K=2, Q=2. The audio signals of the above-mentioned P channels include the audio signal of the first channel pair, the audio signal of the second channel pair, and the LFE channel signal and the C channel signal that have not undergone stereo processing. The audio signal of the first channel pair The signals include a center channel M1 channel signal and a side channel S1 channel signal, and the audio signal of the second channel pair includes a center channel M2 channel signal and a side channel S2 channel signal. Wherein, the middle channels M1 and M2 and the side channels S1 and S2 may be considered as the channels obtained by the downmix processing, that is, the downmix channels.

Wherein, in some embodiments, the P channels do not include the LFE channel. In these embodiments, the LFE channel may be allocated a fixed number of bits regardless of whether the LFE channel's energy/amplitude value is high or low. For example, the fixed number may be a preset value, that is, no matter how many channels the multi-channel signal includes, and no matter the encoding bit rate of the multi-channel signal, the fixed number is unchanged, For example fixed at 80, 100 or 120 and so on. Alternatively, the fixed number can also be determined according to at least one of the number of channels included in the multi-channel signal and the encoding bit rate of the multi-channel signal. Generally speaking, the greater the number of channels, the smaller the fixed number, the better the encoding The higher the bit rate, the larger the fixed number; for example, when the multi-channel signal is a 5.1-channel signal, that is, includes 6 channels, if the encoding bit rate is 192kbps, the fixed number can be 80, which is LFE sound. The number of bits allocated for the channel is 80bits; if the encoding bit rate is 256kbps, the fixed number can be 120, that is, the number of bits allocated for the LFE channel is 120bits; for example, when the encoding bit rate is 192kbps, if multiple audio When the channel signal is a 7.1-channel signal, that is, 8 channels are included, the fixed number may be 60, that is, the number of bits allocated for the LFE channel is 60 bits.

Step 102: Determine the respective bit numbers of the K channel pairs according to the respective energy/amplitude of the audio signals of the P channels and the number of available bits.

Wherein, the energy/amplitude of the audio signal of one channel of the P channels includes the energy/amplitude of the audio signal of the one channel in the time domain, the energy/amplitude of the audio signal of the one channel after time-frequency transformation , the energy/amplitude of the audio signal of the one channel after time-frequency transformation and whitening, the energy/amplitude of the audio signal of the one channel after energy/amplitude equalization, or the audio signal of the one channel after stereo processing At least one of the energy/amplitude of . The energy/amplitude in the time domain, the energy/amplitude after time-frequency transformation, and the energy/amplitude after time-frequency transformation and whitening are the energy/amplitude before energy/amplitude equalization. In other words, in the bit allocation process, any one or more of the above energy/amplitude can be selected for bit allocation.

Wherein, when the P channels do not include the LFE channel, the available bits do not include the fixed number of bits.

The time-frequency transformed and whitened energy/amplitude of the audio signal of one channel refers to the energy/amplitude after time-frequency transformation and whitening of the audio signal of one channel, and the whitening is used to make the one audio The frequency domain coefficients of the audio signal of the channel are more flat, so as to facilitate subsequent coding,

A bit allocation is performed according to the respective energy/amplitude of the audio signals of the P channels and the number of available bits. One bit allocation here refers to bit allocation to channel pairs, that is, to allocate corresponding bit numbers to different channel pairs.

For P=2K, the respective bit numbers of the K channel pairs are determined according to the respective energy/amplitude of the audio signals of the P channels and the number of available bits, and the number of bits is also referred to as the number of initially allocated bits. A channel pair can be used as a basic unit, and a bit allocation is performed on a basic unit according to the ratio of the energy/amplitude of a basic unit to the energy/amplitude of all basic units (K basic units). The energy/amplitude of any one basic unit can be determined according to the energy/amplitude of the audio signals of the two channels in the basic unit. For example, the energy/amplitude of a base unit may be the sum of the energy/amplitude of the audio signals of the two channels within the base unit. Through one bit allocation, bits can be allocated among different basic units to obtain the number of bits of each basic unit, which is also referred to as the number of initially allocated bits.

For P=2×K+Q, according to the respective energies/amplitudes of the audio signals of the P channels and the number of available bits, the respective bit numbers of the K channel pairs and the respective bit numbers of the Q channels are determined. A channel pair can be used as a basic unit, and an unpaired single channel can be used as a basic unit. According to the ratio of the energy/amplitude of a basic unit to the energy/amplitude of all basic units (K+Q basic units), a bit allocation is performed on a basic unit. Wherein, for the basic unit corresponding to the channel of the group pair, the energy/amplitude of the basic unit may be determined according to the energy/amplitude of the audio signals of the two channels in the basic unit. For the basic unit corresponding to the unpaired channel, the energy/amplitude of the basic unit may be determined according to the energy/amplitude of the audio signal of the channel. Through one bit allocation, bit allocation can be performed among basic units (K+Q basic units) to obtain the number of bits of each basic unit. In other words, the number of bits for each of the K channel pairs and the number of bits for each of the Q channels are obtained. Wherein, one of the Q channels may be a monophonic channel, or may also be a channel obtained through downmix processing, that is, a downmix channel.

Whether it is P=2K or P=2×K+Q, for the determination of the respective bit numbers of the K channel pairs, an achievable way can be based on the energy/amplitude, Either the energy/amplitude after time-frequency transformation, or the energy/amplitude after time-frequency transformation and whitening, and can be determined by the number of bits. In this implementation manner, in order to improve coding efficiency and coding effect, energy/amplitude equalization may be performed on the audio signals of the K channel pairs before bit allocation. The manner of performing energy/amplitude equalization on the audio signals of the K channel pairs may be the audio signals of all of the plurality of channel pairs, or the plurality of channel pairs and one or more unpaired channels Perform energy/amplitude equalization. In this implementation manner, the manner of performing energy/amplitude equalization on the audio signals of the K channel pairs may also be performing energy/amplitude equalization on the audio signals of the two channels in a single channel pair.

Another achievable implementation can be determined according to any one of the energy/amplitude after energy/amplitude equalization or the energy/amplitude after stereo processing of the audio signals of the K channel pairs, and the number of available bits. In this implementation manner, in order to improve coding efficiency and coding effect, energy/amplitude equalization may be performed on the audio signals of the K channel pairs before bit allocation. The manner of performing energy/amplitude equalization on the audio signals of the K channel pairs may be performing energy/amplitude equalization on the audio signals of two channels in a single channel pair. Wherein, the energy/amplitude after energy/amplitude equalization or the energy/amplitude after stereo processing of the audio signals of the K channel pairs is the energy/amplitude of the audio signals of the two channels in a single channel pair. obtained after amplitude equalization.

Similar to the determination of the respective bit numbers of the K channels, when P=2×K+Q, for the determination of the respective bit numbers of the Q channels, an achievable way can be based on the audio of the Q channels. The energy/amplitude of each signal in the time domain, or the energy/amplitude after time-frequency transformation, or the energy/amplitude after time-frequency transformation and whitening, and can be determined by the number of bits. Another achievable manner can be determined according to any one of the energy/amplitude after energy/amplitude equalization or the energy/amplitude after stereo processing of the audio signals of the Q channels, and the number of available bits. The energy/amplitude after energy/amplitude equalization or the energy/amplitude after stereo processing of the audio signals of the Q channels is equal to the energy/amplitude before energy/amplitude equalization or the energy/amplitude before stereo processing .

Among them, in some embodiments, considering that the number of bits allocated to a single channel is greater than a certain threshold, the encoding quality of the channel will not be improved, so a threshold can be preset, and the bit allocation to the channel This threshold is taken into account during the process so that regardless of the energy/amplitude of the single channel, the number of bits allocated to a single channel will not exceed the threshold, so that more bits can be allocated to other channels to improve the other channels. At the same time, the encoding quality of the single channel will not be reduced, and the encoding quality of the whole signal will also be improved.

Correspondingly, in some embodiments, the determining the respective bit numbers of the K channel pairs may further include the following steps:

Determine the Mth channel whose initial allocation bit number is greater than the threshold in the P channels, where M is greater than or equal to 0 and less than P;

Obtain the number of redundant bits of the Mth channel, wherein the number of redundant bits of the Mth channel=the number of initially allocated bits of the Mth channel-threshold;

If the M th channel is the first channel of the P channels whose initial allocation bit number is greater than a threshold, allocate the redundant bits to the P channels P-1 channels other than the M-th channel are used to obtain the number of updated bits of the P-1 channels; wherein, the number of updated bits of the M-th channel is the threshold. If the M th channel is not the first channel whose number of initially allocated bits is greater than the threshold, the number of redundant bits is allocated to the P channels except the The M channels and other channels other than the channels whose initial allocation bit number is determined to be greater than the threshold value are obtained, so as to obtain the updated bit number of the other channels. For example, the channel with the determined initial allocation bit number greater than the threshold is the Nth channel, and the other channels include the Mth channel and the Nth channel among the P channels except the Mth channel and the Nth channel. P-2 channels out of one channel. It should be noted that, if the LFE channel is allocated a fixed number of bits, the P channels do not include the LFE channel.

Assume that the bit threshold for a single channel is frmBitMax. frmBitMax can be calculated from the saturated encoding bit rate, frame length, and encoding sampling rate of a single channel according to the following formula:

frmBitMax=rateMax×frameLen/fs,

Among them, rateMax represents the saturated encoding bit rate of a single channel, frameLen represents the frame length, and fs represents the encoding sample rate. Usually rateMax can be 256000bps, 240000bps, 224000bps, 192000bps, etc. The value of rateMax can be selected according to the coding efficiency of the encoder, or can be set according to experience, which is not limited here.

Taking the multi-channel signal as a 5.1-channel signal as an example, the L channel and R channel group are downmixed to obtain M1 channel and S1 channel, and the LS channel and RS channel group are downmixed to obtain M2 channel and S2 channel. sound. Among them, Bits(M1) represents the initial allocation bit number of M1 channel, Bits(S1) represents the initial allocation bit number of S1 channel, Bits(M2) represents the initial allocation bit number of M2 channel, Bits(S2) represents S2 The initial allocation bit number of the channel, the initial allocation bit number of the channel that does not participate in the group pair is Bits(C) and Bits(LFE). Among them, if a fixed number of bits are allocated to the LFE channel, the number of available bits=Bits(M1)+Bits(S1)+Bits(M2)+Bits(S2)+Bits(C); if the LFE channel is allocated a fixed number of bits If not a fixed number of bits are allocated, the number of available bits=Bits(M1)+Bits(S1)+Bits(M2)+Bits(S2)+Bits(C)+Bits(LFE).

The following description takes the example of assigning a fixed number of bits to the LFE channel:

The number of available bits is denoted totalBits and the threshold is denoted frmBitMax. Set allocFlag[5]={0,0,0,0,0}, where 5.1 channels are assumed to be sorted and M1=0, S1=1, C=2, M2=3, S2=4. Execute the following process:

Step 1. If Bits(i)<=frmBitMax, jump to step 5, where Bits(i)=frmBitMax, also need to set allocFlag[i]=1, where 0<=i<5;

Step 2. If Bits(i)>frmBitMax, set allocFlag[i]=1, calculate diffBits=Bits(ch)-frmBitMax, and then perform steps 3-5;

Step 3. Calculate sumBits=∑Bits(j), 0<=j<5, wherein Bits(j) is not accumulated to sumBits when allocFlag[j]=1;

Step 4. Assign diffBits to the channel of allocFlag[j]≠1, as follows:

Bits(j)=Bits(j)+diffBits×Bits(j)/sumBits

Step 5. If i=4, end the process; if i<3, i++, skip to step 1.

Wherein, in one embodiment, after performing step 4, the following steps can also be performed:

Determine whether Bits(j) is greater than or equal to frmBitMax; if Bits(j) is greater than or equal to frmBitMax, set the value of allocFlag[j] to 1.

The following is another example described with a fixed number of bits allocated to the LFE channel:

The number of available bits is denoted totalBits and the threshold is denoted frmBitMax. Set allocFlag[6]={0,0,0,0,0,0}, here it is assumed that 5.1 channels have been sorted and M1=0, S1=1, C=2, M2=3, S2=4 , LFE=5.

Step 1. If Bits(i)<=frmBitMax, skip to step 5. When Bits(i)=frmBitMax, it is also necessary to set allocFlag[i]=1, 0<=i<6;

Step 2. If Bits(i)>frmBitMax, set allocFlag[i]=1, calculate diffBits=Bits(i)-frmBitMax, and then perform steps 3-5;

Step 3. Calculate sumBits=∑Bits(j), 0<=j<4, wherein Bits(j) is not accumulated to sumBits when allocFlag[j]=1;

Step 4. Assign diffBits to the channel of allocFlag[j]≠1, as follows:

Bits(j)=Bits(j)+diffBits×Bits(j)/sumBits

Step 5. If i=4, end the process; if i<3, i++, skip to step 1.

Step 103: Encode the audio signals of the P channels according to the respective bit numbers of the K channels to obtain an encoded code stream.

The number of bits may be the number of initially allocated bits or the number of updated bits.

Encoding the audio signals of the P channels may include performing quantization, entropy encoding, and code stream multiplexing on the audio signals of the P channels to obtain an encoded code stream.

For P=2K, according to the respective bit numbers of the K channels, the audio signals of the P channels are quantized, entropy encoded, and stream multiplexed to obtain an encoded stream.

For P=2×K+Q, perform quantization, entropy coding and code stream multiplexing on the audio signals of the P channels according to the respective bit numbers of the K channels and the respective bit numbers of the Q channels to obtain the encoding code stream.

In this embodiment, the audio signals of the P channels of the current frame of the multi-channel audio signal are acquired, and the audio signals of the P channels include the audio signals of the K channel pairs. The energy/amplitude, and the number of available bits, determine the respective bit numbers of the K channel pairs, and encode the audio signals of the P channels according to the respective bit numbers of the K channels to obtain the encoded code stream. Wherein, the energy/amplitude of the audio signal of one channel of the P channels includes the energy/amplitude of the audio signal of the one channel in the time domain, the energy/amplitude after time-frequency transformation, the time-frequency transformation and whitening at least one of the energy/amplitude after energy/amplitude equalization, or the energy/amplitude after stereo processing. According to the energy/amplitude of the audio signals of the P channels in the time domain, the energy/amplitude after time-frequency transformation, the energy/amplitude after time-frequency transformation and whitening, and the energy/amplitude after energy/amplitude equalization , or at least one of the energy/amplitude after stereo processing performs the bit allocation for the channel pair, and determines the respective bit numbers of the K channel pairs, thereby realizing the reasonable allocation of the bits of each channel pair in the multi-channel signal encoding. to ensure the quality of the reconstructed audio signal at the decoding end. For example, in the case of a large difference in energy/amplitude between channel pairs, the method of the embodiments of the present application can solve the problem of insufficient coding bits for channel pairs with large energy/amplitude, so as to ensure the quality of the reconstructed audio signal at the decoding end .

FIG. 3 is a flowchart of a multi-channel audio signal encoding method according to an embodiment of the present application. The execution body of the embodiment of the present application may be the above encoder. As shown in FIG. 3 , the method of the present embodiment may include:

Step 201: Acquire audio signals of P channels of a current frame of a multi-channel audio signal, where P is a positive integer greater than 1, and the audio signals of the P channels include audio signals of K channel pairs.

The specific explanation of step 201 may refer to step 101 of the embodiment shown in FIG. 2 , and details are not repeated here.

Step 202: Determine the respective bit numbers of the K channel pairs according to the respective energy/amplitude of the audio signals of the P channels and the number of available bits.

A bit allocation is performed according to the respective energy/amplitude of the audio signals of the P channels and the number of available bits.

For P=2×K, in a bit allocation process, the method of the embodiment of the present application can determine the respective bit numbers of the K channel pairs according to the respective energy/amplitude of the audio signals of the P channels and the number of available bits .

For P=2×K+Q, in a bit allocation process, the method of the embodiment of the present application can determine the corresponding K channel pairs according to the respective energy/amplitude of the audio signals of the P channels and the number of available bits. The number of bits and the number of bits for each of the Q channels.

Wherein, whether it is P=2K or P=2×K+Q, in step 202, the explanation about the respective bit numbers of the K channel pairs and the determination of the respective bit numbers of the Q channels can be referred to in FIG. 1 . Step 102 in the illustrated embodiment is not repeated here.

Step 203, according to the number of bits of the current channel pair in the K channel pairs and the respective stereo processed energy/amplitude of the audio signals of the two channels in the current channel pair, determine the two sound channels in the current channel pair. the number of bits for each channel.

Taking the current channel pair among the K channel pairs as an example, according to the number of bits of the current channel pair among the K channel pairs and the respective stereo-processed energy of the audio signals of the two channels among the current channel pairs/ Amplitude, with secondary bit allocation in the current channel pair. Secondary bit allocation is to allocate the number of bits of the two channels of the current channel pair. That is, for the basic units corresponding to the channels of the group pair, the bits are allocated in the basic unit according to the respective energy/amplitude ratios of the audio signals of the two channels in the basic unit. The current channel pair may be any one of the K channel pairs. The secondary bit allocation here refers to the bit allocation for two channels in a channel pair, that is, allocating corresponding bit numbers to the two channels in the channel pair.

Regardless of whether P=2K or P=2×K+Q, the method of step 203 above can be used to allocate bits in the channel pair to obtain the respective bit numbers of the two channels in the channel pair.

Step 204: Encode the audio signals of the two channels according to the respective bit numbers of the two channels in the current channel pair to obtain an encoded code stream.

Respectively encoding the audio signals of the two channels in the current channel pair may include quantization, entropy encoding, and code stream multiplexing respectively on the audio signals of the two channels in the current channel pair to obtain an encoded code stream.

For P=2K, according to the respective bit numbers of the K channels, the audio signals of the P channels are respectively quantized, entropy encoded, and stream multiplexed to obtain an encoded stream.

For P=2×K+Q, the audio signals of the K channel pairs are quantized, entropy encoded, and stream multiplexed according to the respective bit numbers of the K channels, respectively. Perform quantization, entropy encoding, and code stream multiplexing on the audio signals of the Q channels to obtain an encoded code stream.

In this embodiment, the audio signals of the P channels of the current frame of the multi-channel audio signal are acquired, and the audio signals of the P channels include the audio signals of the K channel pairs. The energy/amplitude, and the number of available bits, determine the respective number of bits of the K channel pairs, according to the respective number of bits of the K channel pairs, according to the number of bits of the current channel pair among the K channel pairs and the current channel pair The respective stereo-processed energy/amplitude of the audio signals of the two channels in the current channel pair determines the respective bit numbers of the two channels in the current channel pair, and respectively sets the bit numbers of the two channels in the current channel pair according to the respective bit numbers of the two channels in the current channel pair. The audio signals of the two channels are encoded to obtain an encoded code stream. According to the energy/amplitude of the audio signals of the P channels in the time domain, the energy/amplitude after time-frequency transformation, the energy/amplitude after time-frequency transformation and whitening, and the energy/amplitude after energy/amplitude equalization , or at least one of the energy/amplitude after stereo processing performs bit allocation for channel pairs, determines the respective bit numbers of the K channel pairs, and then performs channel pairing based on the respective bit numbers of the K channels The number of bits of each channel in the multi-channel signal encoding can be reasonably allocated to ensure the quality of the reconstructed audio signal at the decoding end. For example, in the case of a large difference in energy/amplitude between channel pairs, the method of the embodiment of the present application can solve the problem of insufficient coding bits of the channel pair signal with large energy/amplitude, so as to ensure the reconstruction of the audio signal at the decoding end. quality.

FIG. 4 is a flowchart of a method for allocating bits of a channel pair according to an embodiment of the present application. The executive body of the embodiment of the present application may be the foregoing encoder, and this embodiment is one of step 102 of the embodiment shown in FIG. 2 above. A specific implementation manner, as shown in FIG. 4 , the method of this embodiment may include:

Step 1021: Determine the energy/amplitude sum of the current frame according to the respective energy/amplitude of the audio signals of the P channels.

For example, the respective energy/amplitude of the audio signals of the P channels includes the respective energy/amplitude of the audio signals of the P channels in the time domain, the energy/amplitude after time-frequency transformation, the energy after time-frequency transformation and whitening At least one of /amplitude, energy/amplitude after energy/amplitude equalization, or energy/amplitude after stereo processing.

Explain how the energy/amplitude sum of the current frame is determined for different energy/amplitude types.

Manner 1: Determine the energy/amplitude sum of the current frame according to the respective stereo-processed energy/amplitude of the audio signals of the P channels. The current frame energy / energy may be amplitude and / sum_E _pos amplitude and the stereo processing.

_{Exemplarily, the stereo-processed energy/amplitude and sum_E post} can be determined according to the following formulas (1) and (2).

Among them, ch represents the channel index, E _post (ch) represents the energy/amplitude of the audio signal of the channel whose channel index is ch after stereo processing, and sampleCoef _post (ch, i) represents the stereo processed channel of ch. The ith coefficient of the current frame, N represents the number of coefficients of the current frame, and N takes a positive integer greater than 1. The channel whose channel index is ch may be any one of the above P channels.

That is, the energy/amplitude sum of the current frame can be determined by the above method 1, and then the above-mentioned one bit allocation can be completed by the following steps 1022 and 1023.

In a second manner, the energy/amplitude sum of the current frame is determined according to the energy/amplitude before equalization of the respective energy/amplitude of the audio signals of the P channels. The energy/amplitude sum may be the energy/amplitude sum sum_E _pre before energy/amplitude equalization.

_{Exemplarily, the energy/amplitude and sum_E pre} before energy/amplitude equalization may be determined according to the following formulas (3) and (4).

Among them, E _pre (ch) represents the energy/amplitude of the audio signal of the channel whose channel index is ch before energy/amplitude equalization, and sampleCoef(ch, i) represents the current frame of the ch channel before energy/amplitude equalization. For the i-th coefficient, N represents the number of coefficients of the current frame, and N takes a positive integer greater than 1.

That is, the energy/amplitude sum of the current frame can be determined through the second method above, and then the above-mentioned first bit allocation can be completed through the following steps 1022 and 1023 .

Manner 3: Determine the energy/amplitude sum of the current frame according to the respective energy/amplitude of the audio signals of the P channels before equalization and the respective weighting coefficients of the P channels. The weighting coefficient of any one of the P channels is less than or equal to 1. The energy/amplitude sum may be the energy/amplitude sum sum_E _pre before energy/amplitude equalization.

_{Exemplarily, the energy/amplitude sum sum_E pre} before energy/amplitude equalization is determined according to the following formula (5).

Among them, α(ch) is the weighting coefficient of the channel whose channel index is ch, the weighting coefficients of the two channels of a channel pair are the same, and the weighting coefficients of the two channels of a channel pair are the same as the weighting coefficients of the two channels of the channel pair. The normalized correlation values between the two channels in a pair are inversely proportional.

In one implementation, α(ch) is 1 when the channel with the channel index ch does not participate in the group pair. When the channel whose channel index is ch participates in the group pair, the channel whose channel index is ch1 (hereinafter referred to as ch1), the channel whose channel index is ch2 (hereinafter referred to as ch2), and the channel whose channel index is ch3 Take the channel (hereinafter referred to as ch3) and the channel with channel index ch4 (hereinafter referred to as ch4) as examples, where the pair of ch1 and ch2, and the pair of ch3 and ch4 are taken as examples, α(ch1) and α(ch2) are equal, And both are less than 1, α(ch3) and α(ch4) are equal, and both are less than 1. α(ch1) and α(ch2) can be determined according to the normalized correlation value Corr_norm(ch1, ch2) of ch1 and ch2. α(ch3) and α(ch4) may be determined according to the normalized correlation value Corr_norm(ch3, ch4). The values of α(ch3) and α(ch4) where the normalized correlation value Corr_norm(ch3, ch4) is larger, are smaller than the values of α(ch1) and α(ch2) where the normalized correlation value Corr_norm(ch1, ch2) is smaller value of . That is, α(ch1) and α(ch2) are inversely proportional to the normalized correlation values Corr_norm(ch1, ch2) of ch1 and ch2.

Exemplarily, when ch1 and ch2 are paired, α(ch1) and α(ch2) can be calculated by the following formula (6).

α(ch1, ch2)=C+(1-C)×(1-Corr_norm(ch1,ch2))/(1-threshold)(6)

where C is a constant, C∈[0,1], threshold is the normalized pair threshold of ch1 and ch2, threshold∈[0,1], Corr_norm(ch1,ch2) is the normalized correlation of ch1 and ch2 value, coeff(ch1,ch2)∈[0,1]. In some embodiments, C may take 0.707. The threshold can be 0.2, 0.25, or 0.28 and so on.

The two channel correlation values can be calculated by the following formula (7), taking ch1 and ch2 as examples.

where Corr_norm(ch1, ch2) is the normalized correlation value of ch1 and ch2, spec_ch1(i) is the time domain or frequency domain coefficient of ch1, spec_ch2(i) is the time domain or frequency domain coefficient of channel ch2, N is the number of coefficients for the current frame.

For example, the L and R channels are the first channel pair and the normalized correlation value is Corr_norm(L,R), the LS channel and the RS channel are the second channel pair and the normalized correlation value is Corr_norm (LS,RS).

The correlation values of the two channels of other channel pairs can also be calculated by using the formula (7), and the weighting coefficients of the channels of the channel pair can also be calculated by using the formula (6).

Considering that stereo processing will reduce the energy/amplitude sum of the two channels involved in stereo processing, and the reduction degree of the energy/amplitude sum of the two channels is related to the similarity of the audio signals of the two channels, that is, the two The higher the correlation of the audio signal of the channel, the more the energy/amplitude sum of the two channels is reduced after stereo processing.

Therefore, when the energy/amplitude before stereo processing is used for one bit allocation, the weighting coefficient is increased in one bit allocation. The weighting coefficients of the two channels with high correlation are smaller than the weighting coefficients of two channels with low correlation. The weighting coefficients of the ungrouped channels are greater than the weighting coefficients of the paired channels. The weighting coefficients of the two channels of the same pair are the same. That is, the energy/amplitude sum can be determined in the third method above, and then the above-mentioned first bit allocation can be completed through the following steps 1022 and 1023 .

Step 1022: Determine the respective bit coefficients of the K channel pairs according to the energy/amplitude sum of the audio signals of the K channel pairs and the energy/amplitude sum of the current frame.

After the energy/amplitude sum is determined in the first, second or third manner, for P=2K, the energy/amplitude sum of the audio signals of the K channel pairs and the energy/amplitude sum determined in the above step 1021 can be determined. The respective bit coefficients for the K channel pairs.

After the energy/amplitude sum is determined in the first, second, or third manner, for P=2×K+Q, the energy/amplitude of the audio signals of the K channel pairs can be determined according to the respective energy/amplitude of the audio signals of the K channel pairs, and the energy/amplitude determined in step 1021 above. Amplitude sum, determine the respective bit coefficients of the K channel pairs, and determine the respective bit coefficients of the Q channels according to the respective energy/amplitude of the Q channels and the energy/amplitude sum determined in the above step 1021.

The respective bit coefficients of the K channel pairs may be the ratios of the respective energy/amplitude of the K channel pairs to the energy/amplitude sum determined in the foregoing step 1021 . The energy/amplitude of a channel pair may be the sum of the energy/amplitude of the two channels in the channel pair. The respective bit coefficients of the Q unpaired channels are the ratios of the respective energy/amplitude of the Q channels in the sum of the energy/amplitude determined in step 1021 above.

Step 1023: Determine the respective bit numbers of the K channel pairs according to the respective bit coefficients and the available bit numbers of the K channel pairs.

For P=2K, the respective bit numbers of the K channel pairs can be determined according to the respective bit coefficients of the K channel pairs and the number of available bits.

For P=2×K+Q, the respective bit numbers of the K channel pairs can be determined according to the respective bit coefficients and available bits of the K channel pairs, and according to the respective bit coefficients and available bits of the Q channels, Determines the number of bits for each of the Q channels.

In this embodiment, the audio signals of the P channels of the current frame of the multi-channel audio signal are acquired, and the audio signals of the P channels include the audio signals of the K channel pairs. Energy/amplitude, determine the energy/amplitude sum of the current frame, according to the respective energy/amplitude of the audio signals of the K channel pairs, and the energy/amplitude sum of the current frame, determine the respective bit coefficients of the K channel pairs, according to K The respective bit coefficients and available bits of each channel pair are determined, the respective bit numbers of K channel pairs are determined, and the audio signals of P channels are encoded according to the respective bit numbers of K channel pairs to obtain an encoded code flow. Among them, the energy/amplitude in the time domain, the energy/amplitude after time-frequency transformation, the energy/amplitude after time-frequency transformation and whitening, and the energy/amplitude after energy/amplitude equalization of the audio signals passing through the P channels At least one of the amplitude or the energy/amplitude after stereo processing determines the energy/amplitude sum of the current frame, and based on the ratio of the respective energy/amplitude of the audio signals of each channel pair in the energy/amplitude sum, the The bit allocation of channel pairs determines the number of bits of each of the K channel pairs, so as to reasonably allocate the number of bits of each channel pair in multi-channel signal encoding, so as to ensure the quality of the reconstructed audio signal at the decoding end. For example, in the case of a large difference in energy/amplitude between channel pairs, the method of the embodiments of the present application can solve the problem of insufficient coding bits for channel pairs with large energy/amplitude, so as to ensure the quality of the reconstructed audio signal at the decoding end .

The following embodiments take a 5.1-channel signal as an example to schematically illustrate the multi-channel audio signal encoding method according to the embodiment of the present application.

FIG. 5 is a schematic diagram of a processing process of an encoding end according to an embodiment of the present application. As shown in FIG. 5 , the encoding end may include a multi-channel encoding processing unit 401 , a channel encoding unit 402 and a code stream multiplexing interface 403 . The encoding end may be an encoder as described above.

The multi-channel encoding processing unit 401 is used to perform multi-channel signal filtering, group pairing, stereo processing and multi-channel side information generation on the input signal. In this embodiment, the input signal is a 5.1 (L channel, R channel, C channel, LFE channel, LS channel, RS channel) signal.

An example, the multi-channel encoding processing unit 401 pairs the L channel signal and the R channel signal to form a first channel pair, and obtains the middle channel M1 channel signal and the side channel S1 sound through stereo processing. The LS channel signal and the RS channel signal are paired to form a second channel pair, and the middle channel M2 channel signal and the side channel S2 channel signal are obtained through stereo processing.

Due to the large difference in energy/amplitude between multi-channel middle channels, before performing stereo processing, the multi-channel energy/amplitude equalization increases the benefits of stereo processing, that is, the energy/amplitude is concentrated in the middle channel to facilitate the channel The coding unit improves coding efficiency. In the embodiment of the present application, equalizing the channels of the group pair is adopted to obtain the energy/amplitude equalization between the channels. It is assumed that the energy/amplitude of the current frame of each input channel before energy/amplitude equalization is energy_L, energy_R, energy_C, energy_LS, and energy_RS, respectively. energy_L is the energy/amplitude of the L channel signal before energy/amplitude equalization, energy_R is the energy/amplitude of the R channel signal before energy/amplitude equalization, energy_C is the energy/amplitude of the C channel signal before energy/amplitude equalization, and energy_LS is Energy/amplitude of the LS channel signal before energy/amplitude equalization, energy_RS is the energy/amplitude of the RS channel signal before energy/amplitude equalization.

The energy/amplitude of the L channel and the R channel of the first channel pair after energy/amplitude equalization is energy_avg_LR, and the calculation method of energy_avg_LR may use the following formula (8).

energy_avg_LR=avg(energy_L,energy_R) (8)

The energy/amplitude of the LS channel and the RS channel after energy/amplitude equalization of the second channel pair are both energy_avg_LSRS, and the calculation method of energy_avg_LSRS may use the following formula (9).

energy_avg_LSRS=avg(energy_LS,energy_RS) (9)

Among them, the avg(a1, a2) function realizes the mean value of the input two parameters a1 and a2. a1 takes energy_L, a2 takes energy_R. a1 takes energy_LS, a2 takes energy_RS.

The energy/amplitude energy(ch) (including energy_L, energy_R, energy_C, energy_LS, energy_RS) of each channel before energy/amplitude equalization is calculated as follows:

Among them, sampleCoef(ch, i) represents the i-th coefficient of the current frame of the channel whose channel index is ch, N represents the number of coefficients of the current frame, and different ch values can correspond to the above L channel, R channel channel, C channel, LFE channel, LS channel, RS channel.

In the embodiment of the present application, energy_L is equal to E _pre (L), energy_R is equal to E _pre (R), energy_LS is equal to E _pre (LS), energy_RS is equal to E _pre (RS), and energy_C is equal to E _pre (C). _{_{E post (L) = E post}} (R) = energy_avg_LR. _{_{E post (LS) = E post}} (RS) = energy_avg_LSRS.

The multi-channel encoding processing unit 401 outputs the stereo-processed M1 channel signal, the S1 channel signal, the M2 channel signal, the S2 channel signal, the LFE channel signal and the C channel signal not subjected to the stereo processing, and the multi-channel signal. Roadside information.

The channel encoding unit 402 is used to encode the stereo processed M1 channel signal, S1 channel signal, M2 channel signal, S2 channel signal, LFE channel signal and C channel signal without stereo processing, and multi-channel signal. The channel side information is encoded, and the encoded channels E1-E6 are output. Channel encoding unit 402 may include a plurality of channel processing boxes that allocate more bits to channels with greater energy/amplitude than channels with less energy/amplitude. After the channel coding unit 402 performs quantization and entropy coding to remove redundancy at the coding end, the coded channels E1-E6 are sent to the code stream multiplexing interface 403.

The code stream multiplexing interface 403 multiplexes the six encoded channels E1-E6 to form a serial bit stream (bitStream), so as to facilitate the multi-channel audio signal to be transmitted in the channel or stored in the digital medium.

FIG. 6 is a schematic diagram of a processing process of a channel encoding unit according to an embodiment of the present application. As shown in FIG. 6 , the channel encoding unit 402 may include a bit allocation unit 4021 and a quantization entropy encoding unit 4023 . This embodiment is an example of the above-mentioned first mode.

The bit allocation unit 4021 is used to perform the primary bit allocation and the secondary bit allocation in the above-mentioned embodiment, so as to obtain the number of bits of each channel.

_{Exemplarily, the bit allocation unit 4021 determines the energy/amplitude and sum_E post} after stereo processing according to the above formulas (1) and (2). Then, the bit coefficients of each channel pair and the bit coefficients of the unpaired channels are determined by the following formulas (11) to (14). In this embodiment, the bit coefficient of the first channel pair is represented by Ratio(L,R), the bit coefficient of the second channel pair is represented by Ratio(LS,RS), and the bit coefficient of the unpaired C channel is represented by Ratio(C) is represented, and the bit coefficients of the unpaired LFE channels are represented by Ratio(LFE).

Ratio(L,R)=(E _post (M1)+E _post (S1))/sum_E _post (11)

Ratio(LS,RS)=(E _post (M2)+E _post (S2))/sum_E _post (12)

Ratio(C)=E _post (C)/sum_E _post (13)

_{Ratio (LFE) = E post (} LFE) / sum_E post (14)

The bit allocation unit is based on Ratio(L,R), Ratio(LS,RS), Ratio(C), Ratio(LFE), the number of available bits bAvail, the channel pair indices pairIdx1 and pairIdx2, and the stereo processed result of each channel. The energy/amplitude E _post (ch) is calculated to obtain the number of bits for each channel. The channel pair index pairIdx1 and pairIdx2 may be output by the multi-channel encoding processing unit 401, the channel pair index pairIdx1 is used to indicate the L channel and the R channel group pair, and the channel pair index pairIdx2 is used to indicate the LS channel paired with the RS channel group.

Exemplarily, the number of bits of each channel can be determined by the following formulas (15) to (22).

Bit allocation for channel pairs:

Bits(M1,S1)＝bAvail×Ratio(L,R) (15)

Bits(M2,S2)＝bAvail×Ratio(LS,RS) (16)

Wherein, Bits(M1, S1) represents the number of bits of the first channel pair, and Bits(M2, S2) represents the number of bits of the second channel pair.

Bit allocation between channels within a channel pair and bit allocation for channels not involved in a group:

Among them, the bit allocation between the channels of the group pair channel is as follows:

Bits(M1)=Bits(M1,S1)×E _post (M1)/(E _post (M1)+E _post (S1)) (17)

Bits(S1)=Bits(M1,S1)×E _post (S1)/(E _post (M1)+E _post (S1)) (18)

Bits(M2)＝Bits(M2,S2)×E _post (M2)/(E _post (M2)+E _post (S2)) (19)

Bits(S2)=Bits(M2,S2)×E _post (S2)/(E _post (M2)+E _post (S2)) (20)

Among them, Bits(M1) represents the number of bits of the M1 channel, Bits(S1) represents the number of bits of the S1 channel, Bits(M2) represents the number of bits of the M2 channel, and Bits(S2) represents the number of bits of the S2 channel.

The bit assignments for channels not participating in a group pair are as follows:

Bits(C)=bAvail×Ratio(C) (21)

Bits(LFE)=bAvail×Ratio(LFE) (22)

Among them, Bits(C) represents the number of bits of the C channel, and Bits(LFE) represents the number of bits of the LFE channel.

The quantization entropy coding unit 4023 performs stereo processing on the M1 channel signal, the S1 channel signal, the M2 channel signal, the S2 channel signal, the C channel signal, the LFE channel signal and the multi-channel signal according to the number of bits of each channel. The side information is quantized and entropy encoded to obtain the encoded channel E1-E6 signals.

In this embodiment, the channel pair is used as the granularity to perform energy/amplitude equalization on the two channels of the channel pair. Since the energy/amplitude ratio between the channel pairs before stereo processing is different, the The energy/amplitude ratio is also different. Then, according to the energy/amplitude ratio of each channel pair after stereo processing, the bit allocation between the channel pairs is performed, and finally the internal bit allocation of the channel pair is performed, which can realize the reasonable distribution of multi-channel signals. The number of bits of each channel in the encoding to ensure the quality of the reconstructed audio signal at the decoding end. For example, in the case of a large difference in energy/amplitude between channel pairs, the method of the embodiment of the present application can solve the problem of insufficient coding bits of the channel pair signal with large energy/amplitude, so as to ensure the reconstruction of the audio signal at the decoding end. quality.

Similar to the specific implementation of the energy/amplitude equalization of the multi-channel encoding processing unit 401 in the embodiment shown in FIG. 5 , the embodiment of the present application further provides another energy/amplitude equalization manner. The above-mentioned 5.1-channel signal is taken as an example for further illustration.

The energy/amplitude of each channel after equalization is energy_avg. energy_avg can be determined by the following formula (23).

energy_avg=avg(energy_L,energy_R,energy_C,energy_LS,energy_RS) (23)

Among them, the Avg(a1, a2, ..., an) function realizes the mean value of the input n parameters a1, a2, ..., an.

FIG. 7 is a schematic diagram of a processing process of a channel encoding unit according to an embodiment of the present application. As shown in FIG. 7 , the channel encoding unit 402 may include a bit allocation unit 4021 , a quantization entropy encoding unit 4023 and a bit calculation unit 4022 . This embodiment is an example of the above-mentioned second manner.

The bit allocation unit 4021 is configured to perform the primary bit allocation and the secondary bit allocation in the above-mentioned embodiment, so as to obtain the number of bits of each channel.

_{Exemplarily, the bit calculation unit 4022 determines the energy/amplitude sum sum_E pre} before energy/amplitude equalization according to the above formulas (3) and (4). Then, the bit coefficients of each channel pair and the bit coefficients of the unpaired channels are determined by the following formulae (24) to (27). In this embodiment, the bit coefficient of the first channel pair is represented by Ratio(L,R), the bit coefficient of the second channel pair is represented by Ratio(LS,RS), and the bit coefficient of the unpaired C channel is represented by Ratio(C) is represented, and the bit coefficients of the unpaired LFE channels are represented by Ratio(LFE).

Ratio(L,R)＝(E _pre (L)+E _pre (R))/sum_E _pre (24)

Ratio(LS,RS)=(E _pre (LS)+E _pre (RS))/sum_E _pre (25)

_{Ratio (C) = E pre (} C) / sum_E pre (26)

_{Ratio (LFE) = E pre (} LFE) / sum_E pre (27)

The bit allocation unit 4021 is based on Ratio(L,R), Ratio(LS,RS), Ratio(C), Ratio(LFE), the number of available bits bAvail, the channel pair indices pairIdx1 and pairIdx2, and the stereo processing of each channel. The energy/amplitude E _post (ch) is calculated to obtain the number of bits for each channel. The channel pair index pairIdx1 and pairIdx2 may be output by the multi-channel encoding processing unit 401, the channel pair index pairIdx1 is used to indicate the L channel and the R channel group pair, and the channel pair index pairIdx2 is used to indicate the LS channel Pair with the RS channel group.

Exemplarily, based on the number of bits determined by the above formulae (24) to (27), the number of bits of each channel can be determined by the above formulae (15) to (22).

In this embodiment, stereo processing is performed after performing energy/amplitude equalization on all channels. Although the energy/amplitude ratio of each channel after stereo processing is similar, in this embodiment of the present application, after stereo processing Perform bit allocation between channel pairs according to the energy/amplitude ratio of the pair, and then perform bit allocation within the channel pair according to the energy/amplitude after stereo processing. According to the energy/amplitude ratio of the channel pair before stereo processing, the bit allocation between each channel pair is guided. Since the energy/amplitude ratio of the channel pair before stereo processing is different, the bit allocation between each channel pair is performed accordingly. , which can reasonably allocate the number of bits of each channel in the multi-channel signal encoding, so as to ensure the quality of the reconstructed audio signal at the decoding end. For example, in the case of a large difference in energy/amplitude between channel pairs, the method of the embodiment of the present application can solve the problem of insufficient coding bits of the channel pair signal with large energy/amplitude, so as to ensure the reconstruction of the audio signal at the decoding end. quality.

In some embodiments, the channel encoding unit 402 may include a bit allocation unit 4021, a quantization entropy encoding unit 4023, and a bit calculation unit 4022, and may also be used to implement the functions of each step in the third mode.

_{Exemplarily, the bit allocation unit 4021 determines the energy/amplitude and sum_E pre} before the energy/amplitude equalization according to the above formulas (5) to (7). Then, the bit coefficients of each channel pair and the bit coefficients of the unpaired channels are determined by the following formulae (28) to (31). In this embodiment, the bit coefficient of the first channel pair is represented by Ratio(L,R), the bit coefficient of the second channel pair is represented by Ratio(LS,RS), and the bit coefficient of the unpaired C channel is represented by Ratio(C) is represented, and the bit coefficients of the unpaired LFE channels are represented by Ratio(LFE).

Ratio(L,R)=(α(L)*E _pre (L)+α(R)*E _pre (R))/sum_E _pre (28)

Ratio(LS,RS)=(α(LS)*E _pre (LS)+α(RS)*E _pre (RS))/sum_E _pre (29)

Ratio(C)=α(C)*E _pre (C)/sum_E _pre (30)

Ratio(LFE)=α(LFE)*E _pre (LFE)/sum_E _pre (31)

where α(L) represents the weighting coefficient of the L channel, α(R) represents the weighting coefficient of the R channel, α(LS) represents the weighting coefficient of the LS channel, α(RS) represents the weighting coefficient of the RS channel, α(C) represents the weighting coefficient of the C channel, and α(LFE) represents the weighting coefficient of the LFE channel.

Exemplarily, based on the number of bits determined by the above equations (28) to (31), the number of bits of each channel can be determined by the above equations (15) to (22).

The quantization entropy coding unit pairs the stereo processed M1 channel signal, S1 channel signal, M2 channel signal, S2 channel signal, C channel signal, LFE channel signal and multi-channel side signal according to the number of bits of each channel. The information is quantized and entropy encoded to obtain encoded channel E1-E6 signals.

In this embodiment, by adjusting the bit allocation by the weighting coefficient, it is possible to reasonably allocate the number of bits of each channel in the encoding of the multi-channel signal, so as to ensure the quality of the reconstructed audio signal at the decoding end.

FIG. 8 is a flowchart of another multi-channel audio signal encoding method according to an embodiment of the present application. The execution body of the embodiment of the present application may be the foregoing encoder. As shown in FIG. 8 , the method in this embodiment may include:

Step 501: Acquire audio signals of P channels of the current frame of the multi-channel audio signal, where P is a positive integer greater than 1, and the audio signals of the P channels include audio signals of K channel pairs.

The audio signal of one channel pair includes audio signals of two channels.

One channel pair in this embodiment of the present application may be any one of the K channel pairs. Coupling the audio signals of two channels is the audio signal of one channel pair.

The specific explanation of step 501 may refer to step 101 of the embodiment shown in FIG. 2 , and details are not repeated here.

Step 502, according to the respective energy/amplitude of the audio signals of the two channels of the current channel pair in the K channel pairs, perform energy/amplitude equalization on the audio signals of the two channels of the current channel pair, and obtain the The energy/amplitude of the respective energy/amplitude equalized audio signals of the two channels of the current channel pair.

The embodiments of the present application perform energy/amplitude equalization for channel pairs, that is, each channel pair performs energy/amplitude equalization within the channel pair. Taking the current channel pair among the K channel pairs as an example, according to the respective energy/amplitude of the audio signals of the two channels of the current channel pair among the K channel pairs, the two channels of the current channel pair Perform energy/amplitude equalization on the audio signal of the current channel pair, and obtain the energy/amplitude equalized energy/amplitude of the two channels of the current channel pair.

Whether it is P=2K or P=2×K+Q, energy/amplitude equalization can be performed in the channel pair in the manner of step 502 above, so as to obtain the respective energies of the two channels in the current channel pair. /amplitude equalized energy/amplitude.

Exemplarily, the above formula (8) may be used to determine the energy/amplitude after energy/amplitude equalization of the two channels of the current channel pair. That is, L and R in formula (8) are replaced by the two channels of the current channel pair.

Step 503: Determine the respective bit numbers of the two channels of the current channel pair according to the respective energy/amplitude after energy/amplitude equalization of the audio signals of the two channels of the current channel pair, and the number of available bits .

Taking the current channel pair among the K channel pairs as an example, according to the energy/amplitude equalized energy/amplitude of the two channels of the current channel pair, and the number of available bits, determine the two channels of the current channel pair. The number of bits for each channel. The current channel pair may be any one of the K channel pairs.

For P=2×K, the method of the embodiment of the present application may determine the energy/amplitude sum of the current frame according to the energy/amplitude equalized energy/amplitude of the audio signals of the respective two channels of the K channels. According to the energy/amplitude sum of the current frame, the energy/amplitude after energy/amplitude equalization of the audio signals of the two channels of the current channel pair, and the number of available bits, determine the respective energy/amplitude of the two channels of the current channel pair. number of bits.

For example, according to the ratio of the energy/amplitude equalized energy/amplitude of the audio signals of the two channels of the current channel pair to the energy/amplitude sum, and the number of available bits, determine the two channels of the current channel pair. The number of bits for each channel.

For P=2×K+Q, the method of the embodiment of the present application can be based on the energy/amplitude of the audio signals of the respective two channels of the K channels after energy/amplitude equalization, and the audio frequency of the Q channels The energy/amplitude of the signal after energy/amplitude equalization determines the energy/amplitude sum of the current frame. The respective bit numbers of the two channels of the current channel pair are determined according to the energy/amplitude sum, the respective energy/amplitude of the audio signals of the two channels of the current channel pair, and the number of available bits. According to the energy/amplitude sum, the energy/amplitude after energy/amplitude equalization of the audio signals of the Q channels, and the number of available bits, the number of bits for each of the Q channels is determined.

For example, according to the ratio of the respective energy/amplitude of the audio signals of the two channels of the current channel pair to the energy/amplitude sum, and the number of available bits, determine the respective number of bits of the two channels of the current channel pair . The respective bit numbers of the Q channels are determined according to the ratio of the energy/amplitude equalized energy/amplitude of the audio signals of the Q channels to the energy/amplitude sum and the number of available bits.

The energy/amplitude after energy/amplitude equalization of the audio signals of the Q channels may be equal to the respective energy/amplitude before energy/amplitude equalization, and approximately equal to the respective energy/amplitude after stereo processing. The energy/amplitude equalized energy/amplitude of the respective two-channel audio signals of the K channels may be approximately equal to the stereo-processed energy/amplitude of the respective two-channel audio signals.

Exemplarily, the above formula (1) can be used to determine the energy/amplitude sum, that is, the energy/amplitude after stereo processing in formula (1) is replaced by the energy/amplitude equalized energy of each channel in this embodiment. /amplitude.

Step 504: Encode the audio signals of the two channels according to the respective bit numbers of the two channels of the current channel pair to obtain an encoded code stream.

In this embodiment, the audio signals of P channels of the current frame of the multi-channel audio signal are acquired, the audio signals of the P channels include audio signals of K channel pairs, and the current channel is centered according to the K channel pairs. the respective energy/amplitude of the audio signals of the two channels, perform energy/amplitude equalization on the audio signals of the two channels of the current channel pair, and obtain the energy/amplitude of the two channels of the current channel pair The equalized energy/amplitude, according to the energy/amplitude of the two channels of the current channel pair after equalization, and the number of available bits, determine the respective bit numbers of the two channels of the current channel pair , and respectively encode the audio signals of the two channels according to the respective bit numbers of the two channels of the current channel pair to obtain an encoded code stream. Through the energy/amplitude equalization in the channel pair, the bit allocation is performed based on the energy/amplitude after the energy/amplitude equalization, so as to realize the reasonable allocation of the bits of each channel in the multi-channel signal encoding, so as to ensure the reconstruction of the audio signal at the decoding end. quality. For example, in the case of a large difference in energy/amplitude between channel pairs, the method of the embodiment of the present application can solve the problem of insufficient coding bits of the channel pair signal with large energy/amplitude, so as to ensure the reconstruction of the audio signal at the decoding end. quality.

The embodiment shown in FIG. 8 is explained by taking the embodiment shown in FIG. 5 and FIG. 6 as an example.

The multi-channel encoding processing unit 401 in the embodiment shown in FIG. 5 may perform steps 501 and 502 in the embodiment shown in FIG. 8 , and the channel encoding unit 402 may perform step 503 in the embodiment shown in FIG. 8 . When the channel encoding unit 402 can perform step 503 of the embodiment shown in FIG. 8 , the difference from the embodiments shown in FIG. 5 and FIG. 6 is that the bit allocation unit 4021 can determine the number of bits of each channel in the following manner.

The bit allocation unit 4021 in this embodiment of the present application may perform bit allocation according to the energy/amplitude equalized of the respective energy/amplitude of the P channels. Specifically, the following formulas (32) to (37) can be used to determine.

Bits(M1)=bAvail×E _post (M1)/sum_E _post (32)

Bits(S1)=bAvail×E _post (S1)/sum_E _post (33)

Bits(M2)=bAvail×E _post (M2)/sum_E _post (34)

Bits(S2)=bAvail×E _post (S2)/sum_E _post (35)

Bits(C)=bAvail×E _post (C)/sum_E _post (36)

Bits(LFE)=bAvail×E _post (LFE)/sum_E _post (37)

When using formulas (32) to (37) for bit allocation, the multi-channel encoding processing unit 401 needs to adopt the energy/amplitude equalization method of the channel pair, that is, the energy/amplitude equalization within the channel pair. Wherein, sum_E _post can be determined by using the above formula (1).

The energy/amplitude sum E(L, R) before the energy/amplitude equalization of the L channel and the R channel, after the energy/amplitude equalization, the energy/amplitude sum of the L channel and the R channel has not changed, still is E(L, R). After the L channel and the R channel are stereo processed, the stereo processed energy/amplitude sum of the L channel and the R channel becomes E _post (M1, S1). Because stereo processing will slightly reduce the redundancy between the L channel and the R channel and satisfy E _post (M1, S1) ≈ E(L, R). That is to say, when the energy/amplitude sum E(L, R) >> (much greater than) the energy/amplitude sum E(LS, RS) of the LS channel and the RS channel, the The processing of the multi-channel coding processing unit 401 in this embodiment and the bit allocation unit 4021 in this embodiment can make the bits Bits(M1)+Bits(S1) allocated by E(L, R) much larger than Bits(M2) +Bits(S2), so as to achieve the purpose of allocating bits between channel pairs according to energy/amplitude.

Bits(M1)+Bits(S1)=bAvail×E _post (M1)/sum_E _post +bAvail×E _post (S1)/sum_E _post

＝bAvail×E _post (M1,S1)/sum_E _post

>>bAvail×E _post (M2,S2)/sum_E _post

=Bits(M2)+Bits(S2)

In this embodiment, through the energy/amplitude equalization in the channel pair, bit allocation is performed based on the energy/amplitude after energy/amplitude equalization, so as to realize the reasonable distribution of the number of bits of each channel in the multi-channel signal encoding, so as to ensure the decoding end Reconstruct the quality of the audio signal. For example, in the case of a large difference in energy/amplitude between channel pairs, the method of the embodiment of the present application can solve the problem of insufficient coding bits of the channel pair signal with large energy/amplitude, so as to ensure the reconstruction of the audio signal at the decoding end. quality.

Based on the same inventive concept as the above method, an embodiment of the present application further provides an audio signal encoding apparatus, which can be applied to an audio encoder.

FIG. 9 is a schematic structural diagram of an audio signal encoding apparatus according to an embodiment of the present application. As shown in FIG. 9 , the audio signal encoding apparatus 700 includes an acquisition module 701 , a bit allocation module 702 , and an encoding module 703 .

The acquisition module 701 is used to acquire the respective energy/amplitude of the audio signals of the P channels and the audio signals of the P channels of the current frame of the multi-channel audio signal, where P is a positive integer greater than 1, and the P channels The audio signal includes audio signals of K channel pairs, where K is a positive integer.

The bit allocation module 702 is configured to determine the respective bit numbers of the K channel pairs according to the respective energy/amplitude of the audio signals of the P channels and the number of available bits.

The encoding module 703 is configured to encode the audio signals of the P channels according to the respective bit numbers of the K channels to obtain an encoded code stream.

In some embodiments, the encoding module 703 is configured to determine, according to the number of bits of the current channel pair in the K channel pairs and the respective stereo-processed energy/amplitude of the audio signals of the two channels in the current channel pair, The respective bit numbers of the two channels in the current channel pair; the audio signals of the two channels are encoded according to the respective bit numbers of the two channels in the current channel pair.

In some embodiments, the bit allocation module 702 is configured to: determine the energy/amplitude sum of the current frame according to the respective energy/amplitude of the audio signals of the P channels. According to the sum of the energy/amplitude of the audio signals of the K channel pairs and the energy/amplitude sum of the current frame, the respective bit coefficients of the K channel pairs are determined. The respective bit numbers of the K channel pairs are determined according to the respective bit coefficients of the K channel pairs and the available number of bits.

In some embodiments, the bit allocation module 702 is configured to: determine the energy/amplitude sum of the current frame according to the respective stereo-processed energy/amplitude of the audio signals of the P channels.

In some embodiments, the bit allocation module 702 is used to:

According to the formula

Calculate the energy/magnitude of the current frame and sum_E _post ;

in,

Among them, ch represents the channel index, E _post (ch) represents the energy/amplitude of the audio signal of the channel whose channel index is ch after stereo processing, and sampleCoef _post (ch, i) represents the stereo processed channel of ch. The ith coefficient of the current frame, N represents the number of coefficients of the current frame, and N takes a positive integer greater than 1.

In some embodiments, the bit allocation module 702 is configured to: determine the energy/amplitude sum of the current frame according to the energy/amplitude before equalization of the respective energy/amplitude of the audio signals of the P channels.

In some embodiments, the bit allocation module 702 is used to: according to the formula

Calculate the energy/amplitude and sum_E _{pre of the} current frame, where ch represents the channel index, and E _pre (ch) represents the energy/amplitude of the audio signal of the channel whose channel index is ch before energy/amplitude equalization.

In some embodiments, the bit allocation module 702 is configured to: determine the current frame according to the energy/amplitude of the audio signals of the P channels before equalization and the respective weighting coefficients of the P channels. Energy/amplitude sum, the weighting factor is less than or equal to 1.

In some embodiments, the bit allocation module 702 is used to:

According to the formula

Calculate the energy/amplitude and sum_E _{pre of the} current frame;

Among them, α(ch) is the weighting coefficient of the ch channel, the weighting coefficients of the two channels of a channel pair are the same, and the weighting coefficients of the two channels of a channel pair are the same as the difference between the two channels. is inversely proportional to the normalized correlation value of .

In some embodiments, the audio signals of the P channels further include unpaired audio signals of the Q channels, where P=2×K+Q, where K is a positive integer, and Q is a positive integer. The bit allocation module 702 is configured to: determine the respective bit numbers of the K channel pairs and the respective bit numbers of the Q channels according to the respective energy/amplitude of the audio signals of the P channels and the number of available bits. The encoding module 703 is configured to encode the audio signals of the K channel pairs according to the respective bit numbers of the K channels, respectively encode the Q channel audio signals according to the respective bit numbers of the Q channels.

In some embodiments, the bit allocation module 702 is configured to: determine the energy/amplitude sum of the current frame according to the respective energy/amplitude of the audio signals of the P channels. According to the sum of the respective energy/amplitude of the audio signals of the K channel pairs and the energy/amplitude sum of the current frame, the respective bit coefficients of the K channel pairs are determined. The respective bit coefficients of the Q channels are determined according to the sum of the energy/amplitude of the audio signals of the Q channels and the energy/amplitude of the current frame. The respective bit numbers of the K channel pairs are determined according to the respective bit coefficients of the K channel pairs and the available number of bits. The number of bits of each of the Q channels is determined according to the respective bit coefficients and the number of available bits of the Q channels.

In some embodiments, the apparatus may further include: an energy/amplitude equalization module 704 . The energy/amplitude equalization module 704 is configured to obtain the energy/amplitude equalized audio signals of the P channels according to the audio signals of the P channels. The energy/amplitude of the aforementioned audio signal of one channel after energy/amplitude equalization is obtained from the energy/amplitude equalized audio signal of the one channel.

The encoding module 703 is configured to encode the energy/amplitude equalized audio signals of the P channels according to the respective bit numbers of the K channels.

It should be noted that the acquisition module 701, the bit allocation module 702, and the encoding module 703 can be applied to the audio signal encoding process at the encoding end.

It should also be noted that, for the specific implementation process of the obtaining module 701, the bit allocation module 702, and the encoding module 703, reference may be made to the detailed description of the above method embodiments, which are not repeated here for brevity of the description.

An embodiment of the present application further provides another audio signal encoding apparatus. The audio signal encoding apparatus may adopt the schematic structural diagram shown in FIG. 9 , and the audio signal encoding apparatus of this embodiment is used to execute the method of the embodiment shown in FIG. 8 . .

In some embodiments, the functions of each module in the embodiment shown in FIG. 9 are different. In this embodiment, the obtaining module 701 is configured to obtain the audio signals of P channels of the current frame of the multi-channel audio signal, where P is A positive integer greater than 1, the audio signals of the P channels include audio signals of K channel pairs, and K is a positive integer.

The energy/amplitude equalization module 704 is configured to perform an analysis on the audio signals of the two channels of the current channel pair according to the respective energy/amplitude of the audio signals of the two channels of the current channel pair in the K channel pairs. Energy/amplitude equalization: Obtain the energy/amplitude equalized energy/amplitude of the audio signals of the two channels of the current channel pair.

A bit allocation module 702, configured to determine the respective energy/amplitude of the audio signals of the two channels of the current channel pair after equalization of energy/amplitude, and the number of available bits, to determine the respective two channels of the current channel pair. number of bits.

The encoding module 703 is configured to encode the audio signals of the two channels according to the respective bit numbers of the two channels of the current channel pair to obtain an encoded code stream.

In some embodiments, the bit allocation module 702 is configured to determine the energy/amplitude sum of the current frame according to the energy/amplitude equalized energy/amplitude of the audio signals of the P channels. Determine the two channels of the current channel pair according to the energy/amplitude sum of the current frame, the respective energy/amplitude equalized energy/amplitude of the audio signals of the two channels of the current channel pair, and the number of available bits the respective number of bits.

The bit allocation module 702 is configured to equalize the energy/amplitude of the audio signals of the respective two channels according to the energy/amplitude of the K channels, and the energy/amplitude equalized energy/amplitude of the audio signals of the Q channels. energy/amplitude, determines the energy/amplitude sum of the current frame. The respective bit numbers of the two channels of the current channel pair are determined according to the energy/amplitude sum of the current frame, the respective energy/amplitude of the audio signals of the two channels of the current channel pair, and the number of available bits. The respective bit numbers of the Q channels are determined according to the energy/amplitude sum of the current frame, the energy/amplitude equalized energy/amplitude of the audio signals of the Q channels, and the number of available bits.

The encoding module 703 is configured to encode the audio signals of the K channel pairs according to the respective bit numbers of the K channels, and respectively encode the audio signals of the Q channels according to the respective bit numbers of the Q channels The signal is encoded to obtain the encoded code stream.

It should be noted that the acquisition module 701 , the bit allocation module 702 , the energy/amplitude equalization module 704 , and the encoding module 703 can be applied to the audio signal encoding process at the encoding end.

It should also be noted that, for the specific implementation process of the acquisition module 701, the bit allocation module 702, the energy/amplitude equalization module 704, and the encoding module 703, reference may be made to the detailed description of the method embodiment shown in FIG. 8. For the brevity of the description, here No longer.

Based on the same inventive concept as the above method, an embodiment of the present application provides an audio signal encoder. The audio signal encoder is used to encode an audio signal, including: performing the encoder described in one or more of the above embodiments, wherein , the audio signal encoding device is used to encode and generate the corresponding code stream.

Based on the same inventive concept as the above method, an embodiment of the present application provides a device for encoding an audio signal, for example, an audio signal encoding device, as shown in FIG. 10 , the audio signal encoding device 800 includes:

A processor 801, a memory 802, and a communication interface 803 (wherein the number of processors 801 in the audio signal encoding device 800 may be one or more, and one processor is taken as an example in FIG. 10). In some embodiments of the present application, the processor 801 , the memory 802 , and the communication interface 803 may be connected by a bus or in other ways, wherein the connection by a bus is taken as an example in FIG. 10 .

Memory 802 may include read-only memory and random access memory, and provides instructions and data to processor 801 . A portion of memory 802 may also include non-volatile random access memory (NVRAM). The memory 802 stores an operating system and operation instructions, executable modules or data structures, or a subset thereof, or an extended set thereof, wherein the operation instructions may include various operation instructions for implementing various operations. The operating system may include various system programs for implementing various basic services and handling hardware-based tasks.

The processor 801 controls the operation of the audio encoding device, and the processor 801 may also be referred to as a central processing unit (central processing unit, CPU). In a specific application, various components of the audio coding device are coupled together through a bus system, where the bus system may include a power bus, a control bus, a status signal bus, and the like in addition to a data bus. However, for the sake of clarity, the various buses are referred to as bus systems in the figures.

The methods disclosed in the above embodiments of the present application may be applied to the processor 801 or implemented by the processor 801 . The processor 801 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above-mentioned method may be completed by an integrated logic circuit of hardware in the processor 801 or an instruction in the form of software. The above-mentioned processor 801 may be a general-purpose processor, a digital signal processor (digital signal processing, DSP), an application specific integrated circuit (application specific integrated circuit, ASIC), a field-programmable gate array (field-programmable gate array, FPGA) or Other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. The methods, steps, and logic block diagrams disclosed in the embodiments of this application can be implemented or executed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art. The storage medium is located in the memory 802, and the processor 801 reads the information in the memory 802, and completes the steps of the above method in combination with its hardware.

The communication interface 803 can be used to receive or transmit digital or character information, for example, it can be an input/output interface, a pin or a circuit, and the like. For example, the above-mentioned encoded code stream is sent through the communication interface 803 .

Based on the same inventive concept as the above method, an embodiment of the present application provides an audio encoding device, including: a non-volatile memory and a processor coupled to each other, the processor calling program codes stored in the memory to execute Part or all of the steps of the multi-channel audio signal encoding method as described in one or more of the above embodiments.

Based on the same inventive concept as the above method, an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a program code, wherein the program code includes a program code for executing one or more of the above Instructions for part or all of the steps of the multi-channel audio signal encoding method described in the embodiments.

Based on the same inventive concept as the above method, an embodiment of the present application provides a computer program product, when the computer program product is run on a computer, the computer is made to execute the multiple methods described in one or more of the above embodiments. Some or all of the steps of a method for encoding a channel audio signal.

The processor mentioned in the above embodiments may be an integrated circuit chip, which has signal processing capability. In the implementation process, each step of the above method embodiments may be completed by a hardware integrated logic circuit in a processor or an instruction in the form of software. The processor may be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other Programming logic devices, discrete gate or transistor logic devices, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the methods disclosed in the embodiments of the present application may be directly embodied as executed by a hardware coding processor, or executed by a combination of hardware and software modules in the coding processor. The software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art. The storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps of the above method in combination with its hardware.

The memory mentioned in the above embodiments may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory. The non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically programmable Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory. Volatile memory may be random access memory (RAM), which acts as an external cache. By way of example and not limitation, many forms of RAM are available, such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (double data rate SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (enhanced SDRAM, ESDRAM), synchronous link dynamic random access memory (synchlink DRAM, SLDRAM) ) and direct memory bus random access memory (direct rambus RAM, DR RAM). It should be noted that the memory of the systems and methods described herein is intended to include, but not be limited to, these and any other suitable types of memory.

Those of ordinary skill in the art can realize that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.

Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working process of the above-described systems, devices and units may refer to the corresponding processes in the foregoing method embodiments, which will not be repeated here.

In the several embodiments provided in this application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution, and the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: U disk, removable hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program codes .

The above are only specific embodiments of the present application, but the protection scope of the present application is not limited to this. should be covered within the scope of protection of this application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.

Claims

A method for encoding a multi-channel audio signal, comprising:

Obtain the audio signals of P channels of the current frame of the multi-channel audio signal, where P is a positive integer greater than 1, the audio signals of the P channels include audio signals of K channel pairs, and K is a positive integer;

obtaining the respective energy/amplitude of the audio signals of the P channels;

Determine the respective bit numbers of the K channel pairs according to the respective energy/amplitude and available bit numbers of the audio signals of the P channels;

According to the respective bit numbers of the K channels, the audio signals of the P channels are encoded to obtain an encoded code stream;

Wherein, the energy/amplitude of the audio signal of one channel of the P channels includes the energy/amplitude of the audio signal of the one channel in the time domain, the time-frequency transformation of the audio signal of the one channel The energy/amplitude of the audio signal of the one channel after time-frequency transformation and whitening, the energy/amplitude of the audio signal of the one channel after energy/amplitude equalization, or the energy/amplitude of the audio signal of the one channel At least one item of energy/amplitude of the audio signal after stereo processing.
The method according to claim 1, wherein the K channel pairs include a current channel pair, and the audio frequency of the P channels is adjusted according to the respective bit numbers of the K channel pairs. The signal encoding includes: encoding the audio signal of the current channel pair according to the number of bits of the current channel pair;

The encoding of the audio signal of the current channel pair according to the number of bits of the current channel pair includes:

According to the number of bits of the current channel pair and the respective stereo-processed energy/amplitude of the audio signals of the two channels in the current channel pair, determine the respective values of the two channels in the current channel pair number of bits;

The audio signals of the two channels are encoded according to the respective bit numbers of the two channels in the current channel pair.
The method according to claim 1 or 2, wherein the respective bit numbers of the K channel pairs are determined according to the respective energy/amplitude and the number of available bits of the audio signals of the P channels, include:

Determine the energy/amplitude sum of the current frame according to the respective energy/amplitude of the audio signals of the P channels;

According to the energy/amplitude sum of the audio signals of the K channel pairs and the energy/amplitude sum of the current frame, determine the respective bit coefficients of the K channel pairs;

The respective bit numbers of the K channel pairs are determined according to the respective bit coefficients of the K channel pairs and the available number of bits.
The method according to claim 3, wherein the determining the sum of the energy/amplitude of the current frame according to the respective energy/amplitude of the audio signals of the P channels comprises:

The energy/amplitude sum of the current frame is determined according to the stereo-processed energy/amplitude of the audio signals of the P channels.
The method according to claim 4, wherein the determining the energy/amplitude sum of the current frame according to the respective stereo-processed energy/amplitude of the audio signals of the P channels comprises:

According to the formula
Calculate the energy/amplitude and sum_E post of the current frame;

in,

Among them, ch represents the channel index, E post (ch) represents the stereo-processed energy/amplitude of the audio signal of the channel whose channel index is ch, and sampleCoef post (ch, i) represents the ch-th sound after stereo processing. The ith coefficient of the current frame of the track, N represents the number of coefficients of the current frame, and N takes a positive integer greater than 1.
The method according to claim 3, wherein the determining the sum of the energy/amplitude of the current frame according to the respective energy/amplitude of the audio signals of the P channels comprises:

Determine the energy/amplitude sum of the current frame according to the respective energy/amplitude energy/amplitude of the audio signals of the P channels before equalization, and the energy/amplitude of the audio signal of one of the P channels The energy/amplitude before amplitude equalization includes the energy/amplitude of the audio signal of the one channel in the time domain, or the energy/amplitude of the audio signal of the one channel after time-frequency transformation, or the energy/amplitude of the audio signal of the one channel. The time-frequency transformed and whitened energy/amplitude of the audio signal.
The method according to claim 6, wherein the determining the energy/amplitude sum of the current frame according to the energy/amplitude before equalization of the respective energy/amplitude of the audio signals of the P channels comprises:

According to the formula
Calculate the energy/amplitude sum_E pre of the current frame, where ch represents the channel index, and E pre (ch) represents the energy/amplitude of the audio signal of the channel whose channel index is ch before energy/amplitude equalization.
The method according to claim 3, wherein the determining the sum of the energy/amplitude of the current frame according to the respective energy/amplitude of the audio signals of the P channels comprises:

Determine the energy/amplitude sum of the current frame according to the energy/amplitude of the audio signals of the P channels before equalization and the weighting coefficients of the P channels, and the weighting coefficient is less than or equal to 1.
The method according to claim 8, characterized in that, determining the said P channels according to the energy/amplitude before equalization of the respective energy/amplitude of the audio signals of the P channels and the respective weighting coefficients of the P channels. The energy/amplitude sum of the current frame, including:

According to the formula
Calculate the energy/amplitude and sum_E pre of the current frame;

Among them, ch represents the channel index, E pre (ch) is the energy/amplitude of the audio signal of the ch-th channel before energy/amplitude equalization, α(ch) is the weighting coefficient of the ch-th channel, and the The weighting coefficients of the two channels are the same, and the magnitude of the weighting coefficients of the two channels of the one channel pair is inversely proportional to the normalized correlation value between the two channels of the one channel pair.
The method according to any one of claims 1 to 9, wherein the audio signals of the P channels further include the audio signals of unpaired Q channels, P=2×K+Q, Q is a positive integer;

Determining the respective bit numbers of the K channel pairs according to the respective energy/amplitude and the number of available bits of the audio signals of the P channels, including:

Determine the respective bit numbers of the K channel pairs and the respective bit numbers of the Q channels according to the respective energy/amplitude of the audio signals of the P channels and the available number of bits;

The encoding of the audio signals of the P channels according to the respective bit numbers of the K channels includes:

The audio signals of the K channel pairs are respectively encoded according to the respective bit numbers of the K channel pairs, and the audio signals of the Q channels are respectively encoded according to the respective bit numbers of the Q channels coding.
The method according to claim 10, wherein the respective bit numbers of the K channel pairs are determined according to the respective energy/amplitude of the audio signals of the P channels and the available number of bits and the respective bit numbers of the Q channels, including:

Determine the energy/amplitude sum of the current frame according to the respective energy/amplitude of the audio signals of the P channels;

According to the energy/amplitude sum of the audio signals of the K channel pairs and the energy/amplitude sum of the current frame, determine the respective bit coefficients of the K channel pairs;

According to the energy/amplitude sum of the audio signals of the Q channels and the energy/amplitude sum of the current frame, determine the respective bit coefficients of the Q channels;

determining the respective bit numbers of the K channel pairs according to the respective bit coefficients of the K channel pairs and the available number of bits;

The respective bit numbers of the Q channels are determined according to the respective bit coefficients of the Q channels and the available number of bits.
The method according to any one of claims 1 to 11, wherein the encoding the audio signals of the P channels according to the respective bit numbers of the K channels comprises:

The energy/amplitude equalized audio signals of the P channels are encoded according to the respective bit numbers of the K channel pairs.
A multi-channel audio signal encoding device, the other is that the device comprises:

The acquisition module is used to acquire the respective energy/amplitude of the audio signals of the P channels of the current frame of the multi-channel audio signal and the audio signals of the P channels, where P is a positive integer greater than 1, and the P channels The audio signal of the channel includes the audio signal of K channel pairs, where K is a positive integer;

A bit allocation module, configured to determine the respective bit numbers of the K channel pairs according to the respective energy/amplitude and the number of available bits of the audio signals of the P channels;

an encoding module, configured to encode the audio signals of the P channels according to the respective bit numbers of the K channels to obtain an encoded code stream;

Wherein, the energy/amplitude of the audio signal of one channel of the P channels includes the energy/amplitude of the audio signal of the one channel in the time domain, the time-frequency transformation of the audio signal of the one channel The energy/amplitude of the audio signal of the one channel after time-frequency transformation and whitening, the energy/amplitude of the audio signal of the one channel after energy/amplitude equalization, or the energy/amplitude of the audio signal of the one channel At least one item of energy/amplitude of the audio signal after stereo processing.
The apparatus according to claim 13, wherein the K channel pairs include a current channel pair, and the encoding module is configured to: according to the number of bits of the current channel pair and the current channel pair The respective stereo-processed energy/amplitude of the audio signals of the two channels in the current channel pair determines the respective bit numbers of the two channels in the current channel pair; The number of bits encodes the audio signals of the two channels, respectively.
The apparatus according to claim 14, wherein the bit allocation module is configured to:

Determine the energy/amplitude sum of the current frame according to the respective energy/amplitude of the audio signals of the P channels;

According to the energy/amplitude sum of the audio signals of the K channel pairs and the energy/amplitude sum of the current frame, determine the respective bit coefficients of the K channel pairs;

The respective bit numbers of the K channel pairs are determined according to the respective bit coefficients of the K channel pairs and the available number of bits.
The device according to claim 15, wherein the bit allocation module is configured to: determine the energy/amplitude of the current frame according to the respective stereo-processed energy/amplitude of the audio signals of the P channels magnitude and.
The apparatus according to claim 16, wherein the bit allocation module is configured to:

According to the formula
Calculate the energy/amplitude and sum_E post of the current frame;

in,

Among them, ch represents the channel index, E post (ch) represents the stereo-processed energy/amplitude of the audio signal of the channel whose channel index is ch, and sampleCoef post (ch, i) represents the ch-th sound after stereo processing. The ith coefficient of the current frame of the track, N represents the number of coefficients in the current frame, and N takes a positive integer greater than 1.
The apparatus according to claim 15, wherein the bit allocation module is configured to: determine the energy of the current frame according to the energy/amplitude before equalization of the respective energy/amplitude of the audio signals of the P channels /amplitude sum, the energy/amplitude of the audio signal of one channel of the P channels before equalization includes the energy/amplitude of the audio signal of the one channel in the time domain, or the energy/amplitude of the audio signal of the one channel in the time domain, or the The energy/amplitude of the audio signal of one channel after time-frequency transformation, or the energy/amplitude of the audio signal of one channel after time-frequency transformation and whitening.
The apparatus according to claim 18, wherein the bit allocation module is configured to:

According to the formula
Calculate the energy/amplitude sum_E pre of the current frame, where ch represents the channel index, and E pre (ch) represents the energy/amplitude of the audio signal of the channel whose channel index is ch before energy/amplitude equalization.
The apparatus according to claim 15, wherein the bit allocation module is configured to: according to the energy/amplitude of the audio signals of the P channels before equalization and the respective energy/amplitude of the P channels The weighting coefficient of , determines the energy/amplitude sum of the current frame, and the weighting coefficient is less than or equal to 1.
The apparatus according to claim 20, wherein the bit allocation module is configured to:

According to the formula
Calculate the energy/amplitude and sum_E pre of the current frame;

Among them, ch represents the channel index, E pre (ch) is the energy/amplitude of the audio signal of the ch-th channel before energy/amplitude equalization, α(ch) is the weighting coefficient of the ch-th channel, and the The weighting coefficients of the two channels are the same, and the magnitude of the weighting coefficients of the two channels of the one channel pair is inversely proportional to the normalized correlation value between the two channels of the one channel pair.
The device according to any one of claims 13 to 21, wherein the audio signals of the P channels further include audio signals of the Q channels that are not paired, P=2×K+Q, Q is a positive integer; the bit allocation module is configured to: determine the respective bit numbers of the K channel pairs and the The respective bit numbers of the Q channels; the encoding module is configured to encode the audio signals of the K channel pairs according to the respective bit numbers of the K channel pairs, and according to the respective bit numbers of the Q channels encoding the audio signals of the Q channels respectively.
The apparatus according to claim 22, wherein the bit allocation module is configured to: determine the energy/amplitude sum of the current frame according to the respective energy/amplitude of the audio signals of the P channels; The respective energy/amplitude of the audio signals of the K channel pairs is summed with the energy/amplitude of the current frame to determine the respective bit coefficients of the K channel pairs; according to the respective audio signals of the Q channels and the energy/amplitude sum of the current frame to determine the respective bit coefficients of the Q channels; according to the respective bit coefficients of the K channel pairs and the number of available bits, determine the The respective bit numbers of the K channel pairs; the respective bit numbers of the Q channels are determined according to the respective bit coefficients of the Q channels and the available bit number.
The device according to any one of claims 13 to 23, characterized in that:

The encoding module is configured to encode the energy/amplitude equalized audio signals of the P channels according to the respective bit numbers of the K channels.
A method for encoding a multi-channel audio signal, comprising:

Obtain the audio signals of P channels of the current frame of the multi-channel audio signal, where P is a positive integer greater than 1, the audio signals of the P channels include audio signals of K channel pairs, and K is a positive integer;

According to the respective energy/amplitude of the audio signals of the two channels of the current channel pair in the K channel pairs, perform energy/amplitude equalization on the audio signals of the two channels of the current channel pair to obtain the energy/amplitude of the respective audio signals of the two channels of the current channel pair after energy/amplitude equalization;

Determine the respective bit numbers of the two channels of the current channel pair according to the respective energy/amplitude equalized energy/amplitude of the audio signals of the two channels of the current channel pair and the number of available bits;

The audio signals of the two channels are encoded respectively according to the respective bit numbers of the two channels of the current channel pair, so as to obtain an encoded code stream.
The method according to claim 25, characterized in that, P=2*K, K is a positive integer, and the energy/amplitude equalization of the audio signals of the two channels according to the current channel pair. The energy/amplitude, and the number of available bits, determine the respective number of bits of the two channels of the current channel pair, including:

Determine the energy/amplitude sum of the current frame according to the respective energy/amplitude equalized energy/amplitude of the audio signals of the P channels;

The current channel is determined according to the sum of the energy/amplitude of the current frame, the energy/amplitude after energy/amplitude equalization of the audio signals of the two channels of the current channel pair, and the number of available bits The number of bits for each of the two channels of the pair.
The method according to claim 25 or 26, wherein the audio signals of the P channels further include the audio signals of unpaired Q channels, where P=2×K+Q, and K is a positive integer , Q is a positive integer;

According to the energy/amplitude of the audio signals of the two channels of the current channel pair after energy/amplitude equalization, and the number of available bits, determine the respective energy/amplitude of the two channels of the current channel pair. number of bits, including:

According to the energy/amplitude equalized energy/amplitude of the audio signals of the respective two channels of the K channels, and the energy/amplitude equalized energy/amplitude of the audio signals of the Q channels , determine the energy/amplitude sum of the current frame;

According to the sum of the energy/amplitude of the current frame, the respective energy/amplitude of the audio signals of the two channels of the current channel pair, and the number of available bits, determine each of the two channels of the current channel pair the number of bits;

Determine the respective bit numbers of the Q channels according to the energy/amplitude sum of the current frame, the energy/amplitude equalized energy/amplitude of the audio signals of the Q channels, and the available number of bits ;

Encode the audio signals of the two channels according to the respective bit numbers of the two channels of the current channel pair, and obtain an encoded code stream, including:

The audio signals of the K channel pairs are respectively encoded according to the respective bit numbers of the K channel pairs, and the audio signals of the Q channels are respectively encoded according to the respective bit numbers of the Q channels Encode to get the encoded bitstream.
An audio signal encoding device, comprising:

an acquisition module, configured to acquire the audio signals of P channels of the current frame of the multi-channel audio signal, where P is a positive integer greater than 1, and the audio signals of the P channels include audio signals of K channel pairs, K is a positive integer;

An energy/amplitude equalization module, configured to compare the audio signals of the two channels of the current channel pair according to the respective energy/amplitude of the audio signals of the two channels of the current channel pair in the K channel pairs performing energy/amplitude equalization to obtain the respective energy/amplitude equalized energy/amplitude of the audio signals of the two channels of the current channel pair;

A bit allocation module, configured to determine the two audio signals of the current channel pair according to the respective energy/amplitude after energy/amplitude equalization of the audio signals of the two channels of the current channel pair, and the number of available bits. The number of bits of each channel;

The encoding module is configured to encode the audio signals of the two channels according to the respective bit numbers of the two channels of the current channel pair, so as to obtain an encoded code stream.
The device according to claim 28, wherein, P=2×K, K is a positive integer, and the bit allocation module is used for:

Determine the energy/amplitude sum of the current frame according to the respective energy/amplitude after energy/amplitude equalization of the audio signals of the P channels;

The current channel is determined according to the sum of the energy/amplitude of the current frame, the energy/amplitude after energy/amplitude equalization of the audio signals of the two channels of the current channel pair, and the number of available bits The number of bits for each of the two channels of the pair.
The device according to claim 28 or 29, wherein the audio signals of the P channels further include the audio signals of the Q channels that are not paired, and P=2×K+Q, where K is a positive integer , Q is a positive integer;

The bit allocation module is used for:

According to the energy/amplitude equalized energy/amplitude of the audio signals of the respective two channels of the K channels, and the energy/amplitude equalized energy/amplitude of the audio signals of the Q channels , determine the energy/amplitude sum of the current frame;

According to the sum of the energy/amplitude of the current frame, the respective energy/amplitude of the audio signals of the two channels of the current channel pair, and the number of available bits, determine the respective two channels of the current channel pair. the number of bits;

According to the energy/amplitude sum of the current frame, the energy/amplitude after energy/amplitude equalization of the audio signals of the Q channels, and the available number of bits, determine the respective bit numbers of the Q channels ;

The encoding module is used to:

The audio signals of the K channel pairs are encoded according to the respective bit numbers of the K channel pairs, and the audio signals of the Q channels are respectively encoded according to the respective bit numbers of the Q channels. Encode to get the encoded bitstream.
An audio signal encoding device, comprising: a non-volatile memory and a processor coupled to each other, the processor calling program codes stored in the memory to execute any one of claims 1 to 12 the method described, or to perform the method according to any one of claims 25 to 27.
An audio signal encoding device, characterized in that it comprises: an encoder, the encoder is used for performing the method as claimed in any one of claims 1 to 12, or for performing the method as claimed in any one of claims 25 to 27. method described.
A computer-readable storage medium, characterized by comprising a computer program, which, when executed on a computer, causes the computer to execute the method of any one of claims 1 to 12, or causes the computer to The method of any one of claims 25 to 27 is performed.
A computer-readable storage medium, characterized in that it comprises an encoded code stream obtained by the method according to any one of claims 1 to 12, or obtained by the method according to any one of claims 25 to 27 encoding stream.