KR20150009476A - Method for encoding and decoding of multi channel audio signal, encoder and decoder - Google Patents

Method for encoding and decoding of multi channel audio signal, encoder and decoder Download PDF

Info

Publication number
KR20150009476A
KR20150009476A KR20140089722A KR20140089722A KR20150009476A KR 20150009476 A KR20150009476 A KR 20150009476A KR 20140089722 A KR20140089722 A KR 20140089722A KR 20140089722 A KR20140089722 A KR 20140089722A KR 20150009476 A KR20150009476 A KR 20150009476A
Authority
KR
South Korea
Prior art keywords
channel
audio signal
data
size
encoding
Prior art date
Application number
KR20140089722A
Other languages
Korean (ko)
Inventor
이용주
서정일
유재현
강경옥
김진웅
백승권
성종모
이태진
Original Assignee
한국전자통신연구원
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 한국전자통신연구원 filed Critical 한국전자통신연구원
Priority to US14/333,092 priority Critical patent/US20150025894A1/en
Publication of KR20150009476A publication Critical patent/KR20150009476A/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/167Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

Showed are an encoding method for a multi-channel audio signal, a decoding method for a multi-channel audio signal, and an encoder and a decoder for performing the same. The present invention can improve encoding efficiency of a multi-channel audio signal by performing encoding in consideration of the size of data which is allocated according to the characteristic of the multi-channel audio signal by channel.

Description

METHOD FOR ENCODING AND DECODING OF MULTI CHANNEL AUDIO SIGNAL, ENCODER AND DECODER FOR ENCODER AND DECODER,

The present invention relates to an encoding method and a decoding method for a multi-channel audio signal, and to an encoder and a decoder for performing the method. More particularly, the present invention relates to a method and apparatus for allocating different bit rates according to an audio frame for each channel.

Recently, as the quality of multimedia contents increases, contents including multichannel audio signals such as 7.1 channel, 10.2 channel, 13.2 channel and 22.2 channel more than 5.1 channel are being generated. For example, there is an attempt to use multi-channel audio signals such as 10.2 channels and 22.2 channels in high-quality broadcasting such as 13.2 channel multi-channel audio signals and UHDTV in the movie field.

As described above, since a multi-channel audio signal has a large capacity, it is important to efficiently encode the multi-channel audio signal. In the conventional audio encoding technique, the same bit rate is assigned to each channel or the encoding is performed at a substantially constant bit rate for all the audio signal segments for each channel.

In other audio coding techniques, audio may be encoded with a variable bit rate (VBR). However, in such an encoding technique, the encoding efficiency may be good due to a small signal difference for each channel for an audio signal having a small channel. However, since a multi-channel audio signal, such as 10.2 channel and 22.2 channel, The efficiency may not be good.

Therefore, there is a need for a method for efficiently encoding a multi-channel audio signal.

The present invention provides a method and apparatus for assigning different bit rates to audio frames for respective channels when encoding multi-channel audio signals.

The present invention provides a method and apparatus capable of providing a high-quality multi-channel audio signal even under the same bit rate environment.

According to an embodiment of the present invention, there is provided an encoding method, comprising: extracting characteristics of an audio signal in a multi-channel audio signal; Setting a size of data to be allocated to a channel based on the extracted characteristics of each channel; And encoding the audio signal for each channel based on the size of data to be allocated for each channel.

The step of extracting the characteristic of each of the channels of the audio signal may include extracting energy for each channel for each of a plurality of frames constituting the audio signal and setting the size of the audio signal, The size of the data to be allocated to the channel can be set correspondingly.

The step of setting the size of the data may set a size of data proportional to the size of the extracted energy for each channel.

The step of setting the size of the data may include the steps of allocating the same or different data sizes to the frames in the same order on a channel basis and setting the sizes of the same or different data You can assign a size.

The step of encoding the channel-specific audio signal may encode a mono-type audio signal or a stereo-type audio signal.

And generating a bit stream by multiplexing the encoded audio signal for each channel.

According to an embodiment of the present invention, there is provided a decoding method comprising: extracting an audio signal per channel encoded in a bitstream; And decoding the audio signal for each channel based on the size of data allocated for each channel.

The decoding may restore a mono audio signal or a stereo audio signal from the encoded audio signal for each channel.

The size of the data allocated for each channel may be determined based on the energy per channel for each of a plurality of frames constituting the audio signal for each channel.

An encoder according to an embodiment of the present invention includes a channel characteristic extracting unit for extracting characteristics of each channel of an audio signal in a multi-channel audio signal; A data setting unit for setting a size of data to be allocated to a channel based on the extracted characteristics of each channel; And a plurality of encoding units for encoding an audio signal for each channel based on a size of data to be allocated for each channel.

The channel characteristic extracting unit extracts energy for each channel for each of a plurality of frames constituting the audio signal, and the data setting unit sets a size of data to be allocated to the channel corresponding to the extracted energy for each channel have.

The data setting unit may set a size of data proportional to the extracted energy of each channel.

The data setting unit may allocate the sizes of data that are the same or different from each other for the frames in the same order and allocate the same size or different sizes of data to frames of different orders of the same channel have.

Each of the plurality of encoding units may encode a monaural audio signal or a stereo audio signal.

And a bitstream generation unit for generating a bitstream by multiplexing the encoded audio signals for each channel.

According to an embodiment of the present invention, there is provided a decoder including: a bitstream analyzer for extracting an audio signal for each channel encoded in a bitstream; And a plurality of decoders for decoding the audio signal for each channel based on the size of data allocated for each channel.

Each of the plurality of decoding units may recover a mono audio signal or a stereo audio signal from the encoded audio signal of each channel.

The size of the data allocated for each channel may be determined based on the energy per channel for each of a plurality of frames constituting the audio signal for each channel.

According to an embodiment of the present invention, encoding efficiency of a multi-channel audio signal can be improved by assigning different bit rates to audio frames of each channel.

According to an embodiment of the present invention, a high-quality multi-channel audio signal can be provided even in the same bit rate environment.

According to an embodiment of the present invention, a high-quality audio signal can be reproduced in a reproducing terminal including a decoder by encoding a multi-channel audio signal more efficiently.

1 is a diagram illustrating an encoder and a decoder in accordance with one embodiment.
2 is a diagram showing a detailed configuration of an encoder according to an embodiment.
3 is a diagram illustrating operation of an encoder in accordance with one embodiment.
4 is a diagram illustrating a size of data for each channel according to an embodiment.
5 is a diagram showing a detailed configuration of a decoder according to an embodiment.
6 is a diagram illustrating an operation of a decoder according to an embodiment.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

1 is a diagram illustrating an encoder and a decoder in accordance with one embodiment.

Referring to Figure 1, an encoder 101 and a decoder 102 are shown. The encoder 101 may encode a multi-channel audio signal to generate a bitstream. Then, the decoder 102 can recover the multi-channel audio signal from the bitstream.

The encoder 101 may encode a multi-channel audio signal having a plurality of channels using a plurality of encoding units independently encoding. At this time, the encoder 101 can allocate different bits according to time for each channel. Here, the allocated bits may correspond to the size of the data. In other words, the encoder 101 may variably allocate bits for each frame included in the audio signal for each channel. At this time, each of the plurality of encoding units can perform encoding considering the size of the allocated data according to characteristics of each channel.

Here, the characteristic of each channel means a characteristic of each frame for each channel, and for example, the characteristic may include a magnitude of energy corresponding to a frame. And, the size of the data may mean a bit necessary for encoding. In other words, the encoder 101 can variably encode data while maintaining the quality of the multi-channel audio signal by variably allocating the size of the data according to the characteristics of each channel.

The decoder 102 may decode a multi-channel audio signal having a plurality of channels from a bitstream using a plurality of decoders independently performing decoding. At this time, each of the plurality of decoding units can decode based on the channel-specific characteristic determined by the encoder 101. [

2 is a diagram showing a detailed configuration of an encoder according to an embodiment.

2, the encoder 101 may include a channel feature extraction unit 201, a data setting unit 202, a plurality of encoding units 203, and a bitstream generation unit 204.

The channel characteristic extraction unit 201 may extract the characteristics of the multi-channel audio signals corresponding to the plurality of channels according to time intervals. Specifically, the channel characteristic extracting unit 201 can extract, for each channel, characteristics of each frame of the multi-channel audio signal. That is, the channel characteristic extracting unit 201 can extract the characteristics of a plurality of frames included in the audio signal corresponding to each of the plurality of channels in the multi-channel audio signal.

Here, the frame may be divided according to the time interval of the audio signal. For example, the characteristic of the audio signal per channel may mean the amount of energy contained in the frame included in the audio signal corresponding to each channel. Specifically, when the audio signal corresponding to the channel 1 and the channel 2 includes N frames, the channel characteristic extraction unit 201 can determine the energy of each of the N frames on a channel-by-channel basis. The magnitude of the energy corresponding to each of the frames may be different for each channel.

The data setting unit 202 may set the size of data to be allocated for each channel based on the characteristic of the audio signal for each channel. Here, the size of the data may be a bit necessary for encoding the frames included in the audio signal for each channel. The larger the size of the data, the larger the size of the bit, so the bit rate can also increase.

Specifically, the data setting unit 202 can determine the output bits of each channel based on the characteristic of the audio signal for each channel according to the time interval. At this time, the output bit rate corresponding to each channel can be determined in units of frames or multiples of frames. In addition, when the bits for each channel are summed in units of frames, the frames exhibit the same or similar results.

For example, the data setting unit 202 may allocate different sizes of data required for encoding a multi-channel audio signal, even for frames corresponding to the same order, for different channels. For example, data to be allocated to frame 1 included in the audio signals of frame 1 and channel 2 included in the audio signal of channel 1 may be different from each other. However, if the data allocated for each channel for the frame of the multi-channel audio signal are summed for each frame, they may be the same or similar for each frame. This will be described in more detail in Fig.

The data setting unit 202 may allocate data to each frame differently for each channel. For example, channels for high energy are allocated high for encoding, and channels for low energy can be allocated low for encoding if there is no audio signal or low energy.

The plurality of encoding units 203 may encode audio signals corresponding to one channel (mono) or two channels (stereo) for multi-channel audio signals based on the size of data allocated for each channel. The plurality of encoding units 203 shown in FIG. 2 encode audio signals corresponding to two channels and downmix them into audio signals corresponding to one channel. The encoding units 203 corresponding to the respective channels can independently perform encoding.

The result encoded by the plurality of encoding units 203 for each channel may be multiplexed by the bitstream generating unit 204 to generate one bitstream.

3 is a diagram illustrating operation of an encoder in accordance with one embodiment.

In step 301, the encoder 101 can extract the characteristics of the audio signal for each channel. Here, the characteristic of the audio signal may be the same or different for each channel as a characteristic for each frame corresponding to a time interval of the multi-channel audio signal. The characteristic of the audio signal may mean the energy of the frames included in the audio signal corresponding to each channel.

In step 302, the encoder 101 may set the size of the data to be allocated to the channel based on the extracted characteristics. Specifically, as the energy of the frame corresponding to the extracted feature is increased, the encoder 101 can increase the size of data required when encoding the frame. Here, the size of the data may mean a bit for encoding.

In step 303, the encoder 101 can encode an audio signal for each channel based on the size of data allocated for each channel. Here, the encoder 101 can independently encode audio signals for each channel using a plurality of encoding units. At this time, each of the plurality of encoding units may encode an audio signal corresponding to one channel of a mono type or two channels of a stereo type.

In step 304, the encoder 101 may generate a bitstream by multiplexing the encoded audio signal for each channel.

4 is a diagram illustrating a size of data for each channel according to an embodiment.

In particular, referring to FIG. 4, it can be seen that the multi-channel audio signal is composed of 10 channels from channel 1 to channel 10. It is assumed that a plurality of encoding units encodes two channels and encodes them into a stereo format. It can be seen that the audio signals corresponding to the respective channels are composed of the frames 1 to N. [ At this time, the size of the data for each channel in one frame may be the same or may be different. The size of data per channel included in each frame may be the same as or different from the size of data of the previous frame. This is because they can have the same or different energy depending on the channel for the frame.

For example, the size (bits) of data allocated to channels 1 and 2 in frame 1 and the size (bits) of data allocated to channels 3 and 4 may be different. On the other hand, even if the channels 1 and 2 are the same, the sizes of data allocated to the frames 1 and 2 may be different. Here, the size of data allocated for each channel is related to the energy determined for each channel for a frame classified according to the time interval of the multi-channel audio signal. Specifically, the size of data allocated when performing encoding may be related to the amount of energy determined in a particular frame. In Fig. 4, the size of the data corresponds to the length of each block.

That is, referring to FIG. 4, the size of data for each channel allocated to each frame may be determined according to characteristics of each channel. Then, the size of the data for each channel may be the same or different even if it is the same frame. In addition, the size of data allocated to each frame may be the same or different even if the same channel is used.

In the case of FIG. 4, since it is assumed that a plurality of encoding units couple and encode two channels, it can be seen that the sizes of data allocated to channels encoded for two frames are set to the same size. If a plurality of encoding units encode an audio signal corresponding to one channel in a mono form, the sizes of data allocated to channels 1 and 2 may be different from each other. That is, in the case of one frame, the size of data classified according to 10 channels can be allocated to encode the frame.

5 is a diagram showing a detailed configuration of a decoder according to an embodiment.

Referring to FIG. 5, the decoder 102 may include a bitstream analyzer 501 and a plurality of decoders 502.

The bitstream analyzer 501 analyzes the bitstream generated by the encoder 101 and extracts an object to be decoded. Specifically, the bitstream analyzer 501 can demultiplex the bitstream to extract the audio signal for each channel encoded from the bitstream and the size of the data allocated for each channel.

Each of the plurality of decoding units 502 may decode the encoded audio signal for each channel based on the size of data allocated for each channel. Then, the original multi-channel audio signal can be restored.

6 is a diagram illustrating an operation of a decoder according to an embodiment.

In step 601, the decoder 102 may extract the encoded audio signal per channel from the bitstream. The decoder 102 may extract the size of data allocated for each channel used when encoding the audio signal for each channel from the bit stream.

In step 602, the decoder 102 may decode the encoded audio signal according to the size of the per-channel data using a plurality of decoding units. The original multi-channel audio signal can be restored according to the decoded result.

The apparatus described above may be implemented as a hardware component, a software component, and / or a combination of hardware components and software components. For example, the apparatus and components described in the embodiments may be implemented within a computer system, such as, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA) A programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For ease of understanding, the processing apparatus may be described as being used singly, but those skilled in the art will recognize that the processing apparatus may have a plurality of processing elements and / As shown in FIG. For example, the processing unit may comprise a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as a parallel processor.

The software may include a computer program, code, instructions, or a combination of one or more of the foregoing, and may be configured to configure the processing device to operate as desired or to process it collectively or collectively Device can be commanded. The software and / or data may be in the form of any type of machine, component, physical device, virtual equipment, computer storage media, or device , Or may be permanently or temporarily embodied in a transmitted signal wave. The software may be distributed over a networked computer system and stored or executed in a distributed manner. The software and data may be stored on one or more computer readable recording media.

The method according to an embodiment may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions to be recorded on the medium may be those specially designed and configured for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. For example, it is to be understood that the techniques described may be performed in a different order than the described methods, and / or that components of the described systems, structures, devices, circuits, Lt; / RTI > or equivalents, even if it is replaced or replaced. Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

101: Encoder
102: decoder

Claims (15)

Extracting a channel-specific characteristic of the audio signal from the multi-channel audio signal;
Setting a size of data to be allocated to a channel based on the extracted characteristics of each channel; And
Encoding the audio signal for each channel based on the size of data to be allocated for each channel
/ RTI >
The method according to claim 1,
The step of extracting the channel-specific characteristics of the audio signal includes:
Extracting energy for each channel for each of the plurality of frames constituting the audio signal,
Wherein the step of setting the size of the data comprises:
And setting a size of data to be allocated to a channel corresponding to the extracted energy for each channel.
3. The method of claim 2,
Wherein the step of setting the size of the data comprises:
And setting a size of data proportional to the extracted energy of each channel.
The method according to claim 1,
Wherein the step of setting the size of the data comprises:
Allocating the same size or different data size to each frame for frames corresponding to the same order,
And allocating sizes of the same or different data to each other for a frame in a different order of the same channel.
The method according to claim 1,
Wherein the step of encoding the channel-
An encoding method for encoding a mono-type audio signal or a stereo-type audio signal.
The method according to claim 1,
Generating a bit stream by multiplexing the encoded audio signal for each channel;
≪ / RTI >
Extracting an encoded audio signal per channel from the bitstream; And
Decoding the audio signal for each channel based on the size of data allocated for each channel
/ RTI >
8. The method of claim 7,
Wherein the decoding comprises:
A decoding method for recovering a mono audio signal or a stereo audio signal from an encoded audio signal per channel.
8. The method of claim 7,
The size of the data allocated for each channel may be,
Wherein the energy of each channel is determined based on energy per channel for each of a plurality of frames constituting the audio signal per channel.
A channel characteristic extracting unit for extracting characteristics of each channel of the audio signal from the multi-channel audio signal;
A data setting unit for setting a size of data to be allocated to a channel based on the extracted characteristics of each channel; And
A plurality of encoding units for encoding an audio signal for each channel based on a size of data to be allocated for each channel,
/ RTI >
11. The method of claim 10,
The channel characteristic extracting unit,
Extracting energy for each channel for each of the plurality of frames constituting the audio signal,
Wherein the data setting unit comprises:
And setting a size of data to be allocated to the channel in correspondence with the extracted energy for each channel.
12. The method of claim 11,
Wherein the data setting unit comprises:
And setting a size of data proportional to the extracted energy of each channel.
11. The method of claim 10,
Wherein the data setting unit comprises:
Allocating the same size or different data size to each frame for frames corresponding to the same order,
An encoder that allocates the same or different data sizes for frames of different orders of the same channel.
11. The method of claim 10,
Wherein each of the plurality of encoding units comprises:
An encoder that encodes a mono audio signal or a stereo audio signal.
11. The method of claim 10,
A bitstream generation unit for generating a bitstream by multiplexing the encoded audio signals for each channel,
Lt; / RTI >
KR20140089722A 2013-07-16 2014-07-16 Method for encoding and decoding of multi channel audio signal, encoder and decoder KR20150009476A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/333,092 US20150025894A1 (en) 2013-07-16 2014-07-16 Method for encoding and decoding of multi channel audio signal, encoder and decoder

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020130083312 2013-07-16
KR20130083312 2013-07-16

Publications (1)

Publication Number Publication Date
KR20150009476A true KR20150009476A (en) 2015-01-26

Family

ID=52572684

Family Applications (1)

Application Number Title Priority Date Filing Date
KR20140089722A KR20150009476A (en) 2013-07-16 2014-07-16 Method for encoding and decoding of multi channel audio signal, encoder and decoder

Country Status (1)

Country Link
KR (1) KR20150009476A (en)

Similar Documents

Publication Publication Date Title
US11830504B2 (en) Methods and apparatus for decoding a compressed HOA signal
JP5006315B2 (en) Audio signal encoding and decoding method and apparatus
CN106463125B (en) Audio segmentation based on spatial metadata
JP5254808B2 (en) Audio signal processing method and apparatus
TWI648729B (en) A method for compressing a high-order fidelity stereo signal by compressing a high-order fidelity stereo signal, a device for compressing a high-order fidelity stereo signal, and a device for decompressing a compressed high-order fidelity stereo signal
JP7413418B2 (en) Audio decoder for interleaving signals
US11869523B2 (en) Method and apparatus for decoding a bitstream including encoded higher order ambisonics representations
US20150025894A1 (en) Method for encoding and decoding of multi channel audio signal, encoder and decoder
RU2718418C2 (en) Decoding device, decoding method and program
KR20150009476A (en) Method for encoding and decoding of multi channel audio signal, encoder and decoder
WO2014068817A1 (en) Audio signal coding device and audio signal decoding device
JP2015011076A (en) Acoustic signal encoder, acoustic signal encoding method, and acoustic signal decoder
TW202123220A (en) Multichannel audio encode and decode using directional metadata
US20240185872A1 (en) Method and apparatus for decoding a bitstream including encoded higher order ambisonics representations
CN109526234B (en) Apparatus and method for encoding and decoding multi-channel audio signal
TWI412021B (en) Method and apparatus for encoding and decoding an audio signal
KR20160081844A (en) Encoding method and encoder for multi-channel audio signal, and decoding method and decoder for multi-channel audio signal
KR20140128563A (en) Updating method of the decoded object list

Legal Events

Date Code Title Description
WITN Withdrawal due to no request for examination