CN103650036A - Method for encoding multichannel digital audio - Google Patents

Method for encoding multichannel digital audio Download PDF

Info

Publication number
CN103650036A
CN103650036A CN201280000959.4A CN201280000959A CN103650036A CN 103650036 A CN103650036 A CN 103650036A CN 201280000959 A CN201280000959 A CN 201280000959A CN 103650036 A CN103650036 A CN 103650036A
Authority
CN
China
Prior art keywords
frequency band
whole frequency
layer
data frame
primary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201280000959.4A
Other languages
Chinese (zh)
Other versions
CN103650036B (en
Inventor
闫建新
王磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Guangsheng Research And Development Institute Co ltd
Original Assignee
Shenzhen Rising Source Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Rising Source Technology Co ltd filed Critical Shenzhen Rising Source Technology Co ltd
Publication of CN103650036A publication Critical patent/CN103650036A/en
Application granted granted Critical
Publication of CN103650036B publication Critical patent/CN103650036B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/24Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Mathematical Physics (AREA)
  • Stereophonic System (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The present invention provides a method for encoding multichannel digital audio, comprising: dividing multichannel audio into a base layer and at least one enhancement layer; configuring the number of bytes for the base layer and the at least one enhancement layer respectively; and encoding the base layer and the at least one enhancement layer respectively. According to the present invention, decrease of the encoding efficiency that is caused by fine layering is avoided to some extent, and at the same time, applications such as digital audio broadcast in some fields are satisfied. The implementation of the present invention is easy, optimal comprehensive sound quality is obtained by flexibly controlling the quality of a sound channel at each layer, channel encoding requirements are easy to be satisfied, and various limiting conditions during fine layering are not required, thereby ensuring compression at higher efficiency.

Description

The method encoded to multi-sound channel digital audio Technical field
The present invention relates to audio coding process field, more particularly to a kind of method encoded to multi-sound channel digital audio.
Background technology
To multi-sound channel digital audio hierarchical audio coding field, exist and carried out damaging digital audio coding method and lossless audio coding technology, such as ISO/IEC 14496-3 by fine layered mode MPEG-4 BSAC (Bit sliced arithmetic coding) bit slice arithmetic coding, in AVS (Audio Video coding Standard Workgroup of China) in use be similar to MPEG-4 BSAC coding methods and MPEG-4 SLS (Scalable Lossless Coding scheme in lossless enhancement layer mode) can all be realized finely to be layered to audio, and each layer is separately encoded.But fine layered mode, which exists, to be layered meticulous, it is necessary to which many auxiliary informations, code efficiency is low, complicated, handle the shortcomings of logical complexity is high.
There is a kind of non-encoding scheme being finely layered in the prior art:Scalable sample rate encryption algorithm is both provided in MPEG-4 Part III and MPEG-2 Part VII AAC-SSR(Advanced Audio Coding-Scalable Sampling Rate), proposed first by Sony, coding scheme is also similar to that its exclusive ARTAC (Adaptive Transform Acoustic Coding) encode.Polyphase quadrature filter group (PQF, the Polyphase Quadrature that the digital audio and video signals of input are passed through 4 bands by the encoding scheme first Filter 4 frequency bands) are divided into, then this 4 frequency bands carry out 1 256 point MDCT (512 sampling point window length) or 8 32 points (64 sampling point window length) MDCT respectively.The encoding scheme can also reduce data transfer rate by way of removing high PQF bands, realize that bit stream is layered by way of reducing frequency band, so as to obtain different bit rates and sample rate.The benefit of this encoding scheme is that long block or short block MDCT can be independently selected in each frequency band, therefore high frequency can be used short block coding Enhanced time resolution ratio;And high frequency resolution is obtained using long block coding to low frequency.But it is due to that 4 PQF interbands have aliasing, therefore the coefficient in transform domain code efficiency of adjacent part can decline.
The content of the invention
In order to solve the above technical problems, the present invention proposes a kind of method encoded to multi-sound channel digital audio, including:It is divided into a Primary layer and at least an enhancement layer to multichannel audio;To a Primary layer and at least byte number is respectively configured in an enhancement layer;To a Primary layer and at least an enhancement layer is separately encoded.
Preferably, multi-channel audio signal is divided into a Primary layer and an enhancement layer;Wherein Primary layer includes an at least Whole frequency band sound channel, and enhanced layer packets contain an at least Whole frequency band sound channel;The Whole frequency band sound channel that Primary layer is included is not more than the Whole frequency band channel number that enhanced layer packets contain.
It is less than the Whole frequency band channel number that enhanced layer packets contain preferably for the Whole frequency band sound channel that Primary layer is included Situation, in addition to:It is data frame total bytes/2 to Primary layer configuration words joint number, byte number of the Primary layer per sound channel is the Whole frequency band channel number that data frame total bytes/2* Primary layers are included;It is data frame total bytes/2 to enhancement layer configuration words joint number;Byte number of the enhancement layer per sound channel is the Whole frequency band channel number that data frame total bytes/2* enhanced layer packets contain.
It is equal to the Whole frequency band channel number situation that enhanced layer packets contain preferably for the Whole frequency band sound channel that Primary layer is included, in addition to:Data frame total bytes/2 are more than to the byte number that Primary layer is configured;Data frame total bytes/2 are less than to the byte number that enhancement layer is configured.
Preferably, in addition to:It is data frame total bytes/(the Whole frequency band channel number that Whole frequency band channel number+enhanced layer packets that Primary layer is included contain) to the byte number of each Whole frequency band channel configuration as much.
Preferably, in addition to:The Whole frequency band channel number included to the byte number of each Whole frequency band channel configuration in Primary layer for data frame total bytes/Primary layer, and (data frame total bytes/2)>(the Whole frequency band channel number that data frame total bytes/Primary layer is included)>(data frame total bytes/(the Whole frequency band channel number that Whole frequency band channel number+enhanced layer packets that Primary layer is included contain));It is more than the Whole frequency band channel number that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/enhanced layer packets contain for the byte number of certain channel configuration of enhancement layer, and the Whole frequency band channel number that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/enhanced layer packets contain is less than to the byte number of remaining at least channel configuration.
Preferably, the characteristic of the block size, channel coding condition, the characteristic of the Primary layer and/or the enhancement layer that are encoded according to LDPC in each transmission frame, byte number is respectively configured to the Primary layer and enhancement layer.
Preferably, multi-channel audio signal is divided into a Primary layer and multiple enhancement layers;Wherein Primary layer includes an at least Whole frequency band sound channel, and multiple enhancement layers include an at least Whole frequency band sound channel respectively;The Whole frequency band sound channel that Primary layer is included is all or fewer than the Whole frequency band channel number sum that enhanced layer packets contain.
Preferably, to the byte number that Primary layer is configured it is data frame total bytes/2, byte number of the Primary layer per sound channel is the Whole frequency band channel number that data frame total bytes/2* Primary layers are included;The byte number sum of at least enhancement layer configuration is data frame total bytes/2, the byte number of each Whole frequency band sound channel of wherein the first enhancement layer is more than data frame total bytes/2 (the Whole frequency band channel number that Whole frequency band channel number+Primary layer that enhanced layer packets contain is included) The byte number of each Whole frequency band sound channel of remaining at least enhancement layer is less than data frame total bytes/2 (the Whole frequency band channel number that Whole frequency band channel number+Primary layer that enhanced layer packets contain is included).
Preferably, the byte number to each Whole frequency band channel configuration as much, is data frame total bytes/(the Whole frequency band channel number sum that the Whole frequency band channel number that Primary layer is included+all enhanced layer packets contain).
Preferably, to each Whole frequency band channel configuration byte number in Primary layer it is the Whole frequency band channel number that data frame total bytes/Primary layer is included, and (data frame total bytes/2)>(the Whole frequency band channel number that data frame total bytes/Primary layer is included)>(data frame total bytes/(the Whole frequency band channel number sum that the Whole frequency band channel number that Primary layer is included+all enhanced layer packets contain));It is more than the Whole frequency band channel number sum that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/all enhanced layer packets contain for the byte number of certain channel configuration of the first enhancement layer, and the Whole frequency band channel number sum that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/all enhanced layer packets contain is less than to the byte number of remaining at least channel configuration.
Preferably, the block size that is encoded according to LDPC in each transmission frame, channel coding condition, the characteristic of the characteristic of Primary layer and/or enhancement layer, to Primary layer and at least byte number is respectively configured in an enhancement layer.
Preferably, in addition to:One Primary layer and at least an enhancement layer are respectively adopted DRA encryption algorithms coding.
Preferably, in addition to:To Primary layer and or at least an enhancement layer carries out bandwidth expansion respectively.
The present invention also proposes a kind of method encoded to multi-sound channel digital audio, including:Multi-channel audio signal is divided into a Primary layer and an enhancement layer, wherein Primary layer includes an at least Whole frequency band sound channel, and enhanced layer packets contain an at least Whole frequency band sound channel;The Whole frequency band channel number that Primary layer is included is not more than the Whole frequency band channel number that enhanced layer packets contain;Byte number is respectively configured to Primary layer and enhancement layer;Wherein, the Whole frequency band channel number included to the byte number of each Whole frequency band channel configuration in Primary layer for data frame total bytes/Primary layer, and (data frame total bytes/2)>(the Whole frequency band channel number that data frame total bytes/Primary layer is included)>(data frame total bytes/(the Whole frequency band channel number that Whole frequency band channel number+enhanced layer packets that Primary layer is included contain));The Whole frequency band channel number that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/enhanced layer packets contain is more than to the byte number of certain channel configuration of enhancement layer, and the Whole frequency band channel number that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/enhanced layer packets contain is less than to the byte number of remaining at least channel configuration;The Primary layer and enhancement layer are respectively adopted DRA encryption algorithms coding.
The present invention also proposes a kind of method encoded to multi-sound channel digital audio, including:Multi-channel audio signal is divided into a Primary layer and multiple enhancement layers;Wherein Primary layer includes an at least Whole frequency band sound channel, and multiple enhancement layers include an at least Whole frequency band sound channel respectively;The Whole frequency band channel number that Primary layer is included is not more than the Whole frequency band channel number sum that all enhanced layer packets contain;To a Primary layer and at least byte number is respectively configured in an enhancement layer;The Whole frequency band channel number wherein included to each Whole frequency band channel configuration byte number in Primary layer for data frame total bytes/Primary layer, and (data frame total bytes/2)>(the Whole frequency band channel number that data frame total bytes/Primary layer is included)>(data frame total bytes/(the Whole frequency band channel number sum that the Whole frequency band channel number that Primary layer is included+all enhanced layer packets contain));It is more than the Whole frequency band channel number sum that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/all enhanced layer packets contain for the byte number of certain channel configuration of the first enhancement layer, and the Whole frequency band channel number sum that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/all enhanced layer packets contain is less than to the byte number of remaining at least channel configuration;One Primary layer and at least an enhancement layer are respectively adopted DRA encryption algorithms coding.
The present invention had both avoided code efficiency decline caused by fine layering to a certain extent, while meeting the application in some fields, such as digital audio broadcasting again.The present invention is realized simply, by flexibly controlling the quality of every layer of sound channel, obtains optimal comprehensive sound quality, it is easy to meet channel coding requirements, and various restrictive conditions when need not finely be layered, it is ensured that the compression of higher efficiency.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of one embodiment of the invention;
Fig. 2 is the multi-sound channel digital audio two-layer structure schematic diagram of one embodiment of the invention;
Fig. 3 is one embodiment of the invention multi-sound channel digital audio sandwich construction schematic diagram;
Fig. 4 is the stereo hierarchy schematic diagram of left and right acoustic channels two of one embodiment of the invention;
Fig. 5 is the stereo and poor hierarchy schematic diagram of sound channel two of one embodiment of the invention;
Fig. 6 is the hierarchy schematic diagram of one embodiment of the invention surround sound two;
Fig. 7 is the hierarchy schematic diagram of surround sound two shown in one embodiment of the invention;
Fig. 8 is the hierarchy schematic diagram of one embodiment of the invention surround sound three;
Fig. 9 is the hierarchy schematic diagram of one embodiment of the invention surround sound three;
Figure 10 is one embodiment of the invention DRA & DRA+ surround sound hierarchy schematic diagrames.
Embodiment
To describe the technology contents of the present invention, construction feature, the purpose and effect reached in detail, below in conjunction with embodiment and accompanying drawing is coordinated to be described in detail.
The schematic flow sheet shown in Fig. 1 is referred to, first embodiment of the invention includes to multi-channel digital audio coding method:
Step S1, it is divided into a Primary layer and at least an enhancement layer to multichannel audio;
Step S2, to a Primary layer and at least byte number is respectively configured in an enhancement layer;
Step S3, to a Primary layer and at least an enhancement layer is separately encoded.
Multi-sound channel digital audio two-layer structure schematic diagram with reference to shown in Fig. 2, second embodiment of the invention proposes the two-layer structure to multi-channel audio signal to be divided into a Primary layer and an enhancement layer, wherein Primary layer includes an at least Whole frequency band sound channel, and enhanced layer packets contain an at least Whole frequency band sound channel;The Whole frequency band sound channel that Primary layer is included is not more than the Whole frequency band channel number that enhanced layer packets contain.
If Primary layer includes k Whole frequency band sound channel, enhancement layer is set to comprising m Whole frequency band sound channel.The Whole frequency band sound channel that configuration Primary layer is included is not more than the Whole frequency band channel number that enhanced layer packets contain, i.e. k<=m, configuration Primary layer encodes relatively little of sound channel, so as to ensure that its quality is higher.
It is the allocation plan of byte number on payload between each layering, the present invention proposes various embodiments on the premise of full payload (i.e. byte number) is certain.
3rd embodiment is that Primary layer emphasizes allocation plan.Because Primary layer is more important, and contribution of the enhancement layer to overall sound quality is taken second place relatively, it is therefore necessary to which net load is divided into substantially reciprocity two parts.Especially because the reason such as channel needs to abandon or can not be correctly obtained enhancement layer and pay attention to the application scenarios of Primary layer quality.
It is less than the Whole frequency band channel number situation that enhanced layer packets contain for the Whole frequency band sound channel that Primary layer is included, the byte number that the present embodiment is configured to Primary layer is data frame total bytes/2, and byte number of the Primary layer per sound channel is the Whole frequency band channel number that data frame total bytes/2* Primary layers are included;It is data frame total bytes/2 to the byte number that enhancement layer is configured;Byte number of the enhancement layer per sound channel is the Whole frequency band channel number that data frame total bytes/2* enhanced layer packets contain.
If a data frame total bytes are D, work as k<During m, then the byte number for giving Primary layer and enhancement layer is all D/2, and the effective word joint number of each sound channel of Primary layer is D/2*k, and the byte number of each sound channel of enhancement layer is D/2*m.
And the Whole frequency band sound channel included for Primary layer is equal to the Whole frequency band channel number situation that enhanced layer packets contain, the byte number that the present embodiment is configured to Primary layer is more than data frame total bytes/2;Data frame total bytes/2 are less than to the byte number that enhancement layer is configured.
The byte number that Primary layer is more than D/2 is may be configured to as k=m, such as 3*D/5, enhancement layer then configures 2*D/5, can also use other ratios.
So each sound channel of Primary layer can use more byte representations with respect to each sound channel of enhancement layer, ensure that the sound quality of each sound channel of Primary layer is more preferable so as to obtain.
Fourth embodiment is k:M allocation plans, or uniform allocation plan.Above-mentioned first embodiment highlights the importance of Primary layer;And from the aspect of multichannel entirety, the attention for giving equality to each Whole frequency band sound channel is just more reasonable, so due to the factors such as certain channel cause that Primary layer can only be correctly decoded when, the sound quality obtained is slightly poor compared with the allocation plan of first embodiment, but when Primary layer and enhancement layer can be decoded, overall multichannel quality can be more excellent than first embodiment.
Byte number of the present embodiment to each Whole frequency band channel configuration as much, it is that data frame total bytes/(the Whole frequency band channel number that Whole frequency band channel number+enhanced layer packets that Primary layer is included contain) sets the total bytes of an audio frame as D, the byte number of each Whole frequency band sound channel is D/ (k+m), each Whole frequency band sound channel is represented to encode with same byte number so that each Whole frequency band sound channel has same sound quality.
5th embodiment is nearly k:M allocation plans, non-Primary layer emphasizes configuration, also non-homogeneous configuration.Above-mentioned first embodiment highlights the importance of Primary layer, but ought generally k<During m, Primary layer emphasizes that configuration may excessively emphasize the quality of Primary layer.Second embodiment is then equally treated Primary layer Whole frequency band sound channel as a common Whole frequency band sound channel again;Therefore most reasonable disposition should be given according to multichannel concrete condition i.e. close to k:M is configured.The present embodiment thinks that each Whole frequency band sound channel in Primary layer is more important than enhancement layer Whole frequency band sound channel, should give more than uniform configuration and less than the byte number of the first configuration;And for m Whole frequency band sound channel in enhancement layer, it is also desirable to consider respectively, especially for the typical situation of multitrack surround sound 5.1, the center channel in movie audio system is usually set to dub, sound channel should be surround than two and give higher attention.This configuration can be provided configures more preferable multichannel quality than first two.
The Whole frequency band channel number that this programme is included to the byte number of each Whole frequency band channel configuration in Primary layer for data frame total bytes/Primary layer, and (data frame total bytes/2)>(the Whole frequency band channel number that data frame total bytes/Primary layer is included)>(data frame total bytes/(the Whole frequency band channel number that Whole frequency band channel number+enhanced layer packets that Primary layer is included contain));It is more than the Whole frequency band channel number that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/enhanced layer packets contain for the byte number of certain channel configuration of enhancement layer, and the Whole frequency band channel number that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/enhanced layer packets contain is less than to the byte number of remaining at least channel configuration.
The total bytes of an audio frame are set as D, are D/k to the byte number of each Whole frequency band channel configuration in Primary layer, and D/2>D/k>D/(k+m);Appropriate configuration is given also according to the characteristic of each Whole frequency band sound channel in enhancement layer for enhancement layer.For example during 5.1 surround sound, center channel should be configured more than D (1-1/k)/m bytes, and each channel configuration of left and right surround sound centering is less than D (1-1/k)/m bytes.
Sixth embodiment is limited allocation plan, the requirement dependent on channel coding condition.Due to channel coding such as LDPC (Low Density Parity Check) coding is block encoding; and two layerings will use different protection class; therefore each layer of hierarchical coding needs according to the block sizes encoded of LDPC in each transmission frame and considers the characteristic of multichannel Primary layer and enhancement layer, provides most arranged rational and configuration.For being limited configuring condition, the byte number distribution of general Primary layer and enhancement layer is similar with 3rd embodiment, but considers every layer of LDPC encoding block total capacity in transmission frame.
The characteristic of block size, channel coding condition, the characteristic of Primary layer and/or enhancement layer that this programme is encoded according to LDPC in each transmission frame, byte number is respectively configured to Primary layer and enhancement layer.
Multi-sound channel digital audio sandwich construction schematic diagram with reference to shown in Fig. 3, the present invention also proposes multi-segment scheme.Multi-channel audio signal is divided into a Primary layer and multiple enhancement layers;Wherein Primary layer includes an at least Whole frequency band sound channel, and multiple enhancement layers include an at least Whole frequency band sound channel respectively;The Whole frequency band sound channel that Primary layer is included is all or fewer than the Whole frequency band channel number sum that enhanced layer packets contain.
The present invention proposes that the 7th embodiment is that Primary layer emphasizes allocation plan, and Primary layer occupies the payload of half or more based on multi-segment scheme.The reasons why program and feature are similar to 3rd embodiment, therefore do not repeat.The byte number that this programme is configured to Primary layer is data frame total bytes/2, and byte number of the Primary layer per sound channel is the Whole frequency band channel number that data frame total bytes/2* Primary layers are included;The byte number sum of at least enhancement layer configuration is data frame total bytes/2, the byte number of each Whole frequency band sound channel of wherein the first enhancement layer is more than data frame total bytes/2 (the Whole frequency band channel number that Whole frequency band channel number sum+Primary layer that all enhanced layer packets contain is included) The byte number of each Whole frequency band sound channel of remaining at least enhancement layer is less than data frame total bytes/2 (the Whole frequency band channel number that Whole frequency band channel number sum+Primary layer that all enhanced layer packets contain is included).
By taking the three-decker of the enhancement layer of a Primary layer two as an example, if an audio frame total bytes are D, Primary layer includes k Whole frequency band sound channel, and the first enhancement layer is set to comprising m Whole frequency band sound channel, and the second enhancement layer is set to comprising n Whole frequency band sound channel.The byte number for being then allocated to Primary layer is all D/2, and the effective word joint number of each sound channel of Primary layer is D/2k.Two enhancement layer byte number sums are also D/2, but the byte number of each Whole frequency band sound channel of the first enhancement layer is more than D/2 (m+n), the byte number of each Whole frequency band sound channel of second enhancement layer is less than in D/2 (m+n), so each with respect to two enhancement layer sound channels of Primary layer sound channel can use more byte representations, ensure that the sound quality of each sound channel of Primary layer is more preferable so as to obtain;The first enhancement layer also can obtain higher-quality coding than the second enhancement layer simultaneously.If enhancement layer is three or more, then the byte number of each Whole frequency band sound channel of the first enhancement layer is more than D/2 (m+n), Second enhancement layer, the byte number sum of each Whole frequency band sound channel of the 3rd enhancement layer to N enhancement layers are less than D/2 (m+n).
8th embodiment is k:m:The reasons why n configurations, or uniform allocation plan, configuration and feature are similar to fourth embodiment, therefore do not repeat.
Byte number of this programme to each Whole frequency band channel configuration as much, is data frame total bytes/(the Whole frequency band channel number sum that the Whole frequency band channel number that Primary layer is included+all enhanced layer packets contain).If the total bytes of an audio frame are D, Primary layer includes k Whole frequency band sound channel, and the first enhancement layer is set to comprising m Whole frequency band sound channel, and the second enhancement layer is set to comprising n Whole frequency band sound channel.Now the byte number of each Whole frequency band sound channel is D/ (k+m+n), and at this moment each Whole frequency band sound channel is represented (coding) with same byte number, therefore each Whole frequency band sound channel has same sound quality.
9th embodiment is nearly k:m:The reasons why n configurations, the intermediateness of the two kinds of allocation plans provided between the 7th embodiment and the 8th embodiment, configuration and feature are similar to the 5th embodiment, therefore do not repeat.
The Whole frequency band channel number that this programme is included to each Whole frequency band channel configuration byte number in Primary layer for data frame total bytes/Primary layer, and (data frame total bytes/2)>(the Whole frequency band channel number that data frame total bytes/Primary layer is included)>(data frame total bytes/(the Whole frequency band channel number sum that the Whole frequency band channel number that Primary layer is included+all enhanced layer packets contain));It is more than the Whole frequency band channel number sum that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/all enhanced layer packets contain for the byte number of certain channel configuration of the first enhancement layer, and the Whole frequency band channel number sum that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/all enhanced layer packets contain is less than to the byte number of remaining at least channel configuration.
By taking the three-decker of the enhancement layer of a Primary layer two as an example, if the total bytes of an audio frame are D, now the byte number of each Whole frequency band sound channel is D/k, and D/2 in Primary layer>D/k>D/(k+m+n);For the Whole frequency band sound channel in the first enhancement layer more distribution are given than the Whole frequency band in the second enhancement layer, for example during 5.1 surround sound, the first enhancement layer transmission center channel and mega bass sound channel, the second enhancement layer transmission left side surround and right surround sound channel, now m=1, n=2.Center channel to Whole frequency band should be configured more than D (1-1/k)/3 byte, and each channel configuration of left and right surround sound centering is less than D (1-1/k)/3 byte;And the right and left rings of the second enhancement layer give same distribution (or as a sound channel to same coding) around sound channel.
Tenth embodiment is limited configuration, dependent on required by channel coding condition, the reasons why configuration and feature is similar to sixth embodiment, therefore is not repeated.Block size that this programme is encoded according to LDPC in each transmission frame, channel coding condition, the characteristic of the characteristic of Primary layer and/or enhancement layer, to Primary layer and at least byte number is respectively configured in an enhancement layer.
The present invention proposes that the various embodiments described above one Primary layer and at least an enhancement layer are respectively adopted DRA encryption algorithms coding.Can also using bandwidth expansion strengthen coding tools to Primary layer and or at least an enhancement layer carries out bandwidth expansion respectively.
Layering and the encoding scheme application example that different type audio signal is proposed are combined the following is the present invention.
Reference picture The hierarchy schematic diagram of stereo left and right acoustic channels two shown in 4, stereo audio signal only exists two independent Whole frequency band sound channels, therefore Primary layer transmission L channel, enhancement layer transmission R channel.Uniform configuration mode, i.e. left and right acoustic channels should be used to configure same sound quality general two layers of the configuration of such a situation, that is, configure same byte number.Primary layer and enhancement layer can be carried out in bandwidth expansion, figure with dotted line frame example using bandwidth expansion enhancing coding tools respectively.
Only have two Whole frequency band sound channels in the hierarchy schematic diagram of stereo and poor sound channel two shown in reference picture 5, this example, therefore only exist two layered schemes.For stereophonic signal, it will generally be carried out and difference coding in coding in order to improve code efficiency.Due to there is certain correlation between two sound channels of stereophonic signal, therefore difference signal has the dynamic range smaller than R channel on probability, therefore coding needs less data to represent.Further for some application, such as Karaoke stereophonic signal, a sound channel be sound accompaniment, a sound channel lyrics (voice), and sound channel due to two sound channels being mixed, it can be shown that the information of two sound channels.Analyzed according to both the above, should be using as Primary layer, poor sound channel (and possible bandwidth expansion) should emphasize configuration mode as enhancement layer using Primary layer with sound channel (and possible bandwidth expansion).This application example, will be more preferable than left and right acoustic channels layering when being merely capable of being correctly decoded Primary layer.
The embodiment given below concentrated to 5.1 surround sound situations.The hierarchy schematic diagram of surround sound two shown in reference picture 6, the present embodiment is 5.1 surround sounds, wherein 5 Whole frequency band sound channels, 1 mega bass sound channel.In Primary layer transmission stereo left channel (being illustrated as L) and R channel (being illustrated as R);Other sound channels are transmitted in enhancement layer, putting in order for sound channel surrounds and right surrounds sound channel (diagram difference LS and RS) for center channel (being illustrated as C), mega bass sound channel (being illustrated as LFE), a left side in enhancing sound channel.Certainly each Whole frequency band sound channel can select bandwidth expansion enhancing instrument (being represented by dashed line in figure), improve code efficiency;In addition each sound channel can also further be selected to reduce information redundance using parameter stereo coding instrument to (diagram is respectively L&R and LS&RS), now corresponding sound channel correspondence contracting mixes carries out basic coding for monophonic (diagram is respectively M0 and M1).Configuration and nearly k can be emphasized using the Primary layer of two layerings:M configures two ways.
The hierarchy schematic diagram of surround sound two shown in reference picture 7, audio hierarchy is similar with a upper embodiment, and simply enhancement layer, which can put in order sound channel, is adjusted to a preferred coding left side and surrounds and right surrounds sound channel, followed by center channel and mega bass sound channel.
The hierarchy schematic diagram of surround sound three shown in reference picture 8, in this example, 5.1 sound channels are divided into three layers to encode, wherein Primary layer coding left and right acoustic channels (L and R), it may be selected to strengthen instrument and parameter stereo coding instrument using bandwidth expansion, improve code efficiency;First enhancement layer coding center channel (C), and optional use bandwidth expansion enhancing instrument, followed by mega bass sound channel (LFE) coding;A second enhancement layer transmission left side surrounds and right surrounds sound channel (LS and RS), and optional bandwidth expansion and parameter stereo strengthen instrument.If the stereo enhancing instrument of selection parameter stereo pair of basic coding should be modified as to it is stereo it is mixed to contracting after monophonic encode, such as L&R contractings are mixed as M0, and LS&RS contractings are mixed as M1.This application example preferably uses data structure for nearly k:m:N configuration modes.
The hierarchy schematic diagram of surround sound three shown in reference picture 9, audio hierarchy is similar with a upper embodiment, but the first enhancement layer and the second enhancement layer are exchanged.
DRA & DRA+ shown in reference picture 10 Surround sound hierarchy schematic diagram, the structure being layered using surround sound two forms Primary layer and enhancement layer.DRA (Digital Rise are used in the base layer Audio) L channel and R channel are constituted is stereo to carrying out stereo coding, and optional bandwidth expansion SBR (Spectral Band Replication) technology and parameter stereo coding PS (Parametric Stereo) technology.If certain selection parameter stereo encoding techniques, DRA coded portions will be revised as only encoding the monophonic that contracting is mixed, and if selection uses SBR technologies, then DRA coded portions are further modified to the low band portion coding of the monophonic only to contracting after mixed;In the enhancement layer, DRA codings are carried out to center channel C first, optional use SBR bandwidth expansion techniques, then mega bass sound channel LFE is encoded using DRA, finally right and left rings are carried out around sound channel (LS and RS) stereo to DRA codings, optional bandwidth expansion SBR and parameter stereo coding PS, improve the code efficiency to surround sound pair.The data structure that this example is preferably used is nearly k:m:N is set, or using limited setting when applied to digital audio broadcasting.
The present invention proposes that the various embodiments described above one Primary layer and at least an enhancement layer are respectively adopted DRA encryption algorithms coding.
The present invention can carry out four layers of even more multi-segment to audio signal, but general using two to three layers of hierarchy, it is easy to accomplish.It is layered based on sound channel, by flexibly controlling the quality of every layer of sound channel, obtains optimal comprehensive sound quality.Be content with very little channel coding requirements:Because each encoding block of LDPC channel coding requirements has fixed size, by the rough segmentation layer based on sound channel, channel requirements can be met with reasonable arrangement.Various restrictive conditions when need not finely be layered, such as MPEG The every 32 one group of carry out arithmetic codings of MDCT coefficients and associated auxiliary data etc. are required in AAC-BSAC audio codings, binary encoding efficiency is influenceed, therefore rough segmentation layer can ensure the compression of higher efficiency.
The present invention to multi-sound channel digital audio encode method by above-mentioned exposure method, the purpose and effect can be reached, but disclosed above is only presently preferred embodiments of the present invention, from can not be limited with this present invention interest field, other equivalent modifications or change as the present invention, all should cover in scope of the presently claimed invention.

Claims (15)

1st, a kind of method encoded to multi-sound channel digital audio, it is characterised in that including:
Multichannel audio is divided into a Primary layer and at least an enhancement layer;
To a Primary layer and at least byte number is respectively configured in an enhancement layer;
To a Primary layer and at least an enhancement layer is separately encoded.
2nd, the method according to claim 1 encoded to multi-sound channel digital audio, it is characterised in that:
Multi-channel audio signal is divided into a Primary layer and an enhancement layer;
Wherein Primary layer includes an at least Whole frequency band sound channel, and enhanced layer packets contain an at least Whole frequency band sound channel;
The Whole frequency band sound channel that the Primary layer is included is not more than the Whole frequency band channel number that the enhanced layer packets contain.
3rd, the method according to claim 2 encoded to multi-sound channel digital audio, it is characterised in that be less than the Whole frequency band channel number situation that enhanced layer packets contain for the Whole frequency band sound channel that Primary layer is included, in addition to:
It is data frame total bytes/2 to the Primary layer configuration words joint number, byte number of the Primary layer per sound channel is the Whole frequency band channel number that data frame total bytes/2* Primary layers are included;
It is data frame total bytes/2 to the enhancement layer configuration words joint number;Byte number of the enhancement layer per sound channel is the Whole frequency band channel number that data frame total bytes/2* enhanced layer packets contain.
4th, the method according to claim 2 encoded to multi-sound channel digital audio, it is characterised in that be equal to the Whole frequency band channel number situation that enhanced layer packets contain for the Whole frequency band sound channel that Primary layer is included, in addition to:
Data frame total bytes/2 are more than to the byte number that the Primary layer is configured;
Data frame total bytes/2 are less than to the byte number that the enhancement layer is configured.
5th, the method according to claim 2 encoded to multi-sound channel digital audio, it is characterised in that also include:
It is data frame total bytes/(the Whole frequency band channel number that Whole frequency band channel number+enhanced layer packets that Primary layer is included contain) to the byte number of each Whole frequency band channel configuration as much.
6th, the method according to claim 2 encoded to multi-sound channel digital audio, it is characterised in that also include:
The Whole frequency band channel number included to the byte number of each Whole frequency band channel configuration in the Primary layer for data frame total bytes/Primary layer, and (data frame total bytes/2)>(the Whole frequency band channel number that data frame total bytes/Primary layer is included)>(data frame total bytes/(the Whole frequency band channel number that Whole frequency band channel number+enhanced layer packets that Primary layer is included contain));
It is more than the Whole frequency band channel number that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/enhanced layer packets contain for the byte number of certain channel configuration of enhancement layer, and the Whole frequency band channel number that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/enhanced layer packets contain is less than to the byte number of remaining at least channel configuration.
7th, the method according to claim 2 encoded to multi-sound channel digital audio, it is characterised in that:
The characteristic of the block size, channel coding condition, the characteristic of the Primary layer and/or the enhancement layer that are encoded according to LDPC in each transmission frame, byte number is respectively configured to the Primary layer and enhancement layer.
8th, the method according to claim 1 encoded to multi-sound channel digital audio, it is characterised in that multi-channel audio signal is divided into a Primary layer and multiple enhancement layers;
Wherein Primary layer includes an at least Whole frequency band sound channel, and multiple enhancement layers include an at least Whole frequency band sound channel respectively;
The Whole frequency band sound channel that the Primary layer is included is all or fewer than the Whole frequency band channel number sum that enhanced layer packets contain.
9th, the method according to claim 8 encoded to multi-sound channel digital audio, it is characterised in that:
It is data frame total bytes/2 to the byte number that the Primary layer is configured, byte number of the Primary layer per sound channel is the Whole frequency band channel number that data frame total bytes/2* Primary layers are included;
The byte number sum of at least enhancement layer configuration is data frame total bytes/2, the byte number of each Whole frequency band sound channel of wherein the first enhancement layer is more than data frame total bytes/2 (the Whole frequency band channel number that Whole frequency band channel number+Primary layer that enhanced layer packets contain is included) The byte number of each Whole frequency band sound channel of remaining at least enhancement layer is less than data frame total bytes/2 (the Whole frequency band channel number that Whole frequency band channel number+Primary layer that enhanced layer packets contain is included).
10th, the method according to claim 8 encoded to multi-sound channel digital audio, it is characterised in that:
It is data frame total bytes/(the Whole frequency band channel number sum that the Whole frequency band channel number that Primary layer is included+all enhanced layer packets contain) to the byte number of each Whole frequency band channel configuration as much.
11st, the method according to claim 8 encoded to multi-sound channel digital audio, it is characterised in that:
The Whole frequency band channel number included to each Whole frequency band channel configuration byte number in the Primary layer for data frame total bytes/Primary layer, and (data frame total bytes/2)>(the Whole frequency band channel number that data frame total bytes/Primary layer is included)>(data frame total bytes/(the Whole frequency band channel number sum that the Whole frequency band channel number that Primary layer is included+all enhanced layer packets contain));
It is more than the Whole frequency band channel number sum that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/all enhanced layer packets contain for the byte number of certain channel configuration of the first enhancement layer, and the Whole frequency band channel number sum that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/all enhanced layer packets contain is less than to the byte number of remaining at least channel configuration.
12nd, the method according to claim 8 encoded to multi-sound channel digital audio, it is characterised in that:The block size that is encoded according to LDPC in each transmission frame, channel coding condition, the characteristic of the characteristic of the Primary layer and/or the enhancement layer, to the Primary layer and at least byte number is respectively configured in an enhancement layer.
13rd, the method encoded to multi-sound channel digital audio according to claim 1 to 12 any one, it is characterised in that also include:One Primary layer and at least an enhancement layer are respectively adopted DRA encryption algorithms coding.
14th, a kind of method encoded to multi-sound channel digital audio, it is characterised in that including:
Multi-channel audio signal is divided into a Primary layer and an enhancement layer;Wherein Primary layer includes an at least Whole frequency band sound channel, and enhanced layer packets contain an at least Whole frequency band sound channel;The Whole frequency band channel number that the Primary layer is included is not more than the Whole frequency band channel number that the enhanced layer packets contain;
Byte number is respectively configured to the Primary layer and enhancement layer;The Whole frequency band channel number wherein included to the byte number of each Whole frequency band channel configuration in the Primary layer for data frame total bytes/Primary layer, and (data frame total bytes/2)>(the Whole frequency band channel number that data frame total bytes/Primary layer is included)>(data frame total bytes/(the Whole frequency band channel number that Whole frequency band channel number+enhanced layer packets that Primary layer is included contain));The Whole frequency band channel number that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/enhanced layer packets contain is more than to the byte number of certain channel configuration of enhancement layer, and the Whole frequency band channel number that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/enhanced layer packets contain is less than to the byte number of remaining at least channel configuration;
The Primary layer and enhancement layer are respectively adopted DRA encryption algorithms coding.
15th, a kind of method encoded to multi-sound channel digital audio, it is characterised in that including:
Multi-channel audio signal is divided into a Primary layer and multiple enhancement layers;Wherein Primary layer includes an at least Whole frequency band sound channel, and multiple enhancement layers include an at least Whole frequency band sound channel respectively;The Whole frequency band channel number that the Primary layer is included is not more than the Whole frequency band channel number sum that all enhanced layer packets contain;
To a Primary layer and at least byte number is respectively configured in an enhancement layer;The Whole frequency band channel number wherein included to each Whole frequency band channel configuration byte number in the Primary layer for data frame total bytes/Primary layer, and (data frame total bytes/2)>(the Whole frequency band channel number that data frame total bytes/Primary layer is included)>(data frame total bytes/(the Whole frequency band channel number sum that the Whole frequency band channel number that Primary layer is included+all enhanced layer packets contain));It is more than the Whole frequency band channel number sum that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/all enhanced layer packets contain for the byte number of certain channel configuration of the first enhancement layer, and the Whole frequency band channel number sum that data frame total bytes * (the Whole frequency band channel number that 1-1/ Primary layers are included)/all enhanced layer packets contain is less than to the byte number of remaining at least channel configuration;
One Primary layer and at least an enhancement layer are respectively adopted DRA encryption algorithms coding.
CN201280000959.4A 2012-07-06 2012-07-06 Method for coding multi-channel digital audio Active CN103650036B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2012/078306 WO2014005327A1 (en) 2012-07-06 2012-07-06 Method for encoding multichannel digital audio

Publications (2)

Publication Number Publication Date
CN103650036A true CN103650036A (en) 2014-03-19
CN103650036B CN103650036B (en) 2016-05-11

Family

ID=49881272

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201280000959.4A Active CN103650036B (en) 2012-07-06 2012-07-06 Method for coding multi-channel digital audio

Country Status (2)

Country Link
CN (1) CN103650036B (en)
WO (1) WO2014005327A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10573326B2 (en) * 2017-04-05 2020-02-25 Qualcomm Incorporated Inter-channel bandwidth extension

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1756086A (en) * 2004-07-14 2006-04-05 三星电子株式会社 Multichannel audio data encoding/decoding method and equipment
CN101908938A (en) * 2010-07-27 2010-12-08 北京海尔集成电路设计有限公司 Vehicle-mounted broadcasting equipment
WO2011080916A1 (en) * 2009-12-28 2011-07-07 パナソニック株式会社 Audio encoding device and audio encoding method
CN102272829A (en) * 2008-12-29 2011-12-07 摩托罗拉移动公司 Method and apparatus for generating an enhancement layer within a multiple-channel audio coding system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100908117B1 (en) * 2002-12-16 2009-07-16 삼성전자주식회사 Audio coding method, decoding method, encoding apparatus and decoding apparatus which can adjust the bit rate
KR100818268B1 (en) * 2005-04-14 2008-04-02 삼성전자주식회사 Apparatus and method for audio encoding/decoding with scalability
CN101206860A (en) * 2006-12-20 2008-06-25 华为技术有限公司 Method and apparatus for encoding and decoding layered audio
KR101336891B1 (en) * 2008-12-19 2013-12-04 한국전자통신연구원 Encoder/Decoder for improving a voice quality in G.711 codec
US8386266B2 (en) * 2010-07-01 2013-02-26 Polycom, Inc. Full-band scalable audio codec

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1756086A (en) * 2004-07-14 2006-04-05 三星电子株式会社 Multichannel audio data encoding/decoding method and equipment
CN102272829A (en) * 2008-12-29 2011-12-07 摩托罗拉移动公司 Method and apparatus for generating an enhancement layer within a multiple-channel audio coding system
WO2011080916A1 (en) * 2009-12-28 2011-07-07 パナソニック株式会社 Audio encoding device and audio encoding method
CN101908938A (en) * 2010-07-27 2010-12-08 北京海尔集成电路设计有限公司 Vehicle-mounted broadcasting equipment

Also Published As

Publication number Publication date
WO2014005327A1 (en) 2014-01-09
CN103650036B (en) 2016-05-11

Similar Documents

Publication Publication Date Title
CN103400581B (en) Use efficient lower audio decoder and the coding/decoding method mixing
CN1756086B (en) Multichannel audio data encoding/decoding method and apparatus
US11676612B2 (en) Determination of spatial audio parameter encoding and associated decoding
CN101484936B (en) audio decoding
EP1393303B1 (en) Inter-channel signal redundancy removal in perceptual audio coding
JP4772279B2 (en) Multi-channel / cue encoding / decoding of audio signals
CN1922657B (en) Decoding scheme for variable block length signals
US8175729B2 (en) Preserving matrix surround information in encoded audio/video system and method
CN105531763B (en) Uneven parameter for advanced coupling quantifies
RU2323551C1 (en) Method for frequency-oriented encoding of channels in parametric multi-channel encoding systems
CN1179074A (en) Apparatus for reproducing multi channel voice using two speaker and its method
US20230047237A1 (en) Spatial audio parameter encoding and associated decoding
KR20160099531A (en) Parametric reconstruction of audio signals
CN103262160A (en) Method and apparatus for downmixing multi-channel audio signals
WO2019001142A1 (en) Inter-channel phase difference parameter coding method and device
JP2010506207A (en) Encoding method, decoding method, encoder, decoder, and computer program product
CN103650036A (en) Method for encoding multichannel digital audio
US20210027795A1 (en) Method and apparatus for decoding a bitstream including encoded higher order ambisonics representations
US20050141722A1 (en) Signal processing
CN104170007A (en) Monophonic or stereo audio coding method
McGrath et al. Immersive audio coding for virtual reality using a metadata-assisted extension of the 3gpp evs codec
CN103165135B (en) Digital audio coarse layering coding method and digital audio coarse layering coding device
US20230335143A1 (en) Quantizing spatial audio parameters
JP2002162996A (en) Method and device for encoding audio signals, and method and system for distributing music
CN1065400C (en) Compatible AC-3 and MPEG-2 audio-frequency code-decode device and its computing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220524

Address after: 510530 No. 10, Nanxiang 2nd Road, Science City, Luogang District, Guangzhou, Guangdong

Patentee after: Guangdong Guangsheng research and Development Institute Co.,Ltd.

Address before: 518057 6th floor, software building, No. 9, Gaoxin Zhongyi Road, high tech Zone, Nanshan District, Shenzhen, Guangdong Province

Patentee before: SHENZHEN RISING SOURCE TECHNOLOGY Co.,Ltd.