WO2005122639A1 - 音響信号符号化装置および音響信号復号装置 - Google Patents
音響信号符号化装置および音響信号復号装置 Download PDFInfo
- Publication number
- WO2005122639A1 WO2005122639A1 PCT/JP2005/010811 JP2005010811W WO2005122639A1 WO 2005122639 A1 WO2005122639 A1 WO 2005122639A1 JP 2005010811 W JP2005010811 W JP 2005010811W WO 2005122639 A1 WO2005122639 A1 WO 2005122639A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- signal
- channel
- downmix
- coefficient table
- unit
- Prior art date
Links
- 239000011159 matrix material Substances 0.000 claims abstract description 40
- 230000005236 sound signal Effects 0.000 claims description 90
- 238000006243 chemical reaction Methods 0.000 claims description 21
- 238000000034 method Methods 0.000 claims description 12
- 238000001914 filtration Methods 0.000 abstract description 4
- 230000005540 biological transmission Effects 0.000 abstract 2
- 230000006870 function Effects 0.000 description 13
- 238000010586 diagram Methods 0.000 description 12
- 239000000203 mixture Substances 0.000 description 10
- 239000000284 extract Substances 0.000 description 8
- 238000004364 calculation method Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 3
- 238000004088 simulation Methods 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/02—Systems employing more than two channels, e.g. quadraphonic of the matrix type, i.e. in which input signals are combined algebraically, e.g. after having been phase shifted with respect to each other
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/01—Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/03—Application of parametric coding in stereophonic audio systems
Definitions
- the present invention relates to an audio signal encoding device that encodes a multi-channel signal and an audio signal decoding device that decodes an encoded signal.
- an audio encoder sound signal encoding device
- R & D converting a multi-channel signal into a signal with a reduced number of channels is generally referred to as downmitting.
- a multi-channel encoder and a multi-channel decoder based on the MPEG2 audio standard have been researched and developed. This multi-channel encoder performs the following operations:
- the first encoded signal and the second encoded signal are encoded as a second encoded signal, respectively.
- the conventional inexpensive decoder (decoding device) for reproducing two-channel signals is capable of decoding only the first encoded signal L0 R0.
- the multi-channel decoder is capable of performing the following operations, that is,
- the original multi-channel signal L R 1 r is decoded from the first coded signal LO RO and the second coded signal 10 rO using [0006].
- a sub-stream of a signal L0 R0 down-mixed into two channels by inputting a multi-channel signal and a down-mixed signal L0 A signal for returning R0 to a multi-channel signal
- a 10 rO sub-stream After encoding into two sub-streams, a 10 rO sub-stream, a multi-channel encoder that multiplexes them into one stream, and two encoders that encode the multiplexed stream
- the signal is divided into sub-streams that have been down-mixed, and the down-mixed signal L0 R0 is decoded into one of the two channels.
- the 2-channel decoder performs the decoding process on the down-mixed 2-channel signal. At the same time as the sub-channel of the signal L0 R0 downmixed into two channels.
- a multi-channel decoder that can decode the original multi-channel signal using two sub-streams: a stream and a sub-stream of 10 rO to convert the down-mixed signal L0 R0 back to a multi-channel signal.
- an acoustic signal decoding apparatus constituting a conventional 2-channel decoder and a multi-channel decoder when reproducing a down-mixed 2-channel signal after reproducing the original spatial information.
- Fig. 7 shows a block diagram of this.
- a down-converted signal obtained by down-mixing a multi-channel signal having a predetermined number of channels is referred to as a “down-mix signal”.
- the audio signal decoding device 70 includes a demultiplexing unit 71 that extracts a bitstream B power downmix encoded signal and an auxiliary information encoded signal, and the downmix code A first decoding unit 72 that generates down-mix signals LO and RO, which are audio signals in the frequency domain of two channels from the dangling signal, and a second decoding unit 73 that generates auxiliary information 10 and rO from the encoded auxiliary information signal.
- a high-quality 2-channel audio system that has a function simulation unit 77 and reproduces the original spatial information that can be viewed with headphones, etc., by performing spatial information synthesis processing calculations in the head-related transfer function simulation unit 77.
- Signals L1 and R1 can be obtained.
- Patent Document 1 Japanese Patent Application Publication No. 2002-541524
- the decoded downmix signal is downmixed by a predetermined matrix operation at each sample time, so that the spatial information of the original multi-channel signal is lost. Has been done. Therefore, when trying to reproduce a high-quality two-channel signal that reproduces the original spatial information, that is, when reproducing a two-channel signal that has been subjected to virtual surround processing, the sound signal decoding device described above is used. Then, once the multi-channel signal was decoded using the first coded signals L0 and R0 and the second coded signals 10 and rO, it was simulated in the coefficient table 76 by the head-related transfer function simulating unit 77. There is a problem that spatial information needs to be filtered based on the head-related transfer function, and a great deal of arithmetic processing must be spent on this filtering.
- the present invention has been made in order to solve such a conventional problem, and an audio signal code that generates encoded information that can reproduce the original multi-channel spatial information simply by reproducing a downmix signal. It is an object of the present invention to provide an audio signal decoding apparatus that reproduces the original multi-channel spatial information simply by reproducing a downmix signal from a video signal and a video signal.
- An audio signal encoding apparatus includes a time-frequency conversion unit that converts an N-channel signal into a frequency domain, and a two-channel signal that downmixes the N-channel frequency domain signal.
- First signal output means for generating a downmix signal
- second signal output means for generating auxiliary information for converting the downmix signal back to a multi-channel signal
- a first encoding means for encoding the downmix signal.
- a first encoding unit that generates a signal
- a second encoding unit that encodes the auxiliary information to generate a second encoded signal, and multiplexes the first encoded signal and the second encoded signal.
- a multiplexing means and a coefficient table in which coefficients for realizing transfer characteristics are described for each frequency, wherein N is an integer of 3 or more, and the coefficient table is a square matrix of NXN,
- the channel includes a coefficient represented by a 2 XN matrix that simulates the head transfer characteristic at the time of reproduction, and the remaining coefficients represented by the (N-2) XN matrix are the coefficients represented by the 2 XN matrix.
- the first signal output means is configured to down-mix the N-channel frequency domain signal to the two-channel signal according to the coefficient table, and the second signal output means includes The auxiliary information of the downmix signal is generated according to the coefficient table.
- the downmix signal becomes a signal that has been filtered with a desired transfer function. Even when only the first encoded signal is reproduced, the spatial information of the multi-channel signal is reflected and the second mixed signal is reflected.
- the original multi-channel signal is An encoded signal to be reproduced can be generated.
- the audio signal encoding apparatus of the present invention includes a plurality of the coefficient tables each including a coefficient that realizes the different transfer characteristic, and further includes a coefficient table selecting unit that selects the coefficient table according to a use.
- the multiplexing unit multiplexes, together with the first coded signal and the second coded signal, an index indicating the coefficient table selected by the coefficient table selecting unit. It may have the configuration described.
- the coefficient table is selected in accordance with the purpose, and an index for specifying the selected coefficient table is multiplexed. Can be transmitted to.
- the audio signal decoding device is characterized in that: the demultiplexing means for extracting only the downmix code in the bit stream generated by the audio signal encoding device; and the downmix code.
- Decoding means for generating an audio signal in the frequency domain of two channels from the frequency domain, and frequency-time conversion means for generating an audio signal in the time domain from the audio signal in the frequency domain.
- the audio signal decoding apparatus of the present invention includes: a demultiplexing unit that extracts a bit stream power downmix code and an auxiliary information code generated by the audio signal encoding apparatus; First decoding means for generating a downmix signal that is an audio signal in the frequency domain of a channel, second decoding means for generating auxiliary information from the auxiliary information code, and a signal from the downmix signal and the auxiliary information.
- Inverse mixing means for generating a multi-channel signal
- frequency-to-time conversion means for generating a time-domain audio signal from the multi-channel signal, and a 2 ⁇ N matrix simulating head transfer characteristics during multi-channel reproduction.
- NXN square matrix A coefficient table which is an inverse matrix, and wherein the inverse mixing means generates the multi-channel signal using the coefficient table.
- the downmix signal and the auxiliary information are extracted and decoded, and the downmix signal and the auxiliary information are extracted using a coefficient table that is an inverse matrix of a matrix that simulates the head-related transfer characteristics. Since the multi-channel signal is generated from the signal, the original multi-channel signal can be reproduced even if the down-mix signal is a signal including a transfer characteristic.
- the audio signal decoding device of the present invention includes an output channel switching means for selectively switching between outputting the downmix signal and outputting the multi-channel signal, and the frequency-time converting means includes The signal power selectively output to the output channel switching means may be configured to generate the audio signal in the time domain.
- the acoustic signal encoding device of the present invention may be configured so that the coefficient table includes a coefficient simulating a spatial transfer characteristic.
- the present invention provides first signal output means for downmixing an N-channel frequency domain signal to generate a two-channel downmix signal, and auxiliary information for returning the downmix signal to a multi-channel signal.
- a second signal output unit for generating, a multiplexing unit for multiplexing the first encoded signal generated by encoding the downmix signal and a second encoded signal generated by encoding the auxiliary information,
- a coefficient table in which coefficients for realizing characteristics are described for each frequency, wherein N is an integer of 3 or more, and the first signal output means and the second signal output means perform the information processing according to the coefficient table.
- the downmix signal becomes a signal filtered with a desired transfer function.
- FIG. 1 is a block diagram of an audio signal encoding device according to a first embodiment of the present invention.
- FIG. 3 is a block diagram of an audio signal encoding device according to a second embodiment of the present invention.
- FIG. 4 is a block diagram of an audio signal decoding device according to a third embodiment of the present invention.
- FIG. 5 is a block diagram of an audio signal decoding device according to a fourth embodiment of the present invention.
- FIG. 6 is a block diagram of an audio signal decoding device according to a fifth embodiment of the present invention.
- FIG. 7 is a block diagram of a conventional acoustic signal decoding apparatus that reproduces spatial information using an encoded signal.
- the audio signal coding apparatus 10 includes a time-frequency conversion unit 11 for converting an N-channel multi-channel signal into a frequency domain signal, A first signal output section 12 for downmixing the frequency domain signal of the channel to generate a downmix signal of two channels, and a first note for encoding the downmix signal; A first encoding unit 13 for generating an encoded signal; a second signal output unit 14 for generating auxiliary information for returning the downmix signal to the original N-channel multi-channel signal; and encoding the auxiliary information.
- a second encoding unit 15 that generates a second encoded signal, a multiplexing unit 16 that multiplexes the first encoded signal and the second encoded signal, and a coefficient that realizes transfer characteristics.
- a coefficient table 17 described for each frequency is provided.
- N is an integer of 3 or more, and the coefficient table 17 is stored in a storage medium such as a memory (not shown).
- the input N-channel multi-channel signals are four-channel signals of a front left sound signal L, a front right sound signal R, a rear left sound signal 1, and a rear right sound signal r. Power.
- the time-frequency converter 11 converts the input four-channel signals L, R, 1, and r into a method represented by a Fourier transform, a discrete cosine transform, a sub-band filter, or the like. To convert it into a signal in the frequency domain.
- the first signal output unit 12 expresses the frequency-domain signal converted by the time-frequency conversion unit 11 by using a coefficient stored in the coefficient table 17 and expressed by an equation (Equation 3). Downmix by calculation.
- the coefficients a, b, c, and d used here are the head-related transfer functions shown in FIG. 2, and are represented by a 2 ⁇ N matrix.
- a front left speaker 61, a front right speaker 62, a rear left speaker 63, and a rear right speaker 64 are provided for a listener's head 65.
- L is output from the front left speaker.
- R is the signal output from the right front speaker, 1 is the signal output from the left rear speaker, r is the signal output from the right rear speaker, Le is the signal reaching the left ear, and Re is , Indicates the signal reached the right ear.
- the coefficient a is a transfer characteristic from the front left speaker 61 to the left ear
- the coefficient b is a transfer characteristic from the front right speaker 62 to the left ear
- the coefficient c is a transfer characteristic from the rear left speaker 63 to the left ear
- the coefficient d is a transfer characteristic from the right rear speaker 64 to the left ear, and a set of these is called a “head-related transfer function”.
- the first encoding unit 13 encodes the downmix signals LO and RO output from the first signal output unit 12, and performs the first encoding. Generate a signal.
- the encoding performed by the first encoding unit 13 may be, for example, an encoding method defined by the MPEG standard or the like.
- the second signal output unit 14 calculates the frequency domain signal converted by the time-frequency conversion unit 11 using the coefficients stored in the coefficient table 17 using an operation represented by the following equation (Formula 4).
- auxiliary information 10, rO for returning the down-mixed signal to a multi-channel signal is generated.
- the coefficients a, b, c, and d used here are represented by (N ⁇ 2) ⁇ N, that is, a 2 ⁇ N matrix in the present embodiment.
- the second encoding unit 15 encodes the auxiliary information 10 and rO output from the second signal output unit 14 to generate a second encoded signal.
- the encoding method by the second encoding unit 15 may be the encoding method defined by the MPEG standard or the like, like the first encoding unit 13.
- the multiplexing unit 16 compares the first encoded signal generated by the first encoding unit 13 with the second encoded signal. And multiplexes the second encoded signal generated by the encoding unit 15 to generate one bit stream B.
- Equation (Equation 7) is obtained
- Equation 9 Equation 9
- Equation 10 Equation 10
- a coefficient table 17 including a coefficient represented by a 2 ⁇ N matrix that simulates head transfer characteristics during multi-channel signal reproduction, and a coefficient table A first signal output unit 12 for downmixing the N-channel frequency domain signal to generate a two-channel downmix signal according to 17 and a second signal output unit for generating auxiliary information for returning the downmix signal to a multi-channel signal
- the downmix signal becomes a signal that has been filtered with a desired transfer function, and the spatial information of the multi-channel signal is reflected even when only the first encoded signal is reproduced.
- an encoded signal capable of reproducing the original multi-channel signal can be generated.
- audio signal coding apparatus 20 includes a time-frequency conversion unit 21 that converts an N-channel multi-channel signal into a frequency domain signal, and a converted N-frequency signal.
- a first signal output unit 22 for downmixing the channel frequency domain signal to generate a two-channel downmix signal, and a first encoding unit 23 for encoding the downmix signal to generate a first encoded signal;
- Multi-channel downmix signal A second signal output unit 24 for generating auxiliary information for returning to a signal; a second encoding unit 25 for encoding the auxiliary information to generate a second encoded signal; and a first signal output unit 22 according to purpose.
- a coefficient table selecting section 26 for selecting a transfer function to be used in the second signal output section 24, a plurality of coefficient table groups 27 in which coefficients for realizing various transfer characteristics are described for each frequency, and a coefficient table A third encoding unit 28 that generates a third encoded signal serving as an index for specifying the coefficient table selected by the selecting unit 26; the first encoded signal, the second encoded signal, and the third encoded signal.
- a multiplexing unit 29 for multiplexing the encoded signal.
- N is an integer of 3 or more
- the coefficient table group 27 is shown in the figure, and is assumed to be stored in a storage medium such as a memory.
- time-frequency conversion unit 21, the first signal output unit 22, the first encoding unit 23, the second signal output unit 24, and the second encoding unit 25 perform the time-frequency conversion described in the first embodiment. It is the same as the unit 11, the first signal output unit 12, the first encoding unit 13, the second signal output unit 14, and the second encoding unit 15, respectively.
- the input N-channel multi-channel signals are divided into four-channel signals of a front left sound signal L, a front right sound signal R, a rear left sound signal 1, and a rear right sound signal r. Power.
- the time-frequency converter 21 converts the input four-channel signals into frequency-domain signals using a method represented by a Fourier transform, a discrete cosine transform, a sub-band filter, or the like. Convert.
- the coefficient table selecting section 26 selects, from the plurality of coefficient table groups 27, a coefficient table in which the coefficients constituting the transfer characteristics to be simulated in the first signal output section 22 are described.
- the plurality of coefficient table groups 27 include various coefficients that simulate the head transfer characteristics during reproduction. This makes it possible to select an appropriate coefficient table according to the size of the user's head, such as when using headphones or two speakers, so that, for example, whether the user is an adult or a child, This makes it possible to reproduce 2-channel signals that have been subjected to simple virtual surround processing.
- the plurality of coefficient tables 27 may include a space transfer coefficient that simulates a space transfer characteristic of a space where sound is heard only with a head transfer coefficient to be simulated. This allows you to use two speakers. For example, it is possible to reproduce 2-channel signals that have been subjected to appropriate virtual surround processing according to the size of the room.
- the first signal output unit 22 converts the frequency-domain signal converted by the time-frequency conversion unit 21 using the coefficient stored in the coefficient table selected by the coefficient table selection unit 26,
- the downmix is performed by the calculation of
- the coefficients a, b, c, d used here are represented by a 2 ⁇ N matrix.
- first encoding unit 23 encodes the downmix signal output from first signal output unit 22, and generates a first encoded signal.
- the encoding performed by the first encoding unit 23 may be an encoding method defined by the MPEG standard or the like, similarly to the first encoding unit 13 in the first embodiment.
- the second signal output unit 24 uses the frequency-domain signal converted by the time-frequency conversion unit 21 by using the coefficient stored in the coefficient table selected by the coefficient table selection unit 26,
- auxiliary information for returning the downmixed signal to a multi-channel signal is generated.
- the coefficients a, b, c, and d used here are represented by (N ⁇ 2) XN, that is, 2 ⁇ N matrix in the present embodiment! /.
- the second encoding unit 25 encodes the auxiliary information output from the second signal output unit 24 to generate a second encoded signal.
- the encoding performed by the second encoding unit 25 may be an encoding system defined by the MPEG standard or the like, similarly to the first encoding unit 23.
- the index n such as a table number, by which the coefficient selected by the coefficient table selecting section 26 can refer to a force simulating what kind of transfer characteristic is set by the third encoding section 28 to the third It is described as an encoded signal.
- multiplexing section 29 outputs the first encoded signal generated by first encoding section 23, the second encoded signal generated by second encoding section 25, and the third encoded signal.
- the bit stream B is generated by multiplexing the third encoded signal generated by the unit 28.
- a plurality of coefficient table groups 27 in which coefficients for realizing various transfer characteristics are described for each frequency, and the plurality of coefficient tables
- a third encoding unit 28 that generates a third encoded signal serving as an index for identifying the coefficient table selected by the coefficient table selecting unit 26.
- FIG. 4 a configuration diagram of an audio signal decoding device according to a third embodiment of the present invention is shown in FIG. 4 and described.
- audio signal decoding apparatus 30 converts a downmix signal from bit stream B in which a first encoded signal and a second encoded signal are multiplexed.
- a demultiplexing unit 31 that extracts only the first encoded signal and a second encoded signal from the first encoded signal;
- a decoding unit 32 that generates a first signal that is an audio signal in the frequency domain of a channel, and a frequency-time conversion unit 33 that generates audio signals L ′ and R ′ in the first signal time domain.
- the first coded signal is a signal in which a downmix signal is coded
- the second coded signal is auxiliary information for returning the downmix signal to a multi-channel signal. Is an encoded signal.
- the demultiplexing unit 31 converts the bit stream B (the first encoded signal and the second encoded signal) generated by the audio signal encoding apparatus according to the first or second embodiment. Signal is multiplexed) and only the first encoded signal is extracted.
- the decoding unit 32 decodes the first encoded signal that is the downmix code extracted by the demultiplexing unit 31, and the two-channel downmix signal is described in the frequency domain. Generate the first signals LO and RO.
- the frequency-time conversion unit 33 converts the first signals LO and RO, which are audio signals in the frequency domain, generated by the decoding unit 32, into Fourier transform, discrete cosine transform, subband filters, and the like.
- the audio signal is converted into a time-domain audio signal by using the technique described below.
- the demultiplexing unit 31 that extracts only the downmix code from the bit stream in which the downmix signal and the auxiliary information are multiplexed, and the downmix code
- a decoding unit 32 for generating a two-channel frequency domain audio signal from the base station, extracting and decoding only the downmix signal and performing no decoding process on the auxiliary information, thereby reducing the amount of computation. Can reproduce the downmix signal.
- audio signal decoding apparatus 40 converts a downmix signal from bit stream B in which the first encoded signal and the second encoded signal are multiplexed. Extract the encoded first encoded signal and the second encoded signal encoded with the auxiliary information A demultiplexing unit 41, a first decoding unit 42 for generating down-mix signals LO and RO, which are audio signals in the frequency domain of the first encoded signal 2 channels, and an auxiliary signal from the second encoded signal; Information 10, a second decoding unit 43 for generating rO, an inverse mixing unit 44 for generating a multi-channel signal from the downmix signal and the auxiliary information, and a time-domain audio signal L, A frequency-time conversion unit 45 that generates R, 1, and r, and a coefficient that is an inverse matrix of a square matrix of NXN including a coefficient represented by an XN matrix that simulates head transfer characteristics during multi-channel signal reproduction With table 46.
- the coefficient table 46 is illustrated, and is stored in a storage medium such
- the demultiplexing unit 41 converts the first encoded signal and the second encoded signal from the bit stream generated by the audio signal encoding device according to the first or second embodiment. Extract the signal and.
- the first decoding unit 42 decodes the first encoded signal, which is the downmix code extracted by the demultiplexing unit 41, and the two-channel downmix signal is described in the frequency domain. Generate the first signal LO, RO.
- the second decoding unit 43 decodes the second coded signal that is the auxiliary information code extracted by the demultiplexing unit 41, and generates an auxiliary signal for generating a multi-channel signal from the first signal.
- a second signal 10, rO, is generated as information.
- the inverse mixing section 44 performs a coefficient table processing on the first signals LO and RO generated by the first decoding section 42 and the second signals 10 and rO generated by the second decoding section 43.
- a matrix operation using 46 multi-channel signals L, R, 1, and r are obtained.
- the coefficients arranged in the coefficient table 46 are inverse matrices of the matrix described in the first embodiment. For example, in a case where signals of four channels are down-mixed, an equation (number) Using the determinant expressed by (13), it is possible to extract the original four-channel signals, L, R, 1, and r.
- Equation 14 Equation 14
- one coefficient table 46 is stored in the storage medium, but the present invention is not limited to this, and a plurality of coefficient tables are stored in the storage medium. Needless to say, it may be stored.
- the inverse mixing unit 44 reduces the power of the third encoding signal included in the bit stream. It is possible to extract an index n indicating the coefficient used at the time of mixing, and select an appropriate coefficient table from a plurality of coefficient tables based on the index n.
- the frequency-time conversion unit 45 uses a method typified by a Fourier transform, a discrete cosine transform, a sub-band filter, or the like, for each of the frequency-domain multi-channel signals output from the inverse mixing unit 44. To convert them into time domain audio signals L, R, 1, r.
- demultiplexing section 41 that extracts a bit stream power downmix code and an auxiliary information code, and a multichannel signal based on the downmix signal and the auxiliary information.
- a coefficient table 46 which is an inverse matrix of a matrix including a coefficient represented by a 2 ⁇ N matrix simulating a head transfer characteristic at the time of reproducing a multi-channel signal. 44 generates the multi-channel signal using the coefficient table 46, so that the original multi-channel signal can be reproduced even if the downmix signal is a signal including a transfer characteristic.
- the audio signal decoding apparatus 50 converts the first code obtained by coding the downmix signal from the bit stream B in which the first coded signal and the second coded signal are multiplexed.
- Demultiplexing section 51 for extracting a second encoded signal in which the encoded signal and the auxiliary information are encoded, and a down-mix signal LO which is an audio signal in the frequency domain of two channels from the first encoded signal.
- RO a second decoding unit 53 for generating auxiliary information 10 and rO from the second encoded signal, and a multi-channel signal from the downmix signal and the auxiliary information.
- An inverse mixing section 54, an output channel switching section 55 for selectively outputting the downmix signal or the multi-channel signal, and a signal selectively output to the output channel switching section 55 Includes a frequency-time conversion unit 56 that generates a time-domain audio signal from it, and a coefficient represented by a 2 XN matrix that is the inverse matrix of a square matrix of NXN that simulates head transfer characteristics during multichannel playback
- a coefficient table 57 is provided.
- the coefficient table 57 is stored in a storage medium such as a memory as shown in FIG.
- the demultiplexing section 51 converts the bit stream B generated by the audio signal coding apparatus according to the first or second embodiment into a first code signal and a second code.
- the first decoding unit 52 outputs the first downmix code extracted by the demultiplexing unit 51. Decode the encoded signal to generate first signals LO and RO in which a two-channel downmix signal is described in the frequency domain.
- second decoding section 53 decodes a second encoded signal that is an auxiliary information code extracted by demultiplexing section 51, and generates an auxiliary signal for generating a multi-channel signal from the first signal.
- a second signal 10, rO, is generated as information.
- the inverse mixing unit 54 generates a coefficient table for the first signals LO and RO generated by the first decoding unit 52 and the second signals 10 and rO generated by the second decoding unit 53.
- a multi-channel signal is obtained by performing a matrix operation using 57.
- the coefficients arranged in the coefficient table 57 are inverse matrices of the matrix described in the first embodiment. For example, in a case where signals of four channels are down-mixed, an equation (number) The original four-channel signal, L, R, 1, and r, can be extracted by the determinant expressed by (15).
- x and y are represented by mathematical formulas (Formula 16).
- one coefficient table 57 is stored in the storage medium.
- the present invention is not limited to this, and a plurality of coefficient tables may be stored in the storage medium. Needless to say, it may be stored.
- the inverse mixing unit 54 when reproducing the bit stream B generated by the audio signal encoding device according to the second embodiment, the inverse mixing unit 54 outputs the third encoded signal included in the bit stream B.
- the force it is possible to extract an index n indicating the coefficient used at the time of downmixing, and to select an appropriate coefficient table from a plurality of coefficient tables based on the index n.
- the output channel switching unit 55 outputs the frequency domain downmix signals LO and RO output from the first decoding unit 52, and outputs the frequency domain downmix signals LO and RO from the inverse mixing unit 54. Selects whether to output channel signals L, R, 1, r. Whether to output the frequency-domain downmix signals LO and RO output from the first decoding unit 52, and whether to output the frequency-domain multi-channel signals L, R, 1, and r output from the inverse mixing unit 54
- the setting of the output channel switching unit 55 is, for example, to output the signals LO and RO output from the first decoding unit 52 when using headphones or 2-channel speed, and to reverse when using a 4-channel speaker.
- a detection unit for detecting a device connected to the output side is provided, and when it is detected that a headphone or a two-channel speaker is connected to the output side, the output channel switching unit 55 is operated to operate the first channel.
- the signals LO and RO output from the decoding unit 52 are output, and it is detected that a 4-channel speaker is connected. If the signal is output, the output channel switching unit 55 may be operated to output the signals L, R, 1, r output from the inverse mixing unit 54.
- the frequency-time conversion unit 56 converts the frequency-domain signals L, R, 1, r, or LO, RO, which are switched and output to the output channel switching unit 55, into the time-domain audio signals, respectively. Convert to
- the demultiplexing unit 51 that extracts the bit stream power and the downmix code and the auxiliary information code, and the multichannel signal from the downmix signal and the auxiliary information.
- An inverse mixing unit 54 for generating the output signal
- an output channel switching unit 55 for selectively switching between outputting the downmix signal and the multichannel signal
- a signal output to the output channel switching unit 55 By providing a frequency-time conversion unit 56 that generates a time-domain audio signal, for example, when a headphone or two speakers are used, a 2-channel downmix signal is reproduced. For example, four speakers are used. In this case, the operation of reproducing a multi-channel signal can be realized by common components.
- the present invention is not limited to this example in which the number of channels of a multi-channel signal is four. If the number of channels of the multi-channel signal is three or more, the multi-channel signal of any number may be used.
- the present invention is generally widely used, for example. Needless to say! /.
- the audio signal encoding device and the audio signal decoding device convert the downmix signal into a signal obtained by filtering with a desired transfer function, and reproduce only the first encoded signal. Even in such a case, the spatial information of the multi-channel signal is reflected, and the use of the second encoded signal has the effect that the original multi-channel signal can be reproduced. And decodes the encoded downmix signal to reflect the spatial information or the original 2-channel signal. Since the multi-channel signal can be restored, it can be applied to portable equipment that requires particularly small equipment, such as inexpensive decoders and headphones.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Signal Processing (AREA)
- Acoustics & Sound (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Analysis (AREA)
- General Physics & Mathematics (AREA)
- Algebra (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Stereophonic System (AREA)
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP05748600A EP1768451A4 (en) | 2004-06-14 | 2005-06-13 | ACOUSTIC SIGNAL ENCODING DEVICE AND ACOUSTIC SIGNAL DECODING DEVICE |
US11/570,471 US20080052089A1 (en) | 2004-06-14 | 2005-06-13 | Acoustic Signal Encoding Device and Acoustic Signal Decoding Device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2004-175656 | 2004-06-14 | ||
JP2004175656A JP2005352396A (ja) | 2004-06-14 | 2004-06-14 | 音響信号符号化装置および音響信号復号装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2005122639A1 true WO2005122639A1 (ja) | 2005-12-22 |
Family
ID=35503542
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2005/010811 WO2005122639A1 (ja) | 2004-06-14 | 2005-06-13 | 音響信号符号化装置および音響信号復号装置 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20080052089A1 (ja) |
EP (1) | EP1768451A4 (ja) |
JP (1) | JP2005352396A (ja) |
WO (1) | WO2005122639A1 (ja) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008542815A (ja) * | 2005-05-26 | 2008-11-27 | エルジー エレクトロニクス インコーポレイティド | オーディオ信号のデコーディング方法及び装置 |
JP2009531886A (ja) * | 2006-03-24 | 2009-09-03 | ドルビー スウェーデン アクチボラゲット | 多チャンネル信号のパラメータ表現からの空間ダウンミックスの生成 |
CN102292768B (zh) * | 2009-01-20 | 2013-03-27 | Lg电子株式会社 | 用于处理音频信号的装置及其方法 |
US8620008B2 (en) | 2009-01-20 | 2013-12-31 | Lg Electronics Inc. | Method and an apparatus for processing an audio signal |
Families Citing this family (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102004043521A1 (de) * | 2004-09-08 | 2006-03-23 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Vorrichtung und Verfahren zum Erzeugen eines Multikanalsignals oder eines Parameterdatensatzes |
DE102005010057A1 (de) | 2005-03-04 | 2006-09-07 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Vorrichtung und Verfahren zum Erzeugen eines codierten Stereo-Signals eines Audiostücks oder Audiodatenstroms |
EP1905002B1 (en) * | 2005-05-26 | 2013-05-22 | LG Electronics Inc. | Method and apparatus for decoding audio signal |
WO2007004833A2 (en) * | 2005-06-30 | 2007-01-11 | Lg Electronics Inc. | Method and apparatus for encoding and decoding an audio signal |
EP1922721A4 (en) | 2005-08-30 | 2011-04-13 | Lg Electronics Inc | AUDIO SIGNAL DECODING METHOD |
RU2419249C2 (ru) * | 2005-09-13 | 2011-05-20 | Кониклейке Филипс Электроникс Н.В. | Аудиокодирование |
WO2007083952A1 (en) * | 2006-01-19 | 2007-07-26 | Lg Electronics Inc. | Method and apparatus for processing a media signal |
JP4951985B2 (ja) * | 2006-01-30 | 2012-06-13 | ソニー株式会社 | 音声信号処理装置、音声信号処理システム、プログラム |
EP1984913A4 (en) | 2006-02-07 | 2011-01-12 | Lg Electronics Inc | DEVICE AND METHOD FOR CODING / DECODING A SIGNAL |
CA2646278A1 (en) * | 2006-02-09 | 2007-08-16 | Lg Electronics Inc. | Method for encoding and decoding object-based audio signal and apparatus thereof |
KR101358700B1 (ko) * | 2006-02-21 | 2014-02-07 | 코닌클리케 필립스 엔.브이. | 오디오 인코딩 및 디코딩 |
KR100829560B1 (ko) | 2006-08-09 | 2008-05-14 | 삼성전자주식회사 | 멀티채널 오디오 신호의 부호화/복호화 방법 및 장치,멀티채널이 다운믹스된 신호를 2 채널로 출력하는 복호화방법 및 장치 |
US8271290B2 (en) * | 2006-09-18 | 2012-09-18 | Koninklijke Philips Electronics N.V. | Encoding and decoding of audio objects |
EP2082396A1 (en) * | 2007-10-17 | 2009-07-29 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio coding using downmix |
CN101809653A (zh) * | 2007-12-06 | 2010-08-18 | Lg电子株式会社 | 用于处理音频信号的方法和装置 |
EP2229677B1 (en) | 2007-12-18 | 2015-09-16 | LG Electronics Inc. | A method and an apparatus for processing an audio signal |
JP2011002574A (ja) * | 2009-06-17 | 2011-01-06 | Nippon Hoso Kyokai <Nhk> | 3次元音響符号化装置、3次元音響復号装置、符号化プログラム及び復号プログラム |
US20100324915A1 (en) * | 2009-06-23 | 2010-12-23 | Electronic And Telecommunications Research Institute | Encoding and decoding apparatuses for high quality multi-channel audio codec |
JP5345024B2 (ja) * | 2009-08-28 | 2013-11-20 | 日本放送協会 | 3次元音響符号化装置、3次元音響復号装置、符号化プログラム及び復号プログラム |
JP5680391B2 (ja) * | 2010-12-07 | 2015-03-04 | 日本放送協会 | 音響符号化装置及びプログラム |
US9412385B2 (en) * | 2013-05-28 | 2016-08-09 | Qualcomm Incorporated | Performing spatial masking with respect to spherical harmonic coefficients |
CN107533844B (zh) * | 2015-04-30 | 2021-03-23 | 华为技术有限公司 | 音频信号处理装置和方法 |
EP3411875B1 (en) * | 2016-02-03 | 2020-04-08 | Dolby International AB | Efficient format conversion in audio coding |
CN110853658B (zh) * | 2019-11-26 | 2021-12-07 | 中国电影科学技术研究所 | 音频信号的下混方法、装置、计算机设备及可读存储介质 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000156038A (ja) * | 1998-11-16 | 2000-06-06 | Victor Co Of Japan Ltd | 音声符号化装置、記録媒体、音声復号化装置及び音声伝送方法並びにコンピュータ記録媒体 |
JP2001195096A (ja) * | 1998-11-16 | 2001-07-19 | Victor Co Of Japan Ltd | 音声符号化装置 |
JP2002217841A (ja) * | 2001-01-15 | 2002-08-02 | Sony Corp | オーディオ信号再生装置及び方法 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3263484B2 (ja) * | 1993-06-07 | 2002-03-04 | 三洋電機株式会社 | 音声帯域分割復号化装置 |
US5438623A (en) * | 1993-10-04 | 1995-08-01 | The United States Of America As Represented By The Administrator Of National Aeronautics And Space Administration | Multi-channel spatialization system for audio signals |
JP2766466B2 (ja) * | 1995-08-02 | 1998-06-18 | 株式会社東芝 | オーディオ方式、その再生方法、並びにその記録媒体及びその記録媒体への記録方法 |
JPH09224300A (ja) * | 1996-02-16 | 1997-08-26 | Sanyo Electric Co Ltd | 音像位置の補正方法及び装置 |
US5912976A (en) * | 1996-11-07 | 1999-06-15 | Srs Labs, Inc. | Multi-channel audio enhancement system for use in recording and playback and methods for providing same |
DE19721487A1 (de) * | 1997-05-23 | 1998-11-26 | Thomson Brandt Gmbh | Verfahren und Vorrichtung zur Fehlerverschleierung bei Mehrkanaltonsignalen |
JPH1132400A (ja) * | 1997-07-14 | 1999-02-02 | Matsushita Electric Ind Co Ltd | デジタル信号再生装置 |
US6757659B1 (en) * | 1998-11-16 | 2004-06-29 | Victor Company Of Japan, Ltd. | Audio signal processing apparatus |
EP1905002B1 (en) * | 2005-05-26 | 2013-05-22 | LG Electronics Inc. | Method and apparatus for decoding audio signal |
-
2004
- 2004-06-14 JP JP2004175656A patent/JP2005352396A/ja active Pending
-
2005
- 2005-06-13 WO PCT/JP2005/010811 patent/WO2005122639A1/ja active Application Filing
- 2005-06-13 EP EP05748600A patent/EP1768451A4/en not_active Withdrawn
- 2005-06-13 US US11/570,471 patent/US20080052089A1/en not_active Abandoned
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000156038A (ja) * | 1998-11-16 | 2000-06-06 | Victor Co Of Japan Ltd | 音声符号化装置、記録媒体、音声復号化装置及び音声伝送方法並びにコンピュータ記録媒体 |
JP2001195096A (ja) * | 1998-11-16 | 2001-07-19 | Victor Co Of Japan Ltd | 音声符号化装置 |
JP2002217841A (ja) * | 2001-01-15 | 2002-08-02 | Sony Corp | オーディオ信号再生装置及び方法 |
Non-Patent Citations (1)
Title |
---|
See also references of EP1768451A4 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008542815A (ja) * | 2005-05-26 | 2008-11-27 | エルジー エレクトロニクス インコーポレイティド | オーディオ信号のデコーディング方法及び装置 |
JP2009531886A (ja) * | 2006-03-24 | 2009-09-03 | ドルビー スウェーデン アクチボラゲット | 多チャンネル信号のパラメータ表現からの空間ダウンミックスの生成 |
CN102292768B (zh) * | 2009-01-20 | 2013-03-27 | Lg电子株式会社 | 用于处理音频信号的装置及其方法 |
US8620008B2 (en) | 2009-01-20 | 2013-12-31 | Lg Electronics Inc. | Method and an apparatus for processing an audio signal |
US9484039B2 (en) | 2009-01-20 | 2016-11-01 | Lg Electronics Inc. | Method and an apparatus for processing an audio signal |
US9542951B2 (en) | 2009-01-20 | 2017-01-10 | Lg Electronics Inc. | Method and an apparatus for processing an audio signal |
Also Published As
Publication number | Publication date |
---|---|
EP1768451A1 (en) | 2007-03-28 |
JP2005352396A (ja) | 2005-12-22 |
EP1768451A4 (en) | 2009-02-25 |
US20080052089A1 (en) | 2008-02-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2005122639A1 (ja) | 音響信号符号化装置および音響信号復号装置 | |
US20200335115A1 (en) | Audio encoding and decoding | |
KR101158698B1 (ko) | 복수-채널 인코더, 입력 신호를 인코딩하는 방법, 저장 매체, 및 인코딩된 출력 데이터를 디코딩하도록 작동하는 디코더 | |
KR100754220B1 (ko) | Mpeg 서라운드를 위한 바이노럴 디코더 및 그 디코딩방법 | |
JP4943418B2 (ja) | スケーラブルマルチチャネル音声符号化方法 | |
JP5185340B2 (ja) | マルチチャネルオーディオ信号を表示するための装置と方法 | |
JP5592974B2 (ja) | 多チャネルダウンミックスされたオブジェクト符号化における強化された符号化及びパラメータ表現 | |
KR100888474B1 (ko) | 멀티채널 오디오 신호의 부호화/복호화 장치 및 방법 | |
CN101356573B (zh) | 对双耳音频信号的解码的控制 | |
WO2005112002A1 (ja) | オーディオ信号符号化装置及びオーディオ信号復号化装置 | |
KR20060060052A (ko) | 호환성 다중-채널 코딩/디코딩 | |
KR20160033734A (ko) | 렌더러 제어 공간 업믹스 | |
JP5483813B2 (ja) | マルチチャネル音声音響信号符号化装置および方法、並びにマルチチャネル音声音響信号復号装置および方法 | |
WO2006011367A1 (ja) | オーディオ信号符号化装置および復号化装置 | |
MX2008010631A (es) | Codificacion y decodificacion de audio | |
MX2008009565A (en) | Apparatus and method for encoding/decoding signal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS KE KG KM KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 11570471 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWW | Wipo information: withdrawn in national office |
Country of ref document: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2005748600 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 2005748600 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 11570471 Country of ref document: US |