WO2015041070A1 - 符号化装置および方法、復号化装置および方法、並びにプログラム - Google Patents
符号化装置および方法、復号化装置および方法、並びにプログラム Download PDFInfo
- Publication number
- WO2015041070A1 WO2015041070A1 PCT/JP2014/073465 JP2014073465W WO2015041070A1 WO 2015041070 A1 WO2015041070 A1 WO 2015041070A1 JP 2014073465 W JP2014073465 W JP 2014073465W WO 2015041070 A1 WO2015041070 A1 WO 2015041070A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- gain
- value
- difference
- encoding
- difference value
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 132
- 238000004364 calculation method Methods 0.000 claims abstract description 115
- 238000012937 correction Methods 0.000 claims abstract description 40
- 230000008859 change Effects 0.000 claims description 133
- 230000008569 process Effects 0.000 claims description 90
- 238000012545 processing Methods 0.000 claims description 53
- 230000005236 sound signal Effects 0.000 claims description 40
- 238000005516 engineering process Methods 0.000 description 25
- 238000013139 quantization Methods 0.000 description 18
- 230000003044 adaptive effect Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 7
- 230000005540 biological transmission Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000012886 linear function Methods 0.000 description 3
- 102100039385 Histone deacetylase 11 Human genes 0.000 description 2
- 108700038332 Histone deacetylase 11 Proteins 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 101100024330 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) MSB1 gene Proteins 0.000 description 1
- ORFPWVRKFLOQHK-UHFFFAOYSA-N amicarbazone Chemical compound CC(C)C1=NN(C(=O)NC(C)(C)C)C(=O)N1N ORFPWVRKFLOQHK-UHFFFAOYSA-N 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000001151 other effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/167—Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/008—Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/13—Aspects of volume control, not necessarily automatic, in stereophonic sound systems
Definitions
- the present technology relates to an encoding apparatus and method, a decoding apparatus and method, and a program, and in particular, an encoding apparatus and method, a decoding apparatus and a method, which are capable of obtaining sound with an appropriate volume with a smaller code amount. , As well as programs.
- MPEG Motion Picture Experts Group
- AAC Advanced Audio Coding
- DRC Dynamic Range Compression
- the audio signal can be downmixed on the playback side, or appropriate volume control can be performed by DRC.
- the playback environment has various cases such as 2ch, 5.1ch, 7.1ch, and a single downmix.
- the coefficient makes it difficult to obtain a sufficient sound pressure or causes clipping.
- auxiliary information such as downmix and DRC is encoded as a gain on an MDCT (Modified Discrete Cosine Transform) area. Therefore, if the 11.1ch bit stream is played back as it is in 11.1ch, or downmixed to 2ch and played back, the sound pressure level may be low, or conversely, it may be greatly clipped, and the appropriate volume It was difficult to get the voice.
- MDCT Modified Discrete Cosine Transform
- the bitstream code amount increases.
- the present technology has been made in view of such a situation, and makes it possible to obtain sound with an appropriate volume with a smaller code amount.
- the encoding device includes a gain calculation unit that calculates a first gain value and a second gain value for volume correction for each frame of an audio signal, and the first gain value.
- a first difference value of the second gain value is obtained, or between the first gain value and the first gain value of the adjacent frame, or of the frame adjacent to the first difference value.
- a gain encoding unit that obtains a second difference value from the first difference value and encodes the first difference value or information based on the second difference value;
- the gain encoding unit determines the first difference value between the first gain value and the second gain value at a plurality of positions in the frame, or a plurality of positions in the frame.
- the second difference value can be obtained between the first gain values at or between the first difference values at a plurality of positions in the frame.
- the gain encoding unit can determine the second difference value based on a gain change point at which a slope of the first gain value or the first difference value in the frame changes.
- the gain encoding unit can obtain the second difference value by obtaining a difference between the gain change point and another gain change point.
- the gain encoding unit can obtain the second difference value by obtaining a difference between the gain change point and a predicted value by primary prediction using another gain change point.
- the gain encoding unit can encode information based on the number of gain change points in the frame and the second difference value at the gain change points.
- the gain calculation unit can calculate the second gain value for each of the audio signals having different numbers of channels obtained by downmixing.
- the gain encoding unit can select whether or not to obtain the first difference value based on the correlation between the first gain value and the second gain value.
- the gain encoding unit can variable-length encode the first difference value or the second difference value.
- the encoding method or program calculates a first gain value and a second gain value for volume correction for each frame of an audio signal, and the first gain value and the first gain value are calculated.
- a first gain value and a second gain value for volume correction are calculated for each frame of the audio signal, and the first gain value and the second gain value are calculated.
- the first difference value is obtained, or the first difference value of the frame adjacent to or between the first gain value and the first gain value of the adjacent frame.
- the second difference value is obtained between and the information based on the first difference value or the second difference value is encoded.
- the decoding device relates to the first gain value and the second gain value for the first gain value and the second gain value for volume correction calculated for each frame of the audio signal.
- the first difference value of the first gain value, or between the first gain value and the first gain value of the adjacent frame, or the first difference of the frame adjacent to the first difference value A demultiplexing unit that demultiplexes an input code string into a gain code string generated by obtaining a second difference value between the values and a signal code string obtained by encoding the speech signal
- a signal decoding unit that decodes the signal code string; and gain decoding that decodes the gain code string and outputs the first gain value or the second gain value for the sound volume correction.
- the first difference value is encoded by obtaining a difference value between the first gain value and the second gain value at a plurality of positions in the frame
- the second difference value is The encoding is performed by obtaining a difference value between the first gain values at a plurality of positions in the frame or between the first difference values at a plurality of positions in the frame. Can be.
- the second difference value may be encoded by being obtained from a gain change point at which a slope of the first gain value or the first difference value in the frame changes.
- the second difference value can be encoded by being obtained from a difference between the gain change point and another gain change point.
- the second difference value can be encoded by being obtained from a difference between the gain change point and a predicted value by primary prediction using another gain change point.
- the information based on the number of gain change points in the frame and the second difference value at the gain change point may be encoded as the second difference value.
- the decoding method or program provides the first gain value and the second gain value for the volume correction calculated for each frame of the audio signal.
- the input code string is demultiplexed into a gain code string generated by obtaining a second difference value from the difference value and a signal code string obtained by encoding the speech signal, and the signal code Decoding a sequence, decoding the gain code sequence, and outputting the first gain value or the second gain value for the sound volume correction.
- the first gain value and the second gain value for the first gain value and the second gain value for volume correction calculated for each frame of the audio signal.
- the input code string is demultiplexed into the gain code string generated by obtaining the second difference value between the signal and the signal code string obtained by encoding the speech signal, and the signal code string is decoded. Then, the gain code string is decoded, and the first gain value or the second gain value for the sound volume correction is output.
- FIG. 1 is a diagram showing information for one frame included in a bit stream obtained by encoding an audio signal.
- the information for one frame includes auxiliary information and main information.
- Main information is the main information for constructing an output time-series signal that is an encoded audio signal such as a scale factor and MDCT coefficient, and auxiliary information is an output time-series signal generally called metadata.
- This auxiliary information includes gain information and downmix information.
- the downmix information is obtained by encoding gain coefficients for converting an audio signal composed of a plurality of channels such as 11.1ch into an audio signal having a smaller number of channels in the form of an index.
- the MDCT coefficient of each channel is multiplied by the gain coefficient obtained from the downmix information, and the MDCT coefficient of each channel multiplied by the gain coefficient is added, so that the MDCT of the output channel after downmixing A coefficient can be obtained.
- the gain information is obtained by encoding a gain coefficient for converting a pair of groups including all channels or a specific channel into another signal level in the form of an index.
- the gain information can be obtained by multiplying the MDCT coefficient of each channel by the gain coefficient obtained from the gain information at the time of decoding.
- FIG. 2 is a diagram showing a configuration of a decoding apparatus that performs MPEG AAC decoding processing.
- the demultiplexing circuit 21 converts the input code string into a signal code corresponding to main information. Demultiplexing into a column and gain information and downmix information corresponding to auxiliary information.
- the decoding / inverse quantization circuit 22 performs decoding and inverse quantization on the signal code string supplied from the demultiplexing circuit 21 and supplies the MDCT coefficient obtained as a result to the gain application circuit 23. Further, the gain application circuit 23 multiplies the MDCT coefficient by each of the gain information obtained from the demultiplexing circuit 21 and the gain coefficient obtained from the downmix information based on the downmix control information and the DRC control information. The gain applied MDCT coefficient is output.
- the downmix control information and the DRC control information are information given from the host control device, and are information indicating whether or not to perform the downmix and DRC processing.
- the inverse MDCT circuit 24 performs inverse MDCT processing on the gain application MDCT coefficient from the gain application circuit 23 and supplies the obtained inverse MDCT signal to the windowing / OLA circuit 25. Then, the windowing / OLA circuit 25 performs windowing and overlap addition processing on the supplied inverse MDCT signal to obtain an output time-series signal that is an output of the MPEG AAC decoding device 11.
- auxiliary information of downmix and DRC is encoded as a gain in the MDCT area. Therefore, for example, if the 11.1ch bit stream is played back as it is in 11.1ch, or downmixed to 2ch and played back, the sound pressure level may be low, or conversely, it may be greatly clipped. In some cases, loud sound could not be obtained.
- L, R, C, Sl, and Sr indicate 5.1 channel signal left channel, right channel, center channel, side left channel, and side right channel signals, respectively.
- Lt and Rt indicate left channel and right channel signals after downmixing to 2ch, respectively.
- k is a coefficient for adjusting the mixing ratio of the side channel, and the coefficient k is 1 / sqrt (2), 1/2, (1 / 2sqrt (2)), and 0.
- the case where a clip after downmixing occurs is when all the channels are signals of the maximum amplitude. That is, assuming that the amplitudes of the signals of the L, R, C, Sl, and Sr channels are all 1.0, according to Equation (1), the amplitudes of the Lt and Rt signals are also independent of the value of k. 1.0. That is, it is guaranteed that it is a downmix type in which clip distortion does not occur.
- L, R, C, Sl, Sr, Lt, Rt, and k are the same as those in the formula (1).
- the above example is a case where the number of channels is 5.1, and in the case where 11.1 channels or more channels are encoded and downmixed, the occurrence of clip distortion and the level change become more remarkable.
- a method of encoding an index of a known DRC characteristic can be considered.
- down-mixing is performed by performing DRC processing on the PCM (Pulse Code Modulation) signal after decoding, that is, the above-described output time series signal so that the DRC characteristic of this index is obtained at the time of decoding. It is possible to suppress a decrease in sound pressure level and clipping due to the presence or absence of noise.
- PCM Pulse Code Modulation
- the decryption device side since the decryption device side has the DRC characteristic information, the content creator side cannot express free DRC characteristics, and the decryption device side does not perform the DRC processing itself. This increases the amount of calculation.
- the pattern of the number of channels to be downmixed also increases. For example, a case in which a 11.1ch signal is downmixed to 7.1ch, 5.1ch, and 2ch is conceivable, and when a plurality of gains are sent as described above, the code amount is increased four times.
- the content creator can freely set the DRC gain on the encoding device side, and the code amount required for transmission can be reduced while reducing the calculation load on the decoding device side. . That is, the present technology makes it possible to obtain sound with an appropriate volume with a smaller code amount.
- FIG. 3 is a diagram illustrating a functional configuration example of an embodiment of an encoding device to which the present technology is applied.
- the encoding device 51 shown in FIG. 3 includes a first sound pressure level calculation circuit 61, a first gain calculation circuit 62, a downmix circuit 63, a second sound pressure level calculation circuit 64, a second gain calculation circuit 65, and a gain encoding.
- a circuit 66, a signal encoding circuit 67, and a multiplexing circuit 68 are provided.
- the first sound pressure level calculation circuit 61 calculates the sound pressure level of each channel constituting the input time-series signal based on the input time-series signal that is the supplied multi-channel audio signal, A representative value of the sound pressure level is obtained as the first sound pressure level.
- the calculation method of the sound pressure level is the maximum value of the time frame of the audio signal of the channel that constitutes the input time series signal or RMS (Root Mean Square), etc. A sound pressure level is obtained for each channel constituting the series signal.
- a representative value calculation method for the first sound pressure level for example, a method using the maximum value of the sound pressure levels of each channel as a representative value, or a specific calculation formula from the sound pressure level of each channel.
- a technique for calculating one representative value can be used.
- the representative value can be calculated using a loudness calculation formula described in ITU-R BS.1770-2 (03/2011).
- a time frame that is a processing unit in the first sound pressure level calculation circuit 61 is synchronized with a time frame of an input time-series signal that is processed by a signal encoding circuit 67 described later, and a time frame in the signal encoding circuit 67.
- the time frame is shorter than the frame.
- the first sound pressure level calculation circuit 61 supplies the obtained first sound pressure level to the first gain calculation circuit 62.
- the first sound pressure level obtained in this way indicates a representative sound pressure level of an input time-series signal channel composed of audio signals of a predetermined number of channels such as 11.1ch.
- the first gain calculation circuit 62 calculates a first gain based on the first sound pressure level supplied from the first sound pressure level calculation circuit 61 and supplies the first gain to the gain encoding circuit 66.
- the first gain indicates a gain when the volume of the input time-series signal is corrected so that sound having an optimal volume can be obtained when the input time-series signal is reproduced on the decoding device side.
- sound with an optimal volume can be obtained on the playback side by correcting the volume of the input time-series signal with the first gain.
- DRC characteristics as shown in FIG. 4 can be used.
- the horizontal axis indicates the input sound pressure level (dBFS), that is, the first sound pressure level
- the vertical axis indicates the output sound pressure level (dBFS), that is, the input time series signal by the DRC processing.
- the sound pressure level after correction when level correction (volume correction) is performed is shown.
- the broken line C1 and the broken line C2 indicate the relationship between the input / output sound pressure levels.
- the sound volume is corrected so that the sound pressure level of the input time-series signal becomes ⁇ 27 dBFS. Therefore, in this case, the first gain is -27 dBFS.
- the first gain is set to -21 dBFS.
- DRC_MODE1 a mode in which volume correction is performed with the DRC characteristic indicated by the broken line C1
- DRC_MODE2 a mode in which the volume is corrected with the DRC characteristic indicated by the broken line C2.
- the first gain is determined according to the DRC characteristic of the designated mode such as DRC_MODE1 or DRC_MODE2. This first gain is output as a gain waveform synchronized with the time frame of the signal encoding circuit 67. That is, the first gain calculation circuit 62 calculates the first gain for each sample constituting the time frame that is the processing target of the input time-series signal.
- the downmix circuit 63 performs downmix processing on the input time-series signal supplied to the encoding device 51 using the downmix information supplied from the host control device, The resulting downmix signal is supplied to the second sound pressure level calculation circuit 64.
- one downmix signal may be output from the downmix circuit 63, or a plurality of downmix signals may be output.
- downmix processing is performed on 11.1ch input time-series signals, downmix signal that is 2ch audio signal, downmix signal that is 5.1ch audio signal, and downmix signal that is 7.1ch audio signal May be generated.
- the second sound pressure level calculation circuit 64 calculates a second sound pressure level based on the downmix signal that is a multi-channel audio signal supplied from the downmix circuit 63 and supplies the second sound pressure level to the second gain calculation circuit 65.
- the second sound pressure level calculation circuit 64 calculates the second sound pressure level for each downmix signal by using the same method as the first sound pressure level calculation method in the first sound pressure level calculation circuit 61.
- the second gain calculation circuit 65 calculates a second gain based on the second sound pressure level for each downmix signal with respect to the second sound pressure level of each downmix signal supplied from the second sound pressure level calculation circuit 64. And supplied to the gain encoding circuit 66.
- the second gain is calculated by the DRC characteristic and the gain calculation method used in the first gain calculation circuit 62.
- the second gain indicates a gain when the volume of the downmix signal is corrected so that an audio having an optimum volume can be obtained when the input time-series signal is downmixed and reproduced on the decoding device side.
- sound having an optimal volume can be obtained by correcting the volume of the obtained downmix signal with the second gain.
- Such a second gain can be said to be a gain for correcting the sound volume to a more optimal volume according to the DRC characteristic and correcting the sound pressure level that changes due to the downmix.
- the gain waveform g (k, n) in the time frame k can be obtained by calculating the following equation (3).
- Equation (3) n represents a time sample that takes a value from 0 to N ⁇ 1 when the time frame length is N, and Gt (k) represents a target gain in the time frame k. Yes.
- a in the equation (3) is a value determined by the following equation (4).
- Equation (4) Fs represents the sampling frequency (Hz), Tc (k) represents the time constant in the time frame k, and exp (x) represents the exponential function.
- Gt (k) is the first sound pressure level or the second sound pressure level obtained by the first sound pressure level calculation circuit 61 or the second sound pressure level calculation circuit 64 and the DRC characteristic shown in FIG. Can be obtained by:
- the time constant Tc (k) can be obtained from the difference between Gt (k) described above and the gain g (k ⁇ 1, N ⁇ 1) of the previous time frame.
- This time constant is generally different depending on the desired DRC characteristics. For example, a device that records and reproduces the voice of a person such as a voice recorder has a short time constant. On the other hand, a device that uses music recording and reproduction such as a portable music player has a long time constant for release. It is common. In the description of the present embodiment, for simplicity of explanation, if Gt (k) -g (k-1, N-1) is less than zero, the time constant is set to 20 milliseconds as an attack, and should be greater than or equal to zero. In this case, the time constant is set to 2 seconds as a release.
- the gain waveform g (k, n) serving as the first gain and the second gain can be obtained.
- the gain encoding circuit 66 encodes the first gain supplied from the first gain calculation circuit 62 and the second gain supplied from the second gain calculation circuit 65, and obtains the result.
- the obtained gain code string is supplied to the multiplexing circuit 68.
- the difference is appropriately calculated and encoded.
- the difference between different gains is a difference between the first gain and the second gain, or a difference between different second gains.
- the signal encoding circuit 67 encodes the supplied input time series signal by a predetermined encoding method, for example, a general encoding method represented by an encoding method by MEPG AAC, and the signal code obtained as a result
- the column is supplied to the multiplexing circuit 68.
- the multiplexing circuit 68 multiplexes the gain code sequence supplied from the gain encoding circuit 66, the downmix information supplied from the host controller, and the signal code sequence supplied from the signal encoding circuit 67, and the result The obtained output code string is output.
- the gain waveforms shown in FIG. 5 are obtained as the first gain and the second gain supplied to the gain encoding circuit 66.
- the horizontal axis indicates time, and the vertical axis indicates gain (dB).
- a polygonal line C21 represents the gain of the 11.1ch input time series signal obtained as the first gain
- a polygonal line C22 represents the gain of the 5.1ch downmix signal obtained as the second gain.
- the 5.1ch downmix signal is an audio signal obtained by downmixing the 11.1ch input time-series signal.
- the broken line C23 represents the difference between the first gain and the second gain.
- the main gain information that is the difference among the gain information of the first gain and the second gain is also referred to as a master gain sequence
- gain information for which a difference value from the master gain sequence is obtained is also referred to as a slave gain sequence. I will do it.
- the master gain sequence and the slave gain sequence are not particularly distinguished, they are referred to as gain sequences.
- FIG. 6 is a diagram illustrating an example of the relationship between the master gain sequence and the slave gain sequence.
- the horizontal axis indicates a time frame, and the vertical axis indicates each gain sequence.
- GAIN_SEQ0 represents the 11.1ch gain sequence, that is, the first gain of the 11.1ch input time series signal that is not downmixed.
- GAIN_SEQ1 represents a 7.1ch gain sequence, that is, the second gain of the 7.1ch downmix signal obtained by downmixing.
- GAIN_SEQ2 represents the 5.1ch gain sequence, that is, the second gain of the 5.1ch downmix signal
- GAIN_SEQ3 represents the 2ch gain sequence, that is, the second gain of the 2ch downmix signal.
- M1 represents the first master gain sequence
- M2 represents the second master gain sequence
- end point of the arrow marked “M1” or “M2” indicates a slave gain sequence with respect to the master gain sequence represented by “M1” or “M2”.
- the 11.1ch gain sequence is the master gain sequence.
- the other 7.1ch, 5.1ch, and 2ch gain sequences are slave gain sequences with respect to the 11.1ch gain sequence.
- the 11.1ch gain sequence that is the master gain sequence is encoded as it is.
- the 7.1ch, 5.1ch, and 2ch gain sequences which are slave gain sequences, are obtained with a difference from the master gain sequence, and the difference is encoded.
- Information obtained by encoding each gain sequence in this way is used as a gain code string.
- information indicating the gain encoding mode which is the relationship between the master gain sequence and the slave gain sequence, is encoded into a gain encoding mode header HD11 and added to the output code sequence together with the gain code sequence. Is done.
- the gain encoding mode header is generated when the gain encoding mode in the time frame to be processed is different from the gain encoding mode in the immediately preceding time frame, and is added to the output code string.
- the gain encoding mode is the same as that of the time frame J in the time frame J + 1 that is the next frame of the time frame J, the encoding of the gain encoding mode header is not performed.
- the gain encoding mode header HD12 is output. It is added to the code string.
- the 11.1ch gain sequence is the master gain sequence
- the 7.1ch gain sequence is the slave gain sequence for the 11.1ch gain sequence.
- the 5.1ch gain sequence is the second master gain sequence
- the 2ch gain sequence is the slave gain sequence for the 5.1ch gain sequence.
- the bit stream output from the encoding device 51 includes an output code string of each time frame, and each output code string includes auxiliary information and main information.
- a gain encoding mode header corresponding to the gain encoding mode header HD11 shown in FIG. 6 a gain code sequence, and downmix information are included in the output code sequence as components of auxiliary information.
- the gain code string is information obtained by encoding four gain sequences of 11.1ch to 2ch in the example of FIG.
- the downmix information is the same as the downmix information shown in FIG. 1 and is information (index) for obtaining a gain coefficient necessary for downmixing the input time-series signal on the decoding device side. .
- the output code string of the time frame J includes a signal code string as main information.
- the gain encoding mode does not change, so the auxiliary information does not include the gain encoding mode header, and the gain code sequence and downmix information as auxiliary information, A signal code string as information is included in the output code string.
- the output encoding includes a gain encoding mode header, gain code sequence, downmix information as auxiliary information, and a signal code sequence as main information. Yes.
- gain encoding mode header and gain code string shown in FIG. 7 will be described in detail below.
- the gain encoding mode header included in the output code string is configured as shown in FIG.
- the gain encoding mode header shown in FIG. 8 includes GAIN_SEQ_NUM, GAIN_SEQ0, GAIN_SEQ1, GAIN_SEQ2, and GAIN_SEQ3, and these data are encoded in 2 bytes each.
- the data of each gain sequence mode of GAIN_SEQ0 to GAIN_SEQ3 is configured as shown in FIG. 9, for example.
- the gain sequence mode data includes MASTER_FLAG, DIFF_SEQ_ID, DMIX_CH_CFG_ID, and DRC_MODE_ID, and these four elements are each encoded with 4 bits.
- MASTER_FLAG is an identifier indicating whether or not the gain sequence described in the data of the gain sequence mode is a master gain sequence.
- the gain sequence is a master gain sequence
- the gain sequence is a slave gain sequence
- DIFF_SEQ_ID is an identifier indicating which master gain sequence the gain sequence described in the gain sequence mode data is to be calculated, and is read when the value of MASTER_FLAG is “0”.
- DMIX_CH_CFG_ID is channel configuration information corresponding to this gain sequence, that is, information indicating the number of channels of multi-channel audio signals such as 11.1ch and 7.1ch.
- DRC_MODE_ID is an identifier representing the characteristics of DRC used in gain calculation in the first gain calculation circuit 62 or the second gain calculation circuit 65. For example, in the example shown in FIG. 4, it indicates either DRC_MODE1 or DRC_MODE2. Information.
- DRC_MODE_ID may differ between the master gain sequence and slave gain sequence. That is, a difference may be obtained between gain sequences for which gains are obtained according to different DRC characteristics.
- MASTER_FLAG is set to 1
- DIFF_SEQ_ID is set to 0
- DMIX_CH_CFG_ID is set to an identifier indicating 11.1ch
- DRC_MODE_ID is set to an identifier indicating DRC_MODE1, for example, and the gain sequence mode is encoded.
- GAIN_SEQ1 in which information related to the 7.1ch gain sequence is stored, MASTER_FLAG is 0, DIFF_SEQ_ID is 0, DMIX_CH_CFG_ID is an identifier indicating 7.1ch, and DRC_MODE_ID is an identifier indicating DRC_MODE1, for example.
- the gain sequence mode is encoded.
- MASTER_FLAG is set to 0
- DIFF_SEQ_ID is set to 0
- DMIX_CH_CFG_ID is set to an identifier indicating 5.1ch
- DRC_MODE_ID is set to an identifier indicating DRC_MODE1, for example, and the gain sequence mode is encoded.
- MASTER_FLAG is set to 0
- DIFF_SEQ_ID is set to 0
- DMIX_CH_CFG_ID is an identifier indicating 2ch
- DRC_MODE_ID is an identifier indicating DRC_MODE1, for example, and the gain sequence mode is encoded.
- the gain encoding mode header is not inserted into the bit stream.
- the gain encoding mode header is encoded.
- the 5.1ch gain sequence (GAIN_SEQ2), which was the slave gain sequence so far, is the second master gain sequence.
- the 2ch gain sequence (GAIN_SEQ3) is the slave gain sequence of the 5.1ch gain sequence.
- GAIN_SEQ0 and GAIN_SEQ1 of the gain encoding mode header are the same as those in time frame J, but GAIN_SEQ2 and GAIN_SEQ3 change.
- DIFF_SEQ_ID is 0, DMIX_CH_CFG_ID is an identifier indicating 5.1ch, and DRC_MODE_ID is an identifier indicating DRC_MODE1, for example.
- MASTER_FLAG is set to 0, DIFF_SEQ_ID is set to 2, DMIX_CH_CFG_ID is an identifier indicating 2ch, and DRC_MODE_ID is an identifier indicating DRC_MODE1, for example.
- the value of DIFF_SEQ_ID may be any value.
- the gain code string included in the auxiliary information of the output code string shown in FIG. 7 is configured as shown in FIG.
- GAIN_SEQ_NUM indicates the number of gain sequences encoded in the gain encoding mode header. Then, the gain sequence information for the number indicated in GAIN_SEQ_NUM is described after GAIN_SEQ_NUM.
- Hld_mode arranged following GAIN_SEQ_NUM is a flag indicating whether or not to hold the gain of the immediately preceding time frame in time, and is encoded with 1 bit.
- uimsbf represents UnsignedsignInteger Most Significant Bit First, and represents that an unsigned integer is encoded with the MSB side as the first bit.
- the gain of the previous time frame that is, the first gain and the second gain obtained by decoding, for example, are used as they are as the gain of the current time frame. Therefore, in this case, it can be said that the first gain and the second gain are encoded by obtaining a difference between time frames.
- the gain obtained from the information described after hld_mode is used as the gain of the current time frame.
- hld_mode When the value of hld_mode is 0, cmode is described with 2 bits following hld_mode, and gpnum is described with 6 bits.
- Cmode represents an encoding method for generating a gain waveform from gain change points encoded thereafter.
- the lower 1 bit of cmode represents the differential encoding mode at the gain change point. Specifically, when the value of the lower 1 bit of cmode is 0, it indicates that the gain encoding method is the 0th-order prediction difference mode (hereinafter also referred to as DIFF1 mode), and the lower 1 of cmode. A bit value of 1 indicates that the gain encoding method is the primary prediction differential mode (hereinafter also referred to as DIFF2 mode).
- the gain change point refers to a time at which the gain gradient changes before and after the gain waveform including the gain at each time (sample) of the time frame.
- the time (sample) to be a candidate point for the gain change point is determined in advance, and among these candidate points, the candidate point whose gain slope changes at the previous and subsequent times is set as the gain change point.
- the gain change point is the time at which the slope of the gain (difference) changes at the previous and subsequent times in the gain difference waveform from the master gain sequence.
- the 0th-order prediction difference mode encoding is performed by calculating the difference between the gain at each gain change point and the gain at the previous gain change point when encoding a gain waveform composed of gains at each time, that is, at each sample.
- the 0th-order prediction difference mode is a mode in which the gain at each time is decoded using the difference from the gain at another time when the gain waveform is decoded.
- the primary prediction differential mode when a gain waveform is encoded, the gain at each gain change point is predicted by a linear function passing through the immediately preceding gain change point, that is, by primary prediction.
- encoding is performed by obtaining a difference between a value (primary prediction value) and an actual gain.
- the upper 1 bit of cmode indicates whether or not to encode the gain at the beginning of the time frame. Specifically, when the upper 1 bit of cmode is 0, the gain at the beginning of the time frame is encoded with a fixed length of 12 bits and is described as gval_abs_id0 in FIG.
- MSB1 bit of gval_abs_id0 is a sign bit, and the remaining 11 bits are a value (gain) of “gval_abs_id0” determined by the following equation (5) in 0.25 dB steps.
- gain_abs_linear indicates a linear gain, that is, a first gain or a second gain that is a gain of the master gain sequence, or a gain difference between the master gain sequence and the slave gain sequence.
- gain_abs_linear is the gain of the first sample position in the time frame.
- “ ⁇ ” represents a power.
- the upper 1 bit of cmode is 1, it indicates that the gain value at the end of the previous time frame at the time of decoding is the gain value at the beginning of the current time frame.
- gpnum indicates the number of gain change points.
- gpnum or gval_abs_id0 followed by gloc_id [k] and gval_diff_id [k] are described by the number of gain change points indicated by gpnum.
- gloc_id [k] and gval_diff_id [k] indicate the gain change point and the encoded gain of the gain change point.
- k in gloc_id [k] and gval_diff_id [k] is an index for specifying the gain change point, and indicates the gain change point.
- gloc_id [k] is described in 3 bits
- gval_diff_id [k] is described in any number of bits from 1 to 11 bits.
- vlclbf in FIG. 10 represents Variable Length Code Left Bit ⁇ First, which means that encoding is performed starting from the left bit of the variable length code.
- DIFF1 mode 0th-order prediction difference mode
- DIFF2 mode first-order prediction difference mode
- the 0th-order prediction difference mode will be described with reference to FIG.
- the horizontal axis indicates time (sample), and the vertical axis indicates gain.
- the broken line C31 indicates the gain of the gain sequence to be processed, more specifically, the gain of the master gain sequence (first gain or second gain), or the difference between the gains of the master gain sequence and the slave gain sequence. The value is shown.
- two gain change points G11 and G12 are detected from the time frame J to be processed, and PREV11 is the start position of the time frame J, that is, the time frame J-1. The end position of is shown.
- the position gloc [0] of the gain change point G11 is encoded with 3 bits as position information representing the time sample value from the beginning of the time frame J.
- the gain change point is encoded based on the table shown in FIG.
- gloc_id indicates a value described as gloc_id [k] in the gain code string shown in FIG. 10, and gloc [gloc_id] indicates the position of the candidate point of the gain change point, that is, the sample at the beginning of the time frame or The number of samples from the previous gain change point to the candidate point sample is shown.
- each of the 0th, 16th, 32th, 64th, 128th, 256th, 512th, and 1024th samples from the beginning of the time frame, which are arranged at unequal intervals in the time frame, are candidate gain change points. It is said that.
- the gain value gval [0] at the gain change point G11 is encoded as a difference from the gain value at the start position PREV11 of the time frame J. This difference is encoded with a variable length code of 1 to 11 bits as gval_diff_id [k] of the gain code string shown in FIG.
- the difference between the gain value gval [0] at the gain change point G11 and the gain value at the head position PREV11 is encoded using the encoding table (codebook) shown in FIG.
- gval_diff_id [k] When the gain value difference is +0.3 or more or 0 or less, a code “000” and an 8-bit fixed-length code indicating the gain value difference are described as gval_diff_id [k]. Is done.
- each of the position and gain value of the next gain change point G12 is subsequently encoded as a difference from the previous gain change point G11. It becomes.
- the position gloc [1] of the gain change point G12 is the position information indicating the time sample value from the position gloc [0] of the previous gain change point G11, as in the case of the position of the gain change point G11.
- the gain value gval [1] at the gain change point G12 is the difference from the gain value gval [0] at the gain change point G11, and the encoding shown in FIG. 13 is performed as in the case of the gain value at the gain change point G11.
- the gloc table shown in FIG. 12 is not limited to this, and a table with improved time resolution by setting the minimum interval of gloc (candidate point of gain change point) to 1 may be used. In an application that can ensure a high bit rate, it is of course possible to take a difference for each sample of the gain waveform.
- the primary prediction difference mode (DIFF2 mode) will be described with reference to FIG. 14, the horizontal axis indicates time (sample), and the vertical axis indicates gain.
- the broken line C32 indicates the gain of the gain sequence being processed, more specifically, the gain of the master gain sequence (first gain or second gain), or the difference between the gains of the master gain sequence and the slave gain sequence. Is shown.
- two gain change points G21 and G22 are detected from the time frame J to be processed, and PREV21 indicates the start position of the time frame J.
- the position gloc [0] of the gain change point G21 is encoded with 3 bits as position information representing the time sample value from the beginning of the time frame J. In this encoding, processing similar to that at the gain change point G11 described with reference to FIG. 11 is performed.
- the gain value gval [0] of the gain change point G21 is encoded as a difference from the primary predicted value of the gain value gval [0].
- the gain waveform of the time frame J-1 is extended from the start position PREV21 of the time frame J, and the point P11 at the position gloc [0] on the extension line is obtained. Then, the gain value at the point P11 is set as the primary predicted value of the gain value gval [0].
- the straight line of the slope of the end portion of the gain waveform of the time frame J-1 passing through the head position PREV21 is a straight line obtained by extending the gain waveform of the time frame J-1, and a linear function representing the straight line is used.
- the primary predicted value of the gain value gval [0] is calculated.
- the difference between the primary prediction value thus obtained and the actual gain value gval [0] is obtained, and the difference is, for example, 1 to 11 bits based on the encoding table shown in FIG. It is encoded with variable length codes up to.
- the position and gain value of the next gain change point G22 are each encoded as a difference from the immediately preceding gain change point G21.
- the position gloc [1] of the gain change point G22 is the position information representing the time sample value from the position gloc [0] of the previous gain change point G21, as in the case of the position of the gain change point G21.
- the gain value gval [1] of the gain change point G22 is encoded as a difference from the primary predicted value of the gain value gval [1].
- the slope for obtaining the primary prediction value is updated to the slope of a straight line connecting (passing through) the leading position PREV21 and the previous gain change point G21, and the position gloc [1 on the straight line is updated. ] Is obtained. Then, the gain value at the point P12 is set as the primary predicted value of the gain value gval [1].
- a linear prediction value of the gain value gval [1] is calculated using a linear function that passes through the previous gain change point G21 and represents a straight line having an updated slope. Further, the difference between the primary prediction value thus obtained and the actual gain value gval [1] is obtained, and the difference is, for example, 1 to 11 bits based on the encoding table shown in FIG. It is encoded with variable length codes up to.
- the encoding table used for variable length encoding of the gain value at the gain change point is the encoding table shown in FIG. It is not limited to any type.
- the encoding table used for variable length encoding includes the number of downmix channels, the difference in DRC characteristics shown in FIG. 4 above, and differential encoding modes such as the 0th-order prediction difference mode and the first-order prediction difference mode.
- Different encoding tables may be used according to the above. By doing so, the gain encoding efficiency of each gain sequence can be further enhanced.
- attack In general, the former is called attack, and the latter is called release.
- release In human hearing, the attack is fast, and the release may sound unstable and swaying unless it is slow compared to the attack. bad.
- the horizontal axis indicates the time frame
- the vertical axis indicates the gain difference value (dB).
- the difference in the minus direction is small in frequency but the absolute value is large.
- the difference in the positive direction has a high frequency but a small absolute value.
- Such a probability density distribution of time frame differences is generally the distribution shown in FIG. In FIG. 16, the horizontal axis indicates the time frame difference, and the vertical axis indicates the appearance probability of the time frame difference.
- the appearance probability of a positive value from around 0 is very high, but the appearance probability becomes extremely small from a certain level (time frame difference).
- the appearance probability is small in the minus direction, there is a certain appearance probability even if the value becomes small.
- Such a probability density distribution changes depending on whether the encoding is performed in the 0th-order prediction difference mode or the first-order prediction difference mode, or the contents of the gain encoding mode header, so that a variable-length code table corresponding to that is configured.
- the gain information can be efficiently encoded.
- the encoding device 51 When the input time series signal is supplied for one hour frame, the encoding device 51 performs an encoding process of encoding the input time series signal and outputting an output code string.
- the encoding process performed by the encoding device 51 will be described with reference to the flowchart of FIG.
- step S11 the first sound pressure level calculation circuit 61 calculates the first sound pressure level of the input time series signal based on the supplied input time series signal and supplies the first sound pressure level to the first gain calculation circuit 62.
- the first gain calculation circuit 62 calculates a first gain based on the first sound pressure level supplied from the first sound pressure level calculation circuit 61 and supplies the first gain to the gain encoding circuit 66.
- the first gain calculation circuit 62 calculates the first gain according to the DRC characteristic of the mode such as DRC_MODE1 or DRC_MODE2 designated by the host controller.
- step S ⁇ b> 13 the downmix circuit 63 performs downmix processing on the supplied input time-series signal using the downmix information supplied from the host controller, and the resulting downmix signal is processed. This is supplied to the second sound pressure level calculation circuit 64.
- step S14 the second sound pressure level calculation circuit 64 calculates the second sound pressure level based on the downmix signal supplied from the downmix circuit 63 and supplies the second sound pressure level to the second gain calculation circuit 65.
- step S 15 the second gain calculation circuit 65 calculates the second gain of the downmix signal based on the second sound pressure level supplied from the second sound pressure level calculation circuit 64, and supplies the second gain to the gain encoding circuit 66. To do.
- step S ⁇ b> 16 the gain encoding circuit 66 performs gain encoding processing to encode the first gain supplied from the first gain calculation circuit 62 and the second gain supplied from the second gain calculation circuit 65. Turn into. Then, the gain encoding circuit 66 supplies the gain encoding mode header and the gain code string obtained by the gain encoding process to the multiplexing circuit 68.
- a difference between gain sequences, a difference between time frames, a difference between time frames, and a difference within a time frame are obtained for gain sequences such as a first gain and a second gain. It is obtained and encoded. Also, the gain encoding mode header is generated only when necessary.
- step S17 the signal encoding circuit 67 encodes the supplied input time-series signal according to a predetermined encoding method, and supplies the signal code string obtained as a result to the multiplexing circuit 68.
- step S18 the multiplexing circuit 68, the gain encoding mode header and the gain code string from the gain encoding circuit 66, the downmix information supplied from the host controller, and the signal code string from the signal encoding circuit 67. And the output code string obtained as a result is output.
- the output code string for one hour frame is output as a bit stream in this way, the encoding process ends. Then, encoding processing for the next time frame is performed.
- the encoding device 51 calculates the first gain of the original input time-series signal before the downmix and the second gain of the downmix signal after the downmix, and appropriately determines those gains. Is obtained and encoded. As a result, it is possible to obtain sound with an appropriate volume with a smaller code amount.
- the DRC characteristics can be freely set on the encoding device 51 side, more appropriate sound volume can be obtained on the decoding side. Moreover, by obtaining the gain difference and efficiently encoding, more information can be transmitted with a smaller code amount, and the calculation load on the decoding device side can be reduced.
- the gain encoding circuit 66 determines the gain encoding mode based on an instruction from the host control device. That is, for each gain sequence, whether the gain sequence is a master gain sequence or a slave gain sequence, and when a gain sequence is a slave gain sequence, which gain sequence is calculated, etc. It is determined.
- the gain encoding circuit 66 actually calculates the difference between the gains (first gain or second gain) of each gain sequence, and obtains the gain correlation. Based on the difference between the gains, for example, the gain encoding circuit 66 sets a gain sequence having a high gain correlation (small gain difference) with any other gain sequence as a master gain sequence, and sets the other gain sequences as slaves. Gain sequence.
- gain sequences may be master gain sequences.
- step S42 the gain encoding circuit 66 determines whether or not the gain encoding mode of the current time frame to be processed is the same as the gain encoding mode of the time frame immediately before the time frame. .
- step S43 the gain encoding circuit 66 generates a gain encoding mode header and adds it to the auxiliary information. For example, the gain encoding circuit 66 generates the gain encoding mode header shown in FIG.
- step S43 When the gain encoding mode header is generated in step S43, the process proceeds to step S44.
- step S42 If it is determined in step S42 that the gain encoding mode is the same, the gain encoding mode header is not added to the output code string, so the process of step S43 is not performed, and the process proceeds to step S44.
- step S44 the gain encoding circuit 66 determines each gain according to the gain encoding mode. Find the sequence difference.
- the 7.1ch gain sequence as the second gain is a slave gain sequence
- the master gain sequence for the slave gain sequence is the 11.1ch gain sequence as the first gain.
- the gain encoding circuit 66 obtains the difference between the 7.1ch gain sequence and the 11.1ch gain sequence. At this time, the difference calculation is not performed for the 11.1ch gain sequence that is the master gain sequence, and is encoded as it is in the subsequent processing.
- the gain sequence difference the difference between the gain sequences is obtained and the gain sequence is encoded.
- step S45 the gain encoding circuit 66 selects one gain sequence as the gain sequence to be processed, and determines whether the gain is constant within the gain sequence and is the same as the gain of the immediately preceding time frame. judge.
- the 11.1ch gain sequence as the master gain sequence is selected as the gain sequence to be processed.
- the gain encoding circuit 66 has a constant gain in the gain sequence.
- the gain encoding circuit 66 also calculates the difference between the gain of each sample in the 11.1ch gain sequence in the time frame J and the gain of each sample in the 11.1ch gain sequence in the time frame J-1 that is the immediately preceding time frame. Is substantially the same as the gain of the previous time frame.
- step S44 When the gain to be processed is a slave gain sequence, it is determined whether the gain difference obtained in step S44 is constant in the time frame and is the same as the gain difference in the immediately preceding time frame. Is done.
- step S45 When it is determined in step S45 that the gain is constant in the gain sequence and is the same as the gain of the immediately preceding time frame, in step S46, the gain encoding circuit 66 sets the value of hld_mode to 1, and the process is performed. Proceed to step S51. That is, 1 is described as hld_mode of the gain code string.
- step S45 when it is determined in step S45 that the gain is not constant in the gain sequence or is not the same as the gain of the immediately preceding time frame, in step S47, the gain encoding circuit 66 sets the value of hld_mode. 0. That is, 0 is described as hld_mode of the gain code string.
- step S48 the gain encoding circuit 66 extracts the gain change point of the gain sequence to be processed.
- the gain encoding circuit 66 specifies whether or not the slope of the gain time waveform has changed before and after the sample position for a predetermined sample position in the time frame. Thus, it is specified whether the sample position is the gain change point.
- the gain change point is extracted from the time waveform of the gain difference from the master gain sequence obtained for the gain sequence.
- the gain encoding circuit 66 When the gain encoding circuit 66 extracts the gain change points, the gain encoding circuit 66 describes the number of the extracted gain change points as a gpnum in the gain code string shown in FIG.
- step S49 the gain encoding circuit 66 determines cmode.
- the gain encoding circuit 66 actually performs encoding in the 0th-order prediction difference mode and encoding in the first-order prediction difference mode with respect to the gain sequence to be processed, and is obtained as a result of encoding.
- the difference encoding mode with the smaller code amount is selected.
- the gain encoding circuit 66 determines whether or not to encode the gain at the beginning of the time frame, for example, in accordance with an instruction from the host control device. This determines the cmode.
- the gain encoding circuit 66 When the cmode is determined, the gain encoding circuit 66 describes a value indicating the determined cmode in the gain code string shown in FIG. At this time, if the upper 1 bit of cmode is 0, the gain encoding circuit 66 calculates the above-described equation (5) for the gain sequence to be processed, the value of “gval_abs_id0” obtained as a result, the sign bit, Is described in gval_abs_id0 in the gain code string shown in FIG.
- step S50 the gain encoding circuit 66 encodes the gain at each gain change point extracted in step S48, using the differential encoding mode selected in step S49. Then, the gain encoding circuit 66 describes the gain encoding result of each gain change point as gloc_id [k] and gval_diff_id [k] in the gain code string shown in FIG.
- the entropy encoding circuit provided in the gain encoding circuit 66 appropriately determines the differential encoding mode or the like, such as the encoding table shown in FIG.
- the gain value is encoded while switching the entropy codebook table.
- step S51 the gain encoding circuit 66 determines whether or not all gain sequences have been encoded. For example, when all gain sequences are processed and processed, it is determined that all gain sequences have been encoded.
- step S51 If it is determined in step S51 that not all gain sequences have been encoded, the process returns to step S45, and the above-described processes are repeated. In other words, a gain sequence that has not yet been processed is encoded as a next gain sequence to be processed.
- step S51 when it is determined in step S51 that all the gain sequences have been encoded, the gain code sequence is obtained, so that the gain encoding circuit 66 uses the generated gain encoding mode header and the gain code sequence. This is supplied to the multiplexing circuit 68. Note that when the gain encoding mode header is not generated, only the gain code string is output.
- the encoding device 51 obtains the difference between the gain sequences, the difference between the time frames of the gain sequence, and the difference within the time frame of the gain sequence, encodes the gain, and generates a gain code string.
- the gain is encoded by obtaining the difference between the gain sequences, the difference between the time frames of the gain sequence, and the difference within the time frame of the gain sequence, thereby encoding the first gain and the second gain more efficiently.
- the amount of codes obtained as a result of encoding can be further reduced.
- FIG. 19 is a diagram illustrating a functional configuration example of an embodiment of a decoding device to which the present technology is applied.
- 19 includes a non-multiplexing circuit 101, a signal decoding circuit 102, a gain decoding circuit 103, and a gain application circuit 104.
- the demultiplexing circuit 101 demultiplexes the supplied input code string, that is, the output code string received from the encoding device 51.
- the demultiplexing circuit 101 supplies the gain encoding mode header and the gain code string obtained by the demultiplexing to the gain decoding circuit 103, and supplies the signal code string and the downmix information to the signal decoding circuit 102. . If the gain code mode header is not included in the input code string, the gain code mode header is not supplied to the gain decoding circuit 103.
- the signal decoding circuit 102 based on the downmix information supplied from the demultiplexing circuit 101 and the downmix control information supplied from the higher-level control device, is a signal code string supplied from the demultiplexing circuit 101. Are decoded and downmixed, and the obtained time series signal is supplied to the gain application circuit 104.
- the time-series signal is, for example, a 11.1ch or 7.1ch audio signal, and the audio signal of each channel constituting the time-series signal is a PCM signal.
- the gain decoding circuit 103 decodes the gain encoding mode header and the gain code string supplied from the demultiplexing circuit 101, and the downmix control supplied from the higher-order control device among the gain information obtained as a result.
- the gain information specified by the information and the DRC control information is supplied to the gain application circuit 104.
- the gain information output from the gain decoding circuit 103 is information corresponding to the first gain and the second gain described above.
- the gain application circuit 104 adjusts the gain of the time series signal supplied from the signal decoding circuit 102 based on the gain information supplied from the gain decoding circuit 103, and outputs the obtained output time series signal.
- the decoding device 91 When the input code string is supplied for one hour frame, the decoding device 91 performs a decoding process of decoding the input code string and outputting an output time series signal.
- the decoding process performed by the decoding device 91 will be described with reference to the flowchart of FIG.
- step S81 the demultiplexing circuit 101 demultiplexes the input code string, supplies the gain coding mode header and the gain code string obtained as a result to the gain decoding circuit 103, and transmits the signal code string and downmix information. Is supplied to the signal decoding circuit 102.
- step S82 the signal decoding circuit 102 decodes the signal code string supplied from the demultiplexing circuit 101.
- the signal decoding circuit 102 performs decoding and inverse quantization on the signal code string to obtain MDCT coefficients of each channel. Then, the signal decoding circuit 102 multiplies the MDCT coefficient of each channel by the gain coefficient obtained from the downmix information supplied from the demultiplexing circuit 101 based on the downmix control information supplied from the higher-level control device. And adding them to calculate the gain applied MDCT coefficient of each channel after downmixing.
- the signal decoding circuit 102 performs inverse MDCT processing on the gain-applied MDCT coefficients of each channel, performs windowing and overlap addition processing on the obtained inverse MDCT signal, A time-series signal composed of channel signals is generated. Note that the downmix process may be performed in the MDCT region or may be performed in the time region.
- the signal decoding circuit 102 supplies the time series signal thus obtained to the gain application circuit 104.
- step S83 the gain decoding circuit 103 performs gain decoding processing, decodes the gain encoding mode header and gain code string supplied from the demultiplexing circuit 101, and supplies gain information to the gain application circuit 104. . Details of the gain decoding process will be described later.
- step S84 the gain application circuit 104 adjusts the gain of the time series signal supplied from the signal decoding circuit 102 based on the gain information supplied from the gain decoding circuit 103, and the obtained output time series signal is obtained. Is output.
- the decoding device 91 decodes the gain encoding mode header and the gain code string, applies the obtained gain information to the time series signal, and adjusts the gain in the time domain.
- the gain code string is obtained by encoding a gain by obtaining a difference between gain sequences, a difference between time frames of the gain sequence, and a difference within the time frame of the gain sequence. Therefore, the decoding apparatus 91 can obtain more appropriate gain information with a gain code string having a smaller code amount. That is, it is possible to obtain sound with an appropriate volume with a smaller code amount.
- step S121 the gain decoding circuit 103 determines whether or not there is a gain encoding mode header in the input code string. For example, when a gain encoding mode header is supplied from the demultiplexing circuit 101, it is determined that there is a gain encoding mode header.
- step S121 If it is determined in step S121 that there is a gain encoding mode header, the gain decoding circuit 103 decodes the gain encoding mode header supplied from the demultiplexing circuit 101 in step S122. Thereby, information on each gain sequence such as the gain encoding mode is obtained.
- step S123 When the gain encoding mode header is decoded, the process proceeds to step S123.
- step S121 determines whether there is no gain encoding mode header. If it is determined in step S121 that there is no gain encoding mode header, the process proceeds to step S123.
- the gain decoding circuit 103 decodes the entire gain sequence in step S123. That is, the gain decoding circuit 103 decodes the gain code string shown in FIG. 10 and extracts information necessary for obtaining the gain waveform of each gain sequence, that is, the first gain or the second gain.
- step S124 the gain decoding circuit 103 sets one gain sequence as a processing target, and determines whether or not the value of hld_mode of the gain sequence is 0.
- step S124 when it is determined that the value of hld_mode is not 0, that is, 1, the process proceeds to step S125.
- step S125 the gain decoding circuit 103 uses the gain waveform of the immediately previous time frame as it is as the gain waveform of the current time frame.
- step S126 the gain decoding circuit 103 determines whether or not cmode is greater than 1, that is, the upper 1 bit of cmode is 1. It is determined whether or not.
- step S126 If it is determined in step S126 that cmode is greater than 1, that is, the upper 1 bit of cmode is 1, the gain value at the end of the previous time frame is set as the gain value at the beginning of the current time frame, and the process is performed in step S126. Proceed to S128.
- the gain decoding circuit 103 holds the gain value at the end position of the time frame as prev, and when decoding the gain, the value of this prev is appropriately used as the gain value at the start position of the current time frame. Thus, the gain of the gain sequence is obtained.
- step S126 determines whether cmode is 1 or less, that is, the upper 1 bit of cmode is 0, the process of step S127 is performed.
- step S127 the gain decoding circuit 103 substitutes gval_abs_id0 obtained by decoding the gain code string into the above equation (5), calculates the gain value at the beginning of the current time frame, and calculates the value of prev Update. That is, the gain value obtained by the calculation of Expression (5) is set as a new prev value.
- the value of prev is a value of a difference value from the master gain sequence at the head position of the current time frame.
- the gain decoding circuit 103 If the value of prev has been updated in step S127, or if cmode is determined to be greater than 1 in step S126, the gain decoding circuit 103 generates a gain waveform of the gain sequence to be processed in step S128.
- the gain decoding circuit 103 refers to the cmode obtained by decoding the gain code string, and specifies whether it is the 0th-order prediction difference mode or the first-order prediction difference mode. Then, the gain decoding circuit 103 uses the value of prev and the gloc_id [k] and gval_diff_id [k] of each gain change point obtained by decoding the gain code string to identify the specified differential encoding mode Accordingly, the gain of each sample position in the current time frame is obtained to obtain a gain waveform.
- the gain decoding circuit 103 adds a value obtained by adding a gain value (difference value) indicated by gval_diff_id [0] to the value of prev, to gloc_id [ 0] is the gain value at the sample position specified by [0].
- the gain value is linear from the prev value to the gain value at the sample position specified by gloc_id [0].
- a gain value at each sample position is obtained.
- the gain value at the gain change point of interest is obtained from the gain value at the previous gain change point and the gloc_id [k] and gval_diff_id [k] of the gain change point of interest in the same manner.
- a gain waveform including a gain value at each sample position is obtained.
- the gain value (gain waveform) obtained by the above processing is a difference value from the gain waveform of the master gain sequence.
- the gain decoding circuit 103 refers to the MASTER_FLAG and DIFF_SEQ_ID shown in FIG. 9 in the gain sequence mode of the processing target gain sequence, determines whether the processing target gain sequence is a slave gain sequence, and the corresponding master gain. Identify the sequence.
- the gain decoding circuit 103 uses the gain waveform obtained by the above processing as the final gain information of the gain sequence to be processed.
- the gain decoding circuit 103 adds the gain information (gain waveform) of the master gain sequence with respect to the gain sequence to be processed to the gain waveform obtained by the above processing. ) To obtain final gain information of the gain sequence to be processed.
- step S129 When the gain waveform (gain information) of the gain sequence to be processed is obtained as described above, the process thereafter proceeds to step S129.
- step S129 When a gain waveform is generated in step S128 or step S125, the process of step S129 is performed.
- step S129 the gain decoding circuit 103 holds the gain value at the end position of the current time frame of the gain waveform of the gain sequence to be processed as the prev value of the next time frame.
- the gain sequence to be processed is a slave gain sequence
- the end position of the time frame in the gain waveform obtained by prediction in the 0th-order prediction difference mode or the first-order prediction difference mode that is, the gain waveform of the master gain sequence
- the value of the end position of the time frame in the time waveform of the difference is set to the value of prev.
- step S130 the gain decoding circuit 103 determines whether or not gain waveforms of all gain sequences have been obtained. For example, when all gain sequences indicated in the gain encoding mode header are processed as gain sequences and gain waveforms (gain information) are obtained, it is determined that gain waveforms of all gain sequences have been obtained.
- step S130 If it is determined in step S130 that the gain waveforms of all gain sequences have not yet been obtained, the process returns to step S124, and the above-described processes are repeated. That is, the next gain sequence is processed, and a gain waveform (gain information) is obtained.
- step S130 determines whether the gain waveforms of the entire gain sequence have been obtained. If it is determined in step S130 that the gain waveforms of the entire gain sequence have been obtained, the gain decoding process ends, and then the process proceeds to step S84 in FIG.
- the gain decoding circuit 103 is the number of channels after the downmix indicated by the downmix control information in each gain sequence, and the gain whose gain is calculated with the DRC characteristic indicated by the DRC control information.
- the gain information of the sequence is supplied to the gain application circuit 104. That is, with reference to DMIX_CH_CFG_ID and DRC_MODE_ID of each gain sequence mode shown in FIG. 9, gain information of the gain sequence specified by the downmix control information and DRC control information is output.
- the decoding device 91 decodes the gain encoding mode header and the gain code string, and calculates gain information of each gain sequence.
- the decoding device 91 decodes the gain encoding mode header and the gain code string, and calculates gain information of each gain sequence.
- the master gain sequence may change every time frame, and the decoding device 91 uses the value of prev to decode the gain sequence. Done. For this reason, the decoding device 91 needs to calculate a frame every hour for gain waveforms other than the downmix pattern actually used by the decoding device 91.
- the calculation load on the decoding device 91 side is not so large. However, in a case where a further reduction in calculation load is required, such as a portable terminal, it is possible to reduce the amount of calculation at the expense of some reproducibility of the gain waveform.
- the 0th-order prediction difference mode is often used, the number of gain change points gpnum in the time frame is a small number of 2 or less, and the gain difference value at the gain change point is A certain gval_diff_id [k] often has a small value.
- the difference value between the gain value gval [0] at the gain change point G11 and the gain value at the head position PREV11 is gval_diff [0]
- the gain value gval [0] at the gain change point G11 is used.
- the difference value between the gain change point G12 and the gain value gval [1] is gval_diff [1].
- the gain value of the leading position PREV11 which is the value of prev and the difference value gval_diff [0] are added on the decibel, and further, the difference value gval_diff [1] is added to the addition result.
- the gain value gval [1] of the gain change point G12 is obtained.
- the addition result of the gain value, the difference value gval_diff [0], and the difference value gval_diff [1] of the head position PREV11 thus obtained is also referred to as a gain addition value.
- the prev value of the next time frame J + 1 is linearly interpolated between the position gloc [0] of the gain change point G11 and the position gloc [1] of the gain change point G12 with a linear value, and This is the gain value of the Nth sample when the straight line is extended to the position of the Nth sample of the time frame J corresponding to the head.
- the slope of the straight line connecting the gain change point G11 and the gain change point G12 is small, there is no particular problem even if the prev value of the time frame J + 1 has the gain addition value obtained by adding up to the difference value gval_diff [1]. .
- the slope of the straight line connecting the gain change point G11 and the gain change point G12 can be easily obtained by utilizing the fact that the position gloc [k] of each gain change point is a power of 2. That is, in the example of FIG. 11, instead of dividing by the number of samples at the position gloc [1], the slope of the straight line is obtained by shifting the added value of the difference value to the right by the number of bits corresponding to the number of samples. be able to.
- the gain addition value is set as the prev value of the next time frame J + 1, and when the slope is equal to or larger than the threshold value, the first embodiment will be described.
- the gain waveform is obtained by the method described above, and the gain value at the end of the time frame may be set as the prev value.
- the gain waveform is directly obtained by the method described in the first embodiment, and the value at the end of the time frame may be set as the prev value.
- the calculation load of the decoding device 91 can be reduced.
- the encoding device 51 actually performs the downmix and calculates the sound pressure level of the obtained downmix signal as the second sound pressure level.
- the sound pressure level after downmixing may be obtained directly from the sound pressure level of the channel. In this case, the sound pressure level fluctuates somewhat depending on the correlation between the channels of the input time series signal, but the amount of calculation can be reduced.
- the encoding device is configured as shown in FIG. 22, for example.
- parts corresponding to those in FIG. 3 are denoted by the same reference numerals, and description thereof will be omitted as appropriate.
- first sound pressure level calculation circuit 61 includes a first sound pressure level calculation circuit 61, a first gain calculation circuit 62, a second sound pressure level estimation circuit 141, a second gain calculation circuit 65, a gain encoding circuit 66, and a signal code. And a multiplexing circuit 68.
- the first sound pressure level calculation circuit 61 calculates the sound pressure level of each channel constituting the input time series signal based on the supplied input time series signal and supplies it to the second sound pressure level estimation circuit 141.
- the representative value of the sound pressure level of each channel is supplied to the first gain calculation circuit 62 as the first sound pressure level.
- the second sound pressure level estimation circuit 141 calculates a second sound pressure level by estimation based on the sound pressure level of each channel supplied from the first sound pressure level calculation circuit 61, and a second gain calculation circuit. 65.
- step S161 and step S162 are the same as the process of step S11 and step S12 of FIG. 17, the description is abbreviate
- the first sound pressure level calculation circuit 61 supplies the second sound pressure level estimation circuit 141 with the sound pressure level of each channel constituting the input time series signal obtained from the input time series signal.
- the second sound pressure level estimation circuit 141 calculates the second sound pressure level based on the sound pressure level of each channel supplied from the first sound pressure level calculation circuit 61, and the second gain calculation circuit 65. To supply.
- the second sound pressure level estimation circuit 141 calculates one second sound pressure level by weighted addition (linear combination) of sound pressure levels of each channel using a coefficient prepared in advance.
- step S164 to step S167 is performed and the encoding process is terminated.
- these processes are the same as the process from step S15 to step S18 in FIG. The description is omitted.
- the encoding device 131 calculates the second sound pressure level based on the sound pressure level of each channel of the input time-series signal, and appropriately obtains the second gain obtained from the second sound pressure level, The difference of the first gain is obtained and encoded. As a result, it is possible to obtain a sound having an appropriate volume with a smaller code amount and to perform encoding with a smaller amount of calculation.
- the encoding device 171 shown in FIG. 24 includes a window length selection / windowing circuit 181, an MDCT circuit 182, a first sound pressure level calculation circuit 183, a first gain calculation circuit 184, a downmix circuit 185, and a second sound pressure level calculation.
- a circuit 186, a second gain calculation circuit 187, a gain encoding circuit 189, an adaptive bit allocation circuit 190, a quantization / encoding circuit 191, and a multiplexing circuit 192 are included.
- the window length selection / windowing circuit 181 selects a window length, performs a windowing process on the input time series signal supplied with the selected window length, and supplies the resulting time frame signal to the MDCT circuit 182 To do.
- the MDCT circuit 182 performs MDCT processing on the time frame signal supplied from the window length selection / windowing circuit 181, and uses the resulting MDCT coefficients as the first sound pressure level calculation circuit 183 and the downmix circuit 185. , And the adaptive bit allocation circuit 190.
- the first sound pressure level calculation circuit 183 calculates the first sound pressure level of the input time series signal based on the MDCT coefficient supplied from the MDCT circuit 182 and supplies it to the first gain calculation circuit 184.
- the first gain calculation circuit 184 calculates a first gain based on the first sound pressure level supplied from the first sound pressure level calculation circuit 183 and supplies the first gain to the gain encoding circuit 189.
- the downmix circuit 185 calculates the MDCT coefficient of each channel after the downmix based on the downmix information supplied from the host controller and the MDCT coefficient of each channel of the input time series signal supplied from the MDCT circuit 182. Calculated and supplied to the second sound pressure level calculating circuit 186.
- the second sound pressure level calculation circuit 186 calculates the second sound pressure level based on the MDCT coefficient supplied from the downmix circuit 185, and supplies the second sound pressure level to the second gain calculation circuit 187.
- the second gain calculation circuit 187 calculates a second gain based on the second sound pressure level supplied from the second sound pressure level calculation circuit 186 and supplies the second gain to the gain encoding circuit 189.
- the gain encoding circuit 189 encodes the first gain supplied from the first gain calculation circuit 184 and the second gain supplied from the second gain calculation circuit 187, and multiplexes the gain code string obtained as a result. To the circuit 192.
- the adaptive bit allocation circuit 190 Based on the MDCT coefficient supplied from the MDCT circuit 182, the adaptive bit allocation circuit 190 generates bit allocation information indicating a target code amount at the time of encoding the MDCT coefficient, and quantizes the MDCT coefficient and the bit allocation information Supply to the encoding circuit 191.
- the quantization / encoding circuit 191 quantizes and encodes the MDCT coefficient from the adaptive bit allocation circuit 190 based on the bit allocation information supplied from the adaptive bit allocation circuit 190, and obtains the signal code string obtained as a result thereof. This is supplied to the multiplexing circuit 192.
- the multiplexing circuit 192 multiplexes the gain code sequence supplied from the gain encoding circuit 189, the downmix information supplied from the higher-level control device, and the signal code sequence supplied from the quantization / encoding circuit 191.
- the output code string obtained as a result is output.
- step S191 the window length selection / windowing circuit 181 selects a window length, performs a windowing process on the input time series signal supplied with the selected window length, and uses the resulting time frame signal as an MDCT. Supply to circuit 182. Thereby, the signal of each channel which comprises an input time series signal is divided
- step S192 the MDCT circuit 182 performs MDCT processing on the time frame signal supplied from the window length selection / windowing circuit 181 and uses the resulting MDCT coefficient as the first sound pressure level calculation circuit 183, This is supplied to the downmix circuit 185 and the adaptive bit allocation circuit 190.
- the first sound pressure level calculation circuit 183 calculates the first sound pressure level of the input time series signal based on the MDCT coefficient supplied from the MDCT circuit 182 and supplies the first sound pressure level to the first gain calculation circuit 184.
- the first sound pressure level calculated by the first sound pressure level calculation circuit 183 is the same as that calculated by the first sound pressure level calculation circuit 61 of FIG. 3, but in step S193, the MDCT region Thus, the sound pressure level of the input time series signal is calculated.
- step S194 the first gain calculation circuit 184 calculates a first gain based on the first sound pressure level supplied from the first sound pressure level calculation circuit 183, and supplies the first gain to the gain encoding circuit 189.
- the first gain is calculated according to the DRC characteristic shown in FIG.
- step S195 the downmix circuit 185 performs downmix based on the downmix information supplied from the host controller and the MDCT coefficient of each channel of the input time-series signal supplied from the MDCT circuit 182.
- the MDCT coefficient of each channel after mixing is calculated and supplied to the second sound pressure level calculation circuit 186.
- the MDCT coefficient of each channel is calculated by multiplying the MDCT coefficient of each channel by the gain coefficient obtained from the downmix information and adding the MDCT coefficient multiplied by the gain coefficient. .
- step S196 the second sound pressure level calculation circuit 186 calculates the second sound pressure level based on the MDCT coefficient supplied from the downmix circuit 185, and supplies the second sound pressure level to the second gain calculation circuit 187.
- the calculation of the second sound pressure level is obtained by the same calculation as the first sound pressure level.
- step S197 the second gain calculation circuit 187 calculates a second gain based on the second sound pressure level supplied from the second sound pressure level calculation circuit 186, and supplies the second gain to the gain encoding circuit 189.
- the second gain is calculated according to the DRC characteristic shown in FIG.
- step S198 the gain encoding circuit 189 performs a gain encoding process and encodes the first gain supplied from the first gain calculation circuit 184 and the second gain supplied from the second gain calculation circuit 187. To do. Then, the gain encoding circuit 189 supplies the gain encoding mode header and the gain code string obtained by the gain encoding process to the multiplexing circuit 192.
- the gain encoding process a difference between time frames is obtained for a gain sequence such as a first gain and a second gain, and each gain is encoded. Also, the gain encoding mode header is generated only when necessary.
- step S199 the adaptive bit allocation circuit 190 generates bit allocation information based on the MDCT coefficient supplied from the MDCT circuit 182, and supplies the MDCT coefficient and the bit allocation information to the quantization / encoding circuit 191.
- step S200 the quantization / encoding circuit 191 quantizes and encodes the MDCT coefficient from the adaptive bit allocation circuit 190 based on the bit allocation information supplied from the adaptive bit allocation circuit 190, and obtains the result.
- the signal code string is supplied to the multiplexing circuit 192.
- step S201 the multiplexing circuit 192 receives the gain code string and gain coding mode header supplied from the gain coding circuit 189, downmix information supplied from the higher-level control device, and the quantization / coding circuit 191.
- the supplied signal code string is multiplexed, and an output code string obtained as a result is output.
- the output code string shown in FIG. 7 is obtained.
- the gain code string is different from that shown in FIG.
- the encoding device 1711 calculates the first gain and the second gain in the MDCT region, that is, from the MDCT coefficient, and calculates and encodes the difference between these gains. As a result, it is possible to obtain sound having an appropriate volume with a smaller code amount.
- step S231 to step S234 is the same as the processing from step S41 to step S44 in FIG.
- step S235 the gain encoding circuit 189 selects one gain sequence as a gain sequence to be processed, and the difference between the gain (gain waveform) of the current time frame of the gain sequence and the gain of the immediately preceding time frame. Find the value.
- the difference between the gain value at each sample position in the current time frame of the gain sequence to be processed and the gain value at each sample position in the time frame immediately before the current time frame in the gain sequence to be processed is obtained. . That is, the difference between the time frames of the gain sequence is obtained.
- a difference value between time frames of a time waveform that is a difference from the master gain sequence obtained in step S234 is obtained. That is, the difference value between the time waveform of the difference from the master gain sequence in the current time frame and the time waveform of the difference from the master gain sequence in the immediately preceding time frame is obtained.
- step S236 the gain encoding circuit 189 determines whether or not all gain sequences have been encoded. For example, when all gain sequences are processed and processed, it is determined that all gain sequences have been encoded.
- step S236 If it is determined in step S236 that the entire gain sequence has not been encoded, the process returns to step S235, and the above-described process is repeated. That is, encoding is performed with a gain sequence that has not yet been processed as a gain sequence to be processed next.
- the gain encoding circuit 189 uses the gain difference value obtained for each gain sequence in step S235 as a gain code string. To do. Then, the gain encoding circuit 189 supplies the generated gain encoding mode header and gain code string to the multiplexing circuit 129. Note that when the gain encoding mode header is not generated, only the gain code string is output.
- the encoding device 171 encodes the gain by obtaining the difference between the gain sequences and the difference between the time frames of the gain sequence, and generates a gain code string.
- the first gain and the second gain can be encoded more efficiently by obtaining the difference between the gain sequences and the difference between the time frames of the gain sequence and encoding the gain. That is, the amount of codes obtained as a result of encoding can be further reduced.
- FIG. 27 is a diagram illustrating a configuration example of an embodiment of a decoding device to which the present technology is applied.
- 27 includes a demultiplexing circuit 241, a decoding / inverse quantization circuit 242, a gain decoding circuit 243, a gain application circuit 244, an inverse MDCT circuit 245, and a windowing / OLA circuit 246. is doing.
- the demultiplexing circuit 241 demultiplexes the supplied input code string.
- the demultiplexing circuit 241 supplies the gain coding mode header and the gain code string obtained by the demultiplexing to the gain decoding circuit 243, decodes the signal code string, and further performs the down-quantization circuit 242. Mix information is supplied to the gain application circuit 244.
- the decoding / inverse quantization circuit 242 performs decoding and inverse quantization on the signal code string supplied from the demultiplexing circuit 241 and supplies the MDCT coefficient obtained as a result to the gain application circuit 244.
- the gain decoding circuit 243 decodes the gain encoding mode header and the gain code string supplied from the demultiplexing circuit 241, and supplies the gain information obtained as a result to the gain application circuit 244.
- the gain application circuit 244 based on the downmix control information and DRC control information supplied from the host controller, gain coefficients obtained from the downmix information supplied from the demultiplexing circuit 241, and the gain decoding circuit 243. Is multiplied by the MDCT coefficient supplied from the decoding / inverse quantization circuit 242, and the gain applied MDCT coefficient obtained is supplied to the inverse MDCT circuit 245.
- the inverse MDCT circuit 245 performs inverse MDCT processing on the gain application MDCT coefficient supplied from the gain application circuit 244 and supplies the obtained inverse MDCT signal to the windowing / OLA circuit 246.
- the windowing / OLA circuit 246 performs windowing and overlap addition processing on the inverse MDCT signal supplied from the inverse MDCT circuit 245, and outputs an output time series signal obtained thereby.
- the decoding device 231 When the input code string is supplied for one hour frame, the decoding device 231 performs a decoding process of decoding the input code string and outputting an output time series signal.
- the decoding process performed by the decoding device 231 will be described with reference to the flowchart of FIG.
- step S261 the demultiplexing circuit 241 demultiplexes the supplied input code string. Then, the demultiplexing circuit 241 supplies the gain coding mode header and the gain code string obtained by the demultiplexing to the gain decoding circuit 243, decodes the signal code string, and performs the decoding / inverse quantization circuit 242. The downmix information is supplied to the gain application circuit 244.
- step S262 the decoding / inverse quantization circuit 242 performs decoding and inverse quantization on the signal code string supplied from the demultiplexing circuit 241, and outputs the resulting MDCT coefficient to the gain application circuit 244. Supply.
- step S263 the gain decoding circuit 243 performs gain decoding processing, decodes the gain encoding mode header and the gain code string supplied from the demultiplexing circuit 241, and applies gain to the gain information obtained as a result. Supply to circuit 244. Details of the gain decoding process will be described later.
- step S264 the gain application circuit 244, the gain coefficient obtained from the downmix information from the demultiplexing circuit 241 and the gain decoding circuit 243 based on the downmix control information and the DRC control information from the higher-level control device. Is multiplied by the MDCT coefficient from the decoding / inverse quantization circuit 242 to perform gain adjustment.
- the gain application circuit 244 multiplies the MDCT coefficient by a gain coefficient obtained from the downmix information supplied from the demultiplexing circuit 241 in accordance with the downmix control information. Then, the gain application circuit 244 calculates the MDCT coefficient of the channel after downmixing by adding the MDCT coefficient multiplied by the gain coefficient.
- the gain application circuit 244 multiplies the MDCT coefficient of each channel after downmixing by the gain information supplied from the gain decoding circuit 243 in accordance with the DRC control information to obtain a gain application MDCT coefficient.
- the gain application circuit 244 supplies the gain application MDCT coefficient thus obtained to the inverse MDCT circuit 245.
- step S265 the inverse MDCT circuit 245 performs inverse MDCT processing on the gain application MDCT coefficient supplied from the gain application circuit 244, and supplies the obtained inverse MDCT signal to the windowing / OLA circuit 246.
- step S266 the windowing / OLA circuit 246 performs windowing and overlap addition processing on the inverse MDCT signal supplied from the inverse MDCT circuit 245, and outputs the output time series signal obtained thereby.
- the decoding process ends.
- the decoding device 231 decodes the gain encoding mode header and the gain code string, applies the obtained gain information to the MDCT coefficient, and adjusts the gain.
- the gain code string is obtained by obtaining a difference between gain sequences and a difference between time frames of the gain sequence. Therefore, the decoding apparatus 231 can obtain more appropriate gain information with a gain code string having a smaller code amount. That is, it is possible to obtain sound with an appropriate volume with a smaller code amount.
- or step S293 is the same as the process of step S121 thru
- step S293 a gain difference value at each sample position in the time frame for each gain sequence included in the gain code string is obtained by decoding.
- step S294 the gain decoding circuit 243 processes one gain sequence, and calculates the current time from the gain value of the time frame immediately before the current time frame of the gain sequence and the gain difference value of the current time frame. Find the gain value of the frame.
- the gain decoding circuit 243 refers to the MASTER_FLAG and DIFF_SEQ_ID shown in FIG. 9 in the gain sequence mode of the processing target gain sequence, determines whether the processing target gain sequence is a slave gain sequence, and the corresponding master gain. Identify the sequence.
- the gain decoding circuit 243 obtains the difference value of each sample position in the current time frame of the gain sequence to be processed, obtained by decoding the gain code sequence. And the gain value at each sample position in the time frame immediately before the current time frame of the gain sequence to be processed. Then, the gain value at each sample position of the current time frame obtained as a result is used as the time waveform of the gain of the current time frame, that is, the final gain information of the gain sequence to be processed.
- the gain decoding circuit 243 performs the master gain sequence in the time frame immediately before the current time frame of the gain sequence to be processed and the processing in the immediately previous time frame. A gain difference value at each sample position from the target gain sequence is obtained.
- the gain decoding circuit 243 adds the difference value obtained in this way and the difference value of each sample position of the current time frame of the gain sequence to be processed obtained by decoding the gain code string. Further, the gain decoding circuit 243 adds the gain information (gain waveform) of the master gain sequence of the current time frame with respect to the gain sequence to be processed to the gain waveform obtained as a result of the addition, and the gain sequence to be processed Is the final gain information.
- step S295 the gain decoding circuit 243 determines whether gain waveforms of all gain sequences have been obtained. For example, when all gain sequences indicated in the gain encoding mode header are processed as gain sequences and gain waveforms (gain information) are obtained, it is determined that gain waveforms of all gain sequences have been obtained.
- step S295 If it is determined in step S295 that the gain waveforms of all gain sequences have not been obtained yet, the process returns to step S294, and the above-described processes are repeated. That is, the next gain sequence is processed, and a gain waveform (gain information) is obtained.
- step S295 if it is determined in step S295 that the gain waveforms of the entire gain sequence have been obtained, the gain decoding process ends, and then the process proceeds to step S264 in FIG.
- the decoding device 231 decodes the gain encoding mode header and the gain code string, and calculates gain information of each gain sequence.
- the decoding device 231 decodes the gain encoding mode header and the gain code string, and calculates gain information of each gain sequence.
- the present technology it is possible to reproduce encoded sound at an appropriate volume level in various reproduction environments including the presence or absence of downmix, and no clipping noise is generated in various reproduction environments. . Furthermore, since a required code amount is small, a large amount of gain information can be efficiently encoded. In addition, the present technology can be applied to a mobile terminal or the like because the calculation amount of the decoding apparatus is small.
- the gain correction by DRC is performed as the volume correction of the input time-series signal.
- other correction processing such as loudness may be performed as the volume correction.
- the loudness representing the sound pressure level of the entire content can be described for each frame as auxiliary information, and such a loudness correction value is also encoded as a gain value.
- the gain of the loudness correction can also be encoded and transmitted by being included in the gain code string.
- a gain value corresponding to each downmix pattern is required as in the DRC.
- encoding may be performed by obtaining a difference between gain change points between time frames.
- the above-described series of processing can be executed by hardware or can be executed by software.
- a program constituting the software is installed in the computer.
- the computer includes, for example, a general-purpose computer capable of executing various functions by installing a computer incorporated in dedicated hardware and various programs.
- FIG. 30 is a block diagram showing an example of the hardware configuration of a computer that executes the above-described series of processing by a program.
- a CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- An input / output interface 505 is further connected to the bus 504.
- An input unit 506, an output unit 507, a recording unit 508, a communication unit 509, and a drive 510 are connected to the input / output interface 505.
- the input unit 506 includes a keyboard, a mouse, a microphone, an image sensor, and the like.
- the output unit 507 includes a display, a speaker, and the like.
- the recording unit 508 includes a hard disk, a nonvolatile memory, and the like.
- the communication unit 509 includes a network interface or the like.
- the drive 510 drives a removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
- the CPU 501 loads the program recorded in the recording unit 508 to the RAM 503 via the input / output interface 505 and the bus 504 and executes the program, for example. Is performed.
- the program executed by the computer (CPU 501) can be provided by being recorded in, for example, a removable medium 511 as a package medium or the like.
- the program can be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.
- the program can be installed in the recording unit 508 via the input / output interface 505 by attaching the removable medium 511 to the drive 510. Further, the program can be received by the communication unit 509 via a wired or wireless transmission medium and installed in the recording unit 508. In addition, the program can be installed in advance in the ROM 502 or the recording unit 508.
- the program executed by the computer may be a program that is processed in time series in the order described in this specification, or in parallel or at a necessary timing such as when a call is made. It may be a program for processing.
- the present technology can take a cloud computing configuration in which one function is shared by a plurality of devices via a network and is jointly processed.
- each step described in the above flowchart can be executed by one device or can be shared by a plurality of devices.
- the plurality of processes included in the one step can be executed by being shared by a plurality of apparatuses in addition to being executed by one apparatus.
- the present technology can be configured as follows.
- a gain calculation unit for calculating a first gain value and a second gain value for volume correction for each frame of the audio signal; A first difference value between the first gain value and the second gain value is obtained, or between the first gain value and the first gain value of the adjacent frame, or the first difference.
- a gain encoder that obtains a second difference value between the value and the first difference value of the adjacent frame, and encodes the information based on the first difference value or the second difference value;
- An encoding device comprising: (2) The gain encoding unit obtains the first difference value between the first gain value and the second gain value at a plurality of positions in the frame, or the plurality of positions at the plurality of positions in the frame.
- the encoding apparatus wherein the second difference value is obtained between first gain values or between the first difference values at a plurality of positions in the frame.
- the gain encoding unit obtains the second difference value based on a gain change point at which a slope of the first gain value or the first difference value in the frame changes.
- (1) or (2) The encoding device described in 1.
- the encoding apparatus wherein the gain encoding unit obtains the second difference value by obtaining a difference between the gain change point and another gain change point.
- the gain encoding unit obtains the second difference value by obtaining a difference between the gain change point and a predicted value by primary prediction using another gain change point.
- the code according to (3) Device The code according to (3) Device.
- a process including a step of obtaining a second difference value between the value and the first difference value of the adjacent frame and encoding the first difference value or information based on the second difference value.
- the first difference value is encoded by obtaining a difference value between the first gain value and the second gain value at a plurality of positions in the frame
- the second difference value is obtained as a difference value between the first gain values at a plurality of positions in the frame or between the first difference values at a plurality of positions in the frame.
- the second difference value is encoded by being obtained from a gain change point at which a slope of the first gain value or the first difference value in the frame changes. (12) or (13) The decoding device according to 1.
- (15) The decoding device according to (14), wherein the second difference value is encoded by being obtained from a difference between the gain change point and another gain change point.
- the first difference value between the first gain value and the second gain value, or the first gain value Obtaining a second difference value between a gain value of 1 and the first gain value of the adjacent frame or between the first difference value and the first difference value of the adjacent frame;
- the input code string is demultiplexed into the gain code string generated in step (1) and the signal code string obtained by encoding the speech signal, Decoding the signal code string;
- a decoding method comprising: decoding the gain code string and outputting the first gain value or the second gain value for the sound volume correction.
- the first difference value between the first gain value and the second gain value, or the first gain value Obtaining a second difference value between a gain value of 1 and the first gain value of the adjacent frame or between the first difference value and the first difference value of the adjacent frame;
- the input code string is demultiplexed into the gain code string generated in step (1) and the signal code string obtained by encoding the speech signal, Decoding the signal code string;
- a program for causing a computer to execute processing including a step of decoding the gain code string and outputting the first gain value or the second gain value for the sound volume correction.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Mathematical Physics (AREA)
- Stereophonic System (AREA)
- Circuit For Audible Band Transducer (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
〈本技術の概要〉
まず、一般的なMPEG AACのDRC処理について説明する。
次に、本技術を適用した具体的な実施の形態について説明する。
ここで、ゲイン符号化回路66に供給される第一ゲインおよび第二ゲインと、ゲイン符号化回路66から出力されるゲイン符号列の例について説明する。
また、以上の説明では、11.1chの入力時系列信号のゲインを第一ゲインとし、5.1chのダウンミックス信号のゲインを第二ゲインとする例について説明した。以下では、マスターゲインシーケンスとスレーブゲインシーケンスとの関係を詳細に説明するため、さらに11.1chの入力時系列信号をダウンミックスして得られた、7.1chのダウンミックス信号のゲインと、2chのダウンミックス信号のゲインとがあるものとして説明を続ける。すなわち、7.1chのゲインと2chのゲインは、それぞれ第二ゲイン計算回路65において求められた第二ゲインである。したがって、この例では、第二ゲイン計算回路65では3つの第二ゲインが計算されることになる。
次に、符号化装置51の動作について説明する。
次に、図18のフローチャートを参照して、図17のステップS16の処理に対応するゲイン符号化処理について説明する。
次に、符号化装置51から出力された出力符号列を入力符号列として入力し、入力符号列の復号化を行う復号化装置について説明する。
次に、復号化装置91の動作について説明する。
続いて、図21のフローチャートを参照して、図20のステップS83の処理に対応するゲイン復号化処理について説明する。
〈符号化装置の構成例〉
なお、以上においては、符号化装置51において、実際にダウンミックスを行って、得られたダウンミックス信号の音圧レベルを第二音圧レベルとして算出していたが、ダウンミックスを行わずに各チャネルの音圧レベルから、直接、ダウンミックス後の音圧レベルを求めてもよい。この場合、入力時系列信号の各チャネル間の相関によって多少音圧レベルが変動するが、演算量を低減させることができる。
続いて、符号化装置131の動作について説明する。以下、図23のフローチャートを参照して、符号化装置131により行われる符号化処理について説明する。
〈符号化装置の構成例〉
また、以上においては、時間領域でDRC処理を行う例について説明したが、MDCT領域でDRC処理を行うようにしてもよい。そのような場合、符号化装置は、例えば図24に示すように構成される。
次に符号化装置171の動作について説明する。以下、図25のフローチャートを参照して、符号化装置171による符号化処理について説明する。
次に、図26のフローチャートを参照して、図25のステップS198の処理に対応するゲイン符号化処理について説明する。なお、ステップS231乃至ステップS234の処理は、図18のステップS41乃至ステップS44の処理と同様であるので、その説明は省略する。
次に、符号化装置171から出力された出力符号列を入力符号列として入力し、入力符号列の復号化を行う復号化装置について説明する。
続いて、復号化装置231の動作について説明する。
さらに、図29のフローチャートを参照して、図28のステップS263の処理に対応するゲイン復号化処理について説明する。
音声信号のフレームごとに音量補正のための第1のゲイン値と第2のゲイン値を算出するゲイン計算部と、
前記第1のゲイン値と前記第2のゲイン値の第1の差分値を求め、または前記第1のゲイン値と隣接する前記フレームの前記第1のゲイン値との間若しくは前記第1の差分値と隣接する前記フレームの前記第1の差分値との間で第2の差分値を求め、前記第1の差分値または前記第2の差分値に基づく情報を符号化するゲイン符号化部と
を備える符号化装置。
(2)
前記ゲイン符号化部は、前記フレーム内の複数の位置での前記第1のゲイン値と前記第2のゲイン値の前記第1の差分値を求め、または前記フレーム内の複数の位置での前記第1のゲイン値の間若しくは前記フレーム内の複数の位置での前記第1の差分値の間で前記第2の差分値を求める
(1)に記載の符号化装置。
(3)
前記ゲイン符号化部は、前記フレーム内の前記第1のゲイン値または前記第1の差分値の傾きが変化するゲイン変化点に基づいて前記第2の差分値を求める
(1)または(2)に記載の符号化装置。
(4)
前記ゲイン符号化部は、前記ゲイン変化点と、他のゲイン変化点との差分を求めることで、前記第2の差分値を求める
(3)に記載の符号化装置。
(5)
前記ゲイン符号化部は、前記ゲイン変化点と、他のゲイン変化点を用いた1次予測による予測値との差分を求めることで、前記第2の差分値を求める
(3)に記載の符号化装置。
(6)
前記ゲイン符号化部は、前記フレーム内における前記ゲイン変化点の個数と、前記ゲイン変化点における前記第2の差分値に基づく情報を符号化する
(3)に記載の符号化装置。
(7)
前記ゲイン計算部は、ダウンミックスにより得られる異なるチャネル数の前記音声信号ごとに、前記第2のゲイン値を算出する
(1)乃至(6)の何れか一項に記載の符号化装置。
(8)
前記ゲイン符号化部は、前記第1のゲイン値と前記第2のゲイン値の相関に基づいて、前記第1の差分値を求めるか否かを選択する
(1)乃至(7)の何れか一項に記載の符号化装置。
(9)
前記ゲイン符号化部は、前記第1の差分値または前記第2の差分値を可変長符号化する
(1)乃至(8)の何れか一項に記載の符号化装置。
(10)
音声信号のフレームごとに音量補正のための第1のゲイン値と第2のゲイン値を算出し、
前記第1のゲイン値と前記第2のゲイン値の第1の差分値を求め、または前記第1のゲイン値と隣接する前記フレームの前記第1のゲイン値との間若しくは前記第1の差分値と隣接する前記フレームの前記第1の差分値との間で第2の差分値を求め、前記第1の差分値または前記第2の差分値に基づく情報を符号化する
ステップを含む符号化方法。
(11)
音声信号のフレームごとに音量補正のための第1のゲイン値と第2のゲイン値を算出し、
前記第1のゲイン値と前記第2のゲイン値の第1の差分値を求め、または前記第1のゲイン値と隣接する前記フレームの前記第1のゲイン値との間若しくは前記第1の差分値と隣接する前記フレームの前記第1の差分値との間で第2の差分値を求め、前記第1の差分値または前記第2の差分値に基づく情報を符号化する
ステップを含む処理をコンピュータに実行させるプログラム。
(12)
音声信号のフレームごとに算出された音量補正のための第1のゲイン値と第2のゲイン値について、前記第1のゲイン値と前記第2のゲイン値の第1の差分値、または前記第1のゲイン値と隣接する前記フレームの前記第1のゲイン値との間若しくは前記第1の差分値と隣接する前記フレームの前記第1の差分値との間で第2の差分値を求めることで生成されたゲイン符号列と、前記音声信号を符号化して得られた信号符号列とに、入力符号列を非多重化する非多重化部と、
前記信号符号列を復号化する信号復号化部と、
前記ゲイン符号列を復号化して、前記音量補正のための前記第1のゲイン値または前記第2のゲイン値を出力するゲイン復号化部と
を備える復号化装置。
(13)
前記第1の差分値は、前記フレーム内の複数の位置での前記第1のゲイン値と前記第2のゲイン値の差分値を求めることで符号化されており、
前記第2の差分値は、前記フレーム内の複数の位置での前記第1のゲイン値の間または前記フレーム内の複数の位置での前記第1の差分値の間での差分値を求めることで符号化されている
(12)に記載の復号化装置。
(14)
前記第2の差分値が、前記フレーム内の前記第1のゲイン値または前記第1の差分値の傾きが変化するゲイン変化点から求められることで符号化されている
(12)または(13)に記載の復号化装置。
(15)
前記第2の差分値が、前記ゲイン変化点と、他のゲイン変化点との差分から求められることで符号化されている
(14)に記載の復号化装置。
(16)
前記第2の差分値が、前記ゲイン変化点と、他のゲイン変化点を用いた1次予測による予測値との差分から求められることで符号化されている
(14)に記載の復号化装置。
(17)
前記フレーム内における前記ゲイン変化点の個数と、前記ゲイン変化点における前記第2の差分値に基づく情報が前記第2の差分値として符号化されている
(14)乃至(16)の何れか一項に記載の復号化装置。
(18)
音声信号のフレームごとに算出された音量補正のための第1のゲイン値と第2のゲイン値について、前記第1のゲイン値と前記第2のゲイン値の第1の差分値、または前記第1のゲイン値と隣接する前記フレームの前記第1のゲイン値との間若しくは前記第1の差分値と隣接する前記フレームの前記第1の差分値との間で第2の差分値を求めることで生成されたゲイン符号列と、前記音声信号を符号化して得られた信号符号列とに入力符号列を非多重化し、
前記信号符号列を復号化し、
前記ゲイン符号列を復号化して、前記音量補正のための前記第1のゲイン値または前記第2のゲイン値を出力する
ステップを含む復号化方法。
(19)
音声信号のフレームごとに算出された音量補正のための第1のゲイン値と第2のゲイン値について、前記第1のゲイン値と前記第2のゲイン値の第1の差分値、または前記第1のゲイン値と隣接する前記フレームの前記第1のゲイン値との間若しくは前記第1の差分値と隣接する前記フレームの前記第1の差分値との間で第2の差分値を求めることで生成されたゲイン符号列と、前記音声信号を符号化して得られた信号符号列とに入力符号列を非多重化し、
前記信号符号列を復号化し、
前記ゲイン符号列を復号化して、前記音量補正のための前記第1のゲイン値または前記第2のゲイン値を出力する
ステップを含む処理をコンピュータに実行させるプログラム。
Claims (19)
- 音声信号のフレームごとに音量補正のための第1のゲイン値と第2のゲイン値を算出するゲイン計算部と、
前記第1のゲイン値と前記第2のゲイン値の第1の差分値を求め、または前記第1のゲイン値と隣接する前記フレームの前記第1のゲイン値との間若しくは前記第1の差分値と隣接する前記フレームの前記第1の差分値との間で第2の差分値を求め、前記第1の差分値または前記第2の差分値に基づく情報を符号化するゲイン符号化部と
を備える符号化装置。 - 前記ゲイン符号化部は、前記フレーム内の複数の位置での前記第1のゲイン値と前記第2のゲイン値の前記第1の差分値を求め、または前記フレーム内の複数の位置での前記第1のゲイン値の間若しくは前記フレーム内の複数の位置での前記第1の差分値の間で前記第2の差分値を求める
請求項1に記載の符号化装置。 - 前記ゲイン符号化部は、前記フレーム内の前記第1のゲイン値または前記第1の差分値の傾きが変化するゲイン変化点に基づいて前記第2の差分値を求める
請求項1に記載の符号化装置。 - 前記ゲイン符号化部は、前記ゲイン変化点と、他のゲイン変化点との差分を求めることで、前記第2の差分値を求める
請求項3に記載の符号化装置。 - 前記ゲイン符号化部は、前記ゲイン変化点と、他のゲイン変化点を用いた1次予測による予測値との差分を求めることで、前記第2の差分値を求める
請求項3に記載の符号化装置。 - 前記ゲイン符号化部は、前記フレーム内における前記ゲイン変化点の個数と、前記ゲイン変化点における前記第2の差分値に基づく情報を符号化する
請求項3に記載の符号化装置。 - 前記ゲイン計算部は、ダウンミックスにより得られる異なるチャネル数の前記音声信号ごとに、前記第2のゲイン値を算出する
請求項1に記載の符号化装置。 - 前記ゲイン符号化部は、前記第1のゲイン値と前記第2のゲイン値の相関に基づいて、前記第1の差分値を求めるか否かを選択する
請求項1に記載の符号化装置。 - 前記ゲイン符号化部は、前記第1の差分値または前記第2の差分値を可変長符号化する
請求項1に記載の符号化装置。 - 音声信号のフレームごとに音量補正のための第1のゲイン値と第2のゲイン値を算出し、
前記第1のゲイン値と前記第2のゲイン値の第1の差分値を求め、または前記第1のゲイン値と隣接する前記フレームの前記第1のゲイン値との間若しくは前記第1の差分値と隣接する前記フレームの前記第1の差分値との間で第2の差分値を求め、前記第1の差分値または前記第2の差分値に基づく情報を符号化する
ステップを含む符号化方法。 - 音声信号のフレームごとに音量補正のための第1のゲイン値と第2のゲイン値を算出し、
前記第1のゲイン値と前記第2のゲイン値の第1の差分値を求め、または前記第1のゲイン値と隣接する前記フレームの前記第1のゲイン値との間若しくは前記第1の差分値と隣接する前記フレームの前記第1の差分値との間で第2の差分値を求め、前記第1の差分値または前記第2の差分値に基づく情報を符号化する
ステップを含む処理をコンピュータに実行させるプログラム。 - 音声信号のフレームごとに算出された音量補正のための第1のゲイン値と第2のゲイン値について、前記第1のゲイン値と前記第2のゲイン値の第1の差分値、または前記第1のゲイン値と隣接する前記フレームの前記第1のゲイン値との間若しくは前記第1の差分値と隣接する前記フレームの前記第1の差分値との間で第2の差分値を求めることで生成されたゲイン符号列と、前記音声信号を符号化して得られた信号符号列とに、入力符号列を非多重化する非多重化部と、
前記信号符号列を復号化する信号復号化部と、
前記ゲイン符号列を復号化して、前記音量補正のための前記第1のゲイン値または前記第2のゲイン値を出力するゲイン復号化部と
を備える復号化装置。 - 前記第1の差分値は、前記フレーム内の複数の位置での前記第1のゲイン値と前記第2のゲイン値の差分値を求めることで符号化されており、
前記第2の差分値は、前記フレーム内の複数の位置での前記第1のゲイン値の間または前記フレーム内の複数の位置での前記第1の差分値の間での差分値を求めることで符号化されている
請求項12に記載の復号化装置。 - 前記第2の差分値が、前記フレーム内の前記第1のゲイン値または前記第1の差分値の傾きが変化するゲイン変化点から求められることで符号化されている
請求項12に記載の復号化装置。 - 前記第2の差分値が、前記ゲイン変化点と、他のゲイン変化点との差分から求められることで符号化されている
請求項14に記載の復号化装置。 - 前記第2の差分値が、前記ゲイン変化点と、他のゲイン変化点を用いた1次予測による予測値との差分から求められることで符号化されている
請求項14に記載の復号化装置。 - 前記フレーム内における前記ゲイン変化点の個数と、前記ゲイン変化点における前記第2の差分値に基づく情報が前記第2の差分値として符号化されている
請求項14に記載の復号化装置。 - 音声信号のフレームごとに算出された音量補正のための第1のゲイン値と第2のゲイン値について、前記第1のゲイン値と前記第2のゲイン値の第1の差分値、または前記第1のゲイン値と隣接する前記フレームの前記第1のゲイン値との間若しくは前記第1の差分値と隣接する前記フレームの前記第1の差分値との間で第2の差分値を求めることで生成されたゲイン符号列と、前記音声信号を符号化して得られた信号符号列とに入力符号列を非多重化し、
前記信号符号列を復号化し、
前記ゲイン符号列を復号化して、前記音量補正のための前記第1のゲイン値または前記第2のゲイン値を出力する
ステップを含む復号化方法。 - 音声信号のフレームごとに算出された音量補正のための第1のゲイン値と第2のゲイン値について、前記第1のゲイン値と前記第2のゲイン値の第1の差分値、または前記第1のゲイン値と隣接する前記フレームの前記第1のゲイン値との間若しくは前記第1の差分値と隣接する前記フレームの前記第1の差分値との間で第2の差分値を求めることで生成されたゲイン符号列と、前記音声信号を符号化して得られた信号符号列とに入力符号列を非多重化し、
前記信号符号列を復号化し、
前記ゲイン符号列を復号化して、前記音量補正のための前記第1のゲイン値または前記第2のゲイン値を出力する
ステップを含む処理をコンピュータに実行させるプログラム。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2015537641A JP6531649B2 (ja) | 2013-09-19 | 2014-09-05 | 符号化装置および方法、復号化装置および方法、並びにプログラム |
CN201480050373.8A CN105531762B (zh) | 2013-09-19 | 2014-09-05 | 编码装置和方法、解码装置和方法以及程序 |
EP14846054.6A EP3048609A4 (en) | 2013-09-19 | 2014-09-05 | Encoding device and method, decoding device and method, and program |
US14/917,825 US9875746B2 (en) | 2013-09-19 | 2014-09-05 | Encoding device and method, decoding device and method, and program |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2013-193787 | 2013-09-19 | ||
JP2013193787 | 2013-09-19 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015041070A1 true WO2015041070A1 (ja) | 2015-03-26 |
Family
ID=52688721
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2014/073465 WO2015041070A1 (ja) | 2013-09-19 | 2014-09-05 | 符号化装置および方法、復号化装置および方法、並びにプログラム |
Country Status (5)
Country | Link |
---|---|
US (1) | US9875746B2 (ja) |
EP (1) | EP3048609A4 (ja) |
JP (1) | JP6531649B2 (ja) |
CN (1) | CN105531762B (ja) |
WO (1) | WO2015041070A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2021514136A (ja) * | 2018-02-15 | 2021-06-03 | ドルビー ラボラトリーズ ライセンシング コーポレイション | 音量制御方法および装置 |
Families Citing this family (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
MX2007005027A (es) | 2004-10-26 | 2007-06-19 | Dolby Lab Licensing Corp | Calculo y ajuste de la sonoridad percibida y/o el balance espectral percibido de una senal de audio. |
JP5754899B2 (ja) | 2009-10-07 | 2015-07-29 | ソニー株式会社 | 復号装置および方法、並びにプログラム |
TWI529703B (zh) | 2010-02-11 | 2016-04-11 | 杜比實驗室特許公司 | 用以非破壞地正常化可攜式裝置中音訊訊號響度之系統及方法 |
JP5850216B2 (ja) | 2010-04-13 | 2016-02-03 | ソニー株式会社 | 信号処理装置および方法、符号化装置および方法、復号装置および方法、並びにプログラム |
JP5609737B2 (ja) | 2010-04-13 | 2014-10-22 | ソニー株式会社 | 信号処理装置および方法、符号化装置および方法、復号装置および方法、並びにプログラム |
JP5707842B2 (ja) | 2010-10-15 | 2015-04-30 | ソニー株式会社 | 符号化装置および方法、復号装置および方法、並びにプログラム |
JP5743137B2 (ja) | 2011-01-14 | 2015-07-01 | ソニー株式会社 | 信号処理装置および方法、並びにプログラム |
CN103325380B (zh) | 2012-03-23 | 2017-09-12 | 杜比实验室特许公司 | 用于信号增强的增益后处理 |
US10844689B1 (en) | 2019-12-19 | 2020-11-24 | Saudi Arabian Oil Company | Downhole ultrasonic actuator system for mitigating lost circulation |
CN112185400B (zh) | 2012-05-18 | 2024-07-30 | 杜比实验室特许公司 | 用于维持与参数音频编码器相关联的可逆动态范围控制信息的系统 |
US10083700B2 (en) | 2012-07-02 | 2018-09-25 | Sony Corporation | Decoding device, decoding method, encoding device, encoding method, and program |
SG11201502405RA (en) | 2013-01-21 | 2015-04-29 | Dolby Lab Licensing Corp | Audio encoder and decoder with program loudness and boundary metadata |
ES2624419T3 (es) | 2013-01-21 | 2017-07-14 | Dolby Laboratories Licensing Corporation | Sistema y procedimiento para optimizar la sonoridad y el rango dinámico a través de diferentes dispositivos de reproducción |
CN116665683A (zh) | 2013-02-21 | 2023-08-29 | 杜比国际公司 | 用于参数化多声道编码的方法 |
CN104080024B (zh) | 2013-03-26 | 2019-02-19 | 杜比实验室特许公司 | 音量校平器控制器和控制方法以及音频分类器 |
CN105190618B (zh) | 2013-04-05 | 2019-01-25 | 杜比实验室特许公司 | 用于自动文件检测的对来自基于文件的媒体的特有信息的获取、恢复和匹配 |
TWM487509U (zh) | 2013-06-19 | 2014-10-01 | 杜比實驗室特許公司 | 音訊處理設備及電子裝置 |
CN109785851B (zh) | 2013-09-12 | 2023-12-01 | 杜比实验室特许公司 | 用于各种回放环境的动态范围控制 |
CN105531759B (zh) | 2013-09-12 | 2019-11-26 | 杜比实验室特许公司 | 用于下混合音频内容的响度调整 |
JP6593173B2 (ja) | 2013-12-27 | 2019-10-23 | ソニー株式会社 | 復号化装置および方法、並びにプログラム |
CN105142067B (zh) | 2014-05-26 | 2020-01-07 | 杜比实验室特许公司 | 音频信号响度控制 |
EP4060661B1 (en) | 2014-10-10 | 2024-04-24 | Dolby Laboratories Licensing Corporation | Transmission-agnostic presentation-based program loudness |
CN110428381B (zh) * | 2019-07-31 | 2022-05-06 | Oppo广东移动通信有限公司 | 图像处理方法、图像处理装置、移动终端及存储介质 |
CN112992159B (zh) * | 2021-05-17 | 2021-08-06 | 北京百瑞互联技术有限公司 | 一种lc3音频编解码方法、装置、设备及存储介质 |
EP4348643A1 (en) * | 2021-05-28 | 2024-04-10 | Dolby Laboratories Licensing Corporation | Dynamic range adjustment of spatial audio objects |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002373000A (ja) * | 2001-06-15 | 2002-12-26 | Nec Corp | 音声符号化復号方式間の符号変換方法、その装置、そのプログラム及び記憶媒体 |
JP2008261978A (ja) * | 2007-04-11 | 2008-10-30 | Toshiba Microelectronics Corp | 再生音量自動調整方法 |
WO2009001874A1 (ja) * | 2007-06-27 | 2008-12-31 | Nec Corporation | オーディオ符号化方法、オーディオ復号方法、オーディオ符号化装置、オーディオ復号装置、プログラム、およびオーディオ符号化・復号システム |
JP2010212760A (ja) * | 2009-03-06 | 2010-09-24 | Sony Corp | 音響機器及び音響処理方法 |
Family Cites Families (164)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4628529A (en) | 1985-07-01 | 1986-12-09 | Motorola, Inc. | Noise suppression system |
JPH03254223A (ja) | 1990-03-02 | 1991-11-13 | Eastman Kodak Japan Kk | アナログデータ伝送方式 |
JP2655485B2 (ja) | 1994-06-24 | 1997-09-17 | 日本電気株式会社 | 音声セル符号化装置 |
JP3498375B2 (ja) | 1994-07-20 | 2004-02-16 | ソニー株式会社 | ディジタル・オーディオ信号記録装置 |
JP3189598B2 (ja) | 1994-10-28 | 2001-07-16 | 松下電器産業株式会社 | 信号合成方法および信号合成装置 |
JPH1020888A (ja) | 1996-07-02 | 1998-01-23 | Matsushita Electric Ind Co Ltd | 音声符号化・復号化装置 |
US6073100A (en) | 1997-03-31 | 2000-06-06 | Goodridge, Jr.; Alan G | Method and apparatus for synthesizing signals using transform-domain match-output extension |
SE512719C2 (sv) | 1997-06-10 | 2000-05-02 | Lars Gustaf Liljeryd | En metod och anordning för reduktion av dataflöde baserad på harmonisk bandbreddsexpansion |
WO1999003096A1 (fr) | 1997-07-11 | 1999-01-21 | Sony Corporation | Procede et dispositif de codage et decodage d'informations et support de distribution |
SE9903553D0 (sv) | 1999-01-27 | 1999-10-01 | Lars Liljeryd | Enhancing percepptual performance of SBR and related coding methods by adaptive noise addition (ANA) and noise substitution limiting (NSL) |
US6829360B1 (en) | 1999-05-14 | 2004-12-07 | Matsushita Electric Industrial Co., Ltd. | Method and apparatus for expanding band of audio signal |
JP3454206B2 (ja) | 1999-11-10 | 2003-10-06 | 三菱電機株式会社 | 雑音抑圧装置及び雑音抑圧方法 |
CA2290037A1 (en) | 1999-11-18 | 2001-05-18 | Voiceage Corporation | Gain-smoothing amplifier device and method in codecs for wideband speech and audio signals |
SE0004163D0 (sv) | 2000-11-14 | 2000-11-14 | Coding Technologies Sweden Ab | Enhancing perceptual performance of high frequency reconstruction coding methods by adaptive filtering |
JP2002268698A (ja) | 2001-03-08 | 2002-09-20 | Nec Corp | 音声認識装置と標準パターン作成装置及び方法並びにプログラム |
SE0101175D0 (sv) | 2001-04-02 | 2001-04-02 | Coding Technologies Sweden Ab | Aliasing reduction using complex-exponential-modulated filterbanks |
MXPA03002115A (es) | 2001-07-13 | 2003-08-26 | Matsushita Electric Ind Co Ltd | DISPOSITIVO DE DECODIFICACION Y CODIFICACION DE SEnAL DE AUDIO. |
US6895375B2 (en) | 2001-10-04 | 2005-05-17 | At&T Corp. | System for bandwidth extension of Narrow-band speech |
US6988066B2 (en) | 2001-10-04 | 2006-01-17 | At&T Corp. | Method of bandwidth extension for narrow-band speech |
JP3926726B2 (ja) | 2001-11-14 | 2007-06-06 | 松下電器産業株式会社 | 符号化装置および復号化装置 |
EP1444688B1 (en) | 2001-11-14 | 2006-08-16 | Matsushita Electric Industrial Co., Ltd. | Encoding device and decoding device |
CN1279512C (zh) | 2001-11-29 | 2006-10-11 | 编码技术股份公司 | 用于改善高频重建的方法和装置 |
WO2003065353A1 (en) | 2002-01-30 | 2003-08-07 | Matsushita Electric Industrial Co., Ltd. | Audio encoding and decoding device and methods thereof |
JP2003255973A (ja) | 2002-02-28 | 2003-09-10 | Nec Corp | 音声帯域拡張システムおよび方法 |
US20030187663A1 (en) | 2002-03-28 | 2003-10-02 | Truman Michael Mead | Broadband frequency translation for high frequency regeneration |
JP2003316394A (ja) | 2002-04-23 | 2003-11-07 | Nec Corp | 音声復号システム、及び、音声復号方法、並びに、音声復号プログラム |
US7447631B2 (en) | 2002-06-17 | 2008-11-04 | Dolby Laboratories Licensing Corporation | Audio coding system using spectral hole filling |
CA2453814C (en) | 2002-07-19 | 2010-03-09 | Nec Corporation | Audio decoding apparatus and decoding method and program |
JP4728568B2 (ja) | 2002-09-04 | 2011-07-20 | マイクロソフト コーポレーション | レベル・モードとラン・レングス/レベル・モードの間での符号化を適応させるエントロピー符号化 |
JP3881943B2 (ja) | 2002-09-06 | 2007-02-14 | 松下電器産業株式会社 | 音響符号化装置及び音響符号化方法 |
SE0202770D0 (sv) | 2002-09-18 | 2002-09-18 | Coding Technologies Sweden Ab | Method for reduction of aliasing introduces by spectral envelope adjustment in real-valued filterbanks |
EP1543307B1 (en) | 2002-09-19 | 2006-02-22 | Matsushita Electric Industrial Co., Ltd. | Audio decoding apparatus and method |
US7330812B2 (en) | 2002-10-04 | 2008-02-12 | National Research Council Of Canada | Method and apparatus for transmitting an audio stream having additional payload in a hidden sub-channel |
EP2665294A2 (en) | 2003-03-04 | 2013-11-20 | Core Wireless Licensing S.a.r.l. | Support of a multichannel audio extension |
US7318035B2 (en) | 2003-05-08 | 2008-01-08 | Dolby Laboratories Licensing Corporation | Audio coding systems and methods using spectral component coupling and spectral component regeneration |
US20050004793A1 (en) | 2003-07-03 | 2005-01-06 | Pasi Ojala | Signal adaptation for higher band coding in a codec utilizing band split coding |
KR20050027179A (ko) | 2003-09-13 | 2005-03-18 | 삼성전자주식회사 | 오디오 데이터 복원 방법 및 그 장치 |
US7844451B2 (en) | 2003-09-16 | 2010-11-30 | Panasonic Corporation | Spectrum coding/decoding apparatus and method for reducing distortion of two band spectrums |
EP2221808B1 (en) | 2003-10-23 | 2012-07-11 | Panasonic Corporation | Spectrum coding apparatus, spectrum decoding apparatus, acoustic signal transmission apparatus, acoustic signal reception apparatus and methods thereof |
KR100587953B1 (ko) | 2003-12-26 | 2006-06-08 | 한국전자통신연구원 | 대역-분할 광대역 음성 코덱에서의 고대역 오류 은닉 장치 및 그를 이용한 비트스트림 복호화 시스템 |
EP1744139B1 (en) | 2004-05-14 | 2015-11-11 | Panasonic Intellectual Property Corporation of America | Decoding apparatus and method thereof |
WO2005112001A1 (ja) | 2004-05-19 | 2005-11-24 | Matsushita Electric Industrial Co., Ltd. | 符号化装置、復号化装置、およびこれらの方法 |
ATE474310T1 (de) | 2004-05-28 | 2010-07-15 | Nokia Corp | Mehrkanalige audio-erweiterung |
KR100608062B1 (ko) | 2004-08-04 | 2006-08-02 | 삼성전자주식회사 | 오디오 데이터의 고주파수 복원 방법 및 그 장치 |
US7716046B2 (en) | 2004-10-26 | 2010-05-11 | Qnx Software Systems (Wavemakers), Inc. | Advanced periodic signal enhancement |
US20060106620A1 (en) | 2004-10-28 | 2006-05-18 | Thompson Jeffrey K | Audio spatial environment down-mixer |
SE0402651D0 (sv) | 2004-11-02 | 2004-11-02 | Coding Tech Ab | Advanced methods for interpolation and parameter signalling |
BRPI0517780A2 (pt) | 2004-11-05 | 2011-04-19 | Matsushita Electric Ind Co Ltd | aparelho de decodificação escalável e aparelho de codificação escalável |
JP4977471B2 (ja) | 2004-11-05 | 2012-07-18 | パナソニック株式会社 | 符号化装置及び符号化方法 |
KR100657916B1 (ko) | 2004-12-01 | 2006-12-14 | 삼성전자주식회사 | 주파수 대역간의 유사도를 이용한 오디오 신호 처리 장치및 방법 |
WO2006075563A1 (ja) | 2005-01-11 | 2006-07-20 | Nec Corporation | オーディオ符号化装置、オーディオ符号化方法およびオーディオ符号化プログラム |
SG161223A1 (en) | 2005-04-01 | 2010-05-27 | Qualcomm Inc | Method and apparatus for vector quantizing of a spectral envelope representation |
KR100933548B1 (ko) | 2005-04-15 | 2009-12-23 | 돌비 스웨덴 에이비 | 비상관 신호의 시간적 엔벨로프 정형화 |
US20070005351A1 (en) | 2005-06-30 | 2007-01-04 | Sathyendra Harsha M | Method and system for bandwidth expansion for voice communications |
JP4899359B2 (ja) | 2005-07-11 | 2012-03-21 | ソニー株式会社 | 信号符号化装置及び方法、信号復号装置及び方法、並びにプログラム及び記録媒体 |
KR100813259B1 (ko) | 2005-07-13 | 2008-03-13 | 삼성전자주식회사 | 입력신호의 계층적 부호화/복호화 장치 및 방법 |
WO2007026821A1 (ja) | 2005-09-02 | 2007-03-08 | Matsushita Electric Industrial Co., Ltd. | エネルギー整形装置及びエネルギー整形方法 |
EP1926083A4 (en) | 2005-09-30 | 2011-01-26 | Panasonic Corp | AUDIOCODING DEVICE AND AUDIOCODING METHOD |
EP1953737B1 (en) | 2005-10-14 | 2012-10-03 | Panasonic Corporation | Transform coder and transform coding method |
US8326638B2 (en) | 2005-11-04 | 2012-12-04 | Nokia Corporation | Audio compression |
JP4876574B2 (ja) | 2005-12-26 | 2012-02-15 | ソニー株式会社 | 信号符号化装置及び方法、信号復号装置及び方法、並びにプログラム及び記録媒体 |
JP4863713B2 (ja) | 2005-12-29 | 2012-01-25 | 富士通株式会社 | 雑音抑制装置、雑音抑制方法、及びコンピュータプログラム |
US7953604B2 (en) | 2006-01-20 | 2011-05-31 | Microsoft Corporation | Shape and scale parameters for extended-band frequency coding |
US7590523B2 (en) | 2006-03-20 | 2009-09-15 | Mindspeed Technologies, Inc. | Speech post-processing using MDCT coefficients |
US20090248407A1 (en) | 2006-03-31 | 2009-10-01 | Panasonic Corporation | Sound encoder, sound decoder, and their methods |
EP2323131A1 (en) | 2006-04-27 | 2011-05-18 | Panasonic Corporation | Audio encoding device, audio decoding device, and their method |
JP5190359B2 (ja) | 2006-05-10 | 2013-04-24 | パナソニック株式会社 | 符号化装置及び符号化方法 |
JP2007316254A (ja) | 2006-05-24 | 2007-12-06 | Sony Corp | オーディオ信号補間方法及びオーディオ信号補間装置 |
KR20070115637A (ko) | 2006-06-03 | 2007-12-06 | 삼성전자주식회사 | 대역폭 확장 부호화 및 복호화 방법 및 장치 |
JP2007333785A (ja) | 2006-06-12 | 2007-12-27 | Matsushita Electric Ind Co Ltd | オーディオ信号符号化装置およびオーディオ信号符号化方法 |
WO2007148925A1 (en) | 2006-06-21 | 2007-12-27 | Samsung Electronics Co., Ltd. | Method and apparatus for adaptively encoding and decoding high frequency band |
US8260609B2 (en) | 2006-07-31 | 2012-09-04 | Qualcomm Incorporated | Systems, methods, and apparatus for wideband encoding and decoding of inactive frames |
WO2008032828A1 (fr) | 2006-09-15 | 2008-03-20 | Panasonic Corporation | Dispositif de codage audio et procédé de codage audio |
JP4918841B2 (ja) | 2006-10-23 | 2012-04-18 | 富士通株式会社 | 符号化システム |
US8295507B2 (en) | 2006-11-09 | 2012-10-23 | Sony Corporation | Frequency band extending apparatus, frequency band extending method, player apparatus, playing method, program and recording medium |
JP5141180B2 (ja) | 2006-11-09 | 2013-02-13 | ソニー株式会社 | 周波数帯域拡大装置及び周波数帯域拡大方法、再生装置及び再生方法、並びに、プログラム及び記録媒体 |
KR101565919B1 (ko) | 2006-11-17 | 2015-11-05 | 삼성전자주식회사 | 고주파수 신호 부호화 및 복호화 방법 및 장치 |
JP4930320B2 (ja) | 2006-11-30 | 2012-05-16 | ソニー株式会社 | 再生方法及び装置、プログラム並びに記録媒体 |
US8560328B2 (en) | 2006-12-15 | 2013-10-15 | Panasonic Corporation | Encoding device, decoding device, and method thereof |
JP4984983B2 (ja) | 2007-03-09 | 2012-07-25 | 富士通株式会社 | 符号化装置および符号化方法 |
US8015368B2 (en) | 2007-04-20 | 2011-09-06 | Siport, Inc. | Processor extensions for accelerating spectral band replication |
KR101355376B1 (ko) | 2007-04-30 | 2014-01-23 | 삼성전자주식회사 | 고주파수 영역 부호화 및 복호화 방법 및 장치 |
WO2009004727A1 (ja) | 2007-07-04 | 2009-01-08 | Fujitsu Limited | 符号化装置、符号化方法および符号化プログラム |
JP5045295B2 (ja) * | 2007-07-30 | 2012-10-10 | ソニー株式会社 | 信号処理装置及び方法、並びにプログラム |
US8041577B2 (en) | 2007-08-13 | 2011-10-18 | Mitsubishi Electric Research Laboratories, Inc. | Method for expanding audio signal bandwidth |
ES2704286T3 (es) | 2007-08-27 | 2019-03-15 | Ericsson Telefon Ab L M | Método y dispositivo para la descodificación espectral perceptual de una señal de audio, que incluyen el llenado de huecos espectrales |
PT2186090T (pt) | 2007-08-27 | 2017-03-07 | ERICSSON TELEFON AB L M (publ) | Detetor de transitórios e método para suportar codificação de um sinal de áudio |
EP2571024B1 (en) | 2007-08-27 | 2014-10-22 | Telefonaktiebolaget L M Ericsson AB (Publ) | Adaptive transition frequency between noise fill and bandwidth extension |
US8554349B2 (en) | 2007-10-23 | 2013-10-08 | Clarion Co., Ltd. | High-frequency interpolation device and high-frequency interpolation method |
KR101373004B1 (ko) | 2007-10-30 | 2014-03-26 | 삼성전자주식회사 | 고주파수 신호 부호화 및 복호화 장치 및 방법 |
JP4733727B2 (ja) | 2007-10-30 | 2011-07-27 | 日本電信電話株式会社 | 音声楽音擬似広帯域化装置と音声楽音擬似広帯域化方法、及びそのプログラムとその記録媒体 |
JP5404412B2 (ja) | 2007-11-01 | 2014-01-29 | パナソニック株式会社 | 符号化装置、復号装置およびこれらの方法 |
BRPI0818927A2 (pt) | 2007-11-02 | 2015-06-16 | Huawei Tech Co Ltd | Método e aparelho para a decodificação de áudio |
US20090132238A1 (en) | 2007-11-02 | 2009-05-21 | Sudhakar B | Efficient method for reusing scale factors to improve the efficiency of an audio encoder |
EP2220646A1 (en) | 2007-11-06 | 2010-08-25 | Nokia Corporation | Audio coding apparatus and method thereof |
JP2009116275A (ja) | 2007-11-09 | 2009-05-28 | Toshiba Corp | 雑音抑圧、音声スペクトル平滑化、音声特徴抽出、音声認識及び音声モデルトレーニングための方法及び装置 |
AU2008326957B2 (en) | 2007-11-21 | 2011-06-30 | Lg Electronics Inc. | A method and an apparatus for processing a signal |
US8688441B2 (en) | 2007-11-29 | 2014-04-01 | Motorola Mobility Llc | Method and apparatus to facilitate provision and use of an energy value to determine a spectral envelope shape for out-of-signal bandwidth content |
EP2224432B1 (en) | 2007-12-21 | 2017-03-15 | Panasonic Intellectual Property Corporation of America | Encoder, decoder, and encoding method |
JPWO2009084221A1 (ja) | 2007-12-27 | 2011-05-12 | パナソニック株式会社 | 符号化装置、復号装置およびこれらの方法 |
ATE500588T1 (de) | 2008-01-04 | 2011-03-15 | Dolby Sweden Ab | Audiokodierer und -dekodierer |
CN101925953B (zh) | 2008-01-25 | 2012-06-20 | 松下电器产业株式会社 | 编码装置、解码装置以及其方法 |
KR101413968B1 (ko) | 2008-01-29 | 2014-07-01 | 삼성전자주식회사 | 오디오 신호의 부호화, 복호화 방법 및 장치 |
US8433582B2 (en) | 2008-02-01 | 2013-04-30 | Motorola Mobility Llc | Method and apparatus for estimating high-band energy in a bandwidth extension system |
US20090201983A1 (en) | 2008-02-07 | 2009-08-13 | Motorola, Inc. | Method and apparatus for estimating high-band energy in a bandwidth extension system |
CA2716817C (en) | 2008-03-03 | 2014-04-22 | Lg Electronics Inc. | Method and apparatus for processing audio signal |
KR101449434B1 (ko) * | 2008-03-04 | 2014-10-13 | 삼성전자주식회사 | 복수의 가변장 부호 테이블을 이용한 멀티 채널 오디오를부호화/복호화하는 방법 및 장치 |
EP3273442B1 (en) | 2008-03-20 | 2021-10-20 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for synthesizing a parameterized representation of an audio signal |
KR20090122142A (ko) | 2008-05-23 | 2009-11-26 | 엘지전자 주식회사 | 오디오 신호 처리 방법 및 장치 |
US8498344B2 (en) | 2008-06-20 | 2013-07-30 | Rambus Inc. | Frequency responsive bus coding |
AU2009267525B2 (en) | 2008-07-11 | 2012-12-20 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Audio signal synthesizer and audio signal encoder |
EP4407613A1 (en) | 2008-07-11 | 2024-07-31 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio encoder, audio decoder, methods for encoding and decoding an audio signal, audio stream and computer program |
JP5203077B2 (ja) | 2008-07-14 | 2013-06-05 | 株式会社エヌ・ティ・ティ・ドコモ | 音声符号化装置及び方法、音声復号化装置及び方法、並びに、音声帯域拡張装置及び方法 |
JP5419876B2 (ja) | 2008-08-08 | 2014-02-19 | パナソニック株式会社 | スペクトル平滑化装置、符号化装置、復号装置、通信端末装置、基地局装置及びスペクトル平滑化方法 |
JP2010079275A (ja) | 2008-08-29 | 2010-04-08 | Sony Corp | 周波数帯域拡大装置及び方法、符号化装置及び方法、復号化装置及び方法、並びにプログラム |
US8407046B2 (en) | 2008-09-06 | 2013-03-26 | Huawei Technologies Co., Ltd. | Noise-feedback for spectral envelope quantization |
US8352279B2 (en) | 2008-09-06 | 2013-01-08 | Huawei Technologies Co., Ltd. | Efficient temporal envelope coding approach by prediction between low band signal and high band signal |
US8532983B2 (en) | 2008-09-06 | 2013-09-10 | Huawei Technologies Co., Ltd. | Adaptive frequency prediction for encoding or decoding an audio signal |
US8798776B2 (en) | 2008-09-30 | 2014-08-05 | Dolby International Ab | Transcoding of audio metadata |
GB0822537D0 (en) | 2008-12-10 | 2009-01-14 | Skype Ltd | Regeneration of wideband speech |
GB2466201B (en) | 2008-12-10 | 2012-07-11 | Skype Ltd | Regeneration of wideband speech |
CN101770776B (zh) | 2008-12-29 | 2011-06-08 | 华为技术有限公司 | 瞬态信号的编码方法和装置、解码方法和装置及处理系统 |
EP2380172B1 (en) | 2009-01-16 | 2013-07-24 | Dolby International AB | Cross product enhanced harmonic transposition |
US8457975B2 (en) | 2009-01-28 | 2013-06-04 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Audio decoder, audio encoder, methods for decoding and encoding an audio signal and computer program |
JP4945586B2 (ja) | 2009-02-02 | 2012-06-06 | 株式会社東芝 | 信号帯域拡張装置 |
US8463599B2 (en) | 2009-02-04 | 2013-06-11 | Motorola Mobility Llc | Bandwidth extension method and apparatus for a modified discrete cosine transform audio coder |
CN101853663B (zh) | 2009-03-30 | 2012-05-23 | 华为技术有限公司 | 比特分配方法、编码装置及解码装置 |
EP2239732A1 (en) | 2009-04-09 | 2010-10-13 | Fraunhofer-Gesellschaft zur Förderung der Angewandten Forschung e.V. | Apparatus and method for generating a synthesis audio signal and for encoding an audio signal |
CO6440537A2 (es) | 2009-04-09 | 2012-05-15 | Fraunhofer Ges Forschung | Aparato y metodo para generar una señal de audio de sintesis y para codificar una señal de audio |
JP5223786B2 (ja) | 2009-06-10 | 2013-06-26 | 富士通株式会社 | 音声帯域拡張装置、音声帯域拡張方法及び音声帯域拡張用コンピュータプログラムならびに電話機 |
US8515768B2 (en) | 2009-08-31 | 2013-08-20 | Apple Inc. | Enhanced audio decoder |
JP5754899B2 (ja) | 2009-10-07 | 2015-07-29 | ソニー株式会社 | 復号装置および方法、並びにプログラム |
US8600749B2 (en) | 2009-12-08 | 2013-12-03 | At&T Intellectual Property I, L.P. | System and method for training adaptation-specific acoustic models for automatic speech recognition |
US8447617B2 (en) | 2009-12-21 | 2013-05-21 | Mindspeed Technologies, Inc. | Method and system for speech bandwidth extension |
EP2357649B1 (en) | 2010-01-21 | 2012-12-19 | Electronics and Telecommunications Research Institute | Method and apparatus for decoding audio signal |
TWI529703B (zh) * | 2010-02-11 | 2016-04-11 | 杜比實驗室特許公司 | 用以非破壞地正常化可攜式裝置中音訊訊號響度之系統及方法 |
JP5375683B2 (ja) | 2010-03-10 | 2013-12-25 | 富士通株式会社 | 通信装置および電力補正方法 |
EP2555188B1 (en) | 2010-03-31 | 2014-05-14 | Fujitsu Limited | Bandwidth extension apparatuses and methods |
JP5850216B2 (ja) | 2010-04-13 | 2016-02-03 | ソニー株式会社 | 信号処理装置および方法、符号化装置および方法、復号装置および方法、並びにプログラム |
JP5652658B2 (ja) | 2010-04-13 | 2015-01-14 | ソニー株式会社 | 信号処理装置および方法、符号化装置および方法、復号装置および方法、並びにプログラム |
JP5609737B2 (ja) | 2010-04-13 | 2014-10-22 | ソニー株式会社 | 信号処理装置および方法、符号化装置および方法、復号装置および方法、並びにプログラム |
US8793126B2 (en) | 2010-04-14 | 2014-07-29 | Huawei Technologies Co., Ltd. | Time/frequency two dimension post-processing |
US9047875B2 (en) | 2010-07-19 | 2015-06-02 | Futurewei Technologies, Inc. | Spectrum flatness control for bandwidth extension |
PL4016527T3 (pl) | 2010-07-19 | 2023-05-22 | Dolby International Ab | Przetwarzanie sygnałów audio podczas rekonstrukcji wysokich częstotliwości |
US8560330B2 (en) | 2010-07-19 | 2013-10-15 | Futurewei Technologies, Inc. | Energy envelope perceptual correction for high band coding |
JP6075743B2 (ja) | 2010-08-03 | 2017-02-08 | ソニー株式会社 | 信号処理装置および方法、並びにプログラム |
JP2012058358A (ja) | 2010-09-07 | 2012-03-22 | Sony Corp | 雑音抑圧装置、雑音抑圧方法およびプログラム |
JP5707842B2 (ja) | 2010-10-15 | 2015-04-30 | ソニー株式会社 | 符号化装置および方法、復号装置および方法、並びにプログラム |
US9230551B2 (en) | 2010-10-18 | 2016-01-05 | Nokia Technologies Oy | Audio encoder or decoder apparatus |
JP5743137B2 (ja) | 2011-01-14 | 2015-07-01 | ソニー株式会社 | 信号処理装置および方法、並びにプログラム |
JP5704397B2 (ja) | 2011-03-31 | 2015-04-22 | ソニー株式会社 | 符号化装置および方法、並びにプログラム |
JP6024077B2 (ja) | 2011-07-01 | 2016-11-09 | ヤマハ株式会社 | 信号送信装置および信号処理装置 |
JP5942358B2 (ja) | 2011-08-24 | 2016-06-29 | ソニー株式会社 | 符号化装置および方法、復号装置および方法、並びにプログラム |
JP6037156B2 (ja) | 2011-08-24 | 2016-11-30 | ソニー株式会社 | 符号化装置および方法、並びにプログラム |
JP5975243B2 (ja) | 2011-08-24 | 2016-08-23 | ソニー株式会社 | 符号化装置および方法、並びにプログラム |
JP5845760B2 (ja) | 2011-09-15 | 2016-01-20 | ソニー株式会社 | 音声処理装置および方法、並びにプログラム |
KR101585852B1 (ko) | 2011-09-29 | 2016-01-15 | 돌비 인터네셔널 에이비 | Fm 스테레오 라디오 신호들에서 고품질의 검출 |
JPWO2013154027A1 (ja) | 2012-04-13 | 2015-12-17 | ソニー株式会社 | 復号装置および方法、オーディオ信号処理装置および方法、並びにプログラム |
JP5997592B2 (ja) | 2012-04-27 | 2016-09-28 | 株式会社Nttドコモ | 音声復号装置 |
TWI517142B (zh) | 2012-07-02 | 2016-01-11 | Sony Corp | Audio decoding apparatus and method, audio coding apparatus and method, and program |
RU2652468C2 (ru) | 2012-07-02 | 2018-04-26 | Сони Корпорейшн | Декодирующее устройство, способ декодирования, кодирующее устройство, способ кодирования и программа |
WO2014007096A1 (ja) | 2012-07-02 | 2014-01-09 | ソニー株式会社 | 復号装置および方法、符号化装置および方法、並びにプログラム |
US10083700B2 (en) | 2012-07-02 | 2018-09-25 | Sony Corporation | Decoding device, decoding method, encoding device, encoding method, and program |
JP2014123011A (ja) | 2012-12-21 | 2014-07-03 | Sony Corp | 雑音検出装置および方法、並びに、プログラム |
-
2014
- 2014-09-05 CN CN201480050373.8A patent/CN105531762B/zh active Active
- 2014-09-05 JP JP2015537641A patent/JP6531649B2/ja active Active
- 2014-09-05 US US14/917,825 patent/US9875746B2/en active Active
- 2014-09-05 WO PCT/JP2014/073465 patent/WO2015041070A1/ja active Application Filing
- 2014-09-05 EP EP14846054.6A patent/EP3048609A4/en not_active Withdrawn
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002373000A (ja) * | 2001-06-15 | 2002-12-26 | Nec Corp | 音声符号化復号方式間の符号変換方法、その装置、そのプログラム及び記憶媒体 |
JP2008261978A (ja) * | 2007-04-11 | 2008-10-30 | Toshiba Microelectronics Corp | 再生音量自動調整方法 |
WO2009001874A1 (ja) * | 2007-06-27 | 2008-12-31 | Nec Corporation | オーディオ符号化方法、オーディオ復号方法、オーディオ符号化装置、オーディオ復号装置、プログラム、およびオーディオ符号化・復号システム |
JP2010212760A (ja) * | 2009-03-06 | 2010-09-24 | Sony Corp | 音響機器及び音響処理方法 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2021514136A (ja) * | 2018-02-15 | 2021-06-03 | ドルビー ラボラトリーズ ライセンシング コーポレイション | 音量制御方法および装置 |
JP7309734B2 (ja) | 2018-02-15 | 2023-07-18 | ドルビー ラボラトリーズ ライセンシング コーポレイション | 音量制御方法および装置 |
Also Published As
Publication number | Publication date |
---|---|
CN105531762A (zh) | 2016-04-27 |
CN105531762B (zh) | 2019-10-01 |
EP3048609A4 (en) | 2017-05-03 |
US20160225376A1 (en) | 2016-08-04 |
JP6531649B2 (ja) | 2019-06-19 |
US9875746B2 (en) | 2018-01-23 |
EP3048609A1 (en) | 2016-07-27 |
JPWO2015041070A1 (ja) | 2017-03-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2015041070A1 (ja) | 符号化装置および方法、復号化装置および方法、並びにプログラム | |
JP6753499B2 (ja) | 復号化装置および方法、並びにプログラム | |
US11563411B2 (en) | Metadata for loudness and dynamic range control | |
US10276173B2 (en) | Encoded audio extended metadata-based dynamic range control | |
JP2012504260A (ja) | オーディオメタデータのトランスコーディング | |
KR20090122145A (ko) | 신호의 처리 방법 및 장치 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201480050373.8 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14846054 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2015537641 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14917825 Country of ref document: US |
|
REEP | Request for entry into the european phase |
Ref document number: 2014846054 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2014846054 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |