WO1995015032A1 - Procedes et appareils de compression et decompression d'informations, appareils d'enregistrement/emission et de reception d'informations comprimees, et support d'enregistrement - Google Patents
Procedes et appareils de compression et decompression d'informations, appareils d'enregistrement/emission et de reception d'informations comprimees, et support d'enregistrement Download PDFInfo
- Publication number
- WO1995015032A1 WO1995015032A1 PCT/JP1994/002005 JP9402005W WO9515032A1 WO 1995015032 A1 WO1995015032 A1 WO 1995015032A1 JP 9402005 W JP9402005 W JP 9402005W WO 9515032 A1 WO9515032 A1 WO 9515032A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information
- processing block
- input signal
- signal
- block
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 95
- 238000012545 processing Methods 0.000 claims abstract description 147
- 238000007906 compression Methods 0.000 claims abstract description 122
- 230000006835 compression Effects 0.000 claims abstract description 116
- 230000008859 change Effects 0.000 claims abstract description 53
- 230000001131 transforming effect Effects 0.000 claims abstract 3
- 230000000873 masking effect Effects 0.000 claims description 47
- 238000004364 calculation method Methods 0.000 claims description 40
- 230000006837 decompression Effects 0.000 claims description 22
- 230000009466 transformation Effects 0.000 claims description 18
- 230000008569 process Effects 0.000 claims description 15
- 230000006870 function Effects 0.000 claims description 13
- 230000003044 adaptive effect Effects 0.000 claims description 12
- 230000005540 biological transmission Effects 0.000 claims description 10
- 230000015572 biosynthetic process Effects 0.000 claims description 7
- 238000003786 synthesis reaction Methods 0.000 claims description 7
- 230000002194 synthesizing effect Effects 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 210000003323 beak Anatomy 0.000 claims 1
- 239000002131 composite material Substances 0.000 claims 1
- 238000005314 correlation function Methods 0.000 claims 1
- 230000005236 sound signal Effects 0.000 abstract description 12
- 230000002123 temporal effect Effects 0.000 abstract description 12
- 230000004807 localization Effects 0.000 abstract description 4
- 230000000875 corresponding effect Effects 0.000 description 34
- 238000001228 spectrum Methods 0.000 description 28
- 238000012546 transfer Methods 0.000 description 19
- 230000003287 optical effect Effects 0.000 description 17
- 238000010586 diagram Methods 0.000 description 14
- 238000012937 correction Methods 0.000 description 13
- 238000013139 quantization Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 238000000605 extraction Methods 0.000 description 4
- 238000005070 sampling Methods 0.000 description 4
- 238000013144 data compression Methods 0.000 description 3
- 238000006073 displacement reaction Methods 0.000 description 3
- 239000004065 semiconductor Substances 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 2
- 230000006866 deterioration Effects 0.000 description 2
- 241000282412 Homo Species 0.000 description 1
- 201000009310 astigmatism Diseases 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000010287 polarization Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000004043 responsiveness Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 239000000725 suspension Substances 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/002—Dynamic bit allocation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/022—Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
- G11B20/00007—Time or data compression or expansion
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
- G11B20/10—Digital recording or reproducing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S1/00—Two-channel systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S1/00—Two-channel systems
- H04S1/007—Two-channel systems in which the audio signals are in digital form
Definitions
- the present invention relates to a digital audio signal.
- Information compression method and apparatus for bit-compressing information, compression information decompression method and apparatus, compressed information recording and transmitting apparatus for recording or transmitting compressed information, recording medium on which compression information is recorded, and compression information from recording medium The present invention relates to a compressed information reproducing device for reproducing and a compressed information receiving device, and in particular, to change a temporal size of a processing block according to a change in amplitude of a waveform on a time axis of an input signal.
- the applicant first bit-compresses an input digital audio signal, and records the predetermined amount of data as a recording unit in a burst manner on a recording medium.
- Such a technique is proposed in, for example, the specification and drawings of US Patent Nos. P5, 243, 588.
- This technology uses a magneto-optical disk as a recording medium and is defined in the so-called CD-I (CD-interactive) and CD-R0 MXA audio format.
- AD Adaptive Difference
- PCM The audio data is recorded on a magneto-optical disc, and the ADPCM audio data is reproduced from the magneto-optical disc. For example, 32 sectors of the ADPCM audio data and the interleave processing are performed.
- the ADPCM audio is recorded on the magneto-optical disk in bursts with several linking sectors as recording units.
- the reproduction time of a normal CD can be selected.
- a sampling rate of 18.9 kHz is specified at a compression ratio of 1 kHz. That is, for example, in the case of the above level B, the digital audio data is compressed to approximately 1/4, and the playback time (play time) of a disc recorded in this level B mode is a standard time. It is four times that of the CD format (CD-DA format). This means that the same recording / reproducing time can be obtained with a smaller disk than that of a standard 12 cm disk, so that the device can be downsized.
- the rotation speed of the disc in this recording / reproducing device is the same as that of a standard CD.
- level B compressed data for a reproduction time four times as long as a predetermined time is obtained. Will be obtained.
- the same compressed data is read out four times in units of time, such as in a sector or classroom, and only one of the compressed data is used for audio playback.
- a track jump is performed to return to the original track position for each rotation, and the same track is used.
- the playback operation proceeds in such a way that tracking is repeated four times at a time. This means that, for example, normal compressed data needs to be obtained at least once out of, for example, four duplicate readings, and is resistant to errors due to disturbances and the like. It is a new thing.
- IC cards and the like using such semiconductor memory are expected to have increased storage capacity and lower prices with the advancement of semiconductor technology.
- capacity is running short and expensive. Therefore, it is sufficiently conceivable that the contents are transferred from an inexpensive and large-capacity recording medium such as the above-mentioned magneto-optical disk to an IC card or the like and frequently rewritten and used.
- an inexpensive and large-capacity recording medium such as the above-mentioned magneto-optical disk
- a desired song is dubbed to the IC card, and replaced with another song when unnecessary. In this way, by frequently rewriting the contents of the IC card, various songs can be enjoyed outdoors or the like with a small number of hand-held IC cards.
- transform coding which uses orthogonal transform in the high-efficiency compression method
- this technique is particularly effective for pre-echo that occurs when a signal with a large amplitude change is input.
- the pre-echo described here means that when compression and decompression are performed in a state where a large amplitude change occurs in a block (hereinafter, referred to as an orthogonal transformation block) that is a unit of the orthogonal transformation.
- an orthogonal transformation block a block that is a unit of the orthogonal transformation.
- the length of the orthogonal transform block of one channel may be shortened and the length of the other channel may be lengthened.
- DISCLOSURE OF THE INVENTION The present invention has been made in view of the above-described circumstances, and an object of the present invention is to provide a quadrature transform block size that is better adapted to an actual complex input signal.
- the purpose of the present invention is to provide a method capable of determining the sound quality, prevent sound quality degradation at a low bit rate, and improve sound quality at the same bit rate.
- the present invention has been proposed to achieve the above-mentioned object, and the information compression method of the present invention adapts each input signal of at least two channels to the input signal of each channel.
- This is an information compression method in which information blocks are divided into processing blocks of variable length and information is compressed in units of processing blocks, and the length of the above processing blocks of each channel at the same time is the same. It is characterized by the following.
- the information compression apparatus of the present invention when dividing each input signal of at least two channels into processing blocks, varies the length of the processing block in accordance with the input signal of each channel. And a block dividing means for making the lengths of the processing blocks of the respective channels at the same time the same, and-a predetermined information compression processing is performed on the signal of the processing block unit. And information compression means for applying.
- the information compression method and apparatus of the present invention are as follows. Have been. In other words, at least two channels have the same processing block length in all channels. Further, in the information compression method and apparatus of the present invention, at least the correlation of signals between two channels is checked, and only when it is determined that the correlation is high, the above processing block on each corresponding channel is determined. The length of the lock is the same. This correlation may be based on the change in the input signal of the relevant processing block and the change in the input signal of Z or other processing blocks and / or the power or energy or peak information, or on the relevant processing block.
- the information compression method and apparatus of the present invention calculate the degree of a predetermined masking effect according to an input signal, and determine the processing block length of each channel.
- the degree of this masking effect can be calculated based on the change in the input signal of the relevant processing block and / or the input signal of the other processing block, and / or the power, energy or peak information, or can be assigned to the relevant processing block. Changes in the input signal of the adjacent processing process, and Z or power or energy or peak Calculated based on the information of the processing block, and further based on the change of the input signal of the processing block and the Z or the power or the energy or the peak information, which are simultaneously related to the processing block concerned. calculate.
- the ratio involved in determining the element that determines the length of the corresponding processing block is used as a fixed or ratio adapted to the input signal, and the corresponding processing block is used.
- the ratio involved in the determination of the element that determines the length of the signal is made variable according to the frequency.
- orthogonal transform is used to divide a time-axis signal into a plurality of bands on the frequency axis, and the orthogonal transform is used together with the variable orthogonal transform size. The shape of the window function is also changed.
- the signal is divided into a plurality of bands, and a block including a plurality of samples is formed for each of the divided bands. Perform orthogonal transformation for each block to obtain coefficient data.
- the division frequency width in the division of the time-axis signal before the orthogonal transformation into a plurality of bands on the frequency axis is set to be broader in the higher frequency range, and is the same in two continuous bands in the lowest frequency range.
- the assignment of main information and compression information or sub-information to a signal component in a band substantially equal to or higher than the signal pass band is prohibited.
- a compressed information decompression method of the present invention decompresses information compressed by the information compression method or the information compression device of the present invention.
- orthogonal transform is performed at the time of information compression
- a signal on a time axis is extracted from a plurality of bands on a frequency axis.
- Inverse orthogonal transform is used for the conversion to the frequency domain, and when converting from multiple bands on the frequency axis to the time axis signal, the inverse orthogonal transform is used for each block of each band, and the output of each inverse orthogonal transform is synthesized. To obtain a synthesized signal on the time axis.
- synthesized frequency widths from the plurality of bands on the frequency axis after the inverse orthogonal transform into the time-axis signal are broadened to substantially higher frequency bands, and the synthesized frequency widths are continuous in the lowest band. It is the same in two bands.
- the compression information decompression device of the present invention can change the length of a processing block in accordance with at least two channels of input signals, and can use the same length for each channel at the same time.
- a compression information decompression device that decompresses the compressed information of each channel that has been subjected to a predetermined compression process on a per-process block basis, and performs a decompression process corresponding to the predetermined compression process on each channel.
- the compressed information decompression method and device of the present invention decompress the compressed information compressed by the above-described information compression method or information compression device of the present invention.
- the information compression method and device and the compressed information decompression method and device (high-efficiency coding method and compression or decompression device) of the present invention adapt the orthogonal transformation block of the compression process to adapt to the amplitude change of the input signal. It is characterized in that the temporal size of the block is variable, and the amplitude change of the signal on the time axis of the frequency band of the corresponding block and / or the energy of the other frequency band By determining the temporal size of the orthogonal transform block based on the energy or power of the frequency band of the other channel in addition to the power, the signal is highly correlated between channels.
- the compressed information recording Z transmission apparatus of the present invention when dividing each input signal of at least two channels into processing blocks, processes the input signal of each channel adaptively.
- a block dividing means that varies the block length and sets the same length of the processing block for each channel at the same time, and a predetermined value for the signal for each processing block.
- compressed information compressed by the information compression method and apparatus of the present invention is recorded on a recording medium or transmitted to a transmission medium.
- the processing blocks of at least two channels vary the length of the processing blocks according to the input signal and are the same for each corresponding channel.
- the compression information of each channel that has been subjected to a predetermined compression process is recorded in units of processing blocks having the same length as that of the present invention. It records the compressed information compressed by the information compression method or the information compression device.
- the compressed information reproducing apparatus of the present invention expands and reproduces the compressed information from a recording medium on which the compressed information is recorded by the compressed information recording apparatus of the present invention.
- the compressed information receiving device of the present invention receives and decompresses the compressed information transmitted from the compressed information transmitting device of the present invention. It is to be played together with.
- the correlation between the channels is high to some extent. If it is determined that the time length of the orthogonal transform blocks of each channel is the same, the occurrence of sound quality differences between channels is suppressed, and the sound image localization is improved. In order to obtain good sound quality.
- FIG. 1 is a block circuit diagram showing a specific configuration of a compressed data recording / reproducing apparatus to which the present invention is applied.
- FIG. 2 is a block circuit diagram showing a specific configuration of a high efficiency compression encoding apparatus to which the present invention is applied.
- Fig. 3 is a diagram showing the structure of the orthogonal transform block at the time of bit compression.
- FIG. 4 is a block circuit diagram showing a configuration example of the orthogonal transform block size determination circuit.
- FIG. 5 is a diagram showing a relationship between a change in the temporal length of the orthogonal transform block adjacent in time and a window shape used in the orthogonal transform.
- FIG. 6 is a diagram showing a detailed example of a window shape used at the time of orthogonal transformation.
- FIG. 7 is a diagram for explaining the masking effect of the pre-echo in the block determination circuit.
- FIG. 8 is a diagram for explaining the determination of the orthogonal transform block size in the block determination circuit and the correlation between the channels.
- FIG. 9 is a block circuit diagram showing a specific configuration of the bit allocation calculation circuit.
- FIG. 10 is a diagram illustrating the spectrum of each critical band and a band divided in consideration of block floating.
- FIG. 11 is a diagram showing a masking spectrum.
- FIG. 1 is a block circuit diagram showing a configuration of an embodiment of a compressed data recording / reproducing apparatus to which an information compression method and apparatus and a compressed information expansion method and apparatus according to the present invention are applied.
- a magneto-optical disc 1 which is driven to rotate by a spindle motor 51 is used as a recording medium.
- a modulated magnetic field corresponding to the recorded data is irradiated with laser light by an optical head 53.
- a magnetic field modulation recording By applying a so-called magnetic field modulation recording by applying a magnetic head 54, the data is recorded along the recording track of the magneto-optical disk 1.
- the recording track of the magneto-optical disc 1 is traced by a laser beam using an optical head 53, and the data is magneto-optically reproduced. It is intended to be played back in a dynamic manner.
- the optical head 53 is, for example, a laser light source such as a laser diode, a collimating lens, an objective lens, a polarizing beam splitter, or a cylindrical lens. And a photodetector having a light receiving section of a predetermined pattern.
- the optical head 53 is provided at a position facing the magnetic head 54 via the magneto-optical disk 1.
- a magnetic head 54 When data is recorded on the magneto-optical disk 1, a magnetic head 54 is driven by a head drive circuit 66 of a recording system, which will be described later, and a modulation magnetic field corresponding to the recording data is generated. Is applied to the magneto-optical disk 1 and the target track of the magneto-optical disk 1 is irradiated with laser light by the optical head 53, that is, by the magnetic field modulation method. Perform thermomagnetic recording.
- the optical head 53 detects the reflected light of the laser beam applied to the target track, detects a focus error by, for example, a so-called astigmatism method, and detects a focus error by, for example, a so-called Bush-bull method. More traffic error is detected.
- the optical head 53 When reproducing data from the magneto-optical disc 1, the optical head 53 detects the above-mentioned focus error and tracking error and simultaneously reflects the reflected light of the laser beam from the target track. It detects the difference in the polarization angle (Kerr rotation angle) and generates a reproduced signal.
- the output of the optical head 53 is supplied to an RF circuit 55.
- the RF circuit 55 extracts the above-mentioned focus error signal and tracking error signal from the output of the optical head 53, supplies the extracted signal to the servo control circuit 56, and binarizes the reproduction signal. To the playback decoder 71 described later. Supply.
- the servo control circuit 56 includes, for example, a focus servo control circuit, a tracking support control circuit, a spindle motor servo control circuit, a thread support control circuit, and the like.
- the focus servo control circuit performs focus control of the optical system 53 so that the focus error signal becomes zero.
- the above tracking servo control circuit performs tracking control of the optical system of the optical head 53 so that the above tracking error signal becomes zero.
- the spindle motor servo control circuit controls the spindle motor 51 so as to rotate the magneto-optical disc 1 at a predetermined rotation speed (for example, a constant linear speed).
- the above-mentioned thread servo control circuit includes an optical head 53 and a magnetic head at a target track position of the magneto-optical disk 1 specified by the system controller 57. 5 Move 4.
- the servo control circuit 56 that performs such various control operations sends information indicating the operation state of each unit controlled by the servo control circuit 56 to the system controller 57.
- the key input operation unit 58 and the display unit 59 are connected to the system controller 57.
- the system controller 57 controls the recording system and the reproduction system in the operation mode specified by the operation input information from the key input operation unit 58. Further, the system controller 57 converts the recording track of the magneto-optical disc 1 into address information in sector units which is reproduced as so-called Q data of a so-called header sub-code. Based on this, the optical head 53 and the magnetic head 54 manage the recording position and the reproduction position on the above-mentioned recording track where the optical head 53 and the magnetic head 54 are traced. In addition, the system controller 57 provides the data compression ratio and the reproduction on the recording track. Control is performed to display the playback time on the display unit 59 based on the positional information and.
- This playback time display corresponds to address information (absolute time information) in sector units, which is played back from the recording track of the magneto-optical disk 1 to Q data of the header time or subcode. Then, by multiplying the reciprocal of the data compression ratio (for example, 4 in the case of 1Z4 compression), actual time information is obtained, and this is displayed on the display unit 59. Even when recording, for example, if absolute time information is recorded (preformatted) in advance on a recording track such as a magneto-optical disk, this preformat is not performed. By reading the absolute time information and multiplying by the reciprocal of the data compression ratio, it is possible to display the current position with the actual recording time.
- the analog audio input signal AIN from the input terminal 60 is supplied to the A / D converter 62 via the single-pass filter 61, and the A / D converter 62 is connected to the analog audio converter 62.
- the input signal A IN is quantized, that is, converted into, for example, a 16-bit digital audio signal.
- the digital audio signal from the AZD converter 62 is supplied to an ATC (Adaptive Transform Coding) encoder 63.
- ATC Adaptive Transform Coding
- a digital audio input signal D IN from an input terminal 67 is supplied to the ATC encoder 63 via a digital input interface circuit 68.
- the 1 "encoder 603 performs bit compression (de-compression) on a digital audio signal having a predetermined transfer rate obtained by quantizing the analog audio input signal AIN by the AZD converter 62 above.
- the compression ratio is set to 4 times.
- this embodiment has a configuration that does not depend on the magnification, and can be arbitrarily selected depending on the application example.
- the memory 64 is controlled by a system controller 57 to control the writing and reading of data overnight, and receives a compressed digital audio signal from the ATC encoder 63 (hereinafter referred to as ATC audio data). ) Is temporarily stored and used as buffer memory for recording on the magneto-optical disc 1 as needed. That is, for example, the ATC audio data supplied from the ATC encoder 63 has a data transfer rate of 14 which is the data transfer rate of a standard CD-DA format (75 sector nosec), that is, 1 This is reduced to 8.75 sectors / sec, and this ATC audio data is continuously written to memory 64. As described above, this ATC audio data only needs to be recorded in one sector for every four sectors.
- Continuous recording is performed.
- the overall data transfer speed including the recording pause period is as low as 18.75 sectors Z seconds, but it is bursty.
- the instantaneous data transfer rate in the above is the above standard 75 segutanoseconds. Therefore, when the disk rotation speed is the same speed (constant linear velocity) as that of the standard CD-DA format, the same recording density and storage pattern as those of the CD-DA format are recorded. And.
- the ATC audio data ie, the recorded data, which is burst-read from the memory 64 at the above (instantaneous) transfer rate of 75 sectors / sec is supplied to the encoder 65.
- a unit continuously recorded in one recording is a cluster composed of a plurality of sectors (for example, 32 sectors) and a cluster composed of 32 sectors.
- This class setting connection sector is set to be longer than the length of the overnight setting in the encoder 65, so that even if the setting is changed, data in other classes will be affected. I try not to give it.
- the encoder 65 performs coding processing for error correction (for example, parity addition and interleaving processing) and EFM coding of the recording data supplied in a burst manner from the memory 64 as described above. Perform processing.
- the recording data that has been encoded by the encoder 65 is supplied to the magnetic head drive circuit 66.
- the magnetic head drive circuit 66 is connected to a magnetic head 54, and applies a modulation magnetic field corresponding to the recording data to the magneto-optical disk 1 so that the magnetic head 54 is connected to the magnetic head 54. Drive.
- the system controller 57 performs the above-described memory control on the memory 64 and also executes the above-described recording data read out from the memory 64 by this memory control.
- the recording position is controlled so that recording is continuously performed on the recording track of the magneto-optical disk 1.
- This record is controlled by controlling the recording position of the above-mentioned recording data which is read out from the memory 64 in a burst manner by the system controller 57 and recording the recording track of the magneto-optical disc 1. This is performed by supplying a control signal for specifying the recording position on the servo control circuit 56 to the servo control circuit 56.
- This reproducing system is for reproducing the recorded data continuously recorded on the recording track of the magneto-optical disk 1 by the recording system described above, and is provided by the optical head 53.
- a decoder 71 1 for In addition, this playback system can read not only magneto-optical discs, but also the same read-only optical discs as CDs (Compact Discs).
- the decoder 71 corresponds to the encoder 65 in the above-described recording system, and performs decoding processing for error correction and EFM on the reproduced signal binarized by the RF circuit 55. Performs decoding and other processing to reproduce ATC audio data at a transfer rate of 75 sectors / sec, which is faster than the normal transfer rate.
- the reproduction data obtained by the decoder 71 is supplied to the memory 72.
- the writing and reading of data are controlled by the system controller 57, and the reproduced data supplied from the decoder 71 at a transfer rate of 75 sectors / second is supplied to the memory 72.
- the data is written in a burst at a transfer speed of 75 sectorsnoseconds.
- the reproduction data written in burst at the transfer speed of 75 sectornoseconds is continuously read out at the normal transfer speed of 75 sectornoseconds. That is, the system controller 57 writes the reproduced data into the memory 72 at a transfer rate of 75 sectors Z seconds, and writes the reproduced data from the memory 72 into the above-mentioned 18.75 sectors. Performs memory control such that data is read continuously at a transfer rate of / sec.
- the system controller 57 performs the above-described memory control on the memory 72, and the memory controller writes data in a burst manner from the memory 72 by this memory control.
- the playback position is controlled so that the playback data is played back continuously from the recording track of the magneto-optical disc 1.
- the reproduction position is controlled by managing the reproduction position of the reproduction data read out from the memory 72 in a burst manner by the system controller 57, and recording the data on the magneto-optical disk 1. This is performed by supplying a control signal specifying the upper reproduction position to the servo control circuit 56.
- Playback data read continuously at a transfer rate of memory 72 to 18.75 sections / sec that is, ATC audio data
- ATC decoder 73 expands the ATC audio data by four times (bit expansion), for example, a 16-bit digital audio signal (digital audio data). To play overnight).
- the digital audio data from the ATC decoder 73 is supplied to an A-converter 74.
- the DZA converter 74 converts the digital audio data supplied from the ATC decoder 73 into an analog signal, and forms an analog audio output signal AOUT.
- the analog audio signal A0UT obtained by the D / A converter 74 is output from the output terminal 76 via the low-pass filter 75.
- this compressed data recording / reproducing device uses 1-encoder 63
- the ATC audio data is converted into a predetermined transmission format by a modulator 77 so that the data can be transmitted via an antenna 78.
- the input digital signal is divided into multiple frequency bands, and the lowest two adjacent bands have the same bandwidth. Then, the higher the frequency band, the wider the bandwidth is selected, the orthogonal transform is performed for each frequency band, and the obtained spectral data on the frequency axis is taken into account.In the low frequency band, human auditory characteristics described later are considered. Bits are adaptively allocated and coded for each so-called critical bandwidth, and for the middle and high bands, for each band obtained by subdividing the critical bandwidth in consideration of the block floating efficiency. ing. Usually, this block is the quantization noise generation block.
- FIG. 2 shows a circuit configuration for encoding an input digital signal for one channel. That is, in FIG. 2, for example, when the sampling frequency is 44.1 kHz, audio PCM signals for a plurality of channels from 0 to 22 kHz are supplied to the input terminal 200. I have.
- This input signal is, for example, a band division filter composed of filters such as a so-called QMF.
- QMF filters
- the band is divided into 0 to 5.5 kHz band signal and 5.5 kHz to l 1 kHz band signal by the band dividing filter 202 .
- the signal in the 1 lk Hz to 22 kHz band from the band division filter 201 is supplied to an MDCT (Modified Discrete Cosine Transform) circuit 203 which is an example of an orthogonal transformation circuit.
- the 5.5 kHz to 11 kHz band signal from the band dividing filter 202 is supplied to the MDCT circuit 204, and the signal from 0 to 5.5 from the band dividing filter 202 is supplied to the MDCT circuit 204.
- the signal in the kHz band is supplied to the MDCT circuit 205, where it is subjected to MDCT processing.
- a filter such as the above-mentioned QMF, and 1976 REC rochiere Digital coding of speech in subbands Bell Syst. J. Vol.55, No.8, 1976.
- ICASSP 83.B0ST ON Polyphase Quadrature filters-A new subband coding technique Joseph H. Eothweiler describes a filter splitting method with an equal band width.
- an input audio signal is blocked in a predetermined unit time (frame), and a fast Fourier transform (FFT),
- FFT fast Fourier transform
- DCT discrete cosine transform
- FIG. 3 shows a specific example of a block for each band with respect to a standard input signal supplied to each of the circuits 203, 204, and 205.
- the signal divided into three bands has a plurality of orthogonal transform block sizes independently for each band, and the time is determined by the time characteristic and frequency distribution of the signal. The resolution can be switched. If the signal is quasi-stationary in time, the block size of the orthogonal transform is set to 11.6 ms, that is, as shown in Fig. If it is stationary, the orthogonal transform block size is further divided into two and four. As shown in ShortMode shown in FIG. 3B, the orthogonal transformation block size is 2.9 ms, which is obtained by dividing the whole into four parts, or Middle shown in FIG. 3C.
- the orthogonal transform block size is 5.8 ms, which is divided into two parts, 2.9 ms when the-part is divided into four parts. With a time resolution of, it is adapted to actual complex input signals.
- the division of the orthogonal transform block size can be performed more adaptively on the input signal by increasing the number of divisions and the division pattern.
- the determination of the orthogonal transform block size is performed by the block size determining circuits 206, 207 and 208 in FIG. 2, and the determined orthogonal transform block size is 1 ⁇ 001 1 It is supplied to the circuits 203, 204, and 205, and is output from the output terminals 211, 211, and 218 as the block size information of the corresponding block. You.
- FIG. 2 shows the block size determination circuit 206
- Figure 4 shows the block size determination circuit 206
- a specific circuit configuration will be described.
- signals in the band llk Hz to 22 kHz are supplied to the power calculation circuit 404 via the input terminal 401 shown in FIG. Is done.
- signals in the 5.5 kHz to 11 kHz band are transmitted through the input terminal 402 shown in FIG.
- the signal is supplied to the power calculation circuit 405, and the signal in the 0 to 5.5 kHz band is supplied to the power calculation circuit 406 via the input terminal 403 shown in FIG.
- the block size determination circuits 207 and 208 shown in FIG. 2 are arranged so that the signals input to the input terminals 401, 402 and 403 shown in FIG. The operation is the same except for the case of 06. That is, in the case of the block decision circuit 207 of FIG. 2, the input terminal 401 of FIG. 4 is connected to the 5.5 kHz from the band division filter 202 of FIG. The signal in the Hz band is input to the input terminal 402 in Fig. 4 and the signal in the 11 kHz to 22 kHz band from the band division filter 201 in Fig. 2 is input to the input terminal 402 in Fig. 4. The signal in the 0 to 5.5 kHz band from the band division filter 202 in FIG.
- the input terminal 401 of FIG. 4 is connected to the 0 to 5.5 kHz band from the band division filter 202 of FIG.
- the signal of 1 lk Hz to 22 kHz from the band division filter 201 of FIG. 2 is input to the input terminal 402 of FIG. 4, and the input terminal of FIG.
- the signal of 5.5 kHz from the band division filter 202 in FIG. 2: Llk Hz band is input to 4003.
- the block size determination circuits 206, 107, and 208 are provided for each channel. Note that the block size determination circuits 206, 206, and 208 are Only the channels may be provided, and the orthogonal transform block sizes for a plurality of channels may be determined.
- each of the power calculation circuits 404, 405, and 406 calculates the power of each frequency band by integrating the input time waveform for a certain period of time. At this time, the integration time width must be equal to or smaller than the minimum of the orthogonal transform block sizes described above. In addition to the above calculation method, for example, the absolute value or the average value of the maximum amplitude within the minimum time width of the orthogonal transform block size may be used as the representative power.
- the power information output from the power calculation circuit 404 includes the memory 410, the inter-channel correlation coefficient calculation circuit 411, the change extraction circuit 407, and the power comparison circuit 410. 9, and the respective power information from the power calculation circuits 405 and 406 are supplied to a power comparison circuit 409. Power calculation circuits 404, 405, and 406 are provided for each channel, and the power calculation circuits 404, 405, and 406 for each channel calculate the power information of each channel. You may do so.
- the change extraction circuit 407 obtains a differential coefficient by differentiating the power information supplied from the power calculation circuit 404, and uses the differential coefficient as power change information to obtain a process size primary judgment circuit 41. 2 and memory 408.
- the memory 408 accumulates the power change information supplied from the change extraction circuit 407 for the maximum time of the orthogonal transform block size described above. This is because the temporally adjacent orthogonal transform blocks exert influence on each other by window processing during orthogonal transform, so that the power change of the immediately preceding temporally adjacent block. This is because the information is required in the block size primary judgment circuit 412.
- the block size primary judgment circuit 412 is temporally adjacent to the power change information of the corresponding block supplied from the change extraction circuit 407 and supplied from the memory 408.
- the orthogonal transform block size of the corresponding frequency band is determined from the temporal change of the power in the corresponding frequency band based on the phase change information of the block immediately before the corresponding block. Specifically, the process size primary judgment circuit 412 selects an orthogonal transform process size that is shorter in time when, for example, a displacement equal to or more than a certain M value is recognized. Although this threshold value is effective even if it is fixed, the threshold value is proportional to the frequency.In the high frequency band, the orthogonal transform block size is short in time due to large displacement, and in the low frequency band. It is more effective to select an orthogonal transform block size that is shorter in time with a smaller displacement than in the case of a higher frequency band.
- the orthogonal transform block size determined as described above is supplied to the block size secondary determination circuit 413.
- the power comparison circuit 409 compares the power information of each frequency band supplied from each power calculation circuit 404, 405, 406 with the time width at which the masking effect occurs at the same time. Then, the influence of the other frequency bands on the output frequency band of the power calculation circuit 404 is determined, and the obtained masking information is supplied to the secondary program size determination circuit 413.
- the block size secondary judgment circuit 413 uses the masking information supplied from the power comparison circuit 409 to perform the orthogonal transformation process supplied from the block size primary judgment circuit 412. The block size is corrected so as to have a longer block size in time, and the corrected orthogonal transform block size is supplied to the block size tertiary decision circuit 4 14. That is, the block size secondary judgment circuit 413 is used for the corresponding frequency band.
- the orthogonal transform block size is modified by taking advantage of the fact that the effect of pre-echo may be reduced or not.
- Masking is a phenomenon in which a certain signal masks another signal and makes it inaudible due to human auditory characteristics.
- the masking effect includes a time-axis masking effect by a signal on time and a simultaneous masking effect by a signal on the frequency axis.
- the above-mentioned block size secondary judgment circuit 4 13 utilizes the same time masking effect. Due to these masking effects, even if there is noise in the masked part, this noise is inaudible to humans. For this reason, in an actual audio signal, the noise within the masked range is considered to be a noise having no audible problem.
- the inter-channel correlation coefficient calculation circuit 411 uses the power information of the plurality of channels from the power calculation circuit 404 and the memory 410 to determine the phase relationship between the powers of the plurality of channels. Calculate the number.
- the memory 410 is used to supply power information for a plurality of channels at the same time as the corresponding block to the inter-channel correlation coefficient calculation circuit 411. That is, power information for a plurality of channels is continuously transmitted to the memory 410 from the power calculating circuit 404 in time.
- the power calculation circuit 404 stores the power information for the left channel of the corresponding block in the memory 410, followed by the right channel of the corresponding block. Minutes of power information is supplied, followed by one of the blocks that is temporally adjacent to the block. The power information for the left channel of the subsequent block, and the power information for the right channel of the next block that is temporally adjacent to the block are supplied.
- the memory 410 is used to output the power information of the blocks on each channel having the same time relationship as the relevant block to the inter-channel correlation coefficient calculating circuit 4111. And retains the channel information for each channel. Therefore, the memory 410 has a storage capacity of a size proportional to the number of channels. For example, assuming that the capacity of the memory 410 when the number of channels is 2 is C, the capacity Cn of the memory 410 when the number of channels is n is represented by the following equation (1). More
- the inter-channel correlation coefficient calculation circuit 411 uses the power information of a plurality and / or a single channel stored in the memory 4 10. Input the power information for one channel, which is not stored in the memory 410 from the bar calculation circuit 404, to the multiple channels having the same time relationship with the corresponding block.
- the correlation coefficient of the power information of each block is calculated. For example, when the number of channels is two, the correlation coefficient r is defined as the following equation (2).
- the value of the correlation coefficient r is in the range of 1 l ⁇ r + l. If the correlation between X i and Y i is high, the value is close to +1. If the correlation is low, the value is close to -1. Obviously, B in Equation (2) is an integer, which determines the number of blocks to be added, that is, the time range. Even if this value is fixed, the effect can be obtained, but the value is proportional to the frequency, that is, the difference between b and n is large in the low frequency band and small in the high frequency band. The more effective it is, the more effective it becomes.
- a X and A y are average values of power information included in the range from b to n.
- the correlation coefficients are calculated for all the assumed pairs, and the average value thereof is represented as the output of the inter-channel correlation coefficient calculation circuit 411.
- the number of all pairs assumed is ⁇ N (N-1) ⁇ Z 2 where N is the number of channels.
- the block size tertiary judgment circuit 414 includes the correlation coefficient r obtained by the inter-channel correlation coefficient calculation circuit 411, the masking information obtained by the power comparison circuit 409, and the memo. Based on the power information of the orthogonal transform block immediately preceding the temporally adjacent orthogonal transform block held in memory 408, the block size secondary decision circuit 413 decides. Review the orthogonal transform block size, and finally determine the relevant orthogonal transform block size.
- the correlation coefficient r sent from the inter-channel correlation coefficient calculation circuit 4 11 is a value from 11 to +1 as described above. High correlation between channels. Therefore, the block size tertiary decision circuit 414 is provided with a certain M value, a correlation coefficient exceeding the ⁇ value is input, the same time masking effect can be expected, and the memory 408 If the power information from ⁇ ⁇ ⁇ has a value larger than the ⁇ value, all the orthogonal transform block sizes of multiple channels having the same time relationship are made longer. For example, set to 11.6 ms, that is, the same size as the Long Mode shown in Fig. 3A.
- the block size tertiary decision circuit 414 makes all orthogonal transform block sizes of a plurality of channels having the same time relationship shorter. For example, the size is the same as the Short Mode shown in Fig. 3B. It should be noted that although each of the above joint values can be effective even if it is fixed, it is more effective if it is variable according to the frequency.
- the value of the power information of each channel may be compared instead of obtaining the correlation coefficient. For example, if the number of channels is two, the absolute value of the difference between each piece of information is determined. In the case of three or more channels, the absolute value of the difference is calculated for every possible pair, and the average value is calculated. Then, this value is supplied to the block size tertiary judgment circuit 414.
- the block size tertiary determination circuit 4 14 stores the difference value of the power information obtained by the inter-channel correlation coefficient calculation circuit 4 11, the masking information obtained by the power comparison circuit 4 09, and the memory 408. Based on the stored power information of the immediately preceding block in time, the corresponding orthogonal transform block size is determined.
- the block size The size tertiary determination circuit 4 14 determines all the orthogonal transform block sizes of a plurality of channels having the same time relation longer. For example, the size should be the same as the Long Mode shown in Fig. 3A.
- the difference value of the power information takes a value lower than a certain W value, the masking effect at the same time cannot be sufficiently expected, and the power information of the immediately preceding orthogonal transform block exists.
- the block size tertiary judgment circuit 4 14 makes all the orthogonal transform block sizes of a plurality of channels having the same time relationship smaller.
- the size is the same as the Short Mode shown in Fig. 3B.
- the above-mentioned respective M values can obtain the effect even if they are fixed, it is more effective to make them variable according to the frequency.
- the corresponding orthogonal transform process size BS determined by the process size determination circuit 414 is output to the MDCT circuit 203 shown in FIG. 2 via the output terminal 416 and the window
- the window shape determination circuit 415 is supplied to the window shape determination circuit 415, and determines the window shape based on the orthogonal transformation process size BS.
- Fig. 5 shows the state of adjacent windows and window shapes.
- the window used for orthogonal transformation has a portion that overlaps between blocks that are temporally adjacent to each other, and this embodiment employs a shape that overlaps to the center of adjacent blocks. . Therefore, the window shape changes depending on the orthogonal transform block size of the adjacent block.
- FIG. 6 shows the details of the window shape.
- window functions f (n) and s (n + N) satisfy the following equations (3) and (4).
- Cf (n) f (L-1-n) s (n) s (L-1-n) given as a function
- L is the orthogonal transform block size as long as the adjacent orthogonal transform block sizes are the same, but L is the adjacent orthogonal transform block size.
- the orthogonal transform block size is shorter in terms of time and L is longer, and the orthogonal transform block size that is longer in terms of time is K, in the region where the windows do not overlap, the following equation is used.
- the shape of the window used for the orthogonal transform is three orthogonal transform blockers that are continuous in time. Determined after the size is fixed.
- the block size determination circuits 206, 207 and 208 shown in FIG. 2 are omitted, and the power calculation circuits 405 and 406 and the power comparison circuit 409 shown in FIG. 4 are omitted. May be configured.
- the block size determination circuits 206, 107, and 208 are replaced with the block size secondary determination circuit 413 shown in FIG. 4 and / or the block size tertiary determination circuit 414. May be omitted.
- the configuration having a small delay described above can be adopted, which is effective.
- the W value is set to a low value, so that the time lengths of all the processing blocks at the same time are the same. You can do it. This is particularly effective for input signals with high correlation between channels.
- block size primary determination circuit 4 12 block size secondary determination circuit 4 13
- block size tertiary determination circuit 4 14 block size tertiary determination circuit 4 14, and the like will be described.
- the signal in each band is a sine wave, and the level (amplitude) of the signal in the 11 kHz to 22 kHz band of the input signal shown in Fig. 7A It is assumed that the input signal shown in FIG. 7B has the same signal level in the llk Hz to 22 kHz band.
- the orthogonal transform block size of the corresponding block N is determined only by the amplitude change of the corresponding frequency
- the same orthogonal signal is used for the input signal shown in Fig. 7A and the input signal shown in Fig. 7B.
- Conversion block size is determined. However, focusing on signals in the band of 0 to 5.5 kHz or 5.5 kHz: Llk Hz, the input signal shown in FIG. Since the power of the signals in other bands is lower than the power (energy) of the signals in the ⁇ 22 kHz band, the pre-echo generated in the 1 lk Hz ⁇ 22 kHz band is In this implementation, for the input signal shown in Fig. 7A, the block N in the llk Hz to 22 kHz band is shorter than that for the input signal shown in Fig. 7A. The width is assumed to be the orthogonal transform block size.
- the power of the signal in the 0 to 5.5 kHz or 5.5 kHz to l 1 kHz band is 1 lk Hz to 22 kHz. Since this value is sufficient to mask the yellow echo compared to the power of the signal in the band, the pre-economy that occurs in the band from 11 kHz to 22 kHz is masked, which causes hearing problems. It's hard to do. Therefore, in this embodiment, the frequency resolution is given priority to the input signal shown in FIG. 7B, and the orthogonal transform block size having a longer time width than that of the input signal shown in FIG. 7A is determined. You.
- the power calculation circuits 404, 405, 406, the power comparison circuit 409, and the block size secondary judgment circuit 413 shown in FIG. In each case of the input signal shown in A and the input signal shown in FIG. 7B, different orthogonal transform block sizes are determined.
- a signal in a certain band for example, a 1 kHz to 22 kHz band is a sine wave, and the input at which the level at which the level increases becomes different from each other.
- a signal is input.
- the input signal shown in FIG. 8A is a left channel signal
- the input signal shown in FIG. 8B is a right channel signal
- a two-channel stereo signal is input.
- a slight phase difference is often found in actually stereo-recorded tone signals.
- the orthogonal transform block size of the corresponding block N is determined only by the amplitude change of the signal, the orthogonal transform block size of a shorter time width is obtained for the input signal shown in Fig. 8A. Is determined, and for the input signal shown in FIG. 8B, the orthogonal transform block size having a longer time width is determined.
- This is the absolute value of the difference between the maximum amplitude values existing in block N-1 and block N, that is, the magnitude of D a in FIG. 8A and D b in FIG.
- the relationship of Da> T> Db was established, albeit with a small difference, so that the orthogonal transform block size as described above was determined.
- each channel determines the orthogonal transformation block size with a longer time span. If the simultaneous masking effect and / or the time axis masking effect cannot be obtained, each channel has a shorter time span.
- the block size tertiary decision circuit 4 14 makes the orthogonal transform block size of each channel equal to the input signal with high correlation between channels as shown in FIG. It can be What It is effective even if the processing blocks on at least two of the channels have the same time length.
- the spectrum data or MDCT coefficient data on the frequency axis obtained by the MDCT processing in each MDCT circuit 203, 204, 205 is the so-called critical band in the low band. (Critical band), and the middle and high frequencies are divided into critical bandwidths in consideration of the effectiveness of block-floating, and adaptive bit allocation coding circuits 2 Supplied to 10, 2 11, 2 12.
- the critical band is a frequency band divided in consideration of human auditory characteristics, and the pure tone is masked by the narrow band noise of the same strength near the frequency of a pure tone. This is the bandwidth of the noise when it is performed. In this critical band, the higher the frequency, the wider the bandwidth, and the entire frequency band from 0 to 22 kHz is divided into, for example, 25 critical bands.
- the bit allocation calculation circuit 209 performs a clearing process in consideration of the so-called masking effect and the like based on the spectrum data divided in consideration of the above-described critical band and the process flow. Calculate the masking amount for each divided band in consideration of the technical band and block floating, and calculate the energy or peak for each divided band in consideration of the masking amount and the critical band and block floating. The number of allocated bits for each band is determined based on the values, etc., and the bits allocated to each band by the adaptive bit allocation coding circuits 210, 211, and 212. Each spectrum data (or MDCT coefficient data) is requantized according to the number. The data encoded in this manner is taken out via output terminals 2 13, 2 14, and 2 15.
- FIG. 9 is a block circuit diagram showing a configuration of a specific example of the bit allocation calculation circuit 209.
- each of the above 1 ⁇ 10 ⁇ ⁇ ! The spectrum on the frequency axis or the MDCT coefficient data from the circuits 203, 204, and 205 is converted to the energy calculation circuit 900 for each band via the input terminal 900.
- the energy calculation circuit 901 calculates the energy of each divided band in consideration of the masking amount and the critical band and the peak floating, for example, for each amplitude value in the band. It is obtained by calculating the sum, etc. Instead of the energy for each band, a peak value or an average value of the amplitude value may be used.
- the spectrum of the total value of each band is shown as SB in FIG.
- the number of sub-bands considering the masking amount and critical band and block floating is 12 nodes (B 1 to: B 1 2).
- the convolution filter circuit 102 includes a plurality of delay elements for sequentially delaying input data and a plurality of delay elements for multiplying an output from these delay elements by a filter coefficient (weighting function). It is composed of a multiplier (for example, 25 multipliers corresponding to each band) and a sum adder that calculates the sum of the outputs of the multipliers.
- a multiplier for example, 25 multipliers corresponding to each band
- a sum adder that calculates the sum of the outputs of the multipliers.
- the coefficient of the multiplier M corresponding to an arbitrary band is represented by
- the coefficient 0.15 is applied to the multiplier M — 1
- the coefficient 0.000 19 is applied to the multiplier M-2
- the coefficient 0.00 0 0 0 8 6 is applied to the multiplier M-3.
- M is any integer from 1 to 25.
- the output of the convolution filter circuit 902 is sent to a subtractor 905.
- the subtractor 905 obtains a level ⁇ corresponding to an allowable noise level described later in the convolved region.
- the level ⁇ corresponding to the permissible noise level is determined by performing inverse composition processing, as will be described later, on each band of the critical band. This is a level that gives an acceptable noise level for each node.
- an allowance function (a function expressing a masking level) for obtaining the level ⁇ is supplied to the subtractor 905.
- the level ⁇ is controlled by increasing or decreasing the permissible function.
- the permissible function is supplied from an ( ⁇ -ai) function generating circuit 904 described below.
- the level ⁇ corresponding to the allowable noise level can be obtained by the following equation (7), where i is a number sequentially given from the low band of the critical band.
- a S-(n-ai)
- n and a are constants and a> 0
- S is the intensity of the convolution-processed bark spectrum
- (n-ai) in equation (7) is an allowable function.
- the level ⁇ is obtained, and this data is supplied to the subtractor 905.
- the level ⁇ in the convolved region is inversely convolved. Therefore, a masking spectrum can be obtained from the level ⁇ by performing the inverse convolution processing. That is, this masking spectrum becomes the allowable noise spectrum.
- the above-described inverse convolution processing requires a complicated operation, in the present embodiment, the inverse convolution is performed using a simplified subtractor 905. I have.
- the masking spectrum is supplied to a subtracter 907 via a synthesis circuit 906.
- the output from the energy calculation circuit 901 for each band that is, the spectrum SB described above, is supplied to the subtracter 907 via the delay circuit 908. I have. Therefore, the subtractor 907 performs a subtraction operation between the masking spectrum and the spectrum SB, as shown in FIG. Is masked below the level indicated by the level of the masking vector MS.
- the output from the subtracter 907 is taken out via the permissible noise correction circuit 911 and the output terminal 912.
- a ROM or the like in which information on the number of allocated bits is stored in advance (see FIG. (Not shown).
- the ROM and the like are obtained from the output obtained from the subtracter 907 via the allowable noise correction circuit 911. According to the force (the level of the difference between the energy of each band and the output of the noise level setting means), information on the number of bits assigned to each band is output.
- the information on the number of allocated bits is sent to the adaptive bit allocation coding circuits 210, 211, and 212, so that 1 ⁇ 0 ⁇ 1 circuits 203, 204, 205
- Each spectrum data on the frequency axis from is quantized by the number of bits assigned to each band.
- the adaptive bit allocation coding circuits 210, 211, and 212 have the above-described masking amount and the energy of each divided band in consideration of the critical band and the peak floating.
- the spectrum data for each band is quantized by the number of bits assigned according to the level of the difference from the output of the noise level setting means.
- the delay circuit 908 is provided to delay the spectrum SB from the energy calculation circuit 901 per band in consideration of the amount of delay in each circuit before the synthesis circuit 906. ing.
- the so-called minimum audible curve which is a human auditory characteristic supplied from the minimum audible curve generation circuit 909 as shown in FIG.
- the data indicating RC and the masking vector MS can be synthesized. In this minimum audible curve, if the absolute noise level is below this minimum audible curve, the noise will not be heard.
- This minimum audible curve will differ depending on the playback volume during playback, for example, even if the coding is the same, but in a realistic digital system, for example, the 16-bit dynamic range Since there is not much difference in the way music is entered, if for example the quantization noise in the most audible frequency band around 4 kHz is not audible, this minimum audible It is considered that quantization noise below the level of the curve is not exceeded.
- the minimum audible curve RC and the masking vector MS are both used. If an allowable noise level is obtained by synthesis, the allowable noise level in this case can be up to the shaded portion in FIG. In this embodiment, the 4 kHz level of the minimum audible curve is adjusted to the lowest level corresponding to, for example, 20 bits.
- FIG. 12 also shows the signal spectrum SS.
- the allowable noise correction circuit 911 outputs the allowable noise level at the output from the subtracter 907 based on, for example, information on the equal loudness curve sent from the correction information output circuit 910. Is corrected.
- the equal loudness curve is a characteristic curve relating to human auditory characteristics, for example, a curve obtained by calculating the sound pressure of sound at each frequency that sounds as loud as a pure tone of 1 kHz. This is also called the loudness iso-sensitivity curve. This equal loudness curve is similar to the minimum audible curve RC shown in Fig. 12.
- the sound pressure is 8 ⁇ from 1 kHz: the loudness sounds the same as 1 kHz even when the LO dB decreases. In the vicinity of 50 Hz, the same magnitude is not heard unless the sound pressure at 1 kHz is higher by about 15 dB. Therefore, it is clear that noise exceeding the level of the minimum audible curve (allowable noise level) should have a frequency characteristic given by a curve corresponding to the equal loudness curve. For this reason, correcting the above-mentioned allowable noise level in consideration of the above-mentioned equal loudness curve is not suitable for human hearing characteristics. You can see that.
- the correction information output circuit 910 the output information amount at the time of quantization by the adaptive bit allocation coding circuits 210, 211, and 212 (de-night) is used.
- the allowable noise level may be corrected based on information on an error between the detection output of the amount (amount) and the target bit rate of the final encoded data. This is because the total number of bits obtained by performing temporary adaptive bit allocation for all bit allocation unit blocks in advance is determined by the bit rate of the final encoded output data. In some cases, there is an error with respect to a fixed number of bits (target value), and the bits are allocated again so that the error is set to zero.
- the difference bit number is allocated to each unit block and added, and the total allocated number of bits is smaller than the target value.
- the difference bits are allocated to each unit block so as to reduce the number of bits.
- the correction information output circuit 910 detects the error of the total allocated bits from the target value and corrects each allocated bit according to the error data. Output the correction data for this.
- the error data indicates that the number of bits is insufficient, a larger number of bits are used per unit block, so that the amount of data becomes larger than the target value. Can be considered.
- the error data is data indicating a bit number excess, the number of bits per unit block is small, and the data amount is smaller than the target value. Can be considered.
- the permissible noise level in the output from the subtractor 907 according to the error data is calculated. For example, data of the above-mentioned correction value for correcting the same based on the information data of the above-mentioned equal loudness curve is output.
- the correction value as described above is supplied to the allowable noise correction circuit 911 so that the allowable noise level from the subtracter 907 is corrected.
- the scale factor indicating the state of the block floating as data obtained by processing the orthogonal transform output spectrum with sub-information as main information and sub-information. Then, a single word indicating the word length is obtained and sent from the encoder to the decoder.
- FIG. 13 shows a specific configuration of the 1 "(decoder 73 shown in FIG. 1, that is, the decoding circuit for decoding the signal which has been encoded with high efficiency as described above again.
- the quantized MDCT coefficients that is, the data equivalent to the output signals of the output terminals 2 13, 2 14, and 2 15 in FIG. 2 are input through the input terminals 3 0 0, 3 0 2, and 3 0 4.
- the used process size information that is, data equivalent to the output signals of the output terminals 2 16, 2 17, and 2 18 in FIG. 2 are supplied to the input terminals 301, 303, and 305. Is supplied to the decoding circuits 303, 307, and 308.
- the decoding circuits 306, 307, and 308 are used to bite using the adaptive bit allocation information.
- 11 ⁇ 0 ⁇ > 1 circuit 309, 310, 311 converts the signal on the frequency axis to the signal on the time axis.
- the on-axis signal is 1 01 ⁇ 1 1 circuit 3 1 2,
- the signal is decoded into a full-band signal and output to DZA converter 74 shown in FIG. 1 via output terminal 3 14.
- the present invention is not limited to only the above-described embodiment.
- the recording / reproducing medium and the other recording / reproducing medium do not need to be integrated, and a data transfer line is provided between them. It is also possible to connect You.
- the present invention can be applied not only to audio PCM signals but also to signal processing devices for digital audio (speech) signals, digital video signals, and the like.
- a configuration in which the above-described minimum audible curve synthesizing process is not performed may be adopted. In this case, the minimum audible curve generating circuit 909 and the synthesizing circuit 906 in FIG. 9 are not required, and the output from the subtracter 905 is immediately subtracted by the subtractor 905. 7 will be supplied.
- bit allocation methods there are various bit allocation methods, most simply fixed bit allocation, simple bit allocation by energy of each band of the signal, or a combination of fixed and variable components. It can be used for bit allocation.
- the temporal size and the window shape of the orthogonal transform block are changed in response to a sudden change in the amplitude of the input signal.
- the time length of the orthogonal transform blocks of each channel is set to be the same, thereby suppressing the occurrence of sound quality differences between the channels.
- the sound image localization feeling can be improved, and good sound quality can be obtained. This makes it possible to obtain better sound quality at the same bit rate. In addition, in order to obtain the same sound quality, it can be performed at a lower bit rate.
- the present invention it is possible to provide a method for deciding the temporal length of a processing block that is also desirable from the viewpoint of hearing for the compression of an information signal that fluctuates over time. High-efficiency compression and decompression with high sound quality can be performed.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP95901611A EP0691751B1 (en) | 1993-11-29 | 1994-11-29 | Method and device for compressing information, and device for recording/transmitting compressed information |
US08/491,973 US5717670A (en) | 1993-11-29 | 1994-11-29 | Information compacting method and apparatus, compacted information expanding method and apparatus, compacted information recording/transmitting apparatus, compacted information receiving apparatus and recording medium |
KR1019950703186A KR100339325B1 (ko) | 1993-11-29 | 1994-11-29 | 신호처리방법,정보압축용장치,압축정보신장장치,압축정보기록/전송장치 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP5/298304 | 1993-11-29 | ||
JP29830493A JP3175446B2 (ja) | 1993-11-29 | 1993-11-29 | 情報圧縮方法及び装置、圧縮情報伸張方法及び装置、圧縮情報記録/伝送装置、圧縮情報再生装置、圧縮情報受信装置、並びに記録媒体 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1995015032A1 true WO1995015032A1 (fr) | 1995-06-01 |
Family
ID=17857917
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP1994/002005 WO1995015032A1 (fr) | 1993-11-29 | 1994-11-29 | Procedes et appareils de compression et decompression d'informations, appareils d'enregistrement/emission et de reception d'informations comprimees, et support d'enregistrement |
Country Status (6)
Country | Link |
---|---|
US (1) | US5717670A (ja) |
EP (1) | EP0691751B1 (ja) |
JP (1) | JP3175446B2 (ja) |
KR (1) | KR100339325B1 (ja) |
ES (1) | ES2313718T3 (ja) |
WO (1) | WO1995015032A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1995034956A1 (fr) * | 1994-06-13 | 1995-12-21 | Sony Corporation | Procede et dispositif de codage de signal, procede et dispositif de decodage de signal, support d'enregistrement et dispositif de transmission de signaux |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3152109B2 (ja) * | 1995-05-30 | 2001-04-03 | 日本ビクター株式会社 | オーディオ信号の圧縮伸張方法 |
EP0924962B1 (en) * | 1997-04-10 | 2012-12-12 | Sony Corporation | Encoding method and device, decoding method and device, and recording medium |
JPH1132399A (ja) * | 1997-05-13 | 1999-02-02 | Sony Corp | 符号化方法及び装置、並びに記録媒体 |
US6356211B1 (en) * | 1997-05-13 | 2002-03-12 | Sony Corporation | Encoding method and apparatus and recording medium |
ATE231666T1 (de) * | 1997-06-23 | 2003-02-15 | Liechti Ag | Verfahren für die kompression der aufnahmen von umgebungsgeräuschen, verfahren für die erfassung von programmelementen darin, vorrichtung und computer-programm dafür |
US6178147B1 (en) | 1997-08-22 | 2001-01-23 | Sony Corporation | Recording method, recording apparatus, reproducing method and reproducing apparatus |
US6578169B1 (en) * | 2000-04-08 | 2003-06-10 | Advantest Corp. | Data failure memory compaction for semiconductor test system |
JP2002272736A (ja) * | 2001-03-21 | 2002-09-24 | Fuji Photo Film Co Ltd | 超音波診断装置 |
JP4625709B2 (ja) * | 2005-03-25 | 2011-02-02 | 株式会社東芝 | ステレオオーディオ信号符号化装置 |
US8032368B2 (en) | 2005-07-11 | 2011-10-04 | Lg Electronics Inc. | Apparatus and method of encoding and decoding audio signals using hierarchical block swithcing and linear prediction coding |
KR100790362B1 (ko) * | 2006-12-08 | 2008-01-03 | 한국전자통신연구원 | 공간지각 단서에 의한 서라운드 음장 시각화 장치 및 그방법 |
EP2717265A1 (en) * | 2012-10-05 | 2014-04-09 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Encoder, decoder and methods for backward compatible dynamic adaption of time/frequency resolution in spatial-audio-object-coding |
JP6721977B2 (ja) * | 2015-12-15 | 2020-07-15 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America | 音声音響信号符号化装置、音声音響信号復号装置、音声音響信号符号化方法、及び、音声音響信号復号方法 |
JP6881931B2 (ja) * | 2016-09-30 | 2021-06-02 | 株式会社モバイルテクノ | 信号圧縮装置、信号伸長装置、信号圧縮プログラム、信号伸長プログラム及び通信装置 |
EP3616197A4 (en) | 2017-04-28 | 2021-01-27 | DTS, Inc. | AUDIO ENCODER WINDOW SIZES AND TIME-FREQUENCY TRANSFORMATIONS |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS63182700A (ja) * | 1987-01-26 | 1988-07-27 | 株式会社日立製作所 | 音響信号処理回路 |
JPH0352332A (ja) * | 1989-07-19 | 1991-03-06 | Sony Corp | 信号符号化装置 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3134455B2 (ja) * | 1992-01-29 | 2001-02-13 | ソニー株式会社 | 高能率符号化装置及び方法 |
DE4345611B4 (de) * | 1992-04-16 | 2011-06-16 | Mitsubishi Denki K.K. | Wiedergabe-Gerät |
JP3230319B2 (ja) * | 1992-07-09 | 2001-11-19 | ソニー株式会社 | 音響再生装置 |
-
1993
- 1993-11-29 JP JP29830493A patent/JP3175446B2/ja not_active Expired - Fee Related
-
1994
- 1994-11-29 WO PCT/JP1994/002005 patent/WO1995015032A1/ja active IP Right Grant
- 1994-11-29 ES ES95901611T patent/ES2313718T3/es not_active Expired - Lifetime
- 1994-11-29 US US08/491,973 patent/US5717670A/en not_active Expired - Lifetime
- 1994-11-29 EP EP95901611A patent/EP0691751B1/en not_active Expired - Lifetime
- 1994-11-29 KR KR1019950703186A patent/KR100339325B1/ko not_active IP Right Cessation
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS63182700A (ja) * | 1987-01-26 | 1988-07-27 | 株式会社日立製作所 | 音響信号処理回路 |
JPH0352332A (ja) * | 1989-07-19 | 1991-03-06 | Sony Corp | 信号符号化装置 |
Non-Patent Citations (2)
Title |
---|
See also references of EP0691751A4 * |
SUZUKI J: "ANALYSIS OF STERE SPEECH SIGNAL BY OPTIMUM ORTHOGONAL TRANSFORM", IEICE THESIS JOURNAL, vol. J71-A, no. 2, 1988, pages 443 - 452, XP009035887 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1995034956A1 (fr) * | 1994-06-13 | 1995-12-21 | Sony Corporation | Procede et dispositif de codage de signal, procede et dispositif de decodage de signal, support d'enregistrement et dispositif de transmission de signaux |
US6061649A (en) * | 1994-06-13 | 2000-05-09 | Sony Corporation | Signal encoding method and apparatus, signal decoding method and apparatus and signal transmission apparatus |
Also Published As
Publication number | Publication date |
---|---|
EP0691751A4 (en) | 2005-06-22 |
KR960700571A (ko) | 1996-01-20 |
US5717670A (en) | 1998-02-10 |
JPH07154265A (ja) | 1995-06-16 |
JP3175446B2 (ja) | 2001-06-11 |
EP0691751B1 (en) | 2008-10-01 |
KR100339325B1 (ko) | 2002-11-18 |
ES2313718T3 (es) | 2009-03-01 |
EP0691751A1 (en) | 1996-01-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP3123286B2 (ja) | ディジタル信号処理装置又は方法、及び記録媒体 | |
JP3173218B2 (ja) | 圧縮データ記録方法及び装置、圧縮データ再生方法、並びに記録媒体 | |
JP3123290B2 (ja) | 圧縮データ記録装置及び方法、圧縮データ再生方法、記録媒体 | |
US6741965B1 (en) | Differential stereo using two coding techniques | |
JP3186307B2 (ja) | 圧縮データ記録装置及び方法 | |
JPH06180948A (ja) | ディジタル信号処理装置又は方法、及び記録媒体 | |
JP3531177B2 (ja) | 圧縮データ記録装置及び方法、圧縮データ再生方法 | |
JP3175446B2 (ja) | 情報圧縮方法及び装置、圧縮情報伸張方法及び装置、圧縮情報記録/伝送装置、圧縮情報再生装置、圧縮情報受信装置、並びに記録媒体 | |
JP3185415B2 (ja) | 圧縮データ再生記録装置及び方法 | |
JPH08162964A (ja) | 情報圧縮装置及び方法、情報伸張装置及び方法、並びに記録媒体 | |
JP4470304B2 (ja) | 圧縮データ記録装置、記録方法、圧縮データ記録再生装置、記録再生方法および記録媒体 | |
JP3334374B2 (ja) | ディジタル信号圧縮方法及び装置 | |
JP3304717B2 (ja) | ディジタル信号圧縮方法及び装置 | |
JP3186489B2 (ja) | ディジタル信号処理方法及び装置 | |
JP3175456B2 (ja) | ディジタル信号処理装置 | |
JP3334375B2 (ja) | ディジタル信号圧縮方法及び装置 | |
JPH06338861A (ja) | ディジタル信号処理装置及び方法、並びに記録媒体 | |
JP3477735B2 (ja) | 圧縮データ変換装置及び方法 | |
JP3552239B2 (ja) | 圧縮データ記録装置及び方法、並びに圧縮データ再生方法 | |
JP3134368B2 (ja) | 圧縮データ記録再生装置 | |
JP3084815B2 (ja) | データ記録方法及び装置 | |
JPH0590973A (ja) | 信号処理方法及び圧縮データ記録再生装置 | |
JPH10261265A (ja) | データダビング装置 | |
JPH07231259A (ja) | ディジタル信号処理方法及び装置、並びに記録媒体 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): KR US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): ES FR GB IT |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1995901611 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 08491973 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1019950703186 Country of ref document: KR |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWP | Wipo information: published in national office |
Ref document number: 1995901611 Country of ref document: EP |
|
WWG | Wipo information: grant in national office |
Ref document number: 1995901611 Country of ref document: EP |