WO2010137650A1 - 音声再生装置、音声再生方法及びプログラム - Google Patents
音声再生装置、音声再生方法及びプログラム Download PDFInfo
- Publication number
- WO2010137650A1 WO2010137650A1 PCT/JP2010/058994 JP2010058994W WO2010137650A1 WO 2010137650 A1 WO2010137650 A1 WO 2010137650A1 JP 2010058994 W JP2010058994 W JP 2010058994W WO 2010137650 A1 WO2010137650 A1 WO 2010137650A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- volume
- audio
- sound
- audio signal
- common component
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03G—CONTROL OF AMPLIFICATION
- H03G9/00—Combinations of two or more types of control, e.g. gain control and tone control
- H03G9/02—Combinations of two or more types of control, e.g. gain control and tone control in untuned amplifiers
- H03G9/025—Combinations of two or more types of control, e.g. gain control and tone control in untuned amplifiers frequency-dependent volume compression or expansion, e.g. multiple-band systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03G—CONTROL OF AMPLIFICATION
- H03G9/00—Combinations of two or more types of control, e.g. gain control and tone control
- H03G9/005—Combinations of two or more types of control, e.g. gain control and tone control of digital or coded signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R25/00—Deaf-aid sets, i.e. electro-acoustic or electro-mechanical hearing aids; Electric tinnitus maskers providing an auditory perception
- H04R25/50—Customised settings for obtaining desired overall acoustical characteristics
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0264—Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2225/00—Details of deaf aids covered by H04R25/00, not provided for in any of its subgroups
- H04R2225/43—Signal processing in hearing aids to enhance the speech intelligibility
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2430/00—Signal processing covered by H04R, not provided for in its groups
- H04R2430/03—Synergistic effects of band splitting and sub-band processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/13—Aspects of volume control, not necessarily automatic, in stereophonic sound systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/07—Synergistic effects of band splitting and sub-band processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/40—Visual indication of stereophonic sound image
Definitions
- the present invention relates to an audio playback device, an audio playback method, and a program. More specifically, the present invention makes it easy to hear the output sound and avoids bothering when playing back sound such as broadcast waves and recorded content. Therefore, the present invention relates to a sound reproducing device that performs processing for optimizing the characteristics of output sound, a sound reproducing method, and a program for realizing the functions of the sound reproducing device.
- the audio reproduction apparatus When viewing TV broadcasts or recorded / recorded contents, an apparatus for reproducing the sound of these contents is used.
- the audio reproduction apparatus is applied to an apparatus having an audio reproduction apparatus function such as a television, an audio, and a PC, for example, and inputs an audio signal of a broadcast wave or a recording / recording content and is provided from an external speaker connected to the apparatus. Has a function to output sound.
- the user listens to the playback sound using such a sound playback device
- the user usually listens to the sound by adjusting the volume according to the user's preference or necessity.
- the volume for example, when an elderly person listens to the reproduced voice, it is difficult to hear a small sound due to the deterioration of the auditory function seen in the elderly person.
- speech and vocals contained in the playback voice are harder to hear than younger people, so the frequency range including human voice is emphasized as control of speech characteristics for older people. It is preferable to perform such control.
- FIG. 14 is a diagram schematically showing a state of hearing loss due to aging. As shown in FIG. 14, generally, a person's hearing function gradually decreases with age, and it becomes difficult to hear a sound with a small volume. In particular, there is a large drop in how the high frequency band is heard, and it becomes more difficult to hear sounds in the high frequency band than in the low frequency band.
- FIG. 15 is a diagram schematically illustrating an example of how to hear sound due to an oversupplement phenomenon.
- the sound pressure that can be heard from the point where it exceeds 60 dbSPL rises sharply, and at about 80 dBPL, it can be heard with the same sound pressure as a normal person (for example, a young person).
- 80 dBSPL it will be felt as a louder sound than a normal person.
- Such a phenomenon can be said to be a phenomenon peculiar to the elderly although there are individual differences.
- the elderly people have difficulty in hearing sounds at low volumes, which makes it difficult to understand the voices of vocals and speech people.
- Even if control for emphasizing the voice of a person is performed there is a problem that at a high volume, the sound is louder than that of a young person and annoyance is felt. Therefore, for broadcast waves and content being played back, it is necessary to emphasize sound (human voice) and suppress noise and music depending on the situation. Therefore, it is necessary to optimally control the output sound characteristics so as not to feel the noise.
- Patent Document 1 discloses a vocal sound band emphasizing circuit that emphasizes vocals / lines clearly so that they can be heard clearly at low volumes, and emphasizes appropriately while maintaining the balance of the original sound at medium volumes and above.
- This vocal sound band emphasizing circuit includes an in-phase component extraction circuit that extracts both in-phase components A from the L / R channel signal, a bandpass filter that extracts the vocal sound band B from the in-phase component A, and a predetermined range from the vocal sound band B.
- a notch filter that absorbs and attenuates the frequency component C
- an automatic level control circuit ALC
- ALC automatic level control circuit
- microcomputer that controls the amplification level, the output signal E and the input L
- the first and second synthesis circuits for synthesizing the / R channel and outputting them as vocal sound band emphasizing L / R channel signals Lout and Rout are provided.
- the microcomputer determines the signal level of the original audio signal and / or the set volume value, and controls the amplification level of the automatic level control circuit in a substantially inversely proportional relationship.
- Patent Document 1 discloses a general control method of automatic level control (ALC), and optimizes sound characteristics in order to eliminate difficulty in hearing and annoyance caused by a decrease in hearing function of elderly people. No technical idea is disclosed.
- the present invention has been made in view of the circumstances as described above, and when performing sound reproduction, a sound reproducing device capable of controlling sound to be heard in an optimal state for a hearing function specific to the elderly,
- An object of the present invention is to provide an audio playback method and program.
- the first technical means of the present invention includes a frequency characteristic setting means for setting a frequency characteristic of an input audio signal, and a volume setting means for variably controlling the sound volume when outputting the audio signal.
- the frequency characteristic setting unit emphasizes a voice band including a human voice band or attenuates a band other than the voice band, and the volume setting unit compresses the dynamic range. It is characterized by that.
- the second technical means is an audio reproduction device having frequency characteristic setting means for setting the frequency characteristic of the input audio signal and volume setting means for variably controlling the sound volume when outputting the audio signal.
- the third technical means is characterized in that, in the second technical means, the voice band is in a range of approximately 1 kHz to 8 kHz.
- the fourth technical means includes a listener selection means for selecting whether the listener is an elderly person or a young person according to a user operation in the second or third technical means, and the elderly person is selected.
- the frequency characteristic is changed in accordance with the increase in the volume set by the volume setting means.
- a fifth technical means is an audio reproduction device comprising dynamic range setting means for setting a dynamic range of an input audio signal and volume setting means for variably controlling the volume when outputting the audio signal.
- the range setting means is characterized in that the dynamic range compression ratio is gradually increased in accordance with the increase in volume set by the volume setting means.
- the sixth technical means includes listener selection means for selecting whether the listener is an elderly person or a young person according to a user operation in the fifth technical means, and when the elderly person is selected, The compression ratio of the dynamic range is changed according to the increase in the volume set by the setting means.
- the seventh technical means extracts a common component from a plurality of audio signals respectively corresponding to a plurality of channels and subtracts the common component from each of the plurality of audio signals to extract a component other than the common component.
- a sound reproduction apparatus comprising: means; a means for changing and mixing gains of the extracted common component and components other than the common component; and a volume setting means for variably controlling a sound volume when outputting an audio signal. The common component gain is reduced in accordance with the increase in volume set by the volume setting means.
- the eighth technical means has a listener selection for selecting according to a user operation whether the listener is an elderly person or a young person in the seventh technical means, and the volume setting is performed when the elderly person is selected. According to the increase in volume set by the means, the mixing ratio and gain are changed in accordance with the increase in volume set by the volume setting means.
- a ninth technical means is an audio reproduction method executed by an audio reproduction device that sets a frequency characteristic of an input audio signal and variably controls a sound volume when the audio signal is output as audio, wherein the audio reproduction device includes: It has a step of emphasizing a voice band including a human voice band or attenuating a band other than the voice band and compressing a dynamic range.
- a tenth technical means is an audio reproduction method executed by an audio reproduction device that sets a frequency characteristic of an input audio signal and variably controls the volume of the audio signal when outputting the audio signal.
- An eleventh technical means is an audio reproduction method executed by an audio reproduction device that sets a dynamic range of an input audio signal and variably controls the volume of the audio signal when outputting the audio signal. In this case, there is a step of changing so that the compression ratio of the dynamic range is gradually increased in accordance with the increase in volume set at the time.
- the twelfth technical means extracts a common component from a plurality of audio signals respectively corresponding to a plurality of channels, subtracts the common component from each of the plurality of audio signals, extracts components other than the common component, and extracts them
- An audio reproduction method executed by an audio reproduction device that variably controls the volume when outputting an audio signal by changing and mixing the gains of the common component and the components other than the common component.
- the step of reducing the gain of the common component in accordance with the increase in the set volume is characterized.
- the thirteenth technical means is a program for realizing the function of any one of the first to eighth sound reproducing apparatuses by a computer.
- an audio reproducing apparatus an audio reproducing method, and a program capable of performing control so that audio can be heard in an optimal state with respect to hearing characteristics peculiar to elderly people when performing audio reproduction. Can do.
- FIG. 2 is a diagram showing a setting example of the coefficient a1 to b2 coefficient table of FIG. It is a figure which shows the example of a setting of the frequency characteristic in an equalizer part. It is a figure for demonstrating the effect at the time of making the gain of the frequency band of about 1 kHz-8 kHz the frequency large compared with another frequency band, and making it emphasize. It is a figure which shows typically the example of the frequency characteristic changed according to a volume. It is a figure which shows the example of a setting of the range width
- the first embodiment of the sound reproducing device is a sound reproducing device that reproduces an input sound signal and outputs the sound.
- the equalizer corresponds to a human voice.
- the frequency of the band is emphasized, and the frequency band of the output sound is gradually changed and flattened (flat in the frequency direction) as the volume of the sound reproducing device increases.
- an equalizer that changes the frequency characteristic of the input audio signal is used, and the frequency characteristic of the audio signal is changed by the equalizer according to the volume of the audio output.
- FIG. 1 is a diagram illustrating a configuration example of a first embodiment of a sound reproduction device according to the present invention, and illustrates a configuration example of an equalizer unit that changes the frequency specification of an input sound signal.
- the embodiment of the sound reproducing device according to the present invention can be applied to a device having means for receiving and outputting a broadcast signal, for example, a device such as a television or a PC.
- the present invention can be applied to an apparatus that reproduces an input audio signal input from an external recording device such as a recorder or an external memory, or an audio signal input from the outside via a network.
- the equalizer unit shown in FIG. 1 converts the frequency characteristic of the input audio signal and outputs it.
- the audio signal is amplified by an amplifier (not shown) and output from a speaker (not shown).
- the audio reproduction device of the present embodiment has volume setting means that enables the volume setting of the output sound from the speaker in accordance with a user operation, and the equalizer unit 10 changes the frequency characteristic in accordance with the volume and outputs it.
- the change of the frequency characteristic according to the sound volume is determined based on the maximum output sound pressure information of the sound reproducing device.
- the equalizer unit 10 of this example is a parametric equalizer configured by cascading biquad (secondary transfer function) digital filters 11a to 11c in three stages, and by dividing the audio frequency band into several parts, It is an equalizer that can adjust parameters such as the gain of the pass level for each.
- the equalizer unit 10 is provided with coefficient a1 to b2 selection units 21 and 23. Coefficients a1 to b2 selection units 21 and 23 use coefficients a1 to b2 (a1, a2, b1, and b2) of the two biquad digital filters 11b and 11c in the subsequent stage, volume information of the sound reproducing device, and the sound reproducing device. And the characteristic of the equalizer section 10 is changed.
- Coefficient selection of these coefficients a1 to b2 is made by storing coefficient tables 22 and 24 in advance in a storage means such as a memory of an audio reproduction device, and using the coefficient table based on the volume information and the maximum output sound pressure information. Select.
- the first-stage biquad digital filter 11a is used as a high-pass filter.
- Each biquad digital filter 11 includes mixers 12 and 13 and two delay elements 14 and 15.
- An example of the process will be described.
- the mixer 12 on the input side is initialized with the input signal, and the product of the value D1 on the rear side of the first delay element 14 and the coefficient a1 is subtracted from the mixer 12. Further, the product of the value D 2 on the rear side of the second delay element 15 and the coefficient a 2 is subtracted from the mixer 12. Thereby, the value D0 of the mixer 12 is determined.
- the output-side mixer 13 is overwritten with the product of the value D0 of the input-side mixer 12 and the coefficient b0, and the product of the value D1 and the coefficient b11 on the rear side of the first delay device 14 is overwritten. Add to mixer 13. Further, the product of the value D 2 on the rear side of the second delay unit 15 and the coefficient b 2 is added to the mixer 13.
- each value is updated by the action of each delay element 14, 15. That is, the rear value D2 is updated with the front value D1 of the second delay element 15, and the rear value D1 is updated with the front value D0 of the first delay element 14. Further, the value of the mixer 13 on the output side is updated with the value of the mixer 12 on the input side.
- the equalizer unit 10 repeats the processing of each biquad digital filter 11 for the number of stages, and outputs the output of the mixer 13 on the output side as an output signal.
- the level and sharpness of each frequency band can be changed by selecting the coefficient of each biquad digital filter 11.
- the configuration and operation of the equalizer unit as described above are conventionally known techniques.
- the coefficient of the equalizer unit 10 of the audio reproduction device includes the volume information of the audio reproduction device and the maximum output sound. By controlling according to the pressure information, it is possible to set the sound having the frequency characteristics optimal for the elderly according to the volume.
- FIG. 2 is a diagram showing a setting example of the coefficient a1 to b2 coefficient table of FIG.
- the first a1 to b2 coefficient table 22 used by the first coefficient a1 to b2 selection unit 21 that performs coefficient selection of the second-stage biquad digital filter 11b corresponds to the volume for each maximum output level of the regenerator.
- the third stage biquad digital filter 11c has the same configuration as the second stage. That is, the second a1 to b2 coefficient table 24 used by the second coefficient a1 to b2 selection unit 23 that performs coefficient selection of the third-stage biquad digital filter 11c has a volume level for each level of the maximum output of the regenerator.
- the first stage filter functions as a high-pass filter (HPF), the gain is always zero, and the output characteristics are determined only by the cut-off frequency (Fc) and Q (Quality factor).
- An audio signal having characteristics adjusted by passing through the first to third stage parametric equalizers is output from the equalizer unit 10.
- the maximum output sound pressure of the player indicates the maximum sound pressure level (unit: dB SPL “Sound pressure level”) that can be reproduced by each player.
- the rated output (W) is guaranteed as the product specification, but this rated output of the amplifier that drives the speaker is, for example, 2 channels, such as 10 W + 10 W, 5 W + 5 W, etc. Have different specs.
- the maximum output sound pressure of the player is 90 dBSPL.
- the sound pressure shown in FIG. 15 described above, or the sound pressure shown in FIGS. 6, 7, 10 and the like to be described later is not an electrical characteristic, but how the sound pressure output from the speaker of the sound reproducing device is changed. It shows what to control.
- the scale of the volume is the same position (for example, a maximum value of 60), for example, if the rating is 5 W and the speaker efficiency is 80 dB / W / m, then it is 85 dBSPL, so the rating is 10 W, Compared with the case where the efficiency is 80 dB / W / m, the sound pressure is 5 dBSPL lower, but the DRC threshold and gain may be compressed to the range shown in FIGS. Is required.
- the (rated type) ⁇ ( A table of the number of types generated by a combination of speaker efficiency types) is required.
- FIG. 3 is a diagram illustrating a setting example of frequency characteristics in the equalizer unit.
- the coefficient set in the biquad digital filter 11 of the equalizer unit 10 is optimized according to the sound volume.
- the example of FIG. 3 shows the frequency characteristics (FIG. 3A) when the volume is medium and the coefficient values (FIG. 3B) at this time.
- control is performed so as to emphasize the frequency band corresponding to the human voice in the output voice, and particularly to emphasize the frequency band that makes it difficult for the elderly to hear. .
- the gain in the frequency band of approximately 1 kHz to 8 kHz which generally corresponds to a higher-order formant in the human voice band, is increased and emphasized compared to the frequency band.
- the frequency band of 8 kHz or higher may be emphasized here, but particularly in elderly people, it is very difficult to hear the sound in this high frequency band, so it can be heard well even if the sound pressure is increased. It does not always become.
- the coefficient of the equalizer unit 10 is set so that the frequency characteristic as shown in FIG. FIG. 3B shows an example of setting coefficients at this time, and shows examples of coefficients set in the parametric equalizers 1 to 3 (from the first stage to the third stage).
- the first-stage high-pass filter has a cutoff frequency of 160 Hz, and the relationship between the frequency and the gain [dB] indicates that the characteristics of the first-stage high-pass filter are flat when the frequency is 300 Hz or higher. This is because elderly people have the same level of hearing ability as low-frequency sound such as 300 Hz or less, and no over-supplement phenomenon occurs.
- the frequency characteristics of the equalizer section reduce the frequency characteristics particularly on the low frequency side including sound effects and noises other than the frequency band corresponding to the human voice, the characteristics of the first-stage high-pass filter are: It is preferable that the frequency band is not flat in all frequency bands.
- the cutoff frequency as described above is set.
- FIG. 4 is a diagram for explaining the effect when the gain in the frequency band of approximately 1 kHz to 8 kHz is emphasized by increasing it compared to the other frequency bands.
- An ordinary human voice has a characteristic frequency distribution in which energy is concentrated on A, B, C, and D on the frequency as shown by S1 in FIG. These A, B, C, and D are called the fundamental frequency, the first formant, the second formant, and the third formant, respectively.
- the fundamental frequency is the strongest, and the formant attenuates as the order increases.
- the first formant, the second formant, and the third formant, which are normally attenuated are amplified and set to S2, thereby adjusting the sound quality to be easy to pass (easy to hear).
- an intermediate frequency region having a frequency of about 1 kHz to 8 kHz is relatively emphasized, and the frequency is changed to a flatter shape as the volume increases.
- FIG. 5 is a diagram schematically showing an example of frequency characteristics that are changed in conjunction with the volume.
- the frequency characteristic of the output sound by the equalizer unit 10 is converted according to the volume set by the sound reproducing device.
- the volume level is represented by 1 to 60, for example.
- VOL 60
- the frequency characteristics are substantially flat except for a part of the region on the lowest frequency side. That is, in the present embodiment, as the volume set by the volume setting unit is increased, the gain characteristic corresponding to the frequency is gradually flattened from the frequency characteristic in which the voice band including the human voice band is emphasized. Control to change to the frequency characteristics.
- a playback mode for elderly people and a playback mode for young people are prepared, and when the playback mode for elderly people is selected by a user operation, it responds to the increase in volume as described above.
- control for changing the frequency characteristic may be performed.
- the means for selecting the playback mode for the elderly and the playback mode for the young at this time corresponds to the listener selection means of the present invention.
- Embodiment 2 In the case of an elderly person, the level of the minimum audible limit increases with aging, and it becomes difficult to hear a sound with a small sound pressure as compared with younger people. Further, regarding the upper limit of the sound pressure that can be heard comfortably, there is a possibility that an over-supplement phenomenon may occur as described above. For example, sound pressure reproduction exceeding 70 to 80 dBSPL is not suitable for elderly people. That is, for the elderly, the optimum sound pressure level region is generally narrower than that for young people. In order to cope with such an elderly person, in this embodiment, a dynamic range of the reproduction sound pressure for the elderly person is set.
- FIG. 6 is a diagram showing an example of setting the range width of the reproduction sound pressure for elderly people
- FIG. 7 is a diagram showing an example of setting the range width of the reproduction sound pressure for young people.
- the range of the reproduction sound pressure when the set output volume of the television to which the recording / reproduction apparatus is applied is max.
- the upper limit of the reproduced sound pressure for elderly people is around 70 dBSPL and the lower limit is around 15 dBSPL.
- the reproduction sound pressure range for young people can have an upper limit of 80 dBSPL and a lower limit of 10 dBSPL or less. This is because even if a wide reproduction range is taken, young people can hear the reproduction sound without feeling uncomfortable or annoying.
- the dynamic range of the reproduction sound pressure for elderly people is set.
- a playback mode for the elderly and a playback mode for the young may be prepared, and these modes may be appropriately switched by a user operation.
- the compression ratio of the dynamic range of the reproduction sound pressure for the elderly is changed according to the change in the volume of the sound reproduction apparatus. More specifically, the compression ratio of the dynamic range is increased as the volume of the audio playback device increases. As a result, even when the volume is high, the elderly can listen to the reproduced sound without feeling troublesome due to the over-supplement phenomenon.
- FIG. 8 is a diagram for explaining a second embodiment of the audio reproducing apparatus of the present invention.
- the audio reproduction device of this embodiment includes a dynamic range compressor 31 that compresses the dynamic range of an input audio signal, and an amplifying / attenuating unit 32 that amplifies and attenuates an output audio signal from the dynamic range compressor 31. Yes.
- a DRC (dynamic range compression) threshold selection unit 33 that selects a threshold of the dynamic range compressor 31 and a gain selection unit 34 that selects a gain in the amplifier / attenuator 32 are provided.
- the selection unit 34 performs threshold selection of the dynamic range compressor 31 and gain selection of the amplifier / attenuator 32 based on the volume information of the audio playback device and the maximum output sound pressure information of the playback device.
- the dynamic range compressor 31 and the amplifier / attenuator 32 change the level of the output audio signal based on the signal level of the input audio signal. For example, the dynamic range compressor 31 outputs an audio signal having a level that is directly proportional to the input signal level until the frequency of the input audio signal reaches a threshold, and when the frequency of the audio signal exceeds the threshold, An audio signal having a level attenuated with respect to the input signal level is output.
- the amplifier / attenuator 32 amplifies / attenuates the audio signal output from the dynamic range in accordance with a set gain and outputs the amplified signal. By manipulating the threshold value, the compression ratio (range width) of the dynamic range can be arbitrarily changed.
- the dynamic range compressor 31 and the amplifier / attenuator 32 With such a configuration of the dynamic range compressor 31 and the amplifier / attenuator 32, the dynamic range of the entire input audio signal is appropriately compressed according to the signal level of the audio signal, and the reproduction of the audio signal is more accurately performed. It can be executed.
- the dynamic range compressor 31 and the amplifier / attenuator 32 can be combined as an ALC (auto level control).
- the threshold value selected by the DRC threshold value selection unit 33 can be stored in advance as a DRC threshold value table 35 in storage means such as a memory.
- a threshold corresponding to the volume is set for each level of the maximum output of the playback device.
- FIG. 9A shows a setting example of the DRC threshold value table.
- the threshold is set to -5 dB for volume 1 and -6 dB for volume 2.
- the threshold is set up to volume 60 after volume 2.
- the sound volumes 1 to 60 simply represent the sound output levels set to 60 levels, and the playback device maximum output levels 1, 2,... Correspond to predetermined levels. The one assigned a unique number is shown.
- the gain selected by the gain selection unit 34 can be stored in advance as a gain table 36 in storage means such as a memory.
- a gain corresponding to the volume is set in the gain table 36 for each level of the maximum output of the playback device.
- FIG. 9B shows a setting example of the gain table.
- the playback device maximum output level is 1, the gain is determined to be 1.5 for volume 1 and 1.4 for volume 2.
- the gain is determined for the volume 2 and thereafter.
- the sound volumes 1 to 60 simply represent the sound output levels set to 60 levels.
- the playback apparatus maximum output levels 1, 2,... Are assigned with unique numbers corresponding to predetermined levels.
- the DRC threshold selection unit 33 selects a threshold from the DRC threshold table 35 based on the volume information of the recording / playback apparatus and the maximum output sound pressure information, and the dynamic range compressor The compression characteristic at 31 is changed, and the amplification / attenuation rate is optimized by the gain selected by the gain selector 34.
- FIG. 10 is a diagram for explaining a setting example of the dynamic range compression characteristic to be changed in conjunction with the sound volume.
- the compression ratio of the dynamic range is changed as the volume of the audio playback device increases. Specifically, the compression ratio of the dynamic range is increased by setting the threshold value for the dynamic range compressor 31 low. At this time, the compression ratio of the dynamic range with respect to the increase in sound volume is increased nonlinearly. That is, as the volume increases, the slope of the compression rate with respect to the volume is increased. For example, as shown in FIG.
- the compression upper limit C1 when the volume is 20 is set to around 35 dBSPL
- the compression upper limit C2 is set to be less than 70 dBSPL when the volume of the recording / playback apparatus is increased to 50.
- the compression upper limit C3 is in the vicinity of a little over 70 dBSPL.
- the dynamic range width is reduced as the sound volume increases by non-linearly suppressing the increase in the compression upper limit corresponding to the sound volume increase. That is, the dynamic range compression rate is increased as the volume increases.
- the level of the maximum sound pressure to be reproduced is suppressed to a certain level or less (in this example, the volume max is around 70 dBSPL or less), and the elderly people feel annoyed. You can listen to the playback audio.
- the dynamic range of the reproduced audio for the elderly is set higher than that for the young as well, as shown in FIG. This means that even when an audio signal having a low volume is input, the output value is raised and the sound is reproduced with a higher sound pressure.
- the lower limit value is set by setting the gain for the amplifier / attenuator 32 in FIG. In this case, by setting a fixed value that does not differ from the signal amplitude of the input audio signal, the sound pressure level of the lower limit value of the dynamic range is raised.
- the upper limit value of the dynamic range is suppressed, so that the compression ratio of the dynamic range is increased as the volume of the playback device increases, and the amplifying / attenuating device is increased.
- the lower limit of the dynamic range is raised by setting the gain to 32. As a result, it is possible to perform audio reproduction with an optimal dynamic range for the elderly.
- the audio reproduction device extracts a common component from a plurality of channels of input audio, subtracts the common component from each channel component, calculates a component other than the common component, and extracts the extracted common component and components other than the common component. It is configured as a device that mixes by changing the ratio of components.
- a voice signal of a human voice is extracted as a common component.
- a voice signal of a human voice is recorded so as to be lowered to the center by collecting sound with a sound collecting microphone, for example, and distributed to an L channel and an R channel.
- the ratio of common components including human voice and components other than the common components is optimized for elderly people.
- the ratio and gain between the common component including a human voice and a component other than the common component are changed according to the volume of the sound reproducing device.
- the volume of the recording / reproducing apparatus is low, the common component including the human voice is emphasized by increasing the common component gain by increasing the gain of the common component. This makes it easier to hear a human voice at a low volume.
- the volume of the recording / playback device increases, the gain of common components including human voices is reduced so that the ratio with the components other than the common components is changed to be equal. It is possible to make it difficult to feel annoyance caused by the excessive replenishment phenomenon.
- FIG. 11 is a block diagram showing a third embodiment of the audio reproducing apparatus according to the present invention.
- the audio reproduction device includes an audio signal conversion unit 40 having a function of separating input audio into a common component and components other than the common component and adjusting a mixing ratio and gain of these components.
- the audio signal conversion unit 40 includes a spectrum conversion unit 42 (42a, 42b), a common component extraction unit 43, a multiplication unit 44 (44a, 44b, 44c), an inverse conversion unit 45 (45a, 45b, 45c), a subtractor 47, 48, input terminals 41a and 41b, output terminals 46a and 46b, and adders 49 and 50.
- the audio signal conversion unit 40 receives a plurality of audio signals respectively corresponding to a plurality of channels.
- the audio signal conversion unit 40 receives a 2-channel audio signal digitally encoded by PCM (Pulse Code Modulation).
- Examples of the two-channel audio input signal include a stereo audio signal in television broadcasting. In stereo broadcasting or the like, different audio signals are usually supplied to left and right speakers provided in an audio reproduction device such as a television based on the input two-channel audio signals, and different audio is output from each speaker.
- the left audio signal corresponding to the left channel and the right audio signal corresponding to the right channel are respectively input from the input terminals 41a and 41b to the audio signal conversion unit 40, and the audio output from the audio signal conversion unit 40 is Sound is output from the speaker.
- the spectrum conversion unit 42a divides the right audio signal input via the input terminal 41a into, for example, 1024 samples per frame.
- the sampling frequency of the audio signal is 44.1 kHz
- the spectrum converting unit 42a multiplies the frame-divided audio signal by a window function such as a Hanning window.
- a window function such as a Hanning window.
- the spectrum conversion unit 42a performs fast Fourier transform (FFT: Fourier Transform) on the audio signal to which the window function is applied for each frame, and converts the time-domain audio signal into the frequency domain data, that is, The spectrum is converted into a spectrum (hereinafter referred to as a right audio signal spectrum) and output to the common component extraction unit 43 and the subtractor 47.
- FFT fast Fourier transform
- the spectrum conversion unit 42b calculates the spectrum of the left audio signal input through the input terminal 41b (hereinafter referred to as the left audio signal spectrum) by the same processing as the spectrum conversion unit 42a, and extracts the common component.
- the data is output to the unit 43 and the subtracter 48.
- the frequency spectrum may be calculated by modified discrete cosine transform (MDCT) instead of FFT, and the method of spectrum conversion is not particularly limited.
- MDCT modified discrete cosine transform
- the common component extraction unit 43 extracts a common component of the right audio signal spectrum and the left audio signal spectrum.
- FIG. 12 is a diagram for explaining the common component.
- FIG. 12A is a diagram showing the common component of the right audio signal spectrum and the left audio signal spectrum
- FIG. 12B shows only the common component.
- the common component extraction unit 43 extracts the smaller spectrum of XR (k) and XL (k) as a common component.
- the audio signal conversion unit 40 receives a 2-channel input signal in a stereo broadcast program or the like.
- a general stereo broadcast program sound is recorded by a one-channel microphone for recording sound, and BGM and sound effects other than vocals are recorded in advance by two microphones (stereo) on the left and right.
- the 3 channel signal is downmixed to 2 channels. That is, a voice signal of a human voice recorded by a one-channel microphone for recording a sound is mixed with a surrounding sound signal recorded by two left and right microphones, and a two-channel voice signal is transmitted. become.
- the ratio at which the human voice signal and the surrounding sound signal are mixed is set in the broadcasting station.
- the right audio signal is an audio signal obtained by mixing audio recorded by the right microphone and the one-channel microphone for audio recording.
- the left audio signal is an audio signal obtained by mixing audio recorded by the left microphone and the 1-channel microphone for audio recording. Therefore, an audio signal representing a human voice is included in common with the left audio signal and the right audio signal.
- vocals are recorded by a one-channel microphone for voice recording, and instrument sounds are recorded by two left and right microphones (stereo) and then downmixed to two channels.
- the common component extraction unit 43 extracts, as a common component, a component of an audio signal mainly representing a human voice that is included in common in the right audio signal and the left audio signal as described above.
- the subtractor 47 subtracts the common component spectrum C (k) output from the common component extraction unit 43 from the right audio signal spectrum XR (k) output from the spectrum conversion unit 42a to obtain the right component spectrum XR ′ ( k) is calculated and output to the multiplier 44a.
- the subtractor 48 subtracts the common component spectrum C (k) output from the common component extraction unit 43 from the left audio signal spectrum XL (k) output from the spectrum conversion unit 42b to obtain the left component spectrum XL.
- '(K) is calculated and output to the multiplication unit 44c.
- the inverse conversion unit 45b converts the common component C ′′ (k) output from the multiplication unit 44b into a time-domain signal waveform by inverse FFT, and distributes the signal to the adders 49 and 50 for output.
- the inverse transform unit 45a converts the right component output spectrum XR ′′ (k), which is information in the frequency domain, into a signal waveform in the time domain by inverse FFT and outputs it.
- the adder 49 outputs the right component obtained by the inverse FFT,
- the common component output from the inverse conversion unit 45b is added and output as an audio output signal output to the right speaker.
- the inverse transform unit 45c converts the left component output spectrum XL ′′ (k), which is information in the frequency domain, into a signal waveform in the time domain by inverse FFT, and outputs the signal waveform.
- the adder 50 outputs the left after inverse FFT.
- the component and the common component output from the inverse transform unit 45b are added and output as an audio output signal output to the left speaker.
- the gain G2 applied to the common component spectrum is a value satisfying 1 ⁇ M1, and is applied to the component spectra other than the common component (right component spectrum XR ′ (k), left component spectrum XL ′ (k)).
- the gains G1 and G3 are values that satisfy 0 ⁇ M1 ⁇ 1.
- Each of the multiplying units 44a, 44b, and 44c can change the mixing ratio and gain of the common component spectrum and the spectrum other than the common component by multiplying the spectrum of the input component as a multiplication value by these gains.
- the gain set in each of the multipliers 44a, 44b, 44c is selected by the gain selector 51.
- the gain value selected by the gain selection unit 51 is stored in advance as a gain table 52 in storage means such as a memory.
- a gain corresponding to the volume is set for each level of the maximum output of the playback device.
- FIG. 13 is a diagram illustrating a setting example of the gain table.
- the gain of the multiplier 44a (referred to as the multiplier (1)) is 0.7
- the gain of the multiplier 44b (referred to as the multiplier (2))
- the gain of the multiplier (3) is set to 0.7.
- volumes 1 to 60 simply represent sound output levels set to 60 levels. Further, the player maximum output levels 1, 2,... Are assigned with unique numbers corresponding to predetermined levels.
- the gain selection unit 51 inputs the sound volume information of the audio reproduction device and the maximum output sound pressure information of the audio reproduction device, extracts a corresponding gain value by referring to the gain table, and outputs the gain value to each of the multiplication units 44a to 44c. Set. Thereby, the mixing ratio and gain of the common component spectrum and the spectrum other than the common component can be set to values corresponding to the sound volume of the sound reproducing device.
- the gain set in the gain table 52 is a value in which the common component including a human voice is increased and the common component is emphasized by the gain when the volume of the sound reproducing device is low. Also, as the volume of the audio playback device increases, the gain of the common component including the human voice is decreased and the gain of the component other than the common component is increased, so that the ratio of the gain of the common component and the ratio of the component other than the common component is increased. It is changed so that it gradually becomes equal and is played back.
- the gains of the multiplication units 44 a and 44 c other than the common component are 0.7, and the gain of the multiplication unit 44 b of the common component is 1.5.
- the ratio between the common component output from the multiplication unit 44 and the component other than the common component is 1.5: 0.7.
- the gains of the multipliers 44a and 44c other than the common component and the multiplier 44b of the common component are all 1.0.
- the gain of the common component is 1.0, and the gain of the emphasized common component is reduced to the same level as the other components.
- the gain of the common component is set to 1 or more at a small volume, and the gain of the common component is decreased in accordance with the increase in the volume of the sound reproducing device, thereby reducing the common component and the components other than the common component. It is preferable to perform control so that the mixing ratio is gradually uniform. As described above, in the present embodiment, it is easy to hear a human voice included in the common component by increasing the ratio and gain of the common component at a low volume. By outputting the components evenly and reducing the gain of the common components, it is possible to make it less difficult to feel the annoyance caused by the over-supplement phenomenon, and it is possible to perform optimal audio output control for the elderly .
- a playback mode for the elderly and a playback mode for the young are prepared, and when the playback mode for the elderly is selected by a user operation, the volume is You may make it perform control which changes the ratio and gain of a common component and another component according to increase.
- the program that operates in the audio reproduction device of the present invention is a program (a program that causes a computer to function) that controls a CPU or the like so as to realize the function of each means (or part of each means) according to the present invention.
- This program may be provided with a graphical user interface (GUI) for a display device so that the user can easily use the audio playback device.
- GUI graphical user interface
- Information handled by the audio playback device is temporarily stored in the RAM during the processing, and then stored in various ROMs and HDDs.
- the CPU reads and corrects / writes the information as necessary.
- a recording medium for storing the program a semiconductor medium (for example, ROM, nonvolatile memory card, etc.), an optical recording medium (for example, BD, DVD, MO, MD, CD, BD, etc.), a magnetic recording medium (for example, magnetic Any of tape, flexible disk, etc.) may be used.
- a semiconductor medium for example, ROM, nonvolatile memory card, etc.
- an optical recording medium for example, BD, DVD, MO, MD, CD, BD, etc.
- a magnetic recording medium for example, magnetic Any of tape, flexible disk, etc.
- the program can be stored and distributed in a portable recording medium, or transferred to a server computer connected via a network such as the Internet.
- the audio signal conversion apparatus can emphasize the voices of people such as vocals and lines of content being broadcast or reproduced, and thus can be suitably used in a television receiver or the like.
- Spectrum conversion unit 42a ... Spectrum conversion , 42b... Spectrum conversion unit, 43... Common component extraction unit, 44... Multiplication unit, 44a, 44b, 44c ... multiplication unit, 45. b ... inverse transform unit, 45 c ... inverse transformation unit, 46a, 46b ... Output terminal, 47, 48 ... subtractor, 49, 50 ... adder, 51 ... gain selection unit, 52 ... gain table.
Abstract
Description
従って、放送波や再生中のコンテンツについては、状況に応じて、音声(人の声)を強調して騒音や音楽などを抑制する必要があり、このときに音量の増減があっても煩わしさを感じないように出力音声特性を最適に制御する必要がある。
特許文献1は、自動レベルコントロール(ALC)の一般的な制御方法を開示するものであり、高齢者の聴力機能の低下に起因する聞こえにくさや煩わしさを解消するために音声特性を最適化する技術思想については、何ら開示されていない。
本発明に係る音声再生装置の第1の実施形態は、入力音声信号を再生して音声出力する音声再生装置において、音声出力の音量が相対的に小さいときには、イコライザによって人の声に相当する音声帯域の周波数を強調し、音声再生装置の音量が上がっていくに従って、徐々に出力音声の周波数帯域を変更して平坦(周波数方向に平坦)にしていくことを特徴とする。このために本実施形態では、入力音声信号の周波数特性を変更するイコライザを使用して、音声出力の音量に応じてイコライザにより音声信号の周波数特性を変化させる。
本実施形態の音声再生装置は、ユーザ操作に応じてスピーカからの出力音声の音量設定を可能とする音量設定手段を有し、イコライザ部10では、その音量に応じて周波数特性を変化させて出力する、また、音量に応じた周波数特性の変化は、音声再生装置の最大出力音圧情報に基づいて決定される。
イコライザ部10には、係数a1~b2選択部21,23が設けられる。係数a1~b2選択部21,23は、後段の2つのバイクアッドデジタルフィルタ11b,11cの係数a1~b2(a1,a2,b1,b2)を、音声再生装置の音量情報と、当該音声再生装置の最大出力音圧情報とに基づいて選択し、イコライザ部10の特性を変更する。これら係数a1~b2の係数選択は、音声再生装置のメモリ等の記憶手段に予め係数テーブル22,24を記憶させておき、その係数テーブルから上記音量情報と最大出力音圧情報とに基づいて係数を選択する。また、1段目のバイクアッドデジタルフィルタ11aは、ハイパスフィルタとして用いる。
そして、出力側の混合器13を、入力側の混合器12の値D0と係数b0との積で上書きし、第1遅延器14の後側の値D1と係数b11との積を出力側の混合器13に加算する。さらに第2遅延器15の後側の値D2と係数b2との積を混合器13に加算する。
イコライザ部10では、このような各バイクアッドデジタルフィルタ11の処理を段数分反復し、出力側の混合器13の出力を出力信号として出力する。
2段目のバイクアッドデジタルフィルタ11bの係数選択を行う第1の係数a1~b2選択部21が使用する第1のa1~b2係数テーブル22は、再生機最大出力のレベルごとに、音量に応じた係数が設定されている。例えば、図2(A)に示すように、再生機最大出力レベル1のとき、音量1では、係数a1=0.9、・・・b2=0.6のように定められる。音量2以降も同様に音量60まで係数が定められている。なお、この例では、音量1~60は、単に音声出力レベルを60段階に設定したものを表している。また、再生機最大出力レベル1、2・・・についても予め定めたレベルに相当する固有の番号を割り当てたものを示している。
1段目~3段目までのパラメトリックイコライザを経ることで調整された特性をもつ音声信号がイコライザ部10から出力される。
一方、前述の図15に示した音圧、あるいは後述する図6,7,10等に示す音圧は、電気的な特性ではなく、音声再生装置のスピーカから出力される音圧をどのように制御するかを示したものである。ボリュームの目盛が同じ位置(例えば60のmax値)であるとすると、例えば、定格が5Wで、スピーカの能率が80dB/W/mの場合は、85dBSPLとなるので、定格が10Wで、スピーカの能率が80dB/W/mの場合と比較すると、5dBSPL低い音圧となるが、DRCの閾値やゲインは図6,7,10,15等が示すレンジに圧縮すればよいので、5dBSPL分低い制御が必要となる。すなわち、音声再生装置が何W+何Wの定格出力を持っていて、能率が何dBSPLのスピーカを備えているかによって、聴取者の耳に届く音圧レベルが異なるため、(定格の種類)×(スピーカの能率の種類)の組み合わせにより生じる種類数のテーブルが必要となる。
図3(A)に示す例では、出力音声のうち、人の声に相当する周波数帯域であって、特に高齢者が聞こえにくくなってくる周波数帯域を強調して良く聞こえるように制御してある。ここでは、一般的に人の声の帯域の高次フォルマントに相当する周波数略1kHz~8kHzの周波数帯域のゲインを周波数帯域に比較して大きくして強調させる。ここで8kHz以上の周波数帯域についても強調させるようにしてもよいが、特に高齢者では、このレベルの高い周波数帯域の音は非常に聞こえにくくなっているので、音圧を上げてもよく聞こえるようになるとは限らない。また、高い周波数帯域を強調すると、出力された音声信号のピーク成分が飽和し、信号波形がクリップした状態となることがあるため、無理して強調する必要はない。このような観点から、中程度の音量のときには図3(A)に示すような周波数特性が得られるようにイコライザ部10の係数を設定している。図3(B)は、このときの係数の設定例を示すもので、1~3(1段目から3段目)のそれぞれのパラメトリックイコライザに設定する係数の例を示している。
普通の人間の声は、図4のS1に示すような周波数上のA,B,C,Dにエネルギーが集中する特徴的な周波数分布となっている。これらA,B,C,Dをそれぞれ基本周波数、第1フォルマント、第2フォルマント、第3フォルマントと呼んでいる。一般的に、基本周波数が最も音が強く、フォルマントが高次になるほど減衰する。
本実施形態では、通常であれば減衰する第1フォルマント、第2フォルマント、第3フォルマントを増幅してS2のようにすることで、通りやすい(聞こえやすい)音質に調整する。そして、音量が相対的に低い状態では、周波数が略1kHz~8kHzの中間の周波数領域を相対的に強調し、音量が大きくなっていくに従ってより平坦な形状に変化させる。
また、本実施形態では、高齢者用の再生モードと若年者用の再生モードとを用意し、高齢者用の再生モードがユーザ操作によって選択されたときに、上記のように音量の増大に応じて周波数特性を変更する制御を行うようにしてもよい。このときの高齢者用の再生モードと若年者用の再生モードを選択する手段が本発明の聴取者選択手段に該当する。
高齢者の場合、加齢によって最小可聴限のレベルが上がり、若年に比して小さい音圧の音が聞こえにくくなる。また、心地良く聞こえる音圧の上限側に関しては、前述のように過補充現象が生じる可能性があり、例えば70~80dBSPLを超えるような音圧の再生は高齢者に向かない。
つまり、高齢者にとっては、最適な音圧レベルの領域が若年の人に比べて狭くなるのが一般的である。このような高齢者に対応するために、本実施形態では、高齢者向けの再生音圧のダイナミックレンジを設定する。
図6に示すように、本例では、高齢者向けの再生音圧は、上限を70dBSPL付近とし、下限を15dBSPL付近とする。この理由として、上記のように高齢者は、過補充現象などに説明されるように大きな音圧の再生音を好まないことと、最小可聴限のレベルが高いことがある。これに比べて図7に示すように、若者向けの再生音圧レンジは、上限が80dBSPLで下限は10dBSPL以下とすることができる。広い再生レンジをとっても、若年者は違和感や煩わしさを感じることなく再生音を聞くことができるからである。
そして、本実施形態では、音声再生装置の音量の変化に従って、高齢者用の再生音圧のダイナミックレンジの圧縮率を変更する。より具体的には、音声再生装置の音量の増大に伴ってダイナミックレンジの圧縮率を増大させる。これにより、大音量のときにも高齢者が過補充現象に起因する煩わしさを感じることなく、再生音声を聞くことができるようになる。
本実施形態の音声再生装置は、入力音声信号のダイナミックレンジを圧縮するダイナミックレンジ圧縮器31と、ダイナミックレンジ圧縮器31からの出力音声信号の増幅・減衰を行う増幅・減衰器32とを備えている。そしてダイナミックレンジ圧縮器31の閾値を選択するDRC(ダイナミックレンジ圧縮)閾値選択部33と、増幅・減衰器32におけるゲインを選択するゲイン選択部34とを有し、これらDRC閾値選択部33とゲイン選択部34は、音声再生装置の音量情報と再生機最大出力音圧情報とに基づいて、ダイナミックレンジ圧縮器31の閾値選択および増幅・減衰器32のゲイン選択を行う。
例えば、図10に示すように、音量20のときの圧縮上限C1を35dBSPL付近とするとき、記録再生装置の音量が上げられて音量50になったときには、圧縮上限C2を70dBSPL弱とする。さらに、本例の最大音量である音量60になったときには、圧縮上限C3は70dBSPL強の付近となる。
これによって、音声再生装置の音量が大きくなっても、再生される最大音圧のレベルが一定レベル以下(本例では、音量maxで70dBSPL付近以下)に抑えられ、高齢者が煩わしさを感じることなく、再生音声を聞くことができる。
このように本実施形態では、ダイナミックレンジ圧縮器31に対する閾値選択によって、ダイナミックレンジの上限値を押さえ込むことで、再生装置の音量増大に伴ってダイナミックレンジの圧縮率を増大させ、そして増幅・減衰器32へのゲイン設定によってダイナミックレンジの下限値を持ち上げるようにする。この結果、高齢者にとって最適なダイナミックレンジで音声再生を実行することができるようになる。
本実施形態の音声再生装置は、入力音声の複数のチャンネルから共通成分を抽出し、各チャンネル成分から共通成分を減算して共通成分以外の成分を算出し、抽出した共通成分と共通成分以外の成分との割合を変更してミキシングする装置として構成される。
この構成によって、例えば、人の声の音声信号を共通成分として取り出すようにする。人の声の音声信号は、例えば集音マイクで集音することで中央に低位するように録音され、LチャンネルとRチャンネルとに振り分けられる。このような音声信号からLチャンネルとRチャンネルとの共通成分を取り出すことで、人の声を含む共通成分を得ることができる。
音声信号変換部40は、スペクトル変換部42(42a,42b)、共通成分抽出部43、乗算部44(44a,44b,44c)、逆変換部45(45a,45b,45c)、減算器47,48、入力端子41a,41b、出力端子46a,46b、および加算器49,50を備えている。
ここでは、FFTに代えて修正離散コサイン変換(MDCT:Modified Discrete Cosine Transform)によって周波数スペクトルを算出する構成であってもよく、スペクトル変換の手法は特に限定はされない。
共通成分抽出部43は、上記のような右側音声信号と左側音声信号に共通して含まれている主として人の声を表す音声信号の成分を、共通成分として抽出する。
逆変換部45aは、周波数領域の情報である右成分出力スペクトルXR”(k)を逆FFTによって時間領域の信号波形に変換して出力する。加算器49は、逆FFTされた右成分と、逆変換部45bから出力された共通成分とを加算して、右のスピーカに出力する音声出力信号として出力する。
同様に、逆変換部45cは、周波数領域の情報である左成分出力スペクトルXL”(k)を逆FFTによって時間領域の信号波形に変換して出力する。加算器50は、逆FFTされた左成分と、逆変換部45bから出力された共通成分とを加算して、左のスピーカに出力する音声出力信号として出力する。
図13は、ゲインテーブルの設定例を示す図である。ここでは、再生機最大出力レベル1のとき、音量1では乗算部44a(乗算部(1)とする)のゲインは0.7、乗算部44b(乗算部(2)とする)のゲインは1.5、乗算部44c(乗算部(3)とする)のゲイン0.7に設定されている。音量2以降についても、同様にそれぞれの乗算部44a~44cで乗算すべきゲインが設定されている。音量1~60は、単に音声出力レベルを60段階に設定したものを表している。また、再生機最大出力レベル1、2・・についても予め定めたレベルに相当する固有の番号を割り当てたものを示している。
一方、図13の例で音量60のときには、共通成分以外の乗算部44a,44c、および共通成分の乗算部44bのゲインは全て1.0である。これにより、共通成分と共通成分以外の成分とが同じ割合で出力される。また、共通成分のゲインは1.0であり、強調されていた共通成分のゲインも他の成分と同じレベルに低下している。
このように本実施形態では、小音量では、共通成分の割合とゲインを大きくすることで共通成分に含まれる人の声が聞き取りやすくなり、大音量になったときには、共通成分と共通成分以外の成分とを均等に出力し、かつ共通成分のゲインを低下させていくことで、過補充現象に起因する煩わしさを感じにくくすることができ、高齢者にとって最適な音声出力制御を行うことができる。
また、上記各実施形態と同様に、高齢者用の再生モードと若年者用の再生モードとを用意し、高齢者用の再生モードがユーザ操作によって選択されたときに、上記のように音量の増大に応じて共通成分とそれ以外の成分の割合及びゲインを変更する制御を行うようにしてもよい。
また、ロードしたプログラムを実行することにより、上述した実施形態の機能が実現されるだけでなく、そのプログラムの指示に基づき、オペレーティングシステムあるいは他のアプリケーションプログラム等と共同して処理することにより、本発明の機能が実現される場合もある。また、市場に流通させる場合には、可搬型の記録媒体にプログラムを格納して流通させたり、インターネット等のネットワークを介して接続されたサーバコンピュータに転送することができる。
Claims (13)
- 入力した音声信号の周波数特性を設定する周波数特性設定手段と、音声信号を音声出力する際の音量を可変制御する音量設定手段とを有する音声再生装置であって、
前記周波数特性設定手段は、人の声の帯域を含む音声帯域を強調し、または、前記音声帯域以外の帯域を減衰し、
前記音量設定手段は、ダイナミックレンジを圧縮することを特徴とする音声再生装置。 - 入力した音声信号の周波数特性を設定する周波数特性設定手段と、音声信号を音声出力する際の音量を可変制御する音量設定手段とを有する音声再生装置であって、
前記周波数特性設定手段は、前記音量設定手段より設定される音量の増大に応じて、人の声の帯域を含む音声帯域が強調された周波数特性から、周波数に応じたゲインの特性が徐々に平坦となる周波数特性に変更することを特徴とする音声再生装置。 - 請求項2に記載の音声再生装置において、前記音声帯域は、略1kHz~8kHzの範囲とすることを特徴とする音声再生装置。
- 請求項2または3に記載の音声再生装置において、聴取者が高齢者か若年者かをユーザ操作に応じて選択する聴取者選択手段を有し、前記高齢者が選択されている場合に、前記音量設定手段により設定された音量の増大に応じて、前記周波数特性を変更することを特徴とする音声再生装置。
- 入力した音声信号のダイナミックレンジを設定するダイナミックレンジ設定手段と、音声信号を音声出力する際の音量を可変制御する音量設定手段とを有する音声再生装置であって、
前記ダイナミックレンジ設定手段は、前記音量設定手段より設定される音量の増大に応じて、ダイナミックレンジの圧縮率が徐々に高くなるように変更することを特徴とする音声再生装置。 - 請求項5に記載の音声再生装置において、聴取者が高齢者か若年者かをユーザ操作に応じて選択する聴取者選択手段を有し、前記高齢者が選択されている場合に、前記音量設定手段により設定された音量の増大に応じて、前記ダイナミックレンジの圧縮率を変更することを特徴とする音声再生装置。
- 複数のチャンネルにそれぞれ対応する複数の音声信号から、共通成分を抽出する手段と、前記複数の音声信号のそれぞれから前記共通成分を減算して、前記共通成分以外の成分を抽出する手段と、抽出した前記共通成分と前記共通成分以外の成分のゲインを変更して混合する手段と、音声信号を音声出力する際の音量を可変制御する音量設定手段とを有する音声再生装置であって、
前記音量設定手段より設定される音量の増大に応じて、前記共通成分のゲインを小さくすることを特徴とする音声再生装置。 - 請求項7に記載の音声再生装置において、聴取者が高齢者か若年者かをユーザ操作に応じて選択する聴取者選択手段を有し、前記高齢者が選択されている場合に、前記音量設定手段により設定された音量の増大に応じて、前記音量設定手段により設定された音量の増大に応じて、前記ゲインを変更することを特徴とする音声再生装置。
- 入力した音声信号の周波数特性を設定し、音声信号を音声出力する際の音量を可変制御する音声再生装置により実行される音声再生方法であって、
前記音声再生装置が、人の声の帯域を含む音声帯域を強調し、または、前記音声帯域以外の帯域を減衰し、ダイナミックレンジを圧縮するステップを有することを特徴とする音声再生方法。 - 入力した音声信号の周波数特性を設定し、音声信号を音声出力際の音量を可変制御する音声再生装置により実行される音声再生方法であって、
前記音声再生装置が、前記音声出力の際に設定される音量の増大に応じて、人の声の帯域を含む音声帯域が強調された周波数特性から、周波数に応じたゲインの特性が徐々に平坦となる周波数特性に変更するステップを有することを特徴とする音声再生方法。 - 入力した音声信号のダイナミックレンジを設定し、音声信号を音声出力際の音量を可変制御する音声再生装置によって実行する音声再生方法であって、
前記音声再生装置が、音声出力の際に設定される音量の増大に応じて、ダイナミックレンジの圧縮率が徐々に高くなるように変更するステップを有することを特徴とする音声再生方法。 - 複数のチャンネルにそれぞれ対応する複数の音声信号から、共通成分を抽出し、前記複数の音声信号のそれぞれから前記共通成分を減算して、前記共通成分以外の成分を抽出し、抽出した前記共通成分と前記共通成分以外の成分のゲインを変更して混合することで、音声信号を音声出力する際の音量を可変制御する音声再生装置により実行する音声再生方法であって、
前記音声再生装置が、設定される音量の増大に応じて、前記共通成分のゲインを小さくするステップを有することを特徴とする音声再生方法。 - 請求項1~8のいずれか1に記載の音声再生装置の機能をコンピュータにより実現させるためのプログラム。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
MX2011012736A MX2011012736A (es) | 2009-05-29 | 2010-05-27 | Aparato de reproduccion de sonido, metodo de reproduccion de sonido y medio de grabacion. |
US13/375,154 US9093968B2 (en) | 2009-05-29 | 2010-05-27 | Sound reproducing apparatus, sound reproducing method, and recording medium |
JP2011516052A JP5149991B2 (ja) | 2009-05-29 | 2010-05-27 | 音声再生装置、音声再生方法及びプログラム |
CN201080030312.7A CN102461207B (zh) | 2009-05-29 | 2010-05-27 | 声音重放装置、声音重放方法和记录介质 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2009-131318 | 2009-05-29 | ||
JP2009131318 | 2009-05-29 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2010137650A1 true WO2010137650A1 (ja) | 2010-12-02 |
Family
ID=43222755
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2010/058994 WO2010137650A1 (ja) | 2009-05-29 | 2010-05-27 | 音声再生装置、音声再生方法及びプログラム |
Country Status (5)
Country | Link |
---|---|
US (1) | US9093968B2 (ja) |
JP (1) | JP5149991B2 (ja) |
CN (1) | CN102461207B (ja) |
MX (1) | MX2011012736A (ja) |
WO (1) | WO2010137650A1 (ja) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012195899A (ja) * | 2011-03-18 | 2012-10-11 | Yamaha Corp | 音声信号処理装置およびプログラム |
JP2014072873A (ja) * | 2012-10-02 | 2014-04-21 | Panasonic Corp | 音声出力装置およびその音声信号処理方法 |
JP2015053047A (ja) * | 2013-09-06 | 2015-03-19 | イマージョン コーポレーションImmersion Corporation | 動的ハプティック変換システム |
JP2018510550A (ja) * | 2015-02-12 | 2018-04-12 | 電信科学技術研究院 | 音色等化器(aeq)のプリセットを決定するための方法及び装置 |
US11758337B2 (en) | 2019-04-23 | 2023-09-12 | Socionext Inc. | Audio processing apparatus |
JP7482632B2 (ja) | 2019-01-03 | 2024-05-14 | ハーマン インターナショナル インダストリーズ インコーポレイテッド | 多重ステップ音嗜好性判定 |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5106651B2 (ja) * | 2011-03-31 | 2012-12-26 | 株式会社東芝 | 信号処理装置及び信号処理方法 |
US20120262233A1 (en) * | 2011-04-15 | 2012-10-18 | Fairchild Semiconductor Corporation | Mixed signal dynamic range compression |
JP5853813B2 (ja) * | 2012-03-27 | 2016-02-09 | 船井電機株式会社 | 音声信号出力機器および音声出力システム |
KR101874836B1 (ko) * | 2012-05-25 | 2018-08-02 | 삼성전자주식회사 | 음향 보정이 가능한 디스플레이 장치, 청각 레벨 제어 장치 및 방법 |
CN103489451B (zh) * | 2012-06-13 | 2016-11-23 | 百度在线网络技术(北京)有限公司 | 移动终端的语音处理方法及移动终端 |
WO2014006893A1 (ja) * | 2012-07-04 | 2014-01-09 | パナソニック株式会社 | 接近警報装置、接近警報システム、移動体装置と、接近警報システムの故障診断方法 |
KR20220140002A (ko) * | 2013-04-05 | 2022-10-17 | 돌비 레버러토리즈 라이쎈싱 코오포레이션 | 향상된 스펙트럼 확장을 사용하여 양자화 잡음을 감소시키기 위한 압신 장치 및 방법 |
EP2992605B1 (en) * | 2013-04-29 | 2017-06-07 | Dolby Laboratories Licensing Corporation | Frequency band compression with dynamic thresholds |
US9706302B2 (en) * | 2014-02-05 | 2017-07-11 | Sennheiser Communications A/S | Loudspeaker system comprising equalization dependent on volume control |
TWI566240B (zh) * | 2014-12-12 | 2017-01-11 | 宏碁股份有限公司 | 音訊處理方法 |
JP2018504857A (ja) * | 2015-02-04 | 2018-02-15 | エティモティック・リサーチ・インコーポレーテッド | 語音了解度向上システム |
US10708690B2 (en) | 2015-09-10 | 2020-07-07 | Yayuma Audio Sp. Z.O.O. | Method of an audio signal correction |
US20190391782A1 (en) * | 2017-02-10 | 2019-12-26 | Cary Randolph Miller | Method and system of processing an audio recording for facilitating production of competitively loud mastered audio recording |
US10795637B2 (en) * | 2017-06-08 | 2020-10-06 | Dts, Inc. | Adjusting volume levels of speakers |
KR102302683B1 (ko) * | 2017-07-07 | 2021-09-16 | 삼성전자주식회사 | 음향 출력 장치 및 그 신호 처리 방법 |
JP2019075670A (ja) * | 2017-10-16 | 2019-05-16 | 晃 大澤 | 既製補聴器系列 |
CN111613197B (zh) * | 2020-05-15 | 2023-05-26 | 腾讯音乐娱乐科技(深圳)有限公司 | 音频信号处理方法、装置、电子设备及存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH11261356A (ja) * | 1998-03-09 | 1999-09-24 | Matsushita Electric Ind Co Ltd | 音響再生装置 |
JP2001095082A (ja) * | 1999-09-24 | 2001-04-06 | Yamaha Corp | 指向性拡声装置 |
JP2003230071A (ja) * | 2002-01-31 | 2003-08-15 | Toshiba Corp | テレビ視聴システム |
JP2005086462A (ja) * | 2003-09-09 | 2005-03-31 | Victor Co Of Japan Ltd | オーディオ信号再生装置のボーカル音帯域強調回路 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3198558B2 (ja) | 1991-10-22 | 2001-08-13 | ソニー株式会社 | 音声強調回路 |
JPH11113097A (ja) | 1997-09-30 | 1999-04-23 | Sharp Corp | オーディオ装置 |
CN1249053A (zh) * | 1997-10-28 | 2000-03-29 | 皇家菲利浦电子有限公司 | 改进的声频再现装置和电话终端设备 |
US6807574B1 (en) * | 1999-10-22 | 2004-10-19 | Tellme Networks, Inc. | Method and apparatus for content personalization over a telephone interface |
EP1312162B1 (en) * | 2000-08-14 | 2005-01-12 | Clear Audio Ltd. | Voice enhancement system |
JP4185770B2 (ja) * | 2002-12-26 | 2008-11-26 | パイオニア株式会社 | 音響装置および音響特性の変更方法および音響補正用プログラム |
JP4766491B2 (ja) * | 2006-11-27 | 2011-09-07 | 株式会社ソニー・コンピュータエンタテインメント | 音声処理装置および音声処理方法 |
US20110115987A1 (en) * | 2008-01-15 | 2011-05-19 | Sharp Kabushiki Kaisha | Sound signal processing apparatus, sound signal processing method, display apparatus, rack, program, and storage medium |
JP5627241B2 (ja) * | 2008-01-21 | 2014-11-19 | パナソニック株式会社 | 音声信号処理装置および方法 |
-
2010
- 2010-05-27 JP JP2011516052A patent/JP5149991B2/ja not_active Expired - Fee Related
- 2010-05-27 MX MX2011012736A patent/MX2011012736A/es unknown
- 2010-05-27 US US13/375,154 patent/US9093968B2/en not_active Expired - Fee Related
- 2010-05-27 CN CN201080030312.7A patent/CN102461207B/zh not_active Expired - Fee Related
- 2010-05-27 WO PCT/JP2010/058994 patent/WO2010137650A1/ja active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH11261356A (ja) * | 1998-03-09 | 1999-09-24 | Matsushita Electric Ind Co Ltd | 音響再生装置 |
JP2001095082A (ja) * | 1999-09-24 | 2001-04-06 | Yamaha Corp | 指向性拡声装置 |
JP2003230071A (ja) * | 2002-01-31 | 2003-08-15 | Toshiba Corp | テレビ視聴システム |
JP2005086462A (ja) * | 2003-09-09 | 2005-03-31 | Victor Co Of Japan Ltd | オーディオ信号再生装置のボーカル音帯域強調回路 |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012195899A (ja) * | 2011-03-18 | 2012-10-11 | Yamaha Corp | 音声信号処理装置およびプログラム |
JP2014072873A (ja) * | 2012-10-02 | 2014-04-21 | Panasonic Corp | 音声出力装置およびその音声信号処理方法 |
JP2015053047A (ja) * | 2013-09-06 | 2015-03-19 | イマージョン コーポレーションImmersion Corporation | 動的ハプティック変換システム |
JP2018510550A (ja) * | 2015-02-12 | 2018-04-12 | 電信科学技術研究院 | 音色等化器(aeq)のプリセットを決定するための方法及び装置 |
JP7482632B2 (ja) | 2019-01-03 | 2024-05-14 | ハーマン インターナショナル インダストリーズ インコーポレイテッド | 多重ステップ音嗜好性判定 |
US11758337B2 (en) | 2019-04-23 | 2023-09-12 | Socionext Inc. | Audio processing apparatus |
Also Published As
Publication number | Publication date |
---|---|
US9093968B2 (en) | 2015-07-28 |
JPWO2010137650A1 (ja) | 2012-11-15 |
US20120128178A1 (en) | 2012-05-24 |
JP5149991B2 (ja) | 2013-02-20 |
MX2011012736A (es) | 2011-12-16 |
CN102461207B (zh) | 2015-04-22 |
CN102461207A (zh) | 2012-05-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5149991B2 (ja) | 音声再生装置、音声再生方法及びプログラム | |
US8787595B2 (en) | Audio signal adjustment device and audio signal adjustment method having long and short term gain adjustment | |
US8615094B2 (en) | Automatic level control circuit | |
JP4257079B2 (ja) | 周波数特性調整装置および周波数特性調整方法 | |
JP4602621B2 (ja) | 音響補正装置 | |
JP5488389B2 (ja) | 音響信号処理装置 | |
JP5917518B2 (ja) | 知覚スペクトルアンバランス改善のための音声信号動的補正 | |
JP4439579B1 (ja) | 音質補正装置、音質補正方法及び音質補正用プログラム | |
US8406442B2 (en) | Hearing aid apparatus | |
JP4327886B1 (ja) | 音質補正装置、音質補正方法及び音質補正用プログラム | |
US6055502A (en) | Adaptive audio signal compression computer system and method | |
KR20140116152A (ko) | 베이스 강화 시스템 | |
JP2005534980A (ja) | オーディオ明瞭度および了解度を改善するディジタル信号処理技術 | |
JP2009044268A (ja) | 音声信号処理装置、音声信号処理方法、音声信号処理プログラム、及び、記録媒体 | |
WO2006051586A1 (ja) | 音響電子回路及びその音量調節方法 | |
US20100303249A1 (en) | System and method of improving audio signals for the hearing impaired | |
JP6015146B2 (ja) | チャンネルデバイダおよびこれを含む音声再生システム | |
JP2008527882A (ja) | 音声信号の音響レベルを周波数に依存して増幅する信号処理装置及び音声システム及びその方法 | |
JP5058844B2 (ja) | 音声信号変換装置、音声信号変換方法、制御プログラム、および、コンピュータ読み取り可能な記録媒体 | |
WO1999008380A1 (en) | Improved listening enhancement system and method | |
JP5202021B2 (ja) | 音声信号変換装置、音声信号変換方法、制御プログラム、および、コンピュータ読み取り可能な記録媒体 | |
KR20220071954A (ko) | 오디오 신호의 정규화를 수행하는 방법 및 이를 위한 장치 | |
JP3627189B2 (ja) | 音響電子回路の音量調節方法 | |
CN116778949A (zh) | 个性化响度补偿方法、装置、计算机设备和存储介质 | |
JP2011160281A (ja) | 自動レベル制御回路およびそれを用いたオーディオ用デジタル信号プロセッサ、電子機器 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201080030312.7 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10780604 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2011516052 Country of ref document: JP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: MX/A/2011/012736 Country of ref document: MX |
|
WWE | Wipo information: entry into national phase |
Ref document number: 9225/CHENP/2011 Country of ref document: IN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13375154 Country of ref document: US |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 10780604 Country of ref document: EP Kind code of ref document: A1 |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: PI1015140 Country of ref document: BR |