JP2016045214A

JP2016045214A - Evaluation value calculating method, and spatial characteristics designing method

Info

Publication number: JP2016045214A
Application number: JP2014166771A
Authority: JP
Inventors: 幹記矢入; Mikinori Yairi
Original assignee: Kajima Corp
Current assignee: Kajima Corp
Priority date: 2014-08-19
Filing date: 2014-08-19
Publication date: 2016-04-04
Anticipated expiration: 2034-08-19
Also published as: JP6305273B2

Abstract

PROBLEM TO BE SOLVED: To provide an evaluation value calculating method capable of calculating a voice transmitting performance evaluation indicator in a transmission system in which signal processing by a generation source encoding formula intervenes between a sound source and a sound receiving point.SOLUTION: An evaluation value calculating method by which, when collected voice signals are encoded by a generation source encoding formula and voice is reproduced by using the encoded signals, a voice transmitting performance evaluation indicator for evaluating how difficult the reproduced voice is to hear is calculated, comprises: an acquisition step S10 of acquiring an impulse response waveform ((A) in Fig. 4) on a communication route from a first loudspeaker 1a to a first microphone 3a; an attenuation removal step S12 of dividing the impulse response waveform by a Schroeder attenuation curve to calculate an attenuation-removed impulse response waveform ((B) in Fig. 4); and a calculation step S14 of calculating a voice transmitting performance evaluation indicator by using the attenuation-removed impulse response waveform.SELECTED DRAWING: Figure 3

Description

本発明は、評価値算定方法及び空間特性設計方法に関するものである。 The present invention relates to an evaluation value calculation method and a spatial characteristic design method.

音声の「聴き取りにくさ」は、建築空間の中で、単語親密度で統制された単語了解度試験用音表のうち、最も親密度の高い音表を聴取した際に、何％のヒトがその単語を聴き取りにくいと感じたかを表した指標で定義されている（非特許文献１参照）。そして、音声の「聴き取りにくさ」を予測する物理指標として、音声伝送性能評価指標（Speech Transmission Index：ＳＴＩ）を用いることが知られている（非特許文献２参照）。音声伝送性能評価指標は、音源から受音点までの伝達特性を示すインパルス応答、音声のスペクトル及び暗騒音レベルから算出される。 “Hearing difficulty” of speech is the percentage of people who listen to the most intimate sound table of the word intelligibility test sound table controlled by word familiarity in the architectural space. Is defined by an index representing whether it is difficult to hear the word (see Non-Patent Document 1). It is known that a speech transmission index (STI) is used as a physical index for predicting “difficulty in hearing” of speech (see Non-Patent Document 2). The sound transmission performance evaluation index is calculated from an impulse response indicating a transfer characteristic from a sound source to a sound receiving point, a sound spectrum, and a background noise level.

ところで、通信分野では音声に特化した様々な音声情報処理が提案されている。これらの音声情報処理は、音源と調音との分離を前提とする生成源符号化方式、及び音源と調音との分離を行わない波形符号化方式に大別することができる（非特許文献３参照）。 By the way, in the communication field, various voice information processing specialized for voice has been proposed. Such audio information processing can be broadly classified into a source coding method that presupposes separation of a sound source and articulation, and a waveform coding method that does not separate a sound source and articulation (see Non-Patent Document 3). ).

M. Morimoto, et. Al., Listening difficulty as a subjective measurefor evaluation of speech transmission performance in public spaces, J.Acoust.Soc.Am.116, 1607-1613, 2014M. Morimoto, et. Al., Listening difficulty as a subjective measurefor evaluation of speech transmission performance in public spaces, J.Acoust.Soc.Am.116, 1607-1613, 2014 日本建築学会編，「AIJES-S0002-2011 都市・建築空間における音声伝送性能評価規準・同解説」，２０１１年１２月The Architectural Institute of Japan, “AIJES-S0002-2011 Standards for Evaluation of Voice Transmission Performance in Urban and Architectural Spaces / Explanation”, December 2011 斎藤収三、中田和男著，「音声情報処理の基礎」，オーム社，１９８１年１１月Shuzo Saito, Kazuo Nakata, “Basics of Speech Information Processing”, Ohmsha, November 1981

ここで、上述した音声情報処理が音源と受音点との間に介在すると、音源から受音点までの伝達特性を示すインパルス応答波形を取得することが困難となり、結果として音声伝送性能評価指標を算出することができない場合がある。例えば、携帯電話などを用いて発話者と受聴者が会話する場合には、音源（発話者の口）と受音点（受聴者の耳）との間に生成源符号化方式を採用した信号処理が介在することになる。この場合、音の波形が発話者と受聴者との間で伝送されないため、インパルス応答という概念そのものを適用することができなくなり、結果として音声伝送性能評価指標の算出そのものを行うことができない。 Here, if the above-described audio information processing is interposed between the sound source and the sound receiving point, it becomes difficult to obtain an impulse response waveform indicating the transfer characteristic from the sound source to the sound receiving point, and as a result, the sound transmission performance evaluation index May not be calculated. For example, when a speaker and a listener have a conversation using a mobile phone or the like, a signal adopting a source encoding method between a sound source (speaker's mouth) and a sound receiving point (listener's ear) is used. Processing will be involved. In this case, since the sound waveform is not transmitted between the speaker and the listener, the concept of the impulse response itself cannot be applied, and as a result, the calculation of the speech transmission performance evaluation index itself cannot be performed.

本発明は、音源と受音点との間に生成源符号化方式の信号処理が介在する伝送系において、音声伝送性能評価指標を算定することができる評価値算定方法を提供することを目的とする。また、本発明は、当該方法により算定された音声伝送性能評価指標を用いた空間特性設計方法を提供することを目的とする。 An object of the present invention is to provide an evaluation value calculation method capable of calculating an audio transmission performance evaluation index in a transmission system in which signal processing of a source coding method is interposed between a sound source and a sound receiving point. To do. Another object of the present invention is to provide a spatial characteristic design method using a voice transmission performance evaluation index calculated by the method.

本発明者は、音源と受音点との間に生成源符号化方式の信号処理が介在する伝送系において、音源が配置された空間の音響特性が受音点での音声の聴き取りにくさに影響を及ぼしている、との知見を得た。そして、本発明者は、さらに鋭意研究を重ねた結果、音源が配置された空間のインパルス応答波形の反射音構造が、受音点での音声の聴き取りにくさに影響を及ぼしていることを見出し、本発明をするに至った。 The present inventor has found that in a transmission system in which signal processing of a source coding method is interposed between a sound source and a sound receiving point, the acoustic characteristics of the space where the sound source is arranged are difficult to hear the sound at the sound receiving point. It has been found that it has an influence on. As a result of further earnest research, the present inventor has shown that the reflected sound structure of the impulse response waveform in the space where the sound source is arranged has an effect on the difficulty of listening to the sound at the sound receiving point. It came to the headline and this invention.

すなわち、本発明は、第１空間内で音源から生成された音声が第１空間内でマイクにて集音され、集音された音声の信号が生成源符号化方式で符号化され、符号化された信号を用いて音声が再生される場合に、再生された音声の聴き取りにくさを評価するための音声伝送性能評価指標を算定する評価値算定方法であって、音源からマイクまでの伝達経路のインパルス応答波形を取得する取得ステップと、取得ステップにて取得されたインパルス応答波形をSchroederの減衰曲線で除算して減衰除去インパルス応答波形を算出する減衰除去ステップと、減衰除去ステップにて算出された減衰除去インパルス応答波形を用いて、音声伝送性能評価指標を算定する算定ステップと、を備える。 That is, according to the present invention, the sound generated from the sound source in the first space is collected by the microphone in the first space, and the collected sound signal is encoded by the generation source encoding method. An evaluation value calculation method for calculating an audio transmission performance evaluation index for evaluating difficulty in listening to reproduced sound when sound is reproduced using the reproduced signal, and is transmitted from a sound source to a microphone. Calculated in the acquisition step for acquiring the impulse response waveform of the path, the attenuation removal step for calculating the attenuation removal impulse response waveform by dividing the impulse response waveform acquired in the acquisition step by the Schroeder attenuation curve, and the attenuation removal step A calculation step for calculating a voice transmission performance evaluation index using the attenuated impulse response waveform.

この評価値算定方法では、第１空間内における音源からマイクまでの伝達経路のインパルス応答波形をSchroederの減衰曲線で除算して、減衰除去インパルス応答波形を算出する。算出された減衰除去インパルス応答波形では、インパルス応答波形に内在し、かつ、減衰していた反射音構造の特徴が強調される。このような減衰除去インパルス応答波形を音声伝送性能評価指標の算出の入力とすることで、受音点での音声の聴き取りにくさに影響を及ぼす反射音構造を、音声伝送性能評価指標に明確に反映させることができる。よって、算出された音声伝送性能評価指標が音声の聴き取りにくさに対して高い尺度性を示すようにすることが可能となる。 In this evaluation value calculation method, the impulse response waveform of the transmission path from the sound source to the microphone in the first space is divided by the Schroeder attenuation curve to calculate the attenuation removal impulse response waveform. In the calculated attenuation elimination impulse response waveform, the characteristic of the reflected sound structure inherent in the impulse response waveform and attenuated is emphasized. By using such an attenuation-removed impulse response waveform as an input for calculating the sound transmission performance evaluation index, the reflected sound structure that affects the difficulty in listening to the sound at the receiving point is clearly defined as the sound transmission performance evaluation index. Can be reflected. Therefore, the calculated speech transmission performance evaluation index can exhibit a high scale characteristic with respect to difficulty in listening to speech.

また、減衰除去ステップでは、減衰曲線が直接音の音量から所定音量減衰した時間で減衰除去インパルス応答波形を算出する処理を打ち切ってもよい。そして、所定音量は、音声を再生する装置のダイナミックレンジに基づいて決定されてもよい。このように構成することで、音声を再生する装置のダイナミックレンジに合わせて処理をすることができる。よって、不必要な情報を取り除いた情報で音声伝送性能評価指標を算出することが可能となるので、処理コスト及び演算速度を向上させることができる。 Further, in the attenuation removal step, the process of calculating the attenuation removal impulse response waveform may be terminated at the time when the attenuation curve is attenuated by a predetermined volume from the volume of the direct sound. The predetermined volume may be determined based on a dynamic range of a device that reproduces sound. With this configuration, it is possible to perform processing in accordance with the dynamic range of a device that reproduces sound. Therefore, since it is possible to calculate the voice transmission performance evaluation index with information from which unnecessary information is removed, it is possible to improve the processing cost and the calculation speed.

また、評価値算定方法は、第１空間内で音源から生成された音声の音声スペクトルを取得する音声取得ステップを備え、算定ステップでは、減衰除去ステップにて算出された減衰除去インパルス応答波形及び音声取得ステップにて取得された音声スペクトルを用いて前記音声伝送性能評価指標を算定する際に、前記減衰除去ステップにて算出された前記減衰除去インパルス応答波形又は前記音声取得ステップにて取得された前記音声スペクトルに対して音声の信号の伝送帯域の波形成分を通過させる帯域通過処理を施し、帯域通過処理を施した波形成分を用いて前記音声伝送性能評価指標を算定してもよい。このように構成することで、音声の信号の伝送帯域に合わせて処理をすることができる。よって、不必要な情報を取り除いた情報で音声伝送性能評価指標を算出することが可能となる。 The evaluation value calculation method includes a sound acquisition step of acquiring a sound spectrum of sound generated from a sound source in the first space, and the calculation step includes the attenuation removal impulse response waveform and the sound calculated in the attenuation removal step. When calculating the voice transmission performance evaluation index using the voice spectrum acquired in the acquisition step, the attenuation removal impulse response waveform calculated in the attenuation removal step or the voice acquisition step The voice spectrum may be subjected to a band pass process for passing a waveform component of the transmission band of the voice signal, and the voice transmission performance evaluation index may be calculated using the waveform component subjected to the band pass process. With this configuration, processing can be performed in accordance with the transmission band of the audio signal. Therefore, it is possible to calculate the voice transmission performance evaluation index using information from which unnecessary information is removed.

さらに、本発明に係る空間特性設計方法は、上記評価値算定方法で算出された音声伝送性能評価指標を用いて、第１空間の空間特性を設計する設計ステップを備える。この設計方法によれば、音源と受音点との間に生成源符号化方式の信号処理が介在する伝送系において、音声伝送性能評価指標によって受聴者側の聴き取りにくさを適切に評価して、発話者側の第１空間の空間特性を設計することができる。 Furthermore, the spatial characteristic design method according to the present invention includes a design step of designing the spatial characteristic of the first space using the voice transmission performance evaluation index calculated by the evaluation value calculation method. According to this design method, in the transmission system in which the signal processing of the source coding method is interposed between the sound source and the sound receiving point, the difficulty of listening on the listener side is appropriately evaluated by the sound transmission performance evaluation index. Thus, the spatial characteristics of the first space on the speaker side can be designed.

以上説明したように、本発明によれば、音源と受音点との間に生成源符号化方式の信号処理が介在する伝送系において、音声伝送性能評価指標を算定することができる方法、及び空間の設計方法が提供される。 As described above, according to the present invention, in a transmission system in which signal processing of a source coding scheme is interposed between a sound source and a sound receiving point, a method capable of calculating a voice transmission performance evaluation index, and A space design method is provided.

第１実施形態に係る評価値算定方法が適用させる場面の一例である。It is an example of the scene which the evaluation value calculation method which concerns on 1st Embodiment applies. 図１の場面を表現した模式図である。It is the schematic diagram expressing the scene of FIG. 評価値算定方法のフローチャートである。It is a flowchart of an evaluation value calculation method. インパルス応答波形及び減衰除去インパルス応答波形を説明する図である。It is a figure explaining an impulse response waveform and an attenuation removal impulse response waveform. 第２実施形態に係る空間特性設計方法のフローチャートである。It is a flowchart of the spatial characteristic design method which concerns on 2nd Embodiment. 実施例に係るインパルス応答波形及び減衰除去インパルス応答波形である。It is the impulse response waveform and attenuation | damping removal impulse response waveform which concern on an Example. 比較例及び実施例に係る音声伝送性能評価指標（ＳＴＩ）と「聴き取りにくさ」との相関を示すグラフである。It is a graph which shows the correlation with the audio | voice transmission performance evaluation index (STI) which concerns on a comparative example and an Example, and "difficulty of hearing".

以下、添付図面を参照して本発明の実施形態について説明する。なお、各図において同一又は相当部分には同一の符号を付し、重複する説明を省略する。 Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. In addition, in each figure, the same code | symbol is attached | subjected to the same or an equivalent part, and the overlapping description is abbreviate | omitted.

［第１実施形態］
本実施形態に係る評価値算定方法は、特定のシーンにおける音声伝送性能評価指標（以下、単にＳＴＩという。）を算定する方法である。特定のシーンとは、音源と受音点との間に生成源符号化方式の信号処理が介在する場面である。具体的には、発話者と受聴者との両者が携帯電話を介して会話する場面、発話者が携帯電話を介して会話し、受聴者が固定電話を介して会話する場面、発話者が固定電話を介して会話し、受聴者が携帯電話を介して会話する場面などが挙げられる。以下では、一例として、両者が携帯電話を介して会話する場合を説明するが、この場面に限定されるものではない。 [First Embodiment]
The evaluation value calculation method according to the present embodiment is a method of calculating an audio transmission performance evaluation index (hereinafter simply referred to as STI) in a specific scene. A specific scene is a scene in which signal processing of a source coding method is interposed between a sound source and a sound receiving point. Specifically, when both the speaker and listener listen to a conversation via a mobile phone, the speaker talks via a mobile phone, the listener talks via a landline, and the speaker is fixed For example, there is a scene in which a listener has a conversation through a telephone and a listener has a conversation through a mobile phone. Below, as an example, a case where both have a conversation via a mobile phone will be described, but the present invention is not limited to this scene.

図１は、発話者と受聴者との両者が携帯電話を介して会話する場面を説明するための図である。図１に示すように、発話者１と受聴者２とが携帯電話３、４を用いて会話する。発話者１は、携帯電話ブース５内で会話している。携帯電話ブース５は、例えば、医療施設やオフィスなどに設置され、携帯電話などで会話するための発話空間（第１空間）をその内部に画成した構造物である。発話空間の容積は、例えば２ｍ^３〜５ｍ^３程度である。発話者１の音声は、携帯電話３を介して基地局６へ伝送され、基地局６から受聴者の携帯電話４へ伝送され、受聴者へ到達する。 FIG. 1 is a diagram for explaining a scene in which both a speaker and a listener have a conversation via a mobile phone. As shown in FIG. 1, a speaker 1 and a listener 2 have a conversation using mobile phones 3 and 4. The speaker 1 has a conversation in the mobile phone booth 5. The mobile phone booth 5 is a structure that is installed in, for example, a medical facility or an office, and defines an utterance space (first space) for conversation on a mobile phone or the like. The volume of the speech space, for example, ^2m ³ ^~5m 3 about. The voice of the speaker 1 is transmitted to the base station 6 via the mobile phone 3 and is transmitted from the base station 6 to the mobile phone 4 of the listener, and reaches the listener.

上記場面において、受聴者２が感じる音声の「聴き取りにくさ」を、ＳＴＩを用いて予測するためには、発話者１の口（音源）から受聴者２の耳（受音点）までの伝達系におけるインパルス応答が必要となる。発話者１から受聴者２までの伝達系は、例えば図２に示す模式図で示すことができる。図２では、発話者１の口を第１スピーカ（音源）１ａ、携帯電話３のマイクを第１マイク３ａ、携帯電話通信網を通信系６ａ、携帯電話４のスピーカを第２スピーカ４ａ、受聴者２の耳を第２マイク２ａとして表現している。つまり、発話空間５ａ内で第１スピーカ１ａから生成された音声が発話空間５ａ内で第１マイク３ａにて集音され、通信系６ａで伝送される。 In the above scene, in order to predict the “difficulty of hearing” of the sound felt by the listener 2 using the STI, from the mouth (sound source) of the speaker 1 to the ear (sound receiving point) of the listener 2 An impulse response in the transmission system is required. The transmission system from the speaker 1 to the listener 2 can be shown by the schematic diagram shown in FIG. 2, for example. In FIG. 2, the mouth of the speaker 1 is the first speaker (sound source) 1a, the microphone of the mobile phone 3 is the first microphone 3a, the mobile phone communication network is the communication system 6a, and the speaker of the mobile phone 4 is the second speaker 4a. The ear of the listener 2 is expressed as the second microphone 2a. That is, the sound generated from the first speaker 1a in the utterance space 5a is collected by the first microphone 3a in the utterance space 5a and transmitted by the communication system 6a.

ここで、通信系６ａは、音源と調音との分離を前提とする生成源符号化方式の信号処理を含む。つまり、通信系６ａにて符号化された信号を用いて第２スピーカ４ａにて音声が再生される。このように、発話者１と受聴者２との間の伝送系に生成源符号化方式の信号処理が介在する場合には、インパルス応答取得のために第１マイク３ａへパルスを入力しても、入力されたパルスがそのまま第２スピーカ４ａから出力される可能性があり、発話者１から受聴者２までの伝達系全体の実効的なインパルス応答波形を取得することはできない。 Here, the communication system 6a includes signal processing of a generation source coding method that presupposes separation of a sound source and articulation. That is, sound is reproduced by the second speaker 4a using the signal encoded by the communication system 6a. As described above, when the signal processing of the source coding method is interposed in the transmission system between the speaker 1 and the listener 2, even if a pulse is input to the first microphone 3a for obtaining the impulse response. The input pulse may be output as it is from the second speaker 4a, and an effective impulse response waveform of the entire transmission system from the speaker 1 to the listener 2 cannot be acquired.

上記前提を踏まえて、本実施形態に係る評価値算定方法を説明する。音源と受音点との間に生成源符号化方式の信号処理が介在する伝送系においては、音源が配置された空間の音響特性が受音点での音声の聴き取りにくさに少なからず影響を与えている。このため、本実施形態に係る評価値算定方法では、発話者１から受聴者２までの伝達系の一部を表す発話空間５ａ内のインパルス応答波形を用いて、ＳＴＩを算出する。さらに、発話空間５ａ内のインパルス応答波形に含まれる反射音成分の特徴が、生成源符号化方式の信号処理に影響を与えると予測されるため、発話空間５ａ内のインパルス応答波形に含まれる反射音成分の特徴が強調された波形となるように加工し、ＳＴＩ算出のための入力情報とする。図３は、本実施形態に係る評価値算定方法のフローチャートである。なお、当該フローチャートの主体は、人であってもよいし、ＣＰＵなどを有しプログラムを読み出して動作する機器などであってもよい。 Based on the above assumptions, the evaluation value calculation method according to the present embodiment will be described. In a transmission system in which signal processing of the source coding method is interposed between the sound source and the sound receiving point, the acoustic characteristics of the space where the sound source is placed have a considerable impact on the difficulty of listening to the sound at the sound receiving point. Is given. For this reason, in the evaluation value calculation method according to the present embodiment, the STI is calculated using the impulse response waveform in the utterance space 5a representing a part of the transmission system from the utterer 1 to the listener 2. Furthermore, since the characteristic of the reflected sound component included in the impulse response waveform in the utterance space 5a is predicted to affect the signal processing of the generation source coding method, the reflection included in the impulse response waveform in the utterance space 5a. It is processed so as to have a waveform in which the characteristics of the sound component are emphasized, and used as input information for STI calculation. FIG. 3 is a flowchart of the evaluation value calculation method according to the present embodiment. The subject of the flowchart may be a person or a device that has a CPU or the like and operates by reading a program.

図３に示すように、最初にインパルス応答波形取得処理（Ｓ１０：取得ステップ）が実行される。Ｓ１０で示すインパルス応答波形取得処理では、発話空間５ａ内の発話者１の口（第１スピーカ１ａ）を音源とし、携帯電話３の第１マイク３ａを受音点として、第１スピーカ１ａから第１マイク３ａまでの伝達経路のインパルス応答波形が取得される。なお、取得の方法は、実測により取得してもよいし、幾何音響や波動音響解析などを用いて計算により導出してもよい。インパルス応答波形取得処理によって、例えば図４の（Ａ）に示すインパルス応答波形が取得される。図４の（Ａ）の横軸は時間、縦軸は大きさ［ｄＢ］である。インパルス応答波形は、直接音成分Ｉと、反射音成分Ｒａとを含む。インパルス応答波形取得処理が終了すると、減衰除去処理（Ｓ１２：減衰除去ステップ）へ処理が移行する。 As shown in FIG. 3, an impulse response waveform acquisition process (S10: acquisition step) is first executed. In the impulse response waveform acquisition processing shown in S10, the first speaker 1a takes the first microphone 3a of the mobile phone 3 as the sound receiving point and the first speaker 3a from the first speaker 1a as the sound source. The impulse response waveform of the transmission path to one microphone 3a is acquired. The acquisition method may be acquired by actual measurement, or may be derived by calculation using geometrical acoustic or wave acoustic analysis. For example, the impulse response waveform shown in FIG. 4A is acquired by the impulse response waveform acquisition process. In FIG. 4A, the horizontal axis represents time, and the vertical axis represents size [dB]. The impulse response waveform includes a direct sound component I and a reflected sound component Ra. When the impulse response waveform acquisition process ends, the process proceeds to an attenuation removal process (S12: attenuation removal step).

Ｓ１２で示す減衰除去処理では、Ｓ１０で示すインパルス応答波形取得処理にて取得されたインパルス応答波形を発話空間（第１空間）における減衰曲線で除算して減衰除去インパルス応答波形が算出される。図４の（Ａ）に示すように、インパルス応答波形は、時間の経過とともに減衰する。この減衰の様子は、Schroederの減衰曲線Ｚで表現することができる。Schroederの減衰曲線は、発話空間のインパルス応答波形をｒ（ｔ）とすると、以下の数式（１）で表される。

このように、インパルス応答波形ｒ（ｔ）からSchroederの減衰曲線を得ることができる。そして、減衰除去インパルス応答波形ｐ（ｔ）は、以下の数式（２）で表される。

（羽入敏樹，星和磨，鈴木諒一，非直線減衰を持つ室内音場の減衰除去インパルス応答の計算日本音響学会講演論文集参照） In the attenuation removal processing shown in S12, the attenuation removal impulse response waveform is calculated by dividing the impulse response waveform acquired in the impulse response waveform acquisition processing shown in S10 by the attenuation curve in the speech space (first space). As shown in FIG. 4A, the impulse response waveform attenuates with time. This attenuation state can be expressed by a Schroeder attenuation curve Z. Schroeder's decay curve is expressed by the following equation (1), where r (t) is the impulse response waveform in the speech space.

Thus, the Schroeder decay curve can be obtained from the impulse response waveform r (t). The attenuation elimination impulse response waveform p (t) is expressed by the following mathematical formula (2).

(Refer to Toshiki Hairi, Kazuma Hoshi, Junichi Suzuki, Calculation of attenuation-removed impulse response of room sound field with nonlinear attenuation)

Ｓ１２で示す減衰除去処理によって、例えば図４の（Ｂ）に示す減衰除去インパルス応答波形が取得される。図４の（Ｂ）に示す減衰除去インパルス応答波形は、図４の（Ａ）に基づいて生成された減衰除去インパルス応答波形を模式的に示しており、横軸は時間、縦軸は大きさ［ｄＢ］である。減衰除去インパルス応答波形は、直接音成分Ｉと、反射音成分Ｒａから減衰を除去して得られた減衰除去反射音成分Ｒｂとを含む。減衰除去反射音成分Ｒｂは、減衰を除去することにより、反射音成分Ｒａに含まれる反射音構造の特徴を強調させた波形となる。 For example, the attenuation removal impulse response waveform shown in FIG. 4B is acquired by the attenuation removal processing shown in S12. The attenuation removal impulse response waveform shown in FIG. 4B schematically shows the attenuation removal impulse response waveform generated based on FIG. 4A, with the horizontal axis representing time and the vertical axis representing magnitude. [DB]. The attenuation removal impulse response waveform includes a direct sound component I and an attenuation removal reflected sound component Rb obtained by removing the attenuation from the reflected sound component Ra. The attenuation-removed reflected sound component Rb has a waveform that emphasizes the characteristics of the reflected sound structure included in the reflected sound component Ra by removing attenuation.

なお、Ｓ１２で示す減衰除去処理において、図４の（Ａ）に示すように、減衰曲線Ｚが直接音成分Ｉの音量から所定音量Ｘだけ減衰した時間Ｔａで減衰除去インパルス応答波形を算出する処理を打ち切ってもよい。所定音量Ｘは、携帯電話３（音声を再生する装置）のダイナミックレンジに基づいて決定されてもよい。例えば、所定音量Ｘは、３０ｄＢ〜４０ｄＢ程度とされる。このように構成することで、音声を再生する装置のダイナミックレンジに合わせて処理をすることができる。減衰除去処理が終了すると、算定処理（Ｓ１４：算定ステップ）へ処理が移行する。 In the attenuation removal processing shown in S12, as shown in FIG. 4A, the attenuation removal impulse response waveform is calculated at the time Ta when the attenuation curve Z is attenuated by the predetermined volume X from the volume of the direct sound component I. May be terminated. The predetermined volume X may be determined based on the dynamic range of the mobile phone 3 (device that reproduces sound). For example, the predetermined volume X is set to about 30 dB to 40 dB. With this configuration, it is possible to perform processing in accordance with the dynamic range of a device that reproduces sound. When the attenuation removal process ends, the process proceeds to the calculation process (S14: calculation step).

Ｓ１４に示す算定処理では、Ｓ１２に示す減衰除去処理にて算出された減衰除去インパルス応答波形を用いて、ＳＴＩが算定される。ＳＴＩは公知の手法により算出される。 In the calculation process shown in S14, the STI is calculated using the attenuation removal impulse response waveform calculated in the attenuation removal process shown in S12. STI is calculated by a known method.

なお、Ｓ１４に示す算定処理よりも前に、第１空間内で音源から生成された音声の音声スペクトルを取得する音声取得ステップを備えてもよい。音声取得ステップでは、例えば、第１空間内で音源から生成された音声の実測値であってもよいし、シミュレーション値であってもよい。そして、算定ステップでは、減衰除去ステップにて算出された減衰除去インパルス応答波形及び音声取得ステップにて取得された音声スペクトルを用いてＳＴＩを算定する際に、減衰除去ステップにて算出された減衰除去インパルス応答波形又は音声取得ステップにて取得された音声スペクトルに対して音声の信号の伝送帯域の波形成分を通過させる帯域通過処理を施し、帯域通過処理を施した波形成分を用いてＳＴＩを算定してもよい。通信系６ａの伝送帯域は、例えば３００Ｈｚ〜３．２ｋＨｚである。このため、通信系６ａの伝送帯域以外の情報を含んでＳＴＩを算定すると、ＳＴＩを用いた聴き取りにくさの予測精度が低下するおそれがある。このため、音声の信号の伝送帯域（例えば３００Ｈｚ〜３．２ｋＨｚ）以外の伝送帯域については、フィルタリングすることが考えられる。ここで、所定の帯域において、インパルス応答、音声のスペクトル及び暗騒音レベルの少なくとも１つのデータが存在しない場合には、当該帯域のＳＴＩは、算出することができない。つまり、減衰除去インパルス応答波形又は音声スペクトルの少なくとも一方に対して帯域通過処理を施すことで、音声の信号の伝送帯域（例えば３００Ｈｚ〜３．２ｋＨｚ）以外のデータがＳＴＩの算定に用いられることを回避することができる。このため、音声の信号の伝送帯域に合わせたＳＴＩを算定することができる。算定処理が終了すると、図３に示す処理が終了する。 In addition, you may provide the audio | voice acquisition step which acquires the audio | voice spectrum of the audio | voice produced | generated from the sound source in 1st space before the calculation process shown to S14. In the sound acquisition step, for example, an actual measurement value of a sound generated from a sound source in the first space may be used, or a simulation value may be used. In the calculation step, the attenuation removal calculated in the attenuation removal step is performed when the STI is calculated using the attenuation removal impulse response waveform calculated in the attenuation removal step and the voice spectrum acquired in the voice acquisition step. Apply band pass processing to pass the waveform component of the transmission band of the voice signal to the impulse response waveform or the voice spectrum acquired in the voice acquisition step, and calculate the STI using the waveform component subjected to the band pass processing May be. The transmission band of the communication system 6a is, for example, 300 Hz to 3.2 kHz. For this reason, if the STI is calculated including information other than the transmission band of the communication system 6a, the prediction accuracy of difficulty in listening using the STI may be reduced. For this reason, it is conceivable to filter the transmission band other than the transmission band of the audio signal (for example, 300 Hz to 3.2 kHz). Here, when at least one data of an impulse response, a sound spectrum, and a background noise level does not exist in a predetermined band, the STI of the band cannot be calculated. That is, by performing band pass processing on at least one of the attenuation removal impulse response waveform and the voice spectrum, data other than the voice signal transmission band (for example, 300 Hz to 3.2 kHz) is used for the calculation of the STI. It can be avoided. For this reason, the STI can be calculated in accordance with the transmission band of the audio signal. When the calculation process ends, the process shown in FIG. 3 ends.

以上、本実施形態に係る評価値算定方法では、発話空間５ａにおける第１スピーカ１ａから第１マイク３ａまでの伝達経路のインパルス応答波形（図４の（Ａ））が減衰曲線Ｚで除算されて、減衰除去インパルス応答波形（図４の（Ｂ））が算出される。算出された減衰除去インパルス応答波形では、インパルス応答波形に内在し、かつ、減衰していた反射音成分Ｒａの構造特徴が強調される。このような減衰除去インパルス応答波形をＳＴＩ算出の入力とすることで、受聴者２（受音点）での音声の聴き取りにくさに影響を及ぼす反射音構造を、ＳＴＩに明確に反映させることができる。よって、算出されたＳＴＩが音声の聴き取りにくさに対して高い尺度性を示すようにすることが可能となる。 As described above, in the evaluation value calculation method according to the present embodiment, the impulse response waveform ((A) of FIG. 4) of the transmission path from the first speaker 1a to the first microphone 3a in the speech space 5a is divided by the attenuation curve Z. Then, an attenuation elimination impulse response waveform ((B) in FIG. 4) is calculated. In the calculated attenuation elimination impulse response waveform, the structural characteristics of the reflected sound component Ra inherent in the impulse response waveform and attenuated are emphasized. By using such an attenuation-removed impulse response waveform as an input for STI calculation, the reflected sound structure that affects the difficulty of listening to the voice at the listener 2 (sound receiving point) is clearly reflected in the STI. Can do. Therefore, the calculated STI can show a high scale characteristic with respect to difficulty in listening to the voice.

［第２実施形態］
本実施形態に係る空間特性設計方法は、第１実施形態に係る評価値算定方法にて算出されたＳＴＩを用いて、発話空間（第１空間）の空間特性を設計する方法である。 [Second Embodiment]
The spatial characteristic design method according to the present embodiment is a method for designing the spatial characteristics of an utterance space (first space) using the STI calculated by the evaluation value calculation method according to the first embodiment.

図５は、本実施形態に係る空間特性設計方法のフローチャートである。図５に示すように、最初に目標設定処理（Ｓ２０）が行われる。Ｓ２０に示す目標設定処理では、対象とする建築物（構造物）をどの程度の聴き取りにくさとするか目標が設定される。例えば、「聴き取りにくさ」が「聴き取りにくくない」「少し聴き取りにくい」「かなり聴き取りにくい」「非常に聴き取りにくい」などの所定のカテゴリに分類されており、これらのカテゴリを用いて目標が設定されてもよい。なお、対象とする建築物（構造物）は、発話空間（第１空間）をその内部に画成した構造物であって、発話空間の容積は、例えば２ｍ^３〜５ｍ^３程度である。そして、ここでは、聴き取りにくさが「困難でない」と「少し困難」との境界程度になるという目標が設定されたとする。 FIG. 5 is a flowchart of the spatial characteristic design method according to the present embodiment. As shown in FIG. 5, a target setting process (S20) is first performed. In the target setting process shown in S20, a target is set as to how hard the target building (structure) is to be heard. For example, “Difficult to listen” is categorized into predetermined categories such as “Difficult to hear”, “Difficult to hear”, “Difficult to hear”, “Very difficult to hear”, etc. A target may be set. Incidentally, the buildings of interest (structure) The structure in which defines a speech space (first space) therein, the volume of the speech space, for example, 2m ^³ ~5m ³ about. Here, it is assumed that the goal is set such that the difficulty of listening is about the boundary between “not difficult” and “a little difficult”.

次に、設計目標値取得処理（Ｓ２２）が行われる。Ｓ２２に示す設計目標値取得処理では、設計目標値として、減衰除去インパルス応答波形から求めたＳＴＩが取得される。具体的には、「聴き取りにくさ」と減衰除去インパルス応答波形から求めたＳＴＩとを関連付けしたグラフ又はテーブルなどの関連情報を予め用意しておき、関連情報に基づいて目標設定処理で設定された目標の「聴き取りにくさ」（「困難でない」と「少し困難」との境界程度）に対応するＳＴＩが取得される。次に、空間設計処理（Ｓ２４：設計ステップ）が行われる。Ｓ２４に示す空間設計処理では、目標とするＳＴＩに合わせて吸音材の枚数、配置等が設計される。例えば、設計のやり方としては、予め定められた初期値（又は推定値）の枚数の吸音材を、予め定められた初期パターンに則って配置し、吸音材を配置した当該空間のＳＴＩを算定する。そして、目標とするＳＴＩと実測したＳＴＩとの差が小さくなるように、吸音材の枚数及び配置を初期値又は初期パターンから除々に変更していくことで、目標とするＳＴＩに合わせた吸音材の枚数、配置等が設計される。空間設計処理が終了すると、図５に示す処理が終了する。 Next, a design target value acquisition process (S22) is performed. In the design target value acquisition process shown in S22, the STI obtained from the attenuation removal impulse response waveform is acquired as the design target value. Specifically, related information such as a graph or a table in which “difficulty in listening” is associated with STI obtained from the attenuation removal impulse response waveform is prepared in advance, and is set in the target setting process based on the related information. STI corresponding to the target “difficult to hear” (the boundary between “not difficult” and “a little difficult”) is acquired. Next, space design processing (S24: design step) is performed. In the space design process shown in S24, the number, arrangement, and the like of the sound absorbing material are designed according to the target STI. For example, as a design method, a predetermined initial value (or estimated value) of sound absorbing materials are arranged in accordance with a predetermined initial pattern, and the STI of the space where the sound absorbing material is arranged is calculated. . Then, the number and arrangement of the sound absorbing materials are gradually changed from the initial value or the initial pattern so that the difference between the target STI and the measured STI is small, so that the sound absorbing material matched to the target STI is obtained. The number, layout, etc. are designed. When the space design process ends, the process shown in FIG. 5 ends.

以上、本実施形態に係る空間特性設計方法によれば、第１実施形態にて算出されたＳＴＩによって受聴者２側の聴き取りにくさを適切に評価して、発話空間５ａ（第１空間）の空間特性を設計することができる。 As described above, according to the spatial characteristic design method according to the present embodiment, the difficulty of listening on the listener 2 side is appropriately evaluated by the STI calculated in the first embodiment, and the speech space 5a (first space). The spatial characteristics of can be designed.

なお、上述した各実施形態は本発明に係る評価値算定方法及び空間特性設計方法の一例を示すものである。本発明に係る評価値算定方法及び空間特性設計方法は、実施形態に限られるものではなく、各請求項に記載した要旨を変更しない範囲で変形し、又は他のものに適用したものであってもよい。 Each embodiment described above shows an example of the evaluation value calculation method and the spatial characteristic design method according to the present invention. The evaluation value calculation method and the spatial characteristic design method according to the present invention are not limited to the embodiments, but are modified without changing the gist described in each claim, or applied to other methods. Also good.

例えば、第１実施形態において、図１中では、携帯電話ブース５が発話者１の頭部を覆う構造となっている例を示しているが、発話者１の全身が入ることができるブースであってもよい。 For example, in the first embodiment, FIG. 1 shows an example in which the mobile phone booth 5 has a structure covering the head of the speaker 1, but the booth where the whole body of the speaker 1 can enter is shown. There may be.

また、本発明に係る空間設計方法は、第２実施形態の図５のフローチャートに記載された方法に限られず、種々の方法を適用することができる。例えば、図５のフローチャートでは、目標の「聴き取りにくさ」から目標とするＳＴＩを導出し、目標とするＳＴＩに基づいて設計を行う例を説明したが、Ｓ２２に示す設計目標値取得処理において、Ｓ２０の処理結果を用いることなく、例えば基準となる目標値などを取得してもよい。この場合、図５のＳ２０の目標設定処理は実行しなくてもよい。 The space design method according to the present invention is not limited to the method described in the flowchart of FIG. 5 of the second embodiment, and various methods can be applied. For example, in the flowchart of FIG. 5, the example in which the target STI is derived from the target “difficult to listen” and the design is performed based on the target STI has been described. In the design target value acquisition process shown in S22, however, For example, a reference target value may be acquired without using the processing result of S20. In this case, the target setting process in S20 of FIG. 5 may not be executed.

以下、上記効果を説明すべく本発明者が実施した実施例について述べる。 Hereinafter, examples carried out by the present inventor will be described in order to explain the above effects.

［減衰除去インパルス応答波形］
幾何音響解析により、発話空間のインパルス応答波形と、減衰除去インパルス応答波形とを算出した。発話空間を５ｍ^３とした。結果を図６に示す。図６の（Ａ）は、算出されたインパルス応答波形であり、図６の（Ｂ）は、図６の（Ａ）に示すインパルス応答波形から算出した減衰除去インパルス応答波形である。図６のグラフは、横軸が時間［ｓ］、縦軸が強度である。図６の（Ａ），（Ｂ）の比較より、減衰除去インパルス応答波形（特に０．０２５〜０．１２５［ｓ］の間）では、インパルス応答波形の特徴的な反射音構造が強調されることが確認された。 [Attenuation elimination impulse response waveform]
The impulse response waveform in the speech space and the attenuation-removed impulse response waveform were calculated by geometric acoustic analysis. The speech space was 5 m ³ . The results are shown in FIG. 6A is a calculated impulse response waveform, and FIG. 6B is an attenuation removal impulse response waveform calculated from the impulse response waveform shown in FIG. 6A. In the graph of FIG. 6, the horizontal axis represents time [s] and the vertical axis represents intensity. 6A and 6B, in the attenuation elimination impulse response waveform (particularly between 0.025 and 0.125 [s]), the characteristic reflected sound structure of the impulse response waveform is emphasized. It was confirmed.

［評価値算定方法の効果］
発話空間５ａのインパルス応答波形をシミュレーションにより算出した。また、発話空間５ａの体積Ｖ及び平均吸音率をパラメータとして３６種類のインパルス応答波形を算出した。体積Ｖ及び平均吸音率の組合せは以下のとおりである。
体積Ｖ：１ｍ^３，５ｍ^３，１８ｍ^３，７４ｍ^３，２９４ｍ^３，１１７８ｍ^３
平均吸音率：０．０１，０．０２，０．０４，０．０８，０．１６，０．３２
次に、それぞれ畳み込んだ３６種類のインパルス応答波形を、アッテネータ（減衰器）を経由して携帯電話３の第１マイク３ａへ入力し、通信系を介して携帯電話４へ伝送し、携帯電話４の第２スピーカ４ａから出力させて録音した。この録音した音を被験者に聴かせて「聴き取りにくさ」の回答を得た。 [Effect of evaluation value calculation method]
The impulse response waveform of the speech space 5a was calculated by simulation. In addition, 36 types of impulse response waveforms were calculated using the volume V of the speech space 5a and the average sound absorption coefficient as parameters. The combinations of the volume V and the average sound absorption coefficient are as follows.
Volume V: 1 m ³ , 5 m ³ , 18 m ³ , 74 m ³ , 294 m ³ , 1178 m ³
Average sound absorption coefficient: 0.01, 0.02, 0.04, 0.08, 0.16, 0.32
Next, the 36 types of impulse response waveforms that are convoluted are input to the first microphone 3a of the mobile phone 3 via an attenuator (attenuator) and transmitted to the mobile phone 4 via the communication system. 4 and output from the second speaker 4a. This recorded sound was heard by the subject and the answer “Difficult to hear” was obtained.

（比較例１）
発話空間５ａのインパルス応答波形から求めたＳＴＩと、聴き取りにくさとの関係をプロットした。結果を図７の（Ａ）に示す。図７の（Ａ）は、横軸がＳＴＩ、縦軸が「聴き取りにくさ」を示している。なお、縦軸は、「聴き取りにくくない」の境界下限値を０とし、「非常に聴き取りにくい」の代表値を１として、「聴き取りにくくない」「少し聴き取りにくい」「かなり聴き取りにくい」「非常に聴き取りにくい」のカテゴリ全体が０〜１の範囲に納まるように数値を調整した。図７の（Ａ）では、プロットした点の回帰直線を実線Ｌ１、弁別閾を一点鎖線Ｌ２、９５％予測区間を破線Ｌ３で示している。なお、弁別閾は、TukeyのＨＤＳ（α＝０．０５）の範囲（±０．１３）を示したものであり、「聴き取りにくさに明確な差異があると判断できる最小の心理的距離」である。図７の（Ａ）に示すとおり、回帰直線ｙ＝−１．８６ｘ＋２．０７の決定係数Ｒ^２は０．８６と比較的高く、ＳＴＩと聴き取りにくさとの間には高い負の相関が認められる。しかしながら、９５％予測区間はＨＤＳの範囲を超えている。このため、発話空間５ａのインパルス応答波形から求めたＳＴＩは、聴き取りにくさを予測するためには精度が十分であるとはいえない。 (Comparative Example 1)
The relationship between the STI obtained from the impulse response waveform of the speech space 5a and the difficulty in listening was plotted. The results are shown in FIG. In FIG. 7A, the horizontal axis indicates STI and the vertical axis indicates “difficulty in listening”. The vertical axis shows the boundary lower limit value of “not hard to hear” as 0, the representative value of “very hard to hear” as 1, and “not hard to hear”, “a little hard to hear”, “very hard to hear” The numerical values were adjusted so that the entire category of “difficult” and “very difficult to hear” was within the range of 0-1. In FIG. 7A, a regression line of plotted points is indicated by a solid line L1, a discrimination threshold is indicated by a one-dot chain line L2, and a 95% prediction interval is indicated by a broken line L3. The discrimination threshold indicates Tukey's HDS (α = 0.05) range (± 0.13). “The minimum psychological distance at which it is possible to determine that there is a clear difference in difficulty in hearing. Is. As shown in (A) of FIG. 7, the coefficient of determination ^{R 2} of the regression line y = -1.86x + 2.07 is relatively high as 0.86, a high negative correlation between the Difficulty listening and STI Is recognized. However, the 95% prediction interval is beyond the HDS range. For this reason, the STI obtained from the impulse response waveform in the utterance space 5a cannot be said to have sufficient accuracy to predict difficulty in listening.

（実施例１）
第１実施形態に係る評価値算定方法にて算出したＳＴＩと聴き取りにくさとの関係をプロットした。結果を図７の（Ｂ）に示す。図７の（Ｂ）は、図７の（Ａ）と同様に、横軸がＳＴＩ、縦軸が「聴き取りにくさ」を示している。図７の（Ｂ）では、プロットした点の回帰直線を実線Ｌ４、弁別閾を一点鎖線Ｌ５、９５％予測区間を破線Ｌ６で示している。弁別閾は、図７の（Ａ）と同様に、TukeyのＨＤＳ（α＝０．０５）の範囲（±０．１３）を示したものである。図７の（Ｂ）に示すとおり、回帰直線ｙ＝−１．４３ｘ＋１．１３の決定係数Ｒ^２は０．９２と比較例１に比べて高く、ＳＴＩと聴き取りにくさとの間には非常に高い負の相関が認められる。さらに、９５％予測区間はＨＤＳの範囲とほぼ同様である。このため、第１実施形態に係る評価値算定方法にて算出したＳＴＩは、聴き取りにくさを予測するために十分な精度であることが確認された。 (Example 1)
The relationship between STI calculated by the evaluation value calculation method according to the first embodiment and difficulty in listening was plotted. The results are shown in FIG. 7B, as in FIG. 7A, the horizontal axis indicates STI and the vertical axis indicates “difficult to listen”. In FIG. 7B, the regression line of the plotted points is indicated by a solid line L4, the discrimination threshold is indicated by a one-dot chain line L5, and the 95% prediction interval is indicated by a broken line L6. The discrimination threshold indicates the range (± 0.13) of Tukey's HDS (α = 0.05), as in FIG. As shown in FIG. 7 (B), the coefficient of determination ^{R 2} of the regression line y = -1.43x + 1.13 is higher than that of Comparative Example 1 and 0.92, very between the Difficulty listening and STI A high negative correlation is observed. Furthermore, the 95% prediction interval is almost the same as the HDS range. For this reason, it was confirmed that the STI calculated by the evaluation value calculation method according to the first embodiment has sufficient accuracy to predict difficulty in listening.

１…発話者、２…受聴者、３…携帯電話、４…携帯電話（音声を再生する装置）、５…携帯電話ブース、６…基地局、１ａ…第１スピーカ（音源）、２ａ…第２マイク（受音点），３ａ…第１マイク、４ａ…第２スピーカ、５ａ…発話空間（第１空間）、６ａ…通信系、Ｉ…直接音成分、Ｒａ…反射音成分、Ｒｂ…減衰除去反射音成分、Ｚ…減衰曲線。 DESCRIPTION OF SYMBOLS 1 ... Speaker, 2 ... Listener, 3 ... Mobile phone, 4 ... Mobile phone (sound-reproducing apparatus), 5 ... Mobile phone booth, 6 ... Base station, 1a ... 1st speaker (sound source), 2a ... 1st 2 microphones (sound receiving points), 3a ... first microphone, 4a ... second speaker, 5a ... speech space (first space), 6a ... communication system, I ... direct sound component, Ra ... reflected sound component, Rb ... attenuation Removed reflected sound component, Z ... attenuation curve.

Claims

The sound generated from the sound source in the first space is collected by the microphone in the first space, the collected sound signal is encoded by the generation source encoding method, and the encoded signal is used. An evaluation value calculation method for calculating a voice transmission performance evaluation index for evaluating difficulty in listening to the reproduced voice when the voice is reproduced,
An acquisition step of acquiring an impulse response waveform of a transmission path from the sound source to the microphone;
An attenuation removal step of calculating an attenuation removal impulse response waveform by dividing the impulse response waveform acquired in the acquisition step by a Schroeder attenuation curve;
A calculation step for calculating the voice transmission performance evaluation index using the attenuation removal impulse response waveform calculated in the attenuation removal step;
An evaluation value calculation method comprising:

In the attenuation removal step, the process of calculating the attenuation removal impulse response waveform is terminated at a time when the attenuation curve is attenuated by a predetermined volume from the volume of the direct sound, and the predetermined volume is based on a dynamic range of a device that reproduces sound. The evaluation value calculation method according to claim 1 to be determined.

A voice acquisition step of acquiring a voice spectrum of a voice generated from a sound source in the first space;
In the calculation step, when calculating the voice transmission performance evaluation index using the attenuation removal impulse response waveform calculated in the attenuation removal step and the voice spectrum acquired in the voice acquisition step, the attenuation is performed. Band-pass processing is performed by passing a waveform component of a transmission band of a voice signal to the voice spectrum acquired in the voice acquisition step or the attenuation-removed impulse response waveform calculated in the removal step. The evaluation value calculation method according to claim 1, wherein the voice transmission performance evaluation index is calculated using the waveform component subjected to the processing.

A spatial characteristic design method using the voice transmission performance evaluation index calculated by the evaluation value calculation method according to claim 1, wherein the first is performed using the voice transmission performance evaluation index. A spatial characteristic design method comprising a design step of designing a spatial characteristic of a space.