JP2010213330A

JP2010213330A - Measuring method, measuring device, and program

Info

Publication number: JP2010213330A
Application number: JP2010103400A
Authority: JP
Inventors: Kohei Asada; 宏平浅田
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2004-03-17
Filing date: 2010-04-28
Publication date: 2010-09-24
Anticipated expiration: 2028-06-16
Also published as: JP4618334B2; JP5035386B2; JP2008289173A

Abstract

<P>PROBLEM TO BE SOLVED: To eliminate discomfort experienced by a user listening to a measurement sound for acoustic field correction and to achieve entertainment features by giving the measurement sound a musical element. <P>SOLUTION: A first measuring procedure includes: a procedure for making a speaker to output a required sound element obtained based on a fundamental sound component; a procedure for sampling the sound element emitted from the speaker via a spatial transmission path; and a setting procedure for determining characteristics to be set for a signal outputted to the speaker based on an analysis result obtained by a predetermined frequency analysis process being performed on the sampled audio signal. A subsequent second measuring procedure includes: a procedure for applying the characteristics determined in the setting procedure to the required sound element obtained based on the fundamental sound component and outputting a resultant sound element signal to the speaker; a procedure for sampling the sound element signal emitted from the speaker via a spatial transmission path; and a measuring procedure for obtaining a required measurement result based on an analysis result obtained by a predetermined frequency analysis process being applied to the sampled audio signal. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、例えば音響補正のために音響測定するための測定装置と、その方法、及びこのような測定装置が実行するプログラムに関するものである。 The present invention relates to a measuring device for measuring sound, for example, for sound correction, a method thereof, and a program executed by such a measuring device.

例えばマルチチャンネルのオーディオシステムにより再生される音声信号を複数のスピーカから出力させて聴く場合においては、例えばリスニングルームの構造や、スピーカに対する聴取者の聴取位置などをはじめとしたリスニング環境に応じて音声のバランスや音質が変化することで、聴取者が感じる音場（音響）は異なってくる。これは、上記リスニング環境の状態によっては、聴取位置にいる聴取者が適正な音場を感じることができないということにつながる。 For example, when listening to audio signals that are reproduced by a multi-channel audio system by outputting from multiple speakers, the audio will depend on the listening environment, such as the structure of the listening room and the listening position of the listener relative to the speakers, for example. The sound field (sound) that the listener feels varies as the balance and sound quality change. This leads to the fact that the listener at the listening position cannot feel an appropriate sound field depending on the state of the listening environment.

因みに、このような問題は、例えば自動車の室内のような環境において顕著である。自動車の室内で、聴取者の位置は座席位置にほぼ限定されるので、スピーカとの距離も偏ったものとなって、これによるスピーカからの音声の到達時間の時間差により音場のバランスが大きく乱れる。また、自動車の室内は比較的狭い上にほぼ密閉された状態であるので、反射音などが複雑に合成されて聴取者に到達して、音場を乱す要因となる。さらに、スピーカの取り付け位置の制限から、スピーカが聴取者の耳に直接的に到達するように配置されることがあまりなく、このことによる音質の変化もおおきく音場に影響する。 Incidentally, such a problem is remarkable in an environment such as an automobile interior. Since the position of the listener is almost limited to the seat position in the car interior, the distance from the speaker is also biased, and the sound field balance is greatly disturbed due to the time difference of the arrival time of the sound from the speaker. . In addition, since the interior of a car is relatively narrow and almost sealed, reflected sounds and the like are synthesized in a complex manner and reach the listener, thereby disturbing the sound field. Furthermore, because of the limitation of the speaker mounting position, the speaker is not often arranged so as to directly reach the listener's ear, and the change in sound quality due to this greatly affects the sound field.

そこで、実際にオーディオシステムを使用するリスニング環境にあって、できるだけ、〜本来の音声ソースに近いとされる良好な音場により聴くことができるように、音響補正を行うことが知られている。この音響補正のためには、例えば各スピーカから出力すべき音声信号について、聴取者の耳に到達する音声の時間差を補正するように遅延時間を調整したり、また、聴取者の耳に到達した段階での音質や聴取レベルの変化が補正されるようにイコライジング補正などの所要の信号処理を施すものである。 Therefore, it is known to perform acoustic correction so that the user can listen to a good sound field that is as close as possible to the original sound source in a listening environment where the audio system is actually used. For this acoustic correction, for example, for the audio signal to be output from each speaker, the delay time is adjusted so as to correct the time difference of the sound reaching the listener's ear, or the sound signal reaches the listener's ear. Necessary signal processing such as equalizing correction is performed so that changes in sound quality and listening level at the stage are corrected.

そして、このような音響補正を効率的に行うためには、例えばユーザ（聴取者）が単に聴感のみに頼って調整をするのではなく、装置により自動的に行われるようにすることが好ましい。
つまり、先ず、音響補正装置により、リスニング環境における音響特性を測定し、その測定結果に基づいて、オーディオシステムの音声出力系に対して、音響補正のための信号処理のパラメータを設定するものである。このようして設定されたパラメータに従って信号処理された音声信号をスピーカから出力させれば、特にユーザが音場調整操作をしなくとも、そのリスニング環境に適合して補正された良好な音場で音声ソースを聴くことができるわけである。 In order to efficiently perform such acoustic correction, for example, it is preferable that the user (listener) perform the adjustment automatically by the apparatus, instead of making the adjustment based solely on the audibility.
That is, first, the acoustic correction device measures the acoustic characteristics in the listening environment, and sets signal processing parameters for acoustic correction for the audio output system of the audio system based on the measurement result. . If an audio signal that has been signal-processed in accordance with the parameters set in this way is output from the speaker, it is possible to obtain a good sound field that has been corrected and adapted to the listening environment, even if the user does not need to adjust the sound field. You can listen to the audio source.

また、上記した音響特性の測定は、例えば次のようにして行うようにされる。
先ず、そのリスニング空間のなかにおいて、聴取者の耳の位置に対応するとされるリスニングポジションにマイクロフォンを配置する。そして、音響補正装置により、スピーカから測定音を出力させ、この出力された測定音をマイクロフォンにより収音して、収音して得られた音声信号をサンプリングする。音響補正装置では、このサンプリングした音声について例えば周波数解析処理などを行った結果に基づいて、例えば上記のようにして、音響補正のための信号処理のパラメータを求めるようにされる。 The above-described measurement of acoustic characteristics is performed, for example, as follows.
First, in the listening space, a microphone is arranged at a listening position corresponding to the position of the listener's ear. Then, the measurement sound is output from the speaker by the acoustic correction device, the output measurement sound is collected by the microphone, and the sound signal obtained by collecting the sound is sampled. In the acoustic correction device, for example, as described above, signal processing parameters for acoustic correction are obtained based on the result of performing, for example, frequency analysis processing on the sampled sound.

特開２００１−３４６２９９号公報JP 2001-346299 A

しかしながら、上記のような測定のための測定音としては、一般には、例えばピンクノイズなどを用いる。このために、測定を行っているときには、ユーザは、ノイズ音を聴くことになる。ノイズ音は、音の種類として決して聴き心地のよいものではないことから、ユーザのことを考慮すれば、好ましくないということになる。 However, for example, pink noise is generally used as the measurement sound for the measurement as described above. For this reason, the user listens to a noise sound when performing measurement. Since noise sounds are never comfortable as a kind of sound, it is not preferable in consideration of the user.

第１の測定手順と、この第１の測定手順に続く第２の測定手順とからなる測定方法であって、上記第１の測定手順は、互いに異なる基音成分を基として得られる所要の複数の音素を、互いに出力期間が重複するようにそれぞれ別個のスピーカに出力させる第１の出力手順と、上記別個のスピーカから放音された上記複数の音素をそれぞれ複数の空間伝達経路を介して収音して音声信号を得る第１の収音手順と、上記第１の収音手順で収音された音声信号について所定の周波数解析処理を実行して得られた解析結果に基づいて、上記別個のスピーカごとに出力する信号に対し音圧レベルに応じた設定をする設定手順とを実行し、上記第２の測定手順は、互いに異なる基音成分を基として得られる所要の複数の音素に対し、上記設定手順で設定された音圧レベルに応じた特性を施し、得られた複数の音素信号を上記別個のスピーカに出力させる第２の出力手順と、上記別個のスピーカから放音された上記複数の音素信号を上記複数の空間伝達経路を介して収音して音声信号を得る第２の収音手順と、上記第２の収音手順で収音された音声信号について所定の周波数解析処理を実行して得られた解析結果に基づいて、上記複数の空間伝達経路ごとに所要の測定項目についての測定結果を得る測定手順とを実行し、上記音素は、２のべき乗で表される所定のサンプル数Ｎに対して整数の周期数があてはまる正弦波とされる基音成分を基として得られるものとし、上記第１の出力手順および第２の出力手順で出力される１つの音素は、所定の上記整数の周期数があてはまる上記基音成分の１／（２Ｐ?）（Ｐは自然数）の周波数を有する周波数成分を仮想基音成分としたときに、この仮想基音成分に対して所定オクターブ数上の周波数を有するとされる複数の高調波成分のうちから、任意の高調波成分を合成して形成された信号として出力され、上記第１の出力手順および第２の出力手順の少なくともいずれか一方で、所要の音素を出力させた後の所要のタイミングで、次の所要の音素を出力するとともに、上記音素のうちで、１つの基準周波数として設定された特定周波数成分による音素と、この基準周波数を或る所定の音階を成す１つの音高としたときに、上記音階において他の音高となり得る周波数を有する特定周波数成分による音素とを出力する。 A measurement method comprising a first measurement procedure and a second measurement procedure following the first measurement procedure, wherein the first measurement procedure includes a plurality of required plural numbers obtained based on different fundamental sound components. A first output procedure for outputting phonemes to separate speakers so that output periods overlap each other, and collecting the plurality of phonemes emitted from the separate speakers via a plurality of spatial transmission paths, respectively. The first sound collection procedure for obtaining a sound signal and the analysis result obtained by executing a predetermined frequency analysis process on the sound signal collected by the first sound collection procedure, A setting procedure for setting a signal to be output for each speaker according to a sound pressure level, and the second measurement procedure is performed on a plurality of required phonemes obtained based on different fundamental sound components. Set in the setting procedure And a second output procedure for outputting the obtained plurality of phoneme signals to the separate speakers, and the plurality of phoneme signals emitted from the separate speakers. Obtained by performing a predetermined frequency analysis process on the sound signal collected by the second sound collecting procedure and the second sound collecting procedure for obtaining a sound signal by collecting the sound through the spatial transmission path A measurement procedure for obtaining a measurement result for a required measurement item for each of the plurality of spatial transmission paths based on the analysis result, and the phoneme is obtained with respect to a predetermined number of samples N expressed by a power of 2 It is assumed that it is obtained on the basis of a fundamental sound component that is a sine wave to which an integer number of cycles applies, and one phoneme output in the first output procedure and the second output procedure has a predetermined integer number of cycles. 1 / of the above fundamental component that applies 2P?) (When P is set to the virtual fundamental component frequency component having a frequency natural number), from among a plurality of harmonic components that are to have a frequency on the predetermined number of octaves for this virtual fundamental component, It is output as a signal formed by combining arbitrary harmonic components, and at a required timing after outputting a required phoneme in at least one of the first output procedure and the second output procedure, When the next required phoneme is output, and among the phonemes, a phoneme having a specific frequency component set as one reference frequency, and this reference frequency as one pitch that forms a predetermined scale The phoneme having a specific frequency component having a frequency that can be another pitch in the scale is output .

また、測定装置として、互いに異なる基音成分を基として得られる所要の複数の音素を、互いに出力期間が重複するようにそれぞれ別個のスピーカに出力させる第１の出力手段と、上記別個のスピーカから放音された上記複数の音素をそれぞれ複数の空間伝達経路を介して収音して音声信号を得る第１の収音手段と、上記第１の収音手段で収音された音声信号について所定の周波数解析処理を実行して得られた解析結果に基づいて、上記別個のスピーカごとに出力する信号に対し音圧レベルに応じた設定をする設定手段と、互いに異なる基音成分を基として得られる所要の複数の音素に対し、上記設定手段で設定された音圧レベルに応じた特性を施し、得られた複数の音素信号を上記別個のスピーカに出力させる第２の出力手段と、上記別個のスピーカから放音された上記複数の音素信号を上記複数の空間伝達経路を介して収音して音声信号を得る第２の収音手段と、上記第２の収音手順で収音された音声信号について所定の周波数解析処理を実行して得られた解析結果に基づいて、上記複数の空間伝達経路ごとに所要の測定項目についての測定結果を得る測定手段とを備え、上記音素は、２のべき乗で表される所定のサンプル数Ｎに対して整数の周期数があてはまる正弦波とされる基音成分を基として得られるものとし、上記第１の出力手段および第２の出力手段で出力される１つの音素は、所定の上記整数の周期数があてはまる上記基音成分の１／（２Ｐ）（Ｐは自然数）の周波数を有する周波数成分を仮想基音成分としたときに、この仮想基音成分に対して所定オクターブ数上の周波数を有するとされる複数の高調波成分のうちから、任意の高調波成分を合成して形成された信号として出力され、上記第１の出力手段および第２の出力手段の少なくともいずれか一方で、所要の音素を出力させた後の所要のタイミングで、次の所要の音素を出力するとともに、上記音素のうちで、１つの基準周波数として設定された特定周波数成分による音素と、この基準周波数を或る所定の音階を成す１つの音高としたときに、上記音階において他の音高となり得る周波数を有する特定周波数成分による音素とを出力することとした。 In addition, as a measuring apparatus, a first output means for outputting a plurality of required phonemes obtained based on different fundamental sound components to separate speakers so that their output periods overlap with each other, and a release from the separate speakers. A first sound collecting means for collecting a plurality of sounded phonemes through a plurality of spatial transmission paths to obtain a sound signal; and a sound signal picked up by the first sound collecting means for a predetermined Based on the analysis result obtained by executing the frequency analysis processing, the setting means for setting the signal output for each of the separate speakers according to the sound pressure level, and the requirements obtained based on different fundamental sound components A second output means for applying a characteristic according to the sound pressure level set by the setting means to the plurality of phonemes and outputting the obtained plurality of phoneme signals to the separate speakers; A second sound collecting means for collecting a plurality of phoneme signals emitted from a speaker through the plurality of spatial transmission paths to obtain a sound signal; and a sound collected by the second sound collecting procedure. Measurement means for obtaining a measurement result for a required measurement item for each of the plurality of spatial transmission paths based on an analysis result obtained by executing a predetermined frequency analysis process on the signal, It is obtained on the basis of a fundamental component that is a sine wave to which an integer number of cycles is applied to a predetermined number of samples N expressed by a power, and is output by the first output means and the second output means One phoneme has a frequency component having a frequency of 1 / ( 2P ) (P is a natural number) of the fundamental component to which a predetermined integer number of periods are applied as a virtual fundamental component. Predetermined octave number Output as a signal formed by synthesizing arbitrary harmonic components out of a plurality of higher harmonic components having the upper frequency, and at least one of the first output means and the second output means On the other hand, at the required timing after outputting the required phoneme, the next required phoneme is output, and among the phonemes, a phoneme having a specific frequency component set as one reference frequency and the reference When the frequency is one pitch that forms a predetermined scale, a phoneme having a specific frequency component having a frequency that can be another pitch in the scale is output .

また、プログラムとして、第１の測定手順と、この第１の測定手順に続く第２の測定手順とからなる測定手順をコンピュータに実行させるプログラムであって、上記第１の測定手順は、互いに異なる基音成分を基として得られる所要の複数の音素を、互いに出力期間が重複するようにそれぞれ別個のスピーカに出力させる第１の出力手順と、上記別個のスピーカから放音された上記複数の音素をそれぞれ複数の空間伝達経路を介して収音して音声信号を得る第１の収音手順と、上記第１の収音手順で収音された音声信号について所定の周波数解析処理を実行して得られた解析結果に基づいて、上記別個のスピーカごとに出力する信号に対し音圧レベルに応じた設定をする設定手順とを実行し、上記第２の測定手順は、互いに異なる基音成分を基として得られる所要の複数の音素に対し、上記設定手順で設定された音圧レベルに応じた特性を施し、得られた複数の音素信号を上記別個のスピーカに出力させる第２の出力手順と、上記別個のスピーカから放音された上記複数の音素信号を上記複数の空間伝達経路を介して収音して音声信号を得る第２の収音手順と、上記第２の収音手順で収音された音声信号について所定の周波数解析処理を実行して得られた解析結果に基づいて、上記複数の空間伝達経路ごとに所要の測定項目についての測定結果を得る測定手順とを実行し、上記音素は、２のべき乗で表される所定のサンプル数Ｎに対して整数の周期数があてはまる正弦波とされる基音成分を基として得られるものとし、上記第１の出力手順および第２の出力手順で出力される１つの音素は、所定の上記整数の周期数があてはまる上記基音成分の１／（２Ｐ?）（Ｐは自然数）の周波数を有する周波数成分を仮想基音成分としたときに、この仮想基音成分に対して所定オクターブ数上の周波数を有するとされる複数の高調波成分のうちから、任意の高調波成分を合成して形成された信号として出力され、上記第１の出力手順および第２の出力手順の少なくともいずれか一方で、所要の音素を出力させた後の所要のタイミングで、次の所要の音素を出力するとともに、上記音素のうちで、１つの基準周波数として設定された特定周波数成分による音素と、この基準周波数を或る所定の音階を成す１つの音高としたときに、上記音階において他の音高となり得る周波数を有する特定周波数成分による音素とを出力する手順をコンピュータに実行させることとした。 Further, the program is a program that causes a computer to execute a measurement procedure including a first measurement procedure and a second measurement procedure following the first measurement procedure, and the first measurement procedure is different from each other. A first output procedure for outputting a plurality of required phonemes obtained based on a fundamental sound component to separate speakers so that output periods overlap each other; and the plurality of phonemes emitted from the separate speakers. A first sound collection procedure for obtaining sound signals by collecting sound through a plurality of spatial transmission paths, and a predetermined frequency analysis process for the sound signals collected by the first sound collection procedure. A setting procedure for setting a signal corresponding to a sound pressure level for a signal output for each of the separate speakers based on the analysis result obtained, and the second measurement procedure includes different fundamental sound components. A second output procedure for applying a characteristic according to the sound pressure level set in the setting procedure to a plurality of required phonemes obtained as described above, and outputting the obtained plurality of phoneme signals to the separate speakers; A second sound collection procedure for obtaining a sound signal by collecting the plurality of phoneme signals emitted from the separate speakers via the plurality of spatial transmission paths, and a sound collection by the second sound collection procedure A measurement procedure for obtaining a measurement result for a required measurement item for each of the plurality of spatial transmission paths based on an analysis result obtained by executing a predetermined frequency analysis process on the received audio signal, and Is obtained on the basis of a fundamental component that is a sine wave in which an integer number of cycles is applied to a predetermined number of samples N expressed by a power of 2, and the first output procedure and the second output procedure are described above. One output in Arsenide, a frequency component having a frequency of 1 / of the fundamental component periodicity of a predetermined the integer applies (2P?) (P is a natural number) when a virtual fundamental component, given for this virtual fundamental component It is output as a signal formed by synthesizing an arbitrary harmonic component among a plurality of harmonic components having a frequency on the octave number, and at least one of the first output procedure and the second output procedure. In any one of the above cases, the next required phoneme is output at a required timing after outputting the required phoneme, and among the phonemes, a phoneme having a specific frequency component set as one reference frequency, When this reference frequency is set to one pitch that forms a predetermined scale, a procedure for outputting a phoneme with a specific frequency component having a frequency that can be another pitch in the scale is compiled. Computer.

このことから、本発明によっては、例えばピンクノイズなどとは異なって、聴感的に音高を感じられる音が測定音として聞こえてくることになるので、ユーザにとっては、不快感を感じることがない。また、このような音を測定音に使用しているのにかかわらず、例えば上記もしたように窓関数の処理が不要になることなどによって、測定結果を得るための周波数解析処理がより簡略なものとなって、例えば、その分のプログラムの簡易化、あるいはハードウェアの回路規模拡大の抑制を図ることが出来る。また、このことによっては、より高い信頼性の解析結果も得られることになるので、例えばこの解析結果を利用した音響補正について、より良好な結果が得られることにもなる。 Therefore, according to the present invention, unlike a pink noise, for example, a sound that can be heard audibly can be heard as a measurement sound, so that the user does not feel uncomfortable. . In addition, regardless of the use of such sound as measurement sound, the frequency analysis process for obtaining the measurement result is simplified, for example, by eliminating the need for window function processing as described above. Thus, for example, it is possible to simplify the corresponding program or to suppress the expansion of the hardware circuit scale. In addition, this results in an analysis result with higher reliability. For example, a better result can be obtained for acoustic correction using the analysis result.

本発明の実施の形態において測定音の要素となる音素についての基本概念を示す説明図である。It is explanatory drawing which shows the basic concept about the phoneme used as the element of a measurement sound in embodiment of this invention. 音素の形成手法、及び測定音メロディに適合した音素の選択についての基本概念を示す説明図である。It is explanatory drawing which shows the basic concept about the phoneme formation method and selection of the phoneme suitable for a measurement sound melody. 図２に示した概念に基づいて選択される音素の周波数特性を示す図である。It is a figure which shows the frequency characteristic of the phoneme selected based on the concept shown in FIG. 本実施の形態において実際に採用される、音素の形成手法、及び測定音メロディに適合した音素の選択についての概念を示す説明図である。It is explanatory drawing which shows the concept about the selection of the phoneme which is actually employ | adopted in this Embodiment, and the selection of the phoneme suitable for a measurement sound melody. 実施の形態における、測定音（音素）出力と、サンプリングについての基本的なシーケンスを示すタイミングチャートである。It is a timing chart which shows the basic sequence about measurement sound (phoneme) output and sampling in an embodiment. 実施の形態における応答信号についての周波数解析結果例を示す図である。It is a figure which shows the example of a frequency analysis result about the response signal in embodiment. 実施の形態における測定音メロディの出力パターンの実際例を示す図である。It is a figure which shows the actual example of the output pattern of the measurement sound melody in embodiment. 図７に示す測定音メロディの出力パターンに応じた、音素生成及び出力処理と、解析及び測定処理の手順例を示すフローチャートである。It is a flowchart which shows the example of a procedure of a phoneme production | generation and output process according to the output pattern of the measurement sound melody shown in FIG. 7, and an analysis and a measurement process. 実施の形態の音響補正システムと、ＡＶシステムとから成るシステム全体の構成例を示すブロック図である。1 is a block diagram illustrating a configuration example of an entire system including an acoustic correction system according to an embodiment and an AV system. 実施の形態の音響補正システムの構成例を示すブロック図である。It is a block diagram which shows the structural example of the acoustic correction system of embodiment. 準備測定処理ブロック内の測定音処理部についての実際の信号出力形態例を示すブロック図である。It is a block diagram which shows the example of an actual signal output form about the measurement sound process part in a preparatory measurement process block. 準備測定処理ブロック内の測定音処理部における、１音素に対応する音素生成処理過程を示すブロック図である。It is a block diagram which shows the phoneme production | generation process corresponding to one phoneme in the measurement sound process part in a preparation measurement process block. シーケンスデータの構造例を示す図である。It is a figure which shows the structural example of sequence data. 準備測定のために制御部（マイクロコンピュータ）が実行するとされる処理動作を示すブロック図である。It is a block diagram which shows the processing operation | movement which a control part (microcomputer) performs for a preparatory measurement.

以下、本発明の実施の形態について説明を行うこととする。
本実施の形態としては、本願発明に基づく測定装置について、マルチチャンネルに対応するオーディオシステムにより再生される音場について補正する音響補正装置に搭載した場合を例に挙げて説明する。つまり、音響補正のために、そのオーディオシステムを使用するリスニング環境の音響特性を測定する測定装置に本発明を適用するものである。
また、本実施の形態の音響補正装置としては、オーディオシステムに対して元々から備えられるものではなく、既にあるオーディオシステムに対して、いわゆる後付けが可能なものであることとする。つまり、ある一定の規格が合致する範囲内であれば、本実施の形態の音響補正装置を接続可能なオーディオシステムは特に制限がない。
また、このようにして、音響補正装置に対して接続されるオーディオシステムが不定となるのに応じて、本実施の形態では、オーディオシステムそのものが対応しているマルチチャンネル方式についても特定することができない状況にあるものとされる。
そこで、本実施の形態の音響補正装置としては、本測定を行なう事前の段階で、準備測定を行うようにされる。つまり、先ずは、準備測定により、主としては、実際に接続されたオーディオシステムのチャンネル構成（スピーカ構成）がどのようなものであるのかを特定するようにされる。なお、このときの準備測定の結果に応じて、本測定時において各チャンネルのスピーカから出力させるべき信号レベルも決定するようにされる。そして、本測定を行って得られた測定結果に基づいて、信号処理における所要のパラメータについて音場補正が行われるようにして変更設定するようにされる。
そして、以降説明する本実施の形態の測定音は、準備測定のときに用いるべきものとされる。 Hereinafter, embodiments of the present invention will be described.
In this embodiment, the measurement apparatus according to the present invention will be described by taking as an example a case where the measurement apparatus is mounted on an acoustic correction apparatus that corrects a sound field reproduced by an audio system that supports multi-channel. That is, the present invention is applied to a measuring apparatus that measures the acoustic characteristics of a listening environment that uses the audio system for acoustic correction.
In addition, the acoustic correction apparatus according to the present embodiment is not originally provided for an audio system, but can be retrofitted to an existing audio system. In other words, there is no particular limitation on the audio system to which the acoustic correction apparatus of the present embodiment can be connected as long as a certain standard is met.
In addition, in this way, according to the indefinite audio system connected to the acoustic correction device, in this embodiment, it is possible to specify the multi-channel method supported by the audio system itself. It is assumed that the situation is not possible.
Therefore, in the acoustic correction apparatus of the present embodiment, preliminary measurement is performed at a stage prior to performing the main measurement. That is, first, by the preliminary measurement, the channel configuration (speaker configuration) of the actually connected audio system is mainly specified. Note that the signal level to be output from the speaker of each channel at the time of the main measurement is determined according to the result of the preliminary measurement at this time. Then, based on the measurement result obtained by performing the main measurement, a required parameter in the signal processing is changed and set so that the sound field correction is performed.
And the measurement sound of this Embodiment demonstrated below shall be used at the time of a preparatory measurement.

先ず、本実施の形態において使用される測定音の基本概念について図１を参照して説明する。
本実施の形態では、測定音を得るのにあたって、図１（ａ）に示すようにして基本正弦波を規定する。この基本正弦波は、「サンプル数を示す変数Ｎについて、２のべき乗（２ⁿ：ｎは自然数）で表される所定値を設定したうえで、このサンプル数Ｎに対して、ちょうど１周期があてはまる」ことを条件とする、特定的な正弦波とされる。
本発明におけるサンプル数Ｎとしては、２のべき乗となる数である限り特に限定されるべきものではないが、本実施の形態では、以降の説明を行うのにあたり、２の１２乗（ｎ＝１２）となる、Ｎ＝４０９６であることとする。
また、サンプリング周波数Ｆｓについては、４８ＫＨｚであることとする。これにより、実施の形態において規定される基本正弦波の周波数は、４８０００／４０９６≒１１．７２Ｈｚとなる。なお、この１１．７２Ｈｚは、あくまでも近似値ではあるが、以降においては、説明の便宜上、４８０００／４０９６＝１１．７２Ｈｚとみなして説明する場合がある。 First, the basic concept of the measurement sound used in this embodiment will be described with reference to FIG.
In this embodiment, in order to obtain a measurement sound, a basic sine wave is defined as shown in FIG. This basic sine wave is “a variable N indicating the number of samples is set to a predetermined value represented by a power of 2 (2 ⁿ : n is a natural number), and exactly one period is equal to the number of samples N. It is a specific sine wave on condition that it is “applicable”.
The number of samples N in the present invention is not particularly limited as long as it is a number that is a power of 2, but in the present embodiment, in the following description, 2 12 (n = 12) ) And N = 4096.
The sampling frequency Fs is 48 KHz. Thereby, the frequency of the fundamental sine wave defined in the embodiment is 48000 / 4096≈11.72 Hz. In addition, although 11.72 Hz is an approximate value to the last, it may be considered as 48000/4096 = 11.72 Hz below for convenience of explanation.

そして本実施の形態では、上記のようにして規定した基本正弦波を基として、次のようにして、他の正弦波を得るようにされる。
ここで、基本正弦波のサンプル数Ｎ（＝４０９６）に対応する４０９６のサンプルポイントは、時系列に従ってｔ０〜ｔ４０９５であるとする。そして、この基本正弦波のサンプルポイントｔ０〜ｔ４０９５を基として、サンプルポイント［ｔ０，ｔｍ，ｔ２ｍ，ｔ３ｍ・・・・］のようにして４０９６のサンプルを集めて（なお、ｔ４０９５を越えたらｔ０に戻るようにして循環する）正弦波を生成するものとする。
この場合において、ｍ＝１であれば、サンプルポイント［ｔ０，ｔ１，ｔ２，ｔ３・・・・］と集めていくことになるので、基本正弦波そのものとなる。そして、ｍ＝２とすれば、サンプルポイント［ｔ０，ｔ２，ｔ４，ｔ６・・・・］と集めていくことになり、この結果、図１（ｂ）に示すようにして、基本正弦波に対して２倍の周期となる正弦波が得られる。つまりサンプル数４０９６に対してちょうど２周期があてはまる正弦波が得られる。
同様にして、ｍ＝３として、サンプルポイント［ｔ０，ｔ３，ｔ６，ｔ９・・・・］と集めていくようにすれば、図１（ｃ）に示すようにして、基本正弦波に対して３倍の周期であり、サンプル数４０９６に対してちょうど３周期があてはまる正弦波が得られる。
また、ｍ＝４としてサンプルポイント［ｔ０，ｔ４，ｔ８，ｔ１２・・・・］と集めていくようにすれば、図１（ｄ）に示すようにして、基本正弦波に対して４倍の周期であり、サンプル数４０９６に対して４周期があてはまる正弦波となる。
このようにして、変数ｍ（ｍは整数）の値を変えてサンプルポイント［ｔ０，ｔｍ，ｔ２ｍ，ｔ３ｍ・・・・］のようにしてサンプルポイントを集めることで、基本正弦波を基として、サンプル数Ｎ（＝４０９６）に対してｍ周期があてはまる正弦波を作っていくことが出来る。
なお、以降においては、サンプル数Ｎ（＝４０９６）に対してｍ周期があてはまる正弦波について、「ｍ次正弦波」ということにする。ちなみに基本正弦波は、ｍ＝１となるから、１次正弦波となる。本実施の形態の場合、この基本正弦波（１次正弦波（ｍ＝１））は１１．７２Ｈｚであるから、例えば２次正弦波は１１．７２×２＝２３．４４Ｈｚ、３次正弦波は１１．７２×３＝３５．１６Ｈｚというようにして、ｍ次正弦波の周波数は、１１．７２Ｈｚ×ｍで表される。 In this embodiment, another sine wave is obtained as follows based on the basic sine wave defined as described above.
Here, it is assumed that 4096 sample points corresponding to the number of samples N (= 4096) of the basic sine wave are t0 to t4095 according to the time series. Then, 4096 samples are collected as sample points [t0, tm, t2m, t3m,...] Based on the sample points t0 to t4095 of the basic sine wave. A sine wave that circulates back is generated.
In this case, if m = 1, sample points [t0, t1, t2, t3,. Then, if m = 2, sample points [t0, t2, t4, t6...] Are collected, and as a result, as shown in FIG. On the other hand, a sine wave having a double cycle is obtained. That is, a sine wave in which exactly two cycles are applied to the number of samples 4096 is obtained.
Similarly, if m = 3 and sample points [t0, t3, t6, t9,...] Are collected, as shown in FIG. A sinusoidal wave having a period of 3 times and exactly 3 periods corresponding to the number of samples 4096 is obtained.
Further, if m = 4 and sample points [t0, t4, t8, t12,...] Are collected, as shown in FIG. This is a sine wave in which four periods are applied to the number of samples 4096.
In this way, by collecting the sample points like the sample points [t0, tm, t2m, t3m,...] By changing the value of the variable m (m is an integer), based on the basic sine wave, It is possible to create a sine wave in which m cycles are applied to the number of samples N (= 4096).
Hereinafter, a sine wave in which m cycles are applied to the number of samples N (= 4096) will be referred to as an “m-th order sine wave”. Incidentally, the basic sine wave is a primary sine wave because m = 1. In the case of the present embodiment, since this basic sine wave (primary sine wave (m = 1)) is 11.72 Hz, for example, the secondary sine wave is 11.72 × 2 = 23.44 Hz and the tertiary sine wave. Is 11.72 × 3 = 35.16 Hz, and the frequency of the m-th order sine wave is expressed by 11.72 Hz × m.

周知のようにして、ＤＳＰ(Digital Signal Processor)やＣＰＵ(Central Processing Unit)などについて、入出力インターフェイスの入出力バッファを作成したり、あるいはＦＦＴ(Fast Fourier Transform：高速フーリエ変換) の演算などを実行させる場合において、処理対象となるデータについて、２のべき乗で表されるサンプル数とすることが好適である。サンプル数Ｎについて、上記のようにして２のべき乗で表されるサンプル数としているのは、このことに基づいている。 As is well known, I / O buffers for DSPs (Digital Signal Processors), CPUs (Central Processing Units), etc. are created, or FFT (Fast Fourier Transform) operations are performed. In this case, it is preferable to set the number of samples represented by a power of 2 for the data to be processed. It is based on this that the number of samples N is the number of samples expressed as a power of 2 as described above.

また、２のべき乗で表されるサンプル数Ｎ（＝４０９６）に、ちょうど当てはめられる基本正弦波の時系列に対して、例えばＦＦＴなどの周波数解析を行って、その振幅値を求めたとする。すると、そのｍ次正弦波の周波数である１１．７２Ｈｚにてのみ値を有し、他の周波数では理論的に対数軸上は−∞となる。つまり、１１．７２Ｈｚの周波数をメインローブとすると、このメインローブの信号に含まれる周波数成分が原因となるサイドローブは発生することがない。
このことは、２次以上のｍ次正弦波についても同様のことがいえる。これらの２次以上のｍ次正弦波も、図１から理解されるように、サンプル数Ｎに対して、全て整数周期によりちょうど収まる波形となっているからである。
そして、このようにしてサイドローブが発生しないことで、未知とされる一般信号列に対してＦＦＴを行うために、例えば矩形以外の窓関数の処理を実行する必要はなくなる。 Further, it is assumed that frequency analysis such as FFT is performed on the time series of the basic sine wave that is just applied to the number of samples N (= 4096) expressed by a power of 2, and the amplitude value is obtained. Then, it has a value only at 11.72 Hz which is the frequency of the mth-order sine wave, and theoretically becomes −∞ on the logarithmic axis at other frequencies. In other words, if the frequency of 11.72 Hz is the main lobe, side lobes caused by the frequency components included in the main lobe signal will not occur.
The same can be said for the second-order or higher m-order sine wave. This is because these second-order and higher-order m-th order sine waves are also waveforms that exactly fit an integer period with respect to the number of samples N, as can be understood from FIG.
Since side lobes are not generated in this way, it is not necessary to perform processing of a window function other than a rectangle, for example, in order to perform FFT on an unknown general signal sequence.

本実施の形態としては、このことに基づいて、ｍ次正弦波を基として生成した「音素」としての音声信号を、準備測定のための測定音の音源（測定音源）として使用することとする。つまり、この「音素」としての音声信号を使用して、オーディオシステムのスピーカから測定音として再生出力させる。そして、このスピーカから測定音が出力されているときにマイクロフォンにより収音されている音声信号を応答信号としてサンプリングしてＦＦＴにより周波数解析を行なう。この際に応答信号をサンプリングする際のサンプル数Ｎ及びサンプリング周波数Ｆｓは、ｍ次正弦波と同様にＮ＝４０９６、Ｆｓ＝４８ＫＨｚである。
このような測定音の出力、及び収音音声のサンプリング、及び解析の手順とすれば、上記もしたように、ｍ次正弦波の周波数に対応したサイドローブは発生しないから、応答信号において、測定音として再生出力された信号成分の周波数については非常に正確に応答を測定できることになる。また、周波数解析を行った結果として、測定音以外の周波数の振幅が得られた場合、上記のようにｍ次正弦波の周波数に対応したサイドローブは発生し得ない以上、これは、リスニング環境の暗騒音のレベルが測定されているものとみてよいことになる。つまり、周波数解析結果として、特に窓関数の処理を行わなくとも、測定音としての周波数成分の振幅と、この測定音以外の暗騒音とみなされる周波数成分の振幅とは明確に区分されることになる。例えば、この測定音と暗騒音の振幅を比較した結果に基づいて、準備測定としての必要な測定結果を得ることができる。 In the present embodiment, based on this, an audio signal as a “phoneme” generated based on an mth-order sine wave is used as a measurement sound source (measurement sound source) for preparatory measurement. . That is, the audio signal as the “phoneme” is used to reproduce and output the measurement sound from the speaker of the audio system. Then, when the measurement sound is output from the speaker, the sound signal picked up by the microphone is sampled as a response signal, and the frequency analysis is performed by FFT. At this time, the number of samples N and the sampling frequency Fs when sampling the response signal are N = 4096 and Fs = 48 KHz, similarly to the m-th order sine wave.
If the measurement sound output and collected sound sampling and analysis procedures are used, side lobes corresponding to the frequency of the mth-order sine wave are not generated as described above. With respect to the frequency of the signal component reproduced and output as sound, the response can be measured very accurately. As a result of the frequency analysis, when the amplitude of the frequency other than the measurement sound is obtained, the side lobe corresponding to the frequency of the m-th order sine wave cannot be generated as described above. It can be considered that the background noise level is measured. In other words, as a result of the frequency analysis, the amplitude of the frequency component as the measurement sound and the amplitude of the frequency component regarded as background noise other than the measurement sound are clearly separated without particularly processing the window function. Become. For example, a necessary measurement result as a preparatory measurement can be obtained based on a result of comparing the amplitudes of the measurement sound and the background noise.

ところで、準備測定としては、オーディオシステムとして出力可能性のあるスピーカ（チャンネル）ごとに順次、適当に選んだ１つのｍ次正弦波としての音素を測定音として出力させてサンプリングを行って解析する、という手順を踏めばよい。しかしながら、本実施の形態の測定音は正弦波であるから、人間の耳には、例えばピンクノイズなどの信号を再生した音と比較して、音程感が認識できる音であるといえる。そこで、本実施の形態としては、単にｍ次正弦波としての音素を測定音として出すのではなく、これを推し進めて、ｍ次正弦波を基として得られる音素（測定音）を、時系列方向と、音高方向との双方について組み合わせて、人間がメロディとして認識できるようにした形態で出力するようにされる。
これにより、測定音を聴いているユーザにとっては、何らかのメロディ（楽曲）的なものを聴いていることとなって、例えば単にピンクノイズなどを聴かされる場合のように不快な印象を持つこともないし、また、娯楽性が高まることとなる。 By the way, as preparatory measurement, for each speaker (channel) that may be output as an audio system, one m-order sine wave that is appropriately selected is output as measurement sound, and sampling is performed for analysis. You can follow the procedure. However, since the measurement sound of the present embodiment is a sine wave, it can be said that the human ear can recognize a sense of pitch as compared with a sound obtained by reproducing a signal such as pink noise. Therefore, in the present embodiment, the phoneme (measurement sound) obtained based on the m-th order sine wave is not timed out as a measurement sound. In combination with both the pitch direction and the pitch direction, it is output in a form that allows humans to recognize as a melody.
As a result, the user who is listening to the measurement sound is listening to some kind of melody (musical piece), and may have an unpleasant impression, for example, when the user simply hears pink noise or the like. In addition, entertainment will be enhanced.

そして、ｍ次正弦波を基礎としてメロディ的な測定音を出力するために、本実施の形態としては、次のようにして音素を形成していくようにされる。
本実施の形態では、基本的な考え方として、図２に示すようにしてメロディ的な測定音に使用する音素を得る。
先ず、図２においては、ｍ次正弦波を示す変数ｍとして例えばｍ＝９〜１９を選択している。これは、音素について可聴帯域において人間がメロディ（楽音）として聞き取りやすい周波数となることや、最終的に必要な音高の数（作成すべきメロディと、測定音として適当な音素数、音域などにより決まる）、及び実際に音素（測定音）を生成するデバイスの処理能力を考慮して設定された範囲であるが、あくまでも一例に過ぎない。
そのうえで、ここではｍ次正弦波に基づいて得られる周波数ｆとして、

ｆ＝（４８０００／４０９６）×ｍ×２^k・・・（式１）

を定義する。そして、９次〜１９次正弦波（ｍ＝９〜１９）ごとに対応して、ｋ＝１となるときの周波数ｆを、ベース音（基音）として定義するようにされる。これにより、図２に示すようにして、ベース音は、９次正弦波（ｍ＝９）に対応しては210.94Hz、１０次正弦波（ｍ＝１０）に対応しては234.38Hz、１１次正弦波（ｍ＝１１）に対応しては257.81Hz、・・・・・１８次正弦波（ｍ＝１８）に対応しては421.88Hz、１９次正弦波（ｍ＝１９）に対応しては445.31Hz、というようになる。 In order to output a melodic measurement sound based on the m-th order sine wave, the present embodiment forms phonemes as follows.
In the present embodiment, as a basic idea, phonemes used for melodic measurement sounds are obtained as shown in FIG.
First, in FIG. 2, for example, m = 9 to 19 is selected as the variable m indicating the m-th order sine wave. This is because the frequency of the phoneme is easy to hear as a melody (musical sound) in the audible band, and the number of pitches required (the melody to be created, the number of phonemes appropriate for the measurement sound, the range, etc.) And the range set in consideration of the processing capability of the device that actually generates the phoneme (measurement sound), but is merely an example.
In addition, as a frequency f obtained based on the m-th order sine wave,

f = (48000/4096) × m × 2 ^k (Equation 1)

Define Then, corresponding to each of the 9th to 19th sine waves (m = 9 to 19), the frequency f when k = 1 is defined as a bass sound (fundamental sound). Thus, as shown in FIG. 2, the bass sound is 210.94 Hz corresponding to the ninth-order sine wave (m = 9), 234.38 Hz, 11 corresponding to the tenth-order sine wave (m = 10). Corresponding to the second sine wave (m = 11), 257.81 Hz, corresponding to the eighteenth sine wave (m = 18), corresponding to the 421.88 Hz, nineteenth sine wave (m = 19). Is 445.31Hz.

また、上記のようにして定義される各ベース音に対しては、高調波次数としての変数ｋ（ｋは整数）についてｋ＝２以上に対応する周波数ｆが対応付けられる。この場合には、１つのベース音について、高調波次数ｋ＝２、ｋ＝３、ｋ＝４、ｋ＝５、ｋ＝６に対応する５つの周波数ｆが対応付けられるが、これらの５つの周波数ｆは、上記式１によれば、ベース音（ｋ＝１）に対する高調波次数ｋの数値差（ｋ−１）が表すオクターブ数だけ上となる周波数を有する高調波（以降、オクターブ高調波ともいう）となる。例えば９次正弦波（ｍ＝９）に対応するベース音の周波数（210.94Hz）に対して、高調波次数ｋ＝２のオクターブ高調波の周波数は２倍の421.88Hzであり、高調波次数ｋ＝３のオクターブ高調波の周波数は４倍の843.75Hz、・・・ｋ＝６のオクターブ高調波の周波数は３２倍の6750.00Hzというように、それぞれ、ベース音に対して１オクターブ上、２オクターブ上、・・・・５オクターブ上という関係になっていることが分かる。 Further, each bass sound defined as described above is associated with a frequency f corresponding to k = 2 or more for a variable k (k is an integer) as a harmonic order. In this case, five frequencies f corresponding to harmonic orders k = 2, k = 3, k = 4, k = 5, and k = 6 are associated with one bass sound. According to the above formula 1, the frequency f is a harmonic having a frequency that is higher by the octave number represented by the numerical difference (k−1) of the harmonic order k relative to the bass sound (k = 1) (hereinafter, octave harmonics). Also called). For example, the frequency of the octave harmonic of the harmonic order k = 2 is 421.88 Hz with respect to the frequency (210.94 Hz) of the bass sound corresponding to the ninth-order sine wave (m = 9), and the harmonic order k = 3 octave harmonic frequency is four times 843.75Hz, k = 6 octave harmonic frequency is 32 times 6750.00Hz, 2 octaves above the bass sound, respectively It can be seen that there is a relationship of 5 above octave.

本実施の形態において、１つの音素は、ベース音（ｋ＝１）に対する各オクターブ高調波（ｋ＝２〜６）のレベルについてしかるべき関係を設定した上で、これらのオクターブ高調波をベース音に対して合成することで形成するようにされる。
このようにして、測定音に使用する１つの音素として、ベース音（ｋ＝１）の周波数成分だけではなく、そのオクターブ高調波としての周波数成分を合成することによっては、先ず、上記もしている各周波数成分のレベル関係の設定により、音素の音色を設定できるということになる。これにより、音素の組み合わせによるメロディとしての測定音について、音色の要素が加わることになるので、測定音として出力される音素のシーケンスは、より音楽的なものとなる。
また、ベース音（ｋ＝１）と、そのオクターブ高調波（ｋ＝２〜６）の成分から成る音素は、例えば周波数解析された場合には、ベース音の周波数と、オクターブ高調波（ｋ＝２＝６）の周波数との、全部で６つの周波数の振幅が検出されることになる。これは、同時に測定される周波数が複数であることを意味する。このようにして、複数の周波数が同時に測定されるということは、ある周波数の帯域範囲内において測定対象となる周波数が増加して、その存在密度が高くなることにつながる。例えばスピーカによっては、或る特定の周波数帯域において音圧レベルが急峻に低下するような、いわゆるディップといわれる特性を有しているものもある。たまたま、スピーカがこのようなものであった場合、測定音の周波数がちょうどディップが生じる帯域内に収まるものであったりすると、解析結果として充分な振幅が観測されないので、信頼性の高い測定結果が得られなくなる。そこで、本実施の形態のようにして、測定音の音素としては、同時的に異なる周波数を合成すれば、音素における或る周波数成分がディップ帯域内であったとしても、他のディップ帯域外の周波数成分は充分大きな振幅で観測できることになり、信頼性を損なわない測定結果が得られることになる。 In the present embodiment, one phoneme sets an appropriate relationship for the level of each octave harmonic (k = 2 to 6) with respect to the bass sound (k = 1), and then uses these octave harmonics as the bass sound. It is made to form by synthesizing.
In this way, by synthesizing not only the frequency component of the base sound (k = 1) but also the frequency component as its octave harmonic as one phoneme used for the measurement sound, the above-mentioned is also performed first. This means that the tone color of the phoneme can be set by setting the level relationship of each frequency component. As a result, a timbre element is added to the measurement sound as a melody by a combination of phonemes, so that the sequence of phonemes output as the measurement sound becomes more musical.
When a phoneme composed of components of the bass sound (k = 1) and its octave harmonics (k = 2 to 6) is analyzed, for example, by frequency analysis, the frequency of the bass sound and the octave harmonics (k = A total of six amplitudes with a frequency of 2 = 6) are detected. This means that a plurality of frequencies are measured simultaneously. Thus, the fact that a plurality of frequencies are measured at the same time leads to an increase in the density of presence of the frequency to be measured within a certain frequency band range. For example, some speakers have a so-called dip characteristic in which the sound pressure level sharply decreases in a specific frequency band. If the loudspeaker is like this, if the frequency of the measurement sound falls within the band where dip occurs, sufficient amplitude will not be observed as an analysis result. It can no longer be obtained. Therefore, as in this embodiment, if different frequencies are synthesized simultaneously as the phoneme of the measurement sound, even if a certain frequency component in the phoneme is within the dip band, it is outside the other dip band. The frequency component can be observed with a sufficiently large amplitude, and a measurement result that does not impair the reliability can be obtained.

なお、確認のために述べておくと、ベース音（ｋ＝１）に対して高調波次数ｋ≧２となるオクターブ高調波の各々についても、ベース音と同様にして、サンプル数Ｎに対して整数周期数で収まる波形となるものであり、従って、ベース音とそのオクターブ高調波からなる音素としても、「サンプル数Ｎに対して整数周期数があてはまる波形」という規定からは逸脱していない。
また、音素を形成する周波数成分の要素として、ベース音は必須であるが、例えば図２に示している２≦ｋ≦６の範囲の高調波次数に対応する５つの全てのオクターブ高調波を含める必要はない。 For confirmation, for each of the octave harmonics for which the harmonic order k ≧ 2 with respect to the bass sound (k = 1), similarly to the bass sound, Therefore, the phoneme composed of the base sound and its octave harmonics does not deviate from the definition of “a waveform in which the integer number of cycles corresponds to the number of samples N”.
In addition, the bass sound is indispensable as an element of the frequency component forming the phoneme, but includes, for example, all five octave harmonics corresponding to the harmonic orders in the range of 2 ≦ k ≦ 6 shown in FIG. There is no need.

この場合、音素としては、図２において次数ｍ＝９〜１９に対応するベース音を基音周波数とする、１１の異なる音高が存在することになる。しかしながら、測定音としての音素の出力シーケンスをメロディ的なものとすることを考えると、各音素の音高（周波数）としては、例えば或る音律の音階に対応する音程差を有しているべきことになる。
そこで、この場合には、音律として１２音平均律を採用した場合を考えてみる。そして、この場合には、ｍ＝１８に対応するベース音が445.31Hzであることに着目してみる。例えば、絶対音名による音階としてＡ＝445Hzが基準であると規定すると、この次数ｍ＝１９に対応するベース音が445.313Hzとされてその誤差が僅かであることから、この次数ｍ＝１９に対応するベース音をＡの音として扱ってよいということがいえる。 In this case, there are 11 different pitches as phonemes, with the base tone corresponding to the order m = 9 to 19 in FIG. 2 as the fundamental frequency. However, considering that the output sequence of the phoneme as the measurement sound is melodic, the pitch (frequency) of each phoneme should have a pitch difference corresponding to a scale of a certain temperament, for example. It will be.
Therefore, in this case, consider a case where a 12-tone average temperament is adopted as the temperament. In this case, attention is paid to the fact that the bass sound corresponding to m = 18 is 445.31 Hz. For example, if A = 445 Hz is defined as a standard scale based on the absolute pitch name, the base sound corresponding to this order m = 19 is 445.313 Hz, and its error is slight, so this order m = 19. It can be said that the corresponding bass sound may be treated as the A sound.

そして、この次数ｍ＝１９に対応するベース音の周波数445.313HzをＡの音としたとすると、結果的に、この音階に収まる音として扱うことの出来るベース音は次のようになる。
次数ｍ＝１０に対応するベース音（234.38Hz）→Ａ＃
次数ｍ＝１２に対応するベース音（281.25Hz）→Ｃ＃
次数ｍ＝１５に対応するベース音（351.56Hz）→Ｆ
次数ｍ＝１６に対応するベース音（375.00Hz）→Ｆ＃
次数ｍ＝１７に対応するベース音（398.44Hz）→Ｇ
次数ｍ＝１８に対応するベース音（421.88Hz）→Ｇ＃

上記のようにして周波数445.313HzをＡの音とみなすと、図２に平均律近似音周波数として示すように、Ａ＃の音は235.896Hz、Ｃ＃の音は280.529Hz、Ｆの音は353.445Hz、Ｆ＃の音は374.462Hz、Ｇの音は396.728Hz、Ｇ＃の音は420.319Hzとなる。上記した次数ｍ＝１０、ｍ＝１２、ｍ＝１５、ｍ＝１６、ｍ＝１７、ｍ＝１８に対応するベース音の各々は、上記したＡ＃、Ｃ＃、Ｆ、Ｆ＃、Ｇ、Ｇ＃の平均律近似音周波数に近く、従って、それぞれのベース音を、それぞれ、Ａ＃、Ｃ＃、Ｆ、Ｆ＃、Ｇ、Ｇ＃の音としてみなすことができるということになる。
そこで、この図２の場合としては、次数ｍ＝１０に対応するベース音（234.38Hz）を基としてそのオクターブ高調波を合成した音素をＡ＃とし、以下同様に、次数ｍ＝１２に対応するベース音（281.25Hz）を基とする音素をＣ＃、次数ｍ＝１５に対応するベース音（351.56Hz）を基とする音素をＦ、次数ｍ＝１６に対応するベース音（375.00Hz）を基とする音素をＦ＃、次数ｍ＝１７に対応するベース音（398.44Hz）を基とする音素をＧ、次数ｍ＝１８に対応するベース音（421.88Hz）を基とする音素をＧ＃、次数ｍ＝１９に対応するベース音（445.31Hz）を基とする音素をＡとして用いるようにされる。
なお、実際においても、測定音をメロディ的に出力させるという用途のもとでは、このようにして選択された音素により得られる音階については、聴感的に違和感のあるものではないことを確認している。 Assuming that the bass sound frequency 445.313 Hz corresponding to the order m = 19 is the sound of A, as a result, the bass sound that can be handled as a sound that falls within this scale is as follows.
Bass sound corresponding to order m = 10 (234.38Hz) → A #
Bass sound corresponding to order m = 12 (281.25 Hz) → C #
Bass sound corresponding to order m = 15 (351.56Hz) → F
Bass sound corresponding to order m = 16 (375.00Hz) → F #
Bass sound corresponding to order m = 17 (398.44Hz) → G
Bass sound corresponding to order m = 18 (421.88Hz) → G #

If the frequency 445.313 Hz is regarded as the sound of A as described above, the sound of A # is 235.896 Hz, the sound of C # is 280.529 Hz, and the sound of F is 353.445, as shown in FIG. The sound of Hz and F # is 374.462 Hz, the sound of G is 396.728 Hz, and the sound of G # is 420.319 Hz. Each of the bass sounds corresponding to the above orders m = 10, m = 12, m = 15, m = 16, m = 17, m = 18 is A #, C #, F, F #, G, It is close to the average temperament approximate sound frequency of G #. Therefore, each bass sound can be regarded as a sound of A #, C #, F, F #, G, and G #, respectively.
Therefore, in the case of FIG. 2, the phoneme obtained by synthesizing the octave harmonic based on the bass sound (234.38 Hz) corresponding to the order m = 10 is defined as A #, and the same applies to the order m = 12. A phoneme based on the base tone (281.25Hz) is C #, a phoneme based on the base tone (351.56Hz) corresponding to the order m = 15 is F, and a base tone (375.00Hz) corresponding to the order m = 16 is used. The base phoneme is F #, the phoneme based on the base tone (398.44 Hz) corresponding to the order m = 17 is G, and the phoneme based on the base tone (421.88 Hz) corresponding to the order m = 18 is G #. A phoneme based on the bass sound (445.31 Hz) corresponding to the order m = 19 is used as A.
In practice, it is confirmed that the scale obtained by the phonemes selected in this way is not audibly uncomfortable under the application of outputting the measurement sound melodyally. Yes.

図３に、上記図２により説明した手法により選択された、７つの音名Ａ＃、Ｃ＃、Ｆ、Ｆ＃、Ｇ、Ｇ＃、Ａに対応する各音素についての周波数特性を示す。この図から分かるようにして、これらの音素により得られる測定対象としては、最低周波数成分となる音名Ａ＃に対応のベース音（ｋ＝１）の235.896Hzから、最高周波数成分となる音名Ａに対応のオクターブ高調波（ｋ＝６）の14250.00Hzまでの帯域範囲において、４２（＝７×６）の測定対象周波数が、ほぼ均一的に存在しているということがいえる。これは、測定対象の周波数範囲において、測定対象周波数の数が必要充分なだけ存在しており、かつ、その存在が帯域的に偏っていないことを意味する。これにより、例えば先に説明したスピーカのディップなどにもかかわらず、安定的で高い信頼性の測定結果が得られることになる。 FIG. 3 shows frequency characteristics for each phoneme corresponding to the seven pitch names A #, C #, F, F #, G, G #, and A selected by the method described with reference to FIG. As can be seen from this figure, the measurement target obtained by these phonemes is the pitch name that becomes the highest frequency component from 235.896 Hz of the bass sound (k = 1) corresponding to the pitch name A # that becomes the lowest frequency component. It can be said that 42 (= 7 × 6) measurement target frequencies exist almost uniformly in the band range up to 14250.00 Hz of octave harmonics (k = 6) corresponding to A. This means that there are a sufficient number of measurement target frequencies in the frequency range of the measurement target, and the existence thereof is not biased in terms of bandwidth. As a result, for example, a stable and highly reliable measurement result can be obtained in spite of the speaker dip described above.

本実施の形態における音素の選び方としては、上記図２により説明した手法を基本とするものである。しかしながら、例えば図２の説明にそのまま従って音階を形成するのに使用可能とされる音素の音階としては、上記もしているように、平均律１２音階のうち、ほぼ１オクターブ範囲内におけるＡ＃、Ｆ、Ｆ＃、Ｇ、Ｇ＃、Ａの６つの音のみとなる。測定音として音素のシーケンスによりメロディを作成することを考慮すれば、使用できる音階の数は、できるだけ多く得られることが好ましい。 The method of selecting phonemes in the present embodiment is based on the method described with reference to FIG. However, for example, as described above, the scale of the phoneme that can be used to form the scale as it is in accordance with the description of FIG. There are only six sounds F, F #, G, G #, and A. In consideration of creating a melody from a phoneme sequence as a measurement sound, it is preferable to obtain as many musical scales as possible.

そこで、本実施の形態としては、図２により説明した手法を基として、実際には、図４に示すようにして、測定音のメロディとして使用可能な音素を決定するようにされる。
ここでは先ず、図１に示した基本正弦波の１／２の周期の正弦波を、仮想基本正弦波として規定する。そして、この仮想基本正弦波についてのｍ次正弦波として、図４に示す仮想ベース音を規定するようにされる。
この場合には、ｍ次正弦波に基づいて得られる周波数ｆとしては、

ｆ＝（４８０００／４０９６）×ｍ×２^(k-1)・・・（式２）

により表され、上記仮想ベース音は、ｍ次正弦波ごとにｋ＝０を代入して得られる周波数ｆを有するものとなる。また、上記と同様に、ｋ＝１を代入して得られる周波数をベース音としている。つまり、仮想ベース音は、ｋ＝０とされることで上記式２における（２^(k-1)）の項が１／２となるもので、ｋ＝１の基本正弦波に対しては１／２の周波数となる。
そして、ここでは、仮想ベース音に基づいて、ｍ＝１８に対応する105.469Hzから、ｍ＝４３に対応する251.953Hzまでの範囲による２６の周波数を候補としている。 Therefore, in the present embodiment, based on the method described with reference to FIG. 2, actually, phonemes that can be used as the melody of the measurement sound are determined as shown in FIG.
Here, first, a sine wave having a half cycle of the basic sine wave shown in FIG. 1 is defined as a virtual basic sine wave. And the virtual base sound shown in FIG. 4 is prescribed | regulated as an mth-order sine wave about this virtual fundamental sine wave.
In this case, as the frequency f obtained based on the mth-order sine wave,

f = (48000/4096) × m × 2 ^(k−1) (Formula 2)

The virtual bass sound has a frequency f obtained by substituting k = 0 for each m-th order sine wave. Similarly to the above, the frequency obtained by substituting k = 1 is used as the bass sound. In other words, the virtual bass sound is such that the term (2 ^(k−1) ) in Equation 2 is halved by setting k = 0, and is 1 for the fundamental sine wave of k = 1. The frequency is / 2.
Here, based on the virtual bass sound, 26 frequencies in a range from 105.469 Hz corresponding to m = 18 to 251.953 Hz corresponding to m = 43 are set as candidates.

そして、この場合においては、オクターブ高調波としては、各仮想ベース音（ｋ＝０）ごとに、ｋ＝１、ｋ＝２、ｋ＝３、ｋ＝４、ｋ＝５、ｋ＝６に対応する周波数を対応させることとしている。 In this case, the octave harmonics correspond to k = 1, k = 2, k = 3, k = 4, k = 5, and k = 6 for each virtual bass sound (k = 0). The frequency to be used is made to correspond.

ここで、仮想ベース音は、上記のようにして、図１（ａ）に示した本来の基本正弦波に対して２倍の波長（１／２周期）となる仮想正弦波に対するｍ次正弦波であるから、仮想ベース音の周波数として奇数次の正弦波（ｍが奇数の場合）については、サンプル数Ｎに対して整数の周期数で収まらない。また、このｋ＝０による仮想ベース音は、本来の基本正弦波に対して２倍の波長となる仮想正弦波を基として生成するものとしているが、実際の生成処理としては、この仮想正弦波の波形データを使用しないので、基本正弦波を基としては、現実的にも生成され得るものではない。このことに依り、本実施の形態としては、仮想ベース音そのものについては、実際の音素の構成要素からは除外すべきものとなる。 Here, the virtual bass sound is an mth-order sine wave with respect to a virtual sine wave having a wavelength (1/2 period) twice that of the original basic sine wave shown in FIG. Therefore, the odd-order sine wave (when m is an odd number) as the frequency of the virtual bass sound does not fit in an integer number of cycles with respect to the number of samples N. In addition, the virtual bass sound with k = 0 is generated based on a virtual sine wave having a wavelength twice that of the original basic sine wave. Therefore, the waveform data cannot be generated on the basis of the basic sine wave. Therefore, in the present embodiment, the virtual bass sound itself should be excluded from the actual phoneme components.

そして、正弦波の次数ｍごとに対応する音素の要素として、実音として得ることが出来るのは、ｋ＝１以上からのオクターブ高調波となる。従って、音素を成す実のベース音としては、ｋ＝１〜６のうちで、最低値であるｋ＝１のオクターブ高調波とされる。
このｋ＝１のオクターブ高調波となるベース音のリストと、図２に示されたｋ＝１のベース音のリストとを比較してみる。すると、図４の場合には、本来の基本正弦波に対して１／２の周波数となる仮想ベース音の基としていることで、図２に示したｋ＝１のベース音のｍ次の各周波数に加えて、その中間の周波数もベース音として得られていることが分かる。つまり、所定の周波数範囲におけるベース音の数が、ほぼ２倍増加しているものである。 Then, as a phoneme element corresponding to each order m of the sine wave, what can be obtained as a real sound is an octave harmonic from k = 1 or higher. Therefore, an actual bass sound that forms a phoneme is an octave harmonic of k = 1 which is the lowest value among k = 1 to 6.
A comparison is made between the list of bass sounds that are octave harmonics of k = 1 and the list of bass sounds of k = 1 shown in FIG. Then, in the case of FIG. 4, each of the m-th order of the bass sound of k = 1 shown in FIG. 2 is based on the base of the virtual bass sound having a frequency ½ of the original fundamental sine wave. It can be seen that in addition to the frequency, the intermediate frequency is also obtained as the bass sound. That is, the number of bass sounds in a predetermined frequency range is increased almost twice.

そして、この場合には、ｍ＝３８に対応するベース音が445.31Hzとされていることに着目し、絶対音名による音階としてＡ＝445Hzが基準であると規定する。これに応じては、図４に示すベース音（ｋ＝１）の周波数と、Ａ＝445Hz基準とした場合の平均律近似音周波数との関係を比較した結果から、ベース音の周波数と、対応近似絶対音名の欄に示す音階とを対応付けることができる。つまり、
次数ｍ＝１９に対応するベース音（222.656Hz）→Ａ
次数ｍ＝２０に対応するベース音（235.896Hz）→Ａ＃
次数ｍ＝２１に対応するベース音（249.923Hz）→Ｂ
次数ｍ＝２４に対応するベース音（280.529Hz）→Ｃ＃
次数ｍ＝２７に対応するベース音（314.883Hz）→Ｄ＃
次数ｍ＝３０に対応するベース音（353.445Hz）→Ｆ
次数ｍ＝３２に対応するベース音（374.462Hz）→Ｆ＃
次数ｍ＝３４に対応するベース音（396.728Hz）→Ｇ
次数ｍ＝３６に対応するベース音（420.319Hz）→Ｇ＃
次数ｍ＝３８に対応するベース音（445.313Hz）→Ａ
次数ｍ＝４０に対応するベース音（466.164Hz）→Ａ＃
次数ｍ＝４２に対応するベース音（493.883Hz）→Ｂ
として規定することができる。
このようにして、仮想ベース音を想定することで、その１オクターブ上のオクターブ高調波となるベース音の周波数に基づいては、１２平均律音階において、音程の低い方から高い方にかけて、Ａ、Ａ＃、Ｂ、Ｃ＃、Ｄ＃、Ｆ、Ｆ＃、Ｇ、Ｇ＃、Ａ、Ａ＃、Ｂによる１２の音を使用できることになる。つまり、図２による基本的な手法と比較して、メロディ作成に必要な音素の音高の数としても増加されているものである。
なお、確認のために述べておくと、この場合においても、上記１２の音のそれぞれについて、ｋ＝１のベース音に対してｋ＝２〜６までのオクターブ高調波を合成することで１つの音素が生成される点では、図４により説明したとおりである。
また、ここでの仮想ベース音は、式２についてｋ＝０を代入して得られるｍ次正弦波の周波数（ｆ）の正弦波であることとしている。従って、本発明の概念としては、仮想ベース音は、上記図４に示したように基本正弦波のｍ次正弦波に対して１／２の周波数となる正弦波形のみに限定されない。つまり、仮想ベース音としては、変数ｋについて０より小さな任意の負の自然数を代入して得られるｍ次正弦波の周波数である、ということになる。これは換言すれば、仮想ベース音の基音（ｍ＝１）は、図１（ａ）に示した基本正弦波（特定周波数成分）の１／（２^P）（Ｐは自然数である）の周波数を有するものである、ということがいえる。 In this case, paying attention to the fact that the bass sound corresponding to m = 38 is 445.31 Hz, it is defined that A = 445 Hz is the standard as the scale based on the absolute pitch name. Correspondingly, based on the result of comparing the relationship between the frequency of the bass sound (k = 1) shown in FIG. The scale shown in the column of approximate absolute pitch names can be associated. That means
Bass sound corresponding to order m = 19 (222.656Hz) → A
Bass sound corresponding to order m = 20 (235.896Hz) → A #
Bass sound corresponding to order m = 21 (249.923Hz) → B
Bass sound corresponding to order m = 24 (280.529Hz) → C #
Bass sound corresponding to order m = 27 (314.883Hz) → D #
Bass sound corresponding to order m = 30 (353.445Hz) → F
Bass sound corresponding to order m = 32 (374.462Hz) → F #
Bass sound corresponding to order m = 34 (396.728Hz) → G
Bass sound corresponding to order m = 36 (420.319Hz) → G #
Bass sound corresponding to order m = 38 (445.313Hz) → A
Bass sound corresponding to order m = 40 (466.164Hz) → A #
Bass sound corresponding to order m = 42 (493.883Hz) → B
Can be defined as
In this way, assuming a virtual bass sound, based on the frequency of the bass sound, which is an octave harmonic that is one octave higher, in the 12 average temperament scale, A, Twelve sounds of A #, B, C #, D #, F, F #, G, G #, A, A #, and B can be used. That is, as compared with the basic method shown in FIG. 2, the number of phoneme pitches necessary for creating a melody is also increased.
For confirmation, in this case as well, for each of the above 12 sounds, one octave harmonics from k = 2 to 6 are synthesized with k = 1 bass sound. The point at which phonemes are generated is as described with reference to FIG.
The virtual bass sound here is assumed to be a sine wave having the frequency (f) of the m-th order sine wave obtained by substituting k = 0 in Equation 2. Therefore, as a concept of the present invention, the virtual bass sound is not limited to a sine waveform having a frequency ½ of the m-th order sine wave of the basic sine wave as shown in FIG. In other words, the virtual bass sound is the frequency of the m-th order sine wave obtained by substituting an arbitrary negative natural number smaller than 0 for the variable k. In other words, the fundamental tone (m = 1) of the virtual bass tone is 1 / (2 ^P ) (P is a natural number) of the fundamental sine wave (specific frequency component) shown in FIG. It can be said that it is what has.

図５は、上記のようにして測定音のメロディ要素として選択された音素を用いて測定を行う場合の基本的な手順例を模式的に示している。
図５（ａ）には、測定音出力シーケンスが示される。これは、測定音としての音素をスピーカから出力させるために、音素の信号を音声信号出力系に対して出力するタイミングを示している。
この場合の例としては、先ず、期間ｔ０〜ｔ３→期間ｔ３〜ｔ６により、測定音として音高Ｆに対応する音素を２回連続出力させている。ここで、１つの音素は、時系列のサンプル数Ｎに対して整数の周期数があてはまる正弦波の周波数成分から成るから、１つの音素の出力期間（期間ｔ０〜ｔ３、期間ｔ３〜ｔ６）としても、時系列のサンプル数Ｎに対応したものとなる。
また、この場合には、時点ｔ６により音高Ｆの音素の出力を終了させた後に、続けて、音高Ａ＃に対応する音素についても、期間ｔ６〜ｔ９→期間ｔ９〜ｔ１２により２回連続して出力させている。
つまり、ここでは、１つの音高による音素を、サンプル数Ｎによる信号を２回ループさせるようにして出力させることとしている。
なお、本実施の形態としてはサンプル数Ｎ＝４０９６で、サンプリング周波数Ｆｓ＝４８ＫＨｚとしているから、サンプル数Ｎ相当の時間長は、
４０９６／４８０００≒０．０８５（秒）
となる。 FIG. 5 schematically shows a basic procedure example in the case where measurement is performed using the phoneme selected as the melody element of the measurement sound as described above.
FIG. 5A shows a measurement sound output sequence. This indicates the timing of outputting a phoneme signal to the audio signal output system in order to output the phoneme as the measurement sound from the speaker.
As an example in this case, first, a phoneme corresponding to the pitch F is continuously output twice as a measurement sound in a period t0 to t3 → a period t3 to t6. Here, since one phoneme is composed of a frequency component of a sine wave with an integer number of periods corresponding to the number N of time-series samples, an output period of one phoneme (periods t0 to t3, periods t3 to t6). Also corresponds to the number N of time-series samples.
Further, in this case, after the output of the phoneme having the pitch F at the time point t6, the phoneme corresponding to the pitch A # is continuously repeated twice from the period t6 to t9 to the period t9 to t12. Output.
That is, here, a phoneme with one pitch is output by looping a signal with the number of samples N twice.
In this embodiment, since the number of samples N = 4096 and the sampling frequency Fs = 48 KHz, the time length corresponding to the number of samples N is
4096/48000 ≒ 0.085 (seconds)
It becomes.

上記のようにしてスピーカから空間内に出力された音素の音は、しかるべき収音位置に設置されたマイクロフォンに対して図５（ｂ）に示すタイミングにより到達して、この到達音がマイクロフォンにより収音されることになる。
この図５（ｂ）の収音タイミングと、図５（ａ）の測定音出力シーケンスとを比較して分かるように、時点ｔ０から出力された測定音としての音素は、遅延時間Ｔｄを経過した時点ｔ１から、マイクロフォン側での収音が開始されることになる。この遅延時間Ｔｄは、例えば、音素としての信号が音声信号出力系に対して入力されてからスピーカから放音されるまでのいわゆるシステム遅延の時間と、音声がスピーカから出力されてからマイクロフォンに到達するまでの、スピーカとマイクロフォンの位置関係（距離）に応じて生じる空間伝達遅延の時間とから成るものとされる。 The sound of the phoneme output in the space from the speaker as described above reaches the microphone installed at the appropriate sound collection position at the timing shown in FIG. 5B, and this reached sound is transmitted by the microphone. Sound will be collected.
As can be seen by comparing the sound collection timing in FIG. 5B and the measurement sound output sequence in FIG. 5A, the phoneme as the measurement sound output from time t0 has passed the delay time Td. From time t1, sound collection on the microphone side is started. The delay time Td is, for example, a so-called system delay time from when a signal as a phoneme is input to the audio signal output system to when the sound is emitted from the speaker, and reaching the microphone after the audio is output from the speaker. The time of the spatial transmission delay that occurs in accordance with the positional relationship (distance) between the speaker and the microphone until the time is set.

この場合、音高Ｆが対応する音素の収音タイミングとしては、図５（ｂ）に示すようにして、期間ｔ１〜ｔ７となる。なお、この期間ｔ１〜ｔ７の収音期間の時間長は、音高Ｆとしての音素の出力期間ｔ０〜ｔ６に対応する。また、この期間ｔ１〜ｔ７の収音期間は、期間ｔ１〜ｔ４、期間ｔ４〜ｔ７により２等分されるが、この等分された各期間は、サンプル数Ｎに相当する。
また、音高Ａ＃に対応する音素の収音タイミングは、期間ｔ７〜ｔ１３により行われる。この期間ｔ７〜ｔ１３も、サンプル数Ｎに相当するとされる期間ｔ７〜ｔ１０、期間ｔ１０〜ｔ１３により２等分したものとしてみることができる。 In this case, the sound collection timing of the phoneme corresponding to the pitch F is the period t1 to t7 as shown in FIG. Note that the length of the sound collection period of the periods t1 to t7 corresponds to the output periods t0 to t6 of phonemes as the pitch F. Further, the sound collection period of the periods t1 to t7 is divided into two equal parts by the periods t1 to t4 and the periods t4 to t7, and each of the equally divided periods corresponds to the number of samples N.
In addition, the sound collection timing of the phoneme corresponding to the pitch A # is performed during the period t7 to t13. The periods t7 to t13 can also be regarded as being divided into two equal parts by the periods t7 to t10 and the periods t10 to t13, which correspond to the number of samples N.

そして、マイクロフォンにより収音して得られた音声信号について測定するには、この音声信号についてサンプリングを行なって応答信号として得る必要があることになる。このサンプルタイミングを図５（ｃ）に示している。
先ず、期間ｔ０〜ｔ６による２回連続のサンプル数Ｎにより出力した音高Ｆに対応の音素に対応しては、音高Ｆに対応の音素の出力開始時点である時点ｔ０を起点としてサンプル遅延時間Ｔdrs分シフトされた時点ｔ２からサンプリングを開始することとしている。この時点ｔ２から開始されたサンプリングは、時点ｔ２からサンプル数Ｎに対応する時間を経過した時点ｔ５において終了されている。つまり、ここではサンプル数Ｎによるサンプリングを行なうこととしている。また、この期間ｔ２〜ｔ５のタイミングは、音高Ｆに対応の音素の音声が収音される期間ｔ１〜ｔ７内に収まっている。これにより、期間ｔ２〜ｔ５によるサンプリングによっては、音高Ｆに対応の音素を測定対象とするサンプル数Ｎによるサンプリングデータが得られることになる。
また、次のサンプリングタイミングは、音高Ｆの場合と同様にして、音高Ａ＃に対応の音素の出力開始時点である時点ｔ６を起点としてサンプル遅延時間Ｔdrs分シフトされた時点ｔ８からサンプリングを開始する。そして、この時点ｔ１１においてサンプル数Ｎのサンプリングを終了している。これにより、この期間ｔ８〜ｔ１１によるサンプリングによっては、期間ｔ６〜ｔ１２において出力される音高Ａ＃に対応の音素を測定対象とするサンプル数Ｎによるサンプリングデータが得られる。 In order to measure the sound signal obtained by collecting the sound with the microphone, it is necessary to sample the sound signal and obtain it as a response signal. This sample timing is shown in FIG.
First, in response to the phoneme corresponding to the pitch F output by the number of consecutive samples N in the period t0 to t6, the sample delay is started from the time t0 that is the output start time of the phoneme corresponding to the pitch F. Sampling is started from time t2 shifted by time Tdrs. Sampling started from time t2 is ended at time t5 when a time corresponding to the number of samples N has elapsed from time t2. That is, here, sampling is performed by the number N of samples. The timings of the periods t2 to t5 are within the periods t1 to t7 in which the sound of the phonemes corresponding to the pitch F is collected. Thereby, depending on the sampling in the periods t2 to t5, sampling data based on the number N of samples whose phonemes corresponding to the pitch F are to be measured can be obtained.
In the same manner as in the case of the pitch F, the next sampling timing is sampled from the time point t8 shifted by the sample delay time Tdrs from the time point t6 that is the output start time of the phoneme corresponding to the pitch A #. Start. At this time t11, sampling of the number N of samples is finished. Thereby, depending on the sampling in the period t8 to t11, sampling data based on the number N of samples with the phoneme corresponding to the pitch A # output in the period t6 to t12 as the measurement target is obtained.

ここで、図５においてサンプル遅延時間Ｔdrsは、或る音素が出力開始された時点から、この音素を測定対象とするサンプリングデータを得るためのサンプリング期間の開始時点に対応し、サンプリング期間のタイミングを決定するものとされる。
このサンプル遅延時間Ｔdrsは、測定対象とする音素のみが確実にサンプリングできるサンプリング期間が得られるようにして設定すべきものとなる。例えば図５における音高Ｆに対応の音素に対応させて考えれば、サンプリング期間ｔ２〜ｔ５は、音高Ｆに対応の音素のみが測定対象として確実にサンプリングされ、例えば時点ｔ１以前の測定音の無いときであるとか、若しくは時点ｔ７以降に収音される音高Ａ＃に対応の音素など、測定対象外となる音素についてはサンプリングされることがないように、確実に期間ｔ１〜ｔ７に収まるようにして設定されるべきものとなる。この場合には、音高Ａ＃に対応の音素に対応するサンプリング期間ｔ８〜ｔ１１としても、音高Ｆに対応の音素に対応する場合と同じ時間長によるサンプル遅延時間Ｔdrsにより決定されており、これにより、期間ｔ７〜ｔ１３により収音音声信号として得られる、音高Ａ＃に対応の音素のみを測定対象として得ることが出来るようになっている。
また、実際においてサンプル遅延時間Ｔdrsは、本実施の形態の音響補正装置が使用される環境を想定して、その環境において生じるとされる遅延時間Ｔｄを推定して求め、この求められた遅延時間Ｔｄに基づいて設定することができる。例えば、音響補正装置が車載オーディオシステムに対応するものであるとすれば、一般的な自動車内の環境から、１つの遅延時間Ｔｄを求めることが可能である。 Here, in FIG. 5, the sample delay time Tdrs corresponds to the start time of the sampling period for obtaining the sampling data for measuring the phoneme from the time when the output of a certain phoneme is started, and the timing of the sampling period is It will be decided.
This sample delay time Tdrs should be set so as to obtain a sampling period in which only the phonemes to be measured can be reliably sampled. For example, considering the correspondence to the phoneme corresponding to the pitch F in FIG. 5, only the phoneme corresponding to the pitch F is reliably sampled as the measurement object during the sampling period t2 to t5. The phonemes that are not subject to measurement, such as phonemes corresponding to the pitch A # collected after the time point t7, are surely fall within the period t1 to t7 so as not to be sampled. Thus, it should be set. In this case, the sampling period t8 to t11 corresponding to the phoneme corresponding to the pitch A # is also determined by the sample delay time Tdrs having the same time length as that corresponding to the phoneme corresponding to the pitch F. Thereby, only the phoneme corresponding to the pitch A #, which is obtained as the collected sound signal in the period t7 to t13, can be obtained as the measurement target.
Further, in actuality, the sample delay time Tdrs is obtained by estimating the delay time Td assumed to occur in the environment assuming the environment in which the acoustic correction apparatus of the present embodiment is used, and the obtained delay time. It can be set based on Td. For example, if the acoustic correction apparatus is compatible with an in-vehicle audio system, it is possible to obtain one delay time Td from a general environment in a car.

なお、例えば期間ｔ２〜ｔ５のサンプリング期間によりサンプリングされる音声信号としては、サンプル数Ｎの連続点である時点ｔ４を境界にして、前半と後半のサンプル数Ｎの期間を含むことになるが、サンプル数Ｎによるサンプリングが行われることで、サンプリングデータとしては、サンプル数Ｎに対して整数の周期数により収まる周波数成分のみが得られることになる。つまり、メインローブに対してサイドローブの生じない周波数解析結果が得られる。ちなみに、サンプル数Ｎによるサンプリングを行ったとしても、測定対象とすべきでない音素がサンプリングされた場合（図５の場合であれば、例えば期間ｔ２〜ｔ５のサンプリング期間に時点ｔ７が含まれることで、前半では音高Ｆに対応の音素をサンプリングし、後半では音高Ａ＃に対応の音素をサンプリングしたような場合）には、サイドローブが生じてしまうことになる。
また、このことから、１回のサンプリング期間に対して、これに対応する音素の出力期間のほうが多くなければならないことも分かる。本実施の形態の場合、音素の出力期間及びサンプリング期間は、時系列のサンプル数Ｎを最小単位とするものとなる。そのうえで、上記したサンプリング期間と音素の出力期間の関係を満たすこととすれば、サンプリング期間をサンプル数Ｎ×ａ（ａは自然数）で表した場合には、これに対応する音素の出力期間としては、サンプル数Ｎ×（ａ＋ｂ）（変数ｂは１以上の自然数）を設定することとなる。 Note that, for example, the audio signal sampled in the sampling period of the period t2 to t5 includes a period of the first half and the second half of the sample number N, with the time point t4 being a continuous point of the sample number N as a boundary. By sampling by the number of samples N, only frequency components that can be accommodated by an integer number of cycles with respect to the number of samples N are obtained as sampling data. That is, a frequency analysis result in which no side lobe is generated with respect to the main lobe is obtained. By the way, even when sampling is performed with the number of samples N, if a phoneme that should not be measured is sampled (in the case of FIG. If the phoneme corresponding to the pitch F is sampled in the first half and the phoneme corresponding to the pitch A # is sampled in the second half), side lobes will occur.
This also shows that the output period of phonemes corresponding to one sampling period must be longer. In the case of the present embodiment, the phoneme output period and sampling period have the minimum number of time-series samples N. In addition, if the relationship between the sampling period and the phoneme output period is satisfied, when the sampling period is expressed by the number of samples N × a (a is a natural number), the corresponding phoneme output period is as follows. The number of samples N × (a + b) (the variable b is a natural number of 1 or more) is set.

図６は、上記図５に示す手順によりサンプリングされた応答信号についてＦＦＴによる周波数解析を行ったことで得られた帯域特性の例を模式的に示している。この場合には、例えば１つの音高に対応する音素のみによる単音の測定対象音についてサンプリングしてＦＦＴにより解析した結果例を示している。
単音の音素による測定対象音を収音、サンプリングしてＦＦＴを実行したとされると、図示するようにして、ベース音（ｋ＝１）、第２次オクターブ高調波（ｋ＝２）、第３次オクターブ高調波（ｋ＝３）、第４次オクターブ高調波（ｋ＝４）、第５次オクターブ高調波（ｋ＝５）、第６次オクターブ高調波（ｋ＝６）について何らかの振幅値が得られることになる。 FIG. 6 schematically shows an example of the band characteristics obtained by performing frequency analysis by FFT on the response signal sampled by the procedure shown in FIG. In this case, for example, the result of sampling and analyzing by FFT for a single measurement target sound using only phonemes corresponding to one pitch is shown.
Assuming that the sound to be measured by a single phoneme is collected and sampled and FFT is performed, as shown in the figure, the base sound (k = 1), the second-order octave harmonic (k = 2), the second Some amplitude values for the third-order octave harmonic (k = 3), the fourth-order octave harmonic (k = 4), the fifth-order octave harmonic (k = 5), and the sixth-order octave harmonic (k = 6) Will be obtained.

ここで、本実施の形態では、サンプル数Ｎに対して整数の周期数で収まる正弦波を音素とする測定音を出力させて収音し、この収音した音素の音声信号について、同じくサンプル数Ｎによりサンプリングすることとしている。従って、これまでの説明から理解されるように、例えばサンプリングデータが音素のみによる理想的な音声信号であると仮定すると、ＦＦＴによる周波数解析結果としては、音素を形成する測定対象周波数がメインローブとして値を持つのみでサイドローブは発生しないことになる。 Here, in the present embodiment, a measurement sound having a sine wave that falls within an integer number of cycles with respect to the number of samples N is output and collected, and the sound signal of the collected phonemes is similarly sampled. Sampling is performed by N. Therefore, as can be understood from the above description, for example, assuming that the sampling data is an ideal audio signal using only phonemes, the frequency analysis result by FFT indicates that the measurement target frequency forming the phoneme is the main lobe. It only has a value and no side lobe is generated.

しかしながら、実際のＦＦＴによる周波数解析結果としては、例えば図６に示すようにして、ベース音、及び各次のオクターブ高調波とされる測定対象周波数の周囲の周波数にて振幅が検出されるような状態となる。音素のみの信号についてＦＦＴによる周波数解析を行えば、その音素を形成する周波数以外の振幅は存在しないはずであるから、測定対象周波数以外の周波数の振幅は、測定環境におけるいわゆる暗騒音であるとして考えて良いことになる。本実施の形態では、前述もしたように、このような解析結果を、窓関数処理を施すことなく得ることができる。 However, as an actual frequency analysis result by FFT, for example, as shown in FIG. 6, the amplitude is detected at the frequency around the measurement target frequency, which is the base sound and the next octave harmonic. It becomes a state. If frequency analysis is performed by FFT on a phoneme-only signal, there should be no amplitude other than the frequency that forms the phoneme, so the amplitude of the frequency other than the frequency to be measured is considered to be so-called background noise in the measurement environment. It will be good. In the present embodiment, as described above, such an analysis result can be obtained without performing window function processing.

そして、図６に示す解析結果に基づいて、例えば本実施の形態としては測定対象周波数とその近隣周波数に存在する暗騒音のレベルについての比を求めるようにされる。つまり、測定対象周波数の振幅を信号（Ｓ）とし、暗騒音の振幅をノイズ（Ｎ）としてＳ／Ｎ比を得るようにされる。
ここでのＳ／Ｎ比の算出の手法としては、測定対象周波数の振幅とし、暗騒音の振幅とに基づく限り特に限定されるべきものではないが、ここでは、測定対象周波数と比較すべきノイズレベルとして、測定対象周波数の近隣周波数において振幅値が最も大きい暗騒音の周波数を採用することとする。例えば、図６に示すベース音を例に挙げれば、ベース音の振幅値がＬ１であるとして、その近隣周波数の暗騒音の振幅値としては、図示するようにして、ベース音よりも低い周波数側ではＬ２ａで、ベース音よりも高い周波数側では、この振幅値Ｌ２ａよりも高いＬ２であったとする。このときには、Ｓ／Ｎ比算出のための暗騒音の振幅としてＬ２のほうを採用し、例えばＬ２／Ｌ１の演算を行ってＳ／Ｎ比を得るようにされる。
このようなＳ／Ｎ比の算出は、例えばベース音以外の各次のオクターブ高調波についても同様にして行うようにされる。これにより、ベース音と、第２次〜第６次高調波のそれぞれに対応する６つの対象周波数帯域におけるＳ／Ｎ比の情報を得ることが出来る。 Based on the analysis result shown in FIG. 6, for example, in this embodiment, the ratio between the measurement target frequency and the level of background noise existing in the neighboring frequency is obtained. That is, the S / N ratio is obtained by setting the amplitude of the measurement target frequency as the signal (S) and the amplitude of the background noise as the noise (N).
The method for calculating the S / N ratio here is not particularly limited as long as it is based on the amplitude of the measurement target frequency and the amplitude of the background noise, but here the noise to be compared with the measurement target frequency As the level, the frequency of background noise having the largest amplitude value in the vicinity frequency of the measurement target frequency is adopted. For example, taking the bass sound shown in FIG. 6 as an example, assuming that the amplitude value of the bass sound is L1, the amplitude value of the background noise of the neighboring frequency is lower than the bass sound as shown in the figure. Let L2a be L2 higher than the amplitude value L2a on the frequency side higher than the bass sound. At this time, L2 is adopted as the amplitude of the background noise for calculating the S / N ratio, and for example, the S / N ratio is obtained by calculating L2 / L1.
Such calculation of the S / N ratio is performed in the same manner for each order octave harmonic other than the bass sound, for example. Thereby, it is possible to obtain information on the S / N ratio in the six target frequency bands corresponding to the bass sound and the second to sixth harmonics.

なお、上記以外のＳ／Ｎ比（ノイズの評価）を得るための手法としては、１つには、各対象周波数の振幅値に対して、Log（指数関数）対応による重み付けをした後に、ノイズの周波数の振幅値との比較を行なうようにすることが挙げられる。このとき、重み付けとなる係数は、対象周波数ごとに適合させて所定規則に従って変更して良い。
あるいは、対象周波数の近隣周波数となるノイズの振幅値についての平均値を得て、この平均値と対象周波数の振幅値とによってＳ／Ｎ比を算出することも考えられる。
また、Ｓ／Ｎ比算出にあたり、ｄＢ値としての振幅値により比較するのではなく、リニア軸で比較する手法を採ることも考えられる。 As a method for obtaining an S / N ratio (evaluation of noise) other than the above, one of the methods is to weight the amplitude value of each target frequency according to Log (exponential function), and then perform noise reduction. The comparison with the amplitude value of the frequency is mentioned. At this time, the weighting coefficient may be changed according to a predetermined rule by adapting for each target frequency.
Alternatively, it is also conceivable to obtain an average value for the amplitude value of noise that is a neighboring frequency of the target frequency, and calculate the S / N ratio from this average value and the amplitude value of the target frequency.
Further, in calculating the S / N ratio, it is conceivable to employ a method of comparing with a linear axis instead of comparing with an amplitude value as a dB value.

先に図４により説明した手法によると、測定音をメロディ的に出力させるのにあたっては、全部で１２の音高に対応する音素が得られることになる。そして、実際において、測定音によるメロディ（測定音メロディ）を作成しようとした場合には、上記１２の音高に対応する音素のなかから、任意の音高に対応する音素を選択して組み合わせればよい、ということになる。 According to the method described above with reference to FIG. 4, phonemes corresponding to a total of 12 pitches are obtained in outputting the measurement sound in a melody manner. Actually, when creating a melody by measurement sound (measurement sound melody), a phoneme corresponding to an arbitrary pitch can be selected from the phonemes corresponding to the above 12 pitches and combined. That's right.

図７は、図４により説明した手法により選択された１２の音高に対応する音素を選択候補として測定音メロディを作成した場合の、音素の出力パターン例を示している。
この場合、図７に示す１単位の測定音メロディ出力期間は、時間経過に従って、第１次解析モード、第２次解析モード、及び非解析モードとに分けられる。また、ここでの１回の音素の出力期間はＴａは、先に図５により説明した場合と同様にして、サンプル数Ｎが２回連続するものとなる。この出力期間Ｔａの時間は、サンプル数Ｎ＝４０９６で、サンプリング周波数Ｆｓ＝４８ＫＨｚとされる本実施の形態の場合、
４０９６／４８０００×２＝０．１７（秒）
となる。
また、この測定音メロディの出力に対応したサンプリングタイミング（サンプリング期間）としても、図５により説明したようにしてサンプル数Ｎによりサンプリングを行なうものとされ、また、同じく図５により説明したようにして決定したサンプル遅延時間Ｔdrsに応じて決まるものとなる。つまり、ここでは、出力期間Ｔａごとに出力される音素のみがサンプリングされ、その前後の出力期間Ｔａに出力される音素はサンプリングしないようにして、サンプリングタイミングを設定する。 FIG. 7 shows an example of an output pattern of phonemes when a measurement sound melody is created using phonemes corresponding to twelve pitches selected by the method described with reference to FIG. 4 as selection candidates.
In this case, the measurement sound melody output period of one unit shown in FIG. 7 is divided into a primary analysis mode, a secondary analysis mode, and a non-analysis mode as time elapses. Further, in the output period of one phoneme here, the number of samples N is continuous twice as in the case described with reference to FIG. In the case of this embodiment in which the output period Ta is the number of samples N = 4096 and the sampling frequency Fs = 48 KHz,
4096/48000 × 2 = 0.17 (seconds)
It becomes.
Also, the sampling timing (sampling period) corresponding to the output of the measurement sound melody is assumed to be sampled by the number of samples N as described with reference to FIG. 5, and also as described with reference to FIG. This is determined according to the determined sample delay time Tdrs. That is, here, the sampling timing is set so that only the phonemes output in each output period Ta are sampled, and the phonemes output in the preceding and subsequent output periods Ta are not sampled.

また、図７において、音素が出力される期間においては、その音素の音声を出力させるべきものとして選択された、測定対象のスピーカのチャネルを示している。スピーカのチャンネルは、ここでは、センターチャンネル（Ｃ）、フロント左チャンネル（Ｌ）、フロント右チャンネル（Ｒ）、左サラウンドチャンネル（Ｌｓ）、右サラウンドチャンネル（Ｒｓ）、左バックサラウンドチャンネル（Ｂｓｌ）、右バックサラウンドチャンネル（Ｂｓｒ）の７チャンネルとなっている。つまり、本実施の形態の音響補正装置の実際として対応可能な最大チャンネル構成は、このような７チャンネル構成のオーディオシステムとなる。 Further, in FIG. 7, during the period in which the phoneme is output, the channel of the speaker to be measured that is selected as the output of the sound of the phoneme is shown. Here, the speaker channels are center channel (C), front left channel (L), front right channel (R), left surround channel (Ls), right surround channel (Rs), left back surround channel (Bsl), There are 7 right back surround channels (Bsr). That is, the maximum channel configuration that can actually be handled by the acoustic correction apparatus according to this embodiment is an audio system having such a 7-channel configuration.

図７に示す測定音の出力シーケンスによると、先ず、第１次解析モードにおいては、出力期間Ｔａが４回連続している。そして、先ず１回目の出力期間Ｔａにおいては、音高Ｇ＃に対応するとされる音素のみをセンターチャンネル（Ｃ）により出力させている。次の２回目の出力期間Ｔａでは、フロント左チャンネル（Ｌ）から音高Ｆに対応の音素と、フロント右チャンネル（Ｒ）から音高Ｇ＃に対応の音素を出力させている。次の３回目の出力期間Ｔａでは、左サラウンドチャンネル（Ｌｓ）から音高Ｃ＃に対応の音素と、右サラウンドチャンネル（Ｒｓ）から音高Ｆ＃に対応の音素を出力させている。最後の４回目の出力期間Ｔａでは、左バックサラウンドチャンネル（Ｂｓｌ）から音高Ｃ＃に対応する音素と、右バックサラウンドチャンネル（Ｂｓｒ）から音高Ｇ＃に対応する音素を出力させている。 According to the measurement sound output sequence shown in FIG. 7, first, in the primary analysis mode, the output period Ta continues four times. First, in the first output period Ta, only the phonemes corresponding to the pitch G # are output by the center channel (C). In the next second output period Ta, phonemes corresponding to the pitch F from the front left channel (L) and phonemes corresponding to the pitch G # are output from the front right channel (R). In the next third output period Ta, phonemes corresponding to the pitch C # are output from the left surround channel (Ls), and phonemes corresponding to the pitch F # are output from the right surround channel (Rs). In the final fourth output period Ta, phonemes corresponding to the pitch C # are output from the left back surround channel (Bsl), and phonemes corresponding to the pitch G # are output from the right back surround channel (Bsr).

また、第２次解析モードとしても、出力期間Ｔａが４回連続して成るものとされ、各出力期間ごとに、図示するようにして、特定の音高に対応する音素が、特定のスピーカのチャンネルから出力されるようになっている。
また、この場合には、非解析モードとしても出力期間Ｔａを４回連続させており、各出力期間ごとに、図示するようにして、特定の音高に対応する音素を特定のスピーカのチャンネルから出力させている。 Also, in the secondary analysis mode, the output period Ta is assumed to be continuous four times. For each output period, as shown in the figure, a phoneme corresponding to a specific pitch is output from a specific speaker. Output from the channel.
In this case, the output period Ta is continued four times even in the non-analysis mode, and for each output period, a phoneme corresponding to a specific pitch is obtained from a specific speaker channel as shown in the figure. It is output.

この図７に示す出力シーケンスによると、例えば先ず、第１次解析モード及び第２次モードの各段階において、７つのチャンネルに対応する各スピーカに対応させて何らかの音高による測定音（音素）を出力させていることになる。これにより、本実施の形態の音響補正装置として対応可能なチャネル構成の範囲のもとで、第１次解析モードと第２次解析モードとで共に、全てのスピーカについて漏らさず測定することができる。 According to the output sequence shown in FIG. 7, for example, first, in each stage of the primary analysis mode and the secondary mode, a measurement sound (phoneme) with some pitch is associated with each speaker corresponding to seven channels. It will be output. As a result, it is possible to perform measurement without leaking all the speakers in both the primary analysis mode and the secondary analysis mode within the range of the channel configuration that can be handled as the acoustic correction apparatus of the present embodiment. .

また、出力期間Ｔａによっては、複数のスピーカからそれぞれ異なる音高に対応する音素を出力させるようにしている。つまり、空間上での聞こえ方としては、和音となる。このようにして本実施の形態としては、音素について、時間方向と音階方向の両方向において組み合わせを行って所要の出力パターンを得ることで、測定音の出力を音楽的なものとしているものである。
なお、測定音である音素の出力が和音の状態になったとしても、この音声を収音してＦＦＴによる周波数解析を行えば、和音を形成する各音素を形成する周波数成分（ベース音及びオクターブ高調波）の振幅を得ることができるので、測定処理には何ら支障はない。
また、このようにして、和音として出力される期間があることで、測定音により形成されるメロディとしては、より音楽性を帯びたものとすることができ、ユーザにとっての娯楽性も増すことになる。 Depending on the output period Ta, phonemes corresponding to different pitches are output from a plurality of speakers. In other words, it is a chord as a way of hearing in space. In this way, in the present embodiment, the output of the measurement sound is made musical by combining the phonemes in both the time direction and the scale direction to obtain a required output pattern.
Even if the output of the phoneme that is the measurement sound is in the chord state, if the sound is picked up and subjected to frequency analysis by FFT, the frequency components that form each phoneme that forms the chord (the base sound and the octave) Harmonics) can be obtained, so there is no problem in the measurement process.
In addition, since there is a period of output as a chord in this way, the melody formed by the measurement sound can be more musical and the entertainment for the user is also increased. Become.

なお、第１次解析モードでは、各スピーカから出力させた測定対象の音素の周波数解析（ＦＦＴ）の結果に基づいて、第２次解析モードにおいて各スピーカから出力させるべき音素のレベルを決定するようにされる。これにより、第２次解析モードでは、準備測定にとって適切とされるレベルにより、各スピーカから測定音（音素）を出力させることができる。そして、第２次解析モードとしても、図７に示すようにして各スピーカから測定対象の音素を出力させて周波数解析（ＦＦＴ）を行い、これらの解析結果に基づいて、準備測定としての測定結果を得るようにされる。 In the primary analysis mode, the phoneme level to be output from each speaker in the secondary analysis mode is determined based on the result of frequency analysis (FFT) of the phoneme to be measured output from each speaker. To be. Thereby, in the secondary analysis mode, the measurement sound (phoneme) can be output from each speaker at a level appropriate for the preparation measurement. And also as a secondary analysis mode, as shown in FIG. 7, the phoneme of a measuring object is output from each speaker, a frequency analysis (FFT) is performed, and the measurement result as a preparatory measurement based on these analysis results To get to.

これら第１次解析モードと第２次解析モードとしての処理により測定結果を得るのにあたっては、例えば、先に図６により説明したように、測定対象周波数の振幅値と、その周辺周波数に在る暗騒音の振幅値に基づいて算出したＳ／Ｎ比の値を利用することができる。
Ｓ／Ｎ比の情報に基づいては、測定結果として、下記のようにして各種の判断や設定などを行うことができる。
先ず、スピーカごとに出力させた音素を形成する各周波数成分に対応するＳ／Ｎ比の値を総合的に使用することで、そのスピーカが有する再生周波数帯域特性を推定できる。また、スピーカの口径サイズに対応しては、一定の入力レベルに対する出力音圧レベルも変化するので、スピーカの口径サイズも推定することができる。また、当然のこととして、例えば、或るスピーカを対象として、充分なゲインにより音素を出力させたのにもかかわらず、この音素の応答信号から解析されたＳ／Ｎ比が一定以下で、信号（Ｓ）のレベルがほとんど得られないとされる程度に小さいとみなされるような場合には、そのスピーカは接続されていないということを判定できることになる。つまり、オーディオシステムのオーディオチャンネル構成を判定できる。
また、本実施の形態としては、本測定に対して事前段階となる準備測定に適用する場合を例に挙げているが、この本測定によりさらに正確な周波数応答を得るために、適切な測定音（本測定の場合、本実施の形態の音素による測定音とは限らない）のレベル（ゲイン）を推定して、設定することができる。また、第１次解析モードでの処理として、第２次解析モードにおいて各スピーカから出力させるべき音素についての周波数成分の合成バランスや、音素の出力レベル（ゲイン）を設定することにも利用できる。
また、例えばノイズの振幅値が非常に大きいことで、Ｓ／Ｎ比が一定以下となっているような場合であれば、有意な測定結果が得られる環境ではないと判定することができる。このような判定結果を得たのに応じては、例えば測定を中断し、ユーザに対してリスニング環境の改善を促すようなメッセージを、例えば表示などによって出力させるというような動作に移行させることが考えられる。 In obtaining the measurement result by the processing in the primary analysis mode and the secondary analysis mode, for example, as described above with reference to FIG. 6, the measurement target frequency is in the amplitude value and its peripheral frequency. A value of the S / N ratio calculated based on the amplitude value of the background noise can be used.
Based on the information on the S / N ratio, various determinations and settings can be made as measurement results as follows.
First, by comprehensively using the value of the S / N ratio corresponding to each frequency component forming the phoneme output for each speaker, the reproduction frequency band characteristic of the speaker can be estimated. Further, since the output sound pressure level with respect to a certain input level changes corresponding to the aperture size of the speaker, the aperture size of the speaker can also be estimated. Also, as a matter of course, for example, even though a phoneme is output with a sufficient gain for a certain speaker, the S / N ratio analyzed from the response signal of this phoneme is less than a certain value, When it is considered that the level of (S) is hardly obtained, it can be determined that the speaker is not connected. That is, the audio channel configuration of the audio system can be determined.
In addition, as an example of the present embodiment, a case where it is applied to a preliminary measurement that is a preliminary step for the main measurement is described. It is possible to estimate and set the level (gain) of (in the case of the actual measurement, not necessarily the measurement sound by the phoneme of the present embodiment). Further, as a process in the primary analysis mode, it can be used to set a synthesis balance of frequency components and a phoneme output level (gain) for phonemes to be output from each speaker in the secondary analysis mode.
Further, for example, if the noise amplitude value is very large and the S / N ratio is below a certain level, it can be determined that the environment does not provide a significant measurement result. In response to obtaining such a determination result, for example, the measurement may be interrupted, and a message that prompts the user to improve the listening environment may be transferred to, for example, display. Conceivable.

また、図７において、第２次解析モードに続く非解析モードでは、図示するようにして、４回連続する出力期間Ｔａにわたって、センターチャンネル（Ｃ）、フロント左チャンネル（Ｌ）、フロント右チャンネル（Ｒ）の３つのスピーカから音高Ｇ＃としての音素を出力させている。また、同時に、左サラウンドチャンネル（Ｌｓ）、右サラウンドチャンネル（Ｒｓ）の各スピーカから音高Ｆに対応の音素を出力させ、左バックサラウンドチャンネル（Ｂｓｌ）及び右バックサラウンドチャンネル（Ｂｓｒ）の各スピーカから音高Ｃ＃に対応の音素を出力させている。
この非解析モードでは、上記のように出力させている音素についての応答信号をサンプリングすることは行わない。つまり、非解析モードでは、このときに出力させている音素に基づいた周波数解析及び測定を行っていない。
ここで、第１次解析モード、第２次解析モード、及び非解析モードの連続により形成される測定音メロディ出力期間において、７チャンネルのスピーカから出力される音声としては、図７の音素の出力パターンから理解されるように、出力期間Ｔａの時間を最小音符として音程が変化する、メロディ的なものとなっているが、非解析モードでは、Ｃ＃、Ｆ、Ｇ＃による三和音が全音符的に出力されることで、メロディとして終止感が感じられるようにしている。つまり、非解析モードとは、実際に測定に用いられはしないが、測定音メロディについて音楽的な要素を高めることを目的として音素を出力するものとなる。このことから、本実施の形態としては、スピーカから出力させる全ての音素について必ずしも応答信号としてサンプリングして解析する必要は無いということがいえる。 In FIG. 7, in the non-analysis mode following the secondary analysis mode, as shown in the figure, the center channel (C), the front left channel (L), the front right channel ( A phoneme as a pitch G # is output from the three speakers R). At the same time, phonemes corresponding to the pitch F are output from the speakers of the left surround channel (Ls) and the right surround channel (Rs), and the speakers of the left back surround channel (Bsl) and the right back surround channel (Bsr) are output. To output a phoneme corresponding to the pitch C #.
In this non-analysis mode, the response signal for the phoneme output as described above is not sampled. That is, in the non-analysis mode, frequency analysis and measurement based on the phonemes output at this time are not performed.
Here, in the measurement sound melody output period formed by the continuation of the primary analysis mode, the secondary analysis mode, and the non-analysis mode, the output of the phonemes in FIG. As can be seen from the pattern, the pitch changes with the time of the output period Ta as the minimum note, but in the non-analysis mode, the triads by C #, F, and G # are all notes. By outputting the signal steadily, the user can feel the end as a melody. That is, the non-analysis mode is not actually used for measurement, but outputs a phoneme for the purpose of enhancing musical elements of the measurement sound melody. From this, it can be said that in this embodiment, it is not always necessary to sample and analyze all the phonemes output from the speaker as response signals.

図８は、図７に示した測定音メロディの出力シーケンスに従った準備測定としての処理の流れをフローチャートにより示している。
この図に示す手順としては、先ず、ステップＳ１０１により暗騒音をチェックすることとしている。この暗騒音のチェックを行うのにあたっては、音素を出力させないようにしておき、このときにマイクロフォンにより収音されている音声信号をサンプリングして周波数解析（ＦＦＴ）を行うようにされる。これにより、先ず、暗騒音の振幅をみることで、暗騒音の有無がチェックできることになる。一般的なリスニング環境で、暗騒音が全く存在しないということはあり得ない。そこで、このステップＳ１０１による暗騒音チェックの結果として、暗騒音が存在しないということが認識されれば、これは収音用のマイクロフォンが音響補正装置に対して接続されていない状況にあると推定して良いことになる。そこで、実施の形態の音響補正装置の実際としては、例えばステップＳ１０１による暗騒音チェックの結果として、暗騒音が存在しないという判定結果を得たときには、例えばユーザに対してマイクロフォンの接続を促すようなメッセージを表示、音声などによって出力するようにされる。そして、暗騒音チェックの結果として、暗騒音が存在していることが判定されたのであれば、マイクロフォンが接続されていることになるので、ステップＳ１０２に以降の手順に進むことになる。 FIG. 8 is a flowchart showing the flow of processing as a preparatory measurement according to the measurement sound melody output sequence shown in FIG.
In the procedure shown in this figure, first, background noise is checked in step S101. When the background noise is checked, the phoneme is not output, and the audio signal collected by the microphone at this time is sampled to perform frequency analysis (FFT). Thereby, first, the presence or absence of background noise can be checked by looking at the amplitude of background noise. In a typical listening environment, there can be no background noise at all. Therefore, if it is recognized that there is no background noise as a result of the background noise check in step S101, it is estimated that the sound collecting microphone is not connected to the sound correcting device. It will be good. Therefore, as an actual example of the acoustic correction apparatus according to the embodiment, for example, when a determination result indicating that there is no background noise is obtained as a result of the background noise check in step S101, for example, the user is prompted to connect a microphone. Messages are displayed and output by voice. If it is determined that the background noise exists as a result of the background noise check, the microphone is connected, and the process proceeds to step S102.

ステップＳ１０２の手順は、第１次解析モードの最初の出力期間Ｔａに対応する。つまり、音高Ｇ＃に対応する音素を、センターチャンネル（Ｃ）のスピーカから出力させる手順となる。このためには、先ず、サンプル数Ｎ（＝４０９６）による音素Ｇ＃の音素を生成するようにされる。そして、この生成した音素を、２回ループさせるようにして連続的に出力する。これにより、音高Ｇ＃に対応する音素としての音声信号は、サンプル数Ｎの２倍に対応する時間長、つまり、出力期間Ｔａに相当する時間長により再生出力される。
次のステップＳ１０３は、上記ステップＳ１０２により出力させた音素に対応して、第１次解析モードとしての測定処理を実行する手順となる。つまり、ステップＳ１０２による音素の出力時点からサンプル遅延時間Ｔdrsを経過したとされるタイミングでサンプリングを行って応答信号を得る。そして、この応答信号についてＦＦＴを行い、先に図６に示したようにしてＳ／Ｎ比を算出し、さらにこのＳ／Ｎ比に基づいて所要の判断結果を下し、あるいは設定を行なう。つまり第１次解析処理対応の測定処理を行ってその測定結果を得る。例えば、ステップＳ１０３にて得た応答信号はセンターチャンネル（Ｃ）のスピーカから出力されたものであるので、次の第２次解析モードのときに、センターチャンネル（Ｃ(ch)）のスピーカから出力させるべき測定音の音圧レベルに応じた音声信号のゲインを設定する。 The procedure of step S102 corresponds to the first output period Ta in the primary analysis mode. That is, the phoneme corresponding to the pitch G # is output from the speaker of the center channel (C). For this purpose, first, phonemes of phonemes G # with the number of samples N (= 4096) are generated. The generated phonemes are continuously output so as to be looped twice. Thereby, the audio signal as a phoneme corresponding to the pitch G # is reproduced and output with a time length corresponding to twice the number of samples N, that is, a time length corresponding to the output period Ta.
The next step S103 is a procedure for executing the measurement process as the first analysis mode in correspondence with the phoneme output in step S102. That is, a response signal is obtained by performing sampling at the timing when the sample delay time Tdrs has elapsed from the output time of the phoneme in step S102. Then, FFT is performed on the response signal, the S / N ratio is calculated as shown in FIG. 6, and a required determination result is made or set based on the S / N ratio. That is, the measurement result corresponding to the primary analysis process is performed to obtain the measurement result. For example, since the response signal obtained in step S103 is output from the center channel (C) speaker, it is output from the center channel (C (ch)) speaker in the next secondary analysis mode. The gain of the audio signal is set according to the sound pressure level of the measurement sound to be generated.

ステップＳ１０４は、第１次解析モードの２回目の出力期間Ｔａに対応するもので、ステップＳ１０２に準じて、音高Ｆ、Ｇ＃にそれぞれ対応する２つの音素（サンプル数Ｎ）を生成して、それぞれ、２回ループさせるようにして、フロント左チャンネル（Ｌ）、フロント右チャンネル（Ｒ）から出力させる。
ステップＳ１０５では、ステップＳ１０３に準じて、上記ステップＳ１０４により出力された音素をサンプリングして、第１次解析モードとしての測定処理を実行して測定結果を得る。 Step S104 corresponds to the second output period Ta in the primary analysis mode, and generates two phonemes (number of samples N) respectively corresponding to the pitches F and G # according to step S102. In this case, the signals are output from the front left channel (L) and the front right channel (R) in a loop twice.
In step S105, the phonemes output in step S104 are sampled in accordance with step S103, and the measurement process as the primary analysis mode is executed to obtain the measurement result.

また、ステップＳ１０６は、第１次解析モードの３回目の出力期間Ｔａに対応するもので、ステップＳ１０２に準じて、音高Ｃ＃，Ｆに対応する２つの音素（サンプル数Ｎ）を生成して、それぞれ、２回ループさせるようにして、左サラウンドチャンネル（Ｌｓ）、右サラウンドチャンネル（Ｒｓ）から出力させる。
ステップＳ１０７では、ステップＳ１０３に準じて、上記ステップＳ１０６により出力された音素をサンプリングして、第１次解析モードとしての測定処理を実行して測定結果を得る。 Step S106 corresponds to the third output period Ta in the primary analysis mode, and generates two phonemes (number of samples N) corresponding to the pitches C # and F in accordance with Step S102. In this manner, the output is performed from the left surround channel (Ls) and the right surround channel (Rs) so as to be looped twice.
In step S107, according to step S103, the phonemes output in step S106 are sampled, and the measurement process as the primary analysis mode is executed to obtain the measurement result.

次のステップＳ１０８は、第１次解析モードの４回目（最後）の出力期間Ｔａに対応するもので、ステップＳ１０２に準じて、音高Ｃ＃，Ｇ＃に対応する２つの音素（サンプル数Ｎ）を生成して、それぞれ、２回ループさせるようにして、左バックサラウンドチャンネル（Ｂｓｌ）、右バックサラウンドチャンネル（Ｂｓｒ）から出力させる。
ステップＳ１０９では、ステップＳ１０３に準じて、上記ステップＳ１０５により出力された音素をサンプリングして、第１次解析モードとしての測定処理を実行して測定結果を得る。
このステップＳ１０９までの手順により、７つの各オーディオチャンネルごとに対応した第１次解析モードとしての測定結果が得られていることになる。つまり、この段階においては、例えば次の第２次解析モードで各オーディオチャンネルのスピーカから出力させるべきオーディオ信号のゲインなどが設定済みとされた状態となっている。 The next step S108 corresponds to the fourth (last) output period Ta in the primary analysis mode, and two phonemes (number of samples N) corresponding to the pitches C # and G # according to step S102. ) And are output from the left back surround channel (Bsl) and the right back surround channel (Bsr) so as to be looped twice.
In step S109, according to step S103, the phonemes output in step S105 are sampled, and the measurement processing as the primary analysis mode is executed to obtain the measurement result.
By the procedure up to step S109, the measurement result as the primary analysis mode corresponding to each of the seven audio channels is obtained. In other words, at this stage, for example, the gain of the audio signal to be output from the speaker of each audio channel is set in the next secondary analysis mode.

続くステップＳ１１０〜Ｓ１１７までの手順が、ここでは第２次解析モードに対応する。この第２次解析モードとして最初となるステップＳ１１０は、第２次解析モードの１回目の出力期間Ｔａに対応するもので、ここでも先のステップＳ１０２に準じて、サンプル数Ｎによる音高Ａ＃に対応する音素を生成し、これを２回連続させて出力する。
そして、次のステップＳ１１１においては、先のステップＳ１０３に準じて、上記ステップＳ１１０により出力された音素をサンプリングして応答信号を得てＦＦＴを行ない、このＦＦＴの解析結果を利用して、第２次解析モードとしての測定処理を実行する。この場合にも、測定処理にあたっては、ＦＦＴにより得られた対象周波数の振幅値と暗騒音とされる周波数の振幅値とに基づいて算出したＳ／Ｎ比の情報を用いるようにされる。そして、測定結果としては、例えば、先ず、音素（測定音）を出力させていたとされるスピーカ（ステップＳ１１１の場合にはセンターチャンネルとなる）の有無を判定する。また、スピーカが有ると判定した場合には、本測定のときにセンターチャンネルのスピーカから出力させるべき音圧レベル、つまり、測定音の信号レベルを設定するようにされる。この設定にあたっては、スピーカから出力された音声信号についてクリップが生じているか否かなどの判定結果も用いられる。 The subsequent steps S110 to S117 correspond to the secondary analysis mode here. Step S110, which is the first in the secondary analysis mode, corresponds to the first output period Ta in the secondary analysis mode, and here again, according to step S102, the pitch A # based on the number of samples N A phoneme corresponding to is generated and output in succession twice.
Then, in the next step S111, in accordance with the previous step S103, the phoneme output in step S110 is sampled to obtain a response signal, and FFT is performed. The measurement process as the next analysis mode is executed. Also in this case, in the measurement process, information on the S / N ratio calculated based on the amplitude value of the target frequency obtained by FFT and the amplitude value of the frequency regarded as background noise is used. As a measurement result, for example, first, it is determined whether or not there is a speaker (which becomes a center channel in the case of step S111) that is supposed to output a phoneme (measurement sound). If it is determined that there is a speaker, the sound pressure level to be output from the center channel speaker during the main measurement, that is, the signal level of the measurement sound is set. In this setting, a determination result such as whether or not a clip has occurred in the audio signal output from the speaker is also used.

続くステップＳ１１２は、第２次解析モードの２回目の出力期間Ｔａに対応するもので、先のステップＳ１０２の説明に準じて、音高Ｄ＃，Ａ＃にそれぞれ対応する２つの音素（サンプル数Ｎ）を生成して、それぞれ２回ループさせるようにして、フロント左チャンネル（Ｌ）、フロント右チャンネル（Ｒ）から出力させる。
ステップＳ１１３では、先のステップＳ１０３に準じて、上記ステップＳ１１３により出力された音素をサンプリングして、第２次解析モードとしての測定処理を実行して測定結果を得る。 The subsequent step S112 corresponds to the second output period Ta in the secondary analysis mode, and in accordance with the description of the previous step S102, two phonemes (number of samples) respectively corresponding to the pitches D # and A #. N) is generated and output from the front left channel (L) and the front right channel (R) in a loop twice.
In step S113, the phonemes output in step S113 are sampled in accordance with the previous step S103, and the measurement process as the secondary analysis mode is executed to obtain the measurement result.

また、ステップＳ１１４は、第２次解析モードの３回目の出力期間Ｔａに対応するもので、ステップＳ１０２に準じて、音高Ｆ＃，Ｄ＃に対応する２つの音素（サンプル数Ｎ）を生成して、それぞれ２回ループさせるようにして、左サラウンドチャンネル（Ｌｓ）、右サラウンドチャンネル（Ｒｓ）から出力させる。
ステップＳ１１５では、ステップＳ１０３に準じて、上記ステップＳ１１４により出力された音素をサンプリングして、第２次解析モードとしての測定処理を実行して測定結果を得る。 Step S114 corresponds to the third output period Ta in the secondary analysis mode, and generates two phonemes (number of samples N) corresponding to the pitches F # and D # according to step S102. Then, each is looped twice to output from the left surround channel (Ls) and the right surround channel (Rs).
In step S115, the phonemes output in step S114 are sampled in accordance with step S103, and the measurement process as the secondary analysis mode is executed to obtain the measurement result.

また、ステップＳ１１６は、第２次解析モードの最後（４回目）の出力期間Ｔａに対応するもので、ステップＳ１０２に準じて、音高Ｇ，Ａ＃に対応する２つの音素（サンプル数Ｎ）を生成して、それぞれ２回ループさせるようにして、左サラウンドチャンネル（Ｌｓ）、右サラウンドチャンネル（Ｒｓ）から出力させる。
ステップＳ１１７では、ステップＳ１０３に準じて、上記ステップＳ１１６により出力された音素をサンプリングして、第２次解析モードとしての測定処理を実行して測定結果を得る。
ここまでの段階の手順を経た段階では、第２次解析モードとしての測定音出力、サンプリングによる応答信号の取得、及びＦＦＴによる解析が完了していることで、例えば、７チャンネルのスピーカごとについての有無（つまりオーディオシステムにおけるオーディオチャンネル構成）が判定されており、また、各スピーカについての本測定時における測定音の出力レベルも設定されていることになる。 Step S116 corresponds to the last (fourth) output period Ta of the secondary analysis mode, and two phonemes (number of samples N) corresponding to the pitches G and A # according to step S102. Are generated and output from the left surround channel (Ls) and the right surround channel (Rs) so as to be looped twice.
In step S117, the phonemes output in step S116 are sampled in accordance with step S103, and the measurement process as the secondary analysis mode is executed to obtain the measurement result.
At the stage through the steps so far, the measurement sound output as the second analysis mode, the acquisition of the response signal by sampling, and the analysis by FFT are completed, for example, for each speaker of 7 channels. The presence / absence (that is, the audio channel configuration in the audio system) is determined, and the output level of the measurement sound at the time of the main measurement for each speaker is also set.

そして、図７の測定音出力シーケンスにしたがった場合、第２次解析モードに続いては、非解析モードに対応するステップＳ１１８の手順を行うようにされる。つまり、音高Ｇ＃、Ｆ、Ｃ＃のそれぞれに対応する音素を生成し、音高Ｇ＃に対応の音素については、センターチャンネル（Ｃ）、フロント左チャンネル（Ｌ）、フロント右チャンネル（Ｒ）から出力させる。また、音高Ｆ＃に対応の音素については、左サラウンドチャンネル（Ｌｓ）、右サラウンドチャンネル（Ｒｓ）から出力させる。音高Ｃ＃に対応の音素については、左バックサラウンドチャンネル（Ｂｓｌ）及び右バックサラウンドチャンネル（Ｂｓｒ）から出力させる。また、これらの各音高に対応する音素の出力は、出力期間Ｔａに対応するタイミングで同時的に行われ、また、図７からも分かるようにして、４回連続する出力期間Ｔａに対応して、サンプル数Ｎによる２回連続出力を４回繰り返すようにされる。 When the measurement sound output sequence of FIG. 7 is followed, the procedure of step S118 corresponding to the non-analysis mode is performed following the secondary analysis mode. That is, a phoneme corresponding to each of the pitches G #, F, and C # is generated, and for the phonemes corresponding to the pitch G #, the center channel (C), the front left channel (L), and the front right channel (R ). The phonemes corresponding to the pitch F # are output from the left surround channel (Ls) and the right surround channel (Rs). The phonemes corresponding to the pitch C # are output from the left back surround channel (Bsl) and the right back surround channel (Bsr). Further, the output of phonemes corresponding to these pitches is performed simultaneously at the timing corresponding to the output period Ta, and as can be seen from FIG. 7, it corresponds to the output period Ta that is continuous four times. Thus, continuous output twice by the number of samples N is repeated four times.

ステップＳ１１８の手順により非解析モードとしての測定音出力に続けては、ステップＳ１１９の手順として示すように、これまでの解析、測定結果に基づいて、総合判定処理を行う。例えば、これまでの解析、測定は、出力期間Ｔａにより出力された音素ごとに対応して個別的に行われていたものであり、従って、例えばあるチャンネルについて何らかの要因で測定結果にエラーが発生していたとしても、そのチャンネルの解析、測定結果のみに基づいてはエラーの発生したことが特定できない場合もあると考えられる。
そこで、ステップＳ１１９としては、これまでの解析結果及び測定結果の全体について比較参照することで、上記のような局所的なエラーの有無を判定するようにされる。あるいは、ここのチャンネルごとに設定されたパラメータなどのバランスも考慮して、これらのパラメータがより最適なものとなるように再設定を行うようにもされる。 Following the measurement sound output as the non-analysis mode by the procedure of step S118, as shown as the procedure of step S119, comprehensive determination processing is performed based on the previous analysis and measurement results. For example, the analysis and measurement so far have been performed individually for each phoneme output in the output period Ta, and therefore, for example, an error occurs in the measurement result due to some factor for a certain channel. Even if the error occurs, it may be impossible to identify the occurrence of an error based only on the analysis and measurement results of the channel.
Therefore, as step S119, by comparing and referring to the entire analysis result and measurement result so far, the presence / absence of a local error as described above is determined. Alternatively, in consideration of the balance of the parameters set for each channel, the resetting may be performed so that these parameters become more optimal.

図９は、本実施の形態の音響補正装置と、この音響補正装置と接続されるオーディオシステムとから成るシステム全体の構成例を示している。前述もしているように、本実施の形態の音響補正装置は、いわゆる後付けのキットとされ、対応機種については、一定条件の範囲内で汎用性を有している。この図においては、本実施の形態の音響補正装置２と接続可能なオーディオシステムが、オーディオ再生だけではなくビデオ再生も可能なＡＶ(Audio Video)システムに含められている場合を例に挙げている。 FIG. 9 shows a configuration example of the entire system including the sound correction apparatus according to the present embodiment and an audio system connected to the sound correction apparatus. As described above, the acoustic correction apparatus according to the present embodiment is a so-called retrofit kit, and the corresponding model has versatility within a certain range of conditions. In this figure, an example is given in which the audio system that can be connected to the sound correction apparatus 2 of the present embodiment is included in an AV (Audio Video) system that can perform not only audio reproduction but also video reproduction. .

先ず、この場合のＡＶシステム１は、メディア再生部１１、映像表示装置１２、パワーアンプ部１３、及びスピーカ１４を備えて成るものとされる。
メディア再生部１１は、例えば映像／音声コンテンツとしてのデータが記録されたメディアについての再生を行って、ビデオ信号とオーディオ信号を再生して出力する。なお、ここでは、メディア再生部１１は、デジタルによるビデオ信号及びオーディオ信号を出力させることとしている。
この場合において、メディア再生部１１において再生対象となるメディアの種別、フォーマット等については特に限定されるべきものではないが、例えば、現状であれば、ＤＶＤ(Digital Versatile Disc)を考えることができる。メディア再生部１１の具体的構成としてＤＶＤに対応する場合には、装填されたＤＶＤに記録されたビデオ／オーディオコンテンツとしてのデータを読み出して、例えば同時に再生出力されるべきビデオデータとオーディオデータとを得るようにされる。ここで、現状のＤＶＤフォーマットでは、ビデオデータとオーディオデータは、ＤＶＤ規格に準拠した方式に従って圧縮符号化された符号形式となっているので、この圧縮符号化されたビデオデータとオーディオデータとについてデコード処理を施すようにされる。そして、このデコード処理により得られた、デジタルビデオ信号とデジタルオーディオ信号について、再生時間が同期したタイミングにより出力するようにされる。 First, the AV system 1 in this case includes a media playback unit 11, a video display device 12, a power amplifier unit 13, and a speaker 14.
The media playback unit 11 plays back, for example, a medium on which data as video / audio content is recorded, and plays back and outputs a video signal and an audio signal. Here, the media playback unit 11 outputs a digital video signal and audio signal.
In this case, the type, format, and the like of the media to be played back by the media playback unit 11 are not particularly limited. For example, in the current situation, a DVD (Digital Versatile Disc) can be considered. In the case of supporting a DVD as a specific configuration of the media playback unit 11, data as video / audio content recorded on the loaded DVD is read and, for example, video data and audio data to be played back simultaneously are output. To get. Here, in the current DVD format, since the video data and the audio data are in a code format that is compression-encoded according to a method compliant with the DVD standard, the compression-encoded video data and audio data are decoded. Processing is performed. Then, the digital video signal and the digital audio signal obtained by the decoding process are output at the timing when the reproduction times are synchronized.

なお、メディア再生部１１としては、ＤＶＤなどに加えて、例えばオーディオＣＤなども再生可能とされたいわゆるマルチメディア対応の構成とすることもできる。また、テレビジョン放送などを受信復調してビデオ信号、オーディオ信号を出力するテレビジョンチューナ単体としての構成とされても構わない。あるいは、テレビジョンチューナの機能とパッケージメディアの再生機能とが複合的に組み合わされたような構成とされてもよい。 Note that the media playback unit 11 may have a so-called multimedia configuration that enables playback of, for example, an audio CD in addition to a DVD. Further, it may be configured as a single television tuner that receives and demodulates a television broadcast and outputs a video signal and an audio signal. Alternatively, the configuration may be such that the function of the television tuner and the playback function of the package media are combined.

また、メディア再生部１１として、マルチオーディオチャンネルに対応する場合には、このメディア再生部１１から再生出力するオーディオ信号としては、オーディオチャンネルごとに対応した複数系統の信号ラインによってオーディオ信号を出力するようにされる。
例えば、メディア再生部１１が、図７に例示しているような、センターチャンネル（Ｃ）、フロント左チャンネル（Ｌ）、フロント右チャンネル（Ｒ）、左サラウンドチャンネル（Ｌｓ）、右サラウンドチャンネル（Ｒｓ）、左バックサラウンドチャンネル（Ｂｓｌ）、右バックサラウンドチャンネル（Ｂｓｒ）の７チャンネルに対応するものである場合には、これらの各チャンネルごとに対応して、７系統によりオーディオ信号を出力するようにされる。 When the media playback unit 11 supports multi-audio channels, the audio signal played back and output from the media playback unit 11 is output by a plurality of signal lines corresponding to each audio channel. To be.
For example, the media playback unit 11 may have a center channel (C), a front left channel (L), a front right channel (R), a left surround channel (Ls), a right surround channel (Rs) as illustrated in FIG. ), The left back surround channel (Bsl), and the right back surround channel (Bsr) corresponding to 7 channels, audio signals are output by 7 systems corresponding to each of these channels. Is done.

ＡＶシステム１のみの構成としてみた場合、上記メディア再生部１１から出力されたビデオ信号は、映像表示装置１２に対して入力される。また、オーディオ信号は、パワーアンプ部１３に対して入力される。
映像表示装置１２は、入力されたビデオ信号に基づいて画像表示を行なう。なお、ここでは、映像表示装置１２として実際に用いられる表示デバイスについては特に限定されるべきものではなく、例えば現状であれば、ＣＲＴ(Cathode Ray Tube)、ＬＣＤ（(Liquid Crystal Display)、ＰＤＰ(Plasma Display Panel)などをはじめとした各種の表示デバイスを採用することができる。 When viewed as a configuration of only the AV system 1, the video signal output from the media playback unit 11 is input to the video display device 12. The audio signal is input to the power amplifier unit 13.
The video display device 12 displays an image based on the input video signal. Here, the display device actually used as the video display device 12 is not particularly limited. For example, in the present situation, a CRT (Cathode Ray Tube), LCD ((Liquid Crystal Display), PDP ( Various display devices such as Plasma Display Panel) can be used.

パワーアンプ部１３は、入力されたオーディオ信号を増幅してスピーカを駆動するためのドライブ信号を出力する。この場合のパワーアンプ部１３は、このＡＶシステム１が対応するとされるオーディオチャンネル構成に応じた複数のパワーアンプ回路系を備え、これらの各パワーアンプ回路により、チャンネルごとに対応するオーディオ信号を増幅して、そのチャンネルに対応するスピーカ１４に対してドライブ信号を出力するようにされる。従って、スピーカ１４としても、ＡＶシステム１が対応するオーディオチャンネル構成に応じて複数が備えられることになる。例えば、ＡＶシステム１が、上記した７チャンネルに対応する場合には、パワーアンプ部１３においては、７つのパワーアンプ回路系が備えられることになる。また、スピーカ１４としても、各チャンネルに対応する７つが設けられて、それぞれが、そのリスニング環境においてしかるべき位置に配置されていることになる。
そして、パワーアンプ部１３により各チャンネルのオーディオ信号を増幅して得られるドライブ信号をしかるべきチャンネルのスピーカ１４に供給することにより、スピーカ１４からは、対応するチャンネルの音声を空間に出力する。これにより、マルチチャンネル構成に応じた音場を形成するようにしてコンテンツの音声の再生出力が行われることになる。なお、確認のために述べておくと、このようにしてスピーカから再生出力される音声は、ビデオ信号に応じて映像表示装置１２において表示される画像との同期（いわゆるリップシンク）が保たれたものとなる。 The power amplifier unit 13 amplifies the input audio signal and outputs a drive signal for driving the speaker. In this case, the power amplifier unit 13 includes a plurality of power amplifier circuit systems corresponding to the audio channel configuration to which the AV system 1 is compatible, and the audio signal corresponding to each channel is amplified by each of these power amplifier circuits. Then, a drive signal is output to the speaker 14 corresponding to the channel. Accordingly, a plurality of speakers 14 are provided according to the audio channel configuration supported by the AV system 1. For example, when the AV system 1 corresponds to the above-described seven channels, the power amplifier unit 13 includes seven power amplifier circuit systems. Also, as the speaker 14, seven corresponding to each channel are provided, and each is arranged at an appropriate position in the listening environment.
Then, by supplying a drive signal obtained by amplifying the audio signal of each channel by the power amplifier unit 13 to the speaker 14 of the appropriate channel, the sound of the corresponding channel is output from the speaker 14 to the space. As a result, the sound of the content is reproduced and output so as to form a sound field corresponding to the multi-channel configuration. For confirmation, the sound reproduced and output from the speaker in this way is synchronized with the image displayed on the video display device 12 in accordance with the video signal (so-called lip sync). It will be a thing.

また、ＡＶシステム１そのものにおいて、例えばメディア再生部１１、映像表示装置１２、パワーアンプ部１３、及びスピーカ１４は、それぞれ別体とされるコンポーネントＡＶシステムとしての構成を採ってもよいし、これらの部位の少なくとも２つの部位が一体化されたユニットタイプの装置構成を採ってもよい。 Further, in the AV system 1 itself, for example, the media playback unit 11, the video display device 12, the power amplifier unit 13, and the speaker 14 may be configured as component AV systems that are separated from each other. A unit type device configuration in which at least two of the parts are integrated may be employed.

そして、このようなＡＶシステム１に対して、本実施の形態の音響補正装置２を後付け的に接続する場合には、図示するようにして、メディア再生部１１から出力されるオーディオ信号に対して入力されるようにする。この場合、例えば音響補正装置２としては、例えば図７に示したように、最大で、センターチャンネル（Ｃ）、フロント左チャンネル（Ｌ）、フロント右チャンネル（Ｒ）、左サラウンドチャンネル（Ｌｓ）、右サラウンドチャンネル（Ｒｓ）、左バックサラウンドチャンネル（Ｂｓｌ）、右バックサラウンドチャンネル（Ｂｓｒ）の７チャンネルに対応可能とされているので、最大でこれらの７チャンネルに対応可能なようにして、例えば７つのオーディオ信号入力端子を有するものとされる。なお、実際のＡＶシステムにおいては、上記７つのオーディオチャンネルに加えて、サブウーファチャンネルが備えられることが通常であるが、ここでは説明を簡単なものとするために省略している。
また、例えばＡＶシステム１がＬ，Ｒのステレオチャンネルのみに対応するような場合には、メディア再生部１１から出力されるＬ，Ｒの各オーディオ信号について、上記７つのオーディオチャンネルに対応する入力端子のうち、フロント左チャンネル（Ｌ）、フロント右チャンネル（Ｒ）に対応する各入力端子に入力させるようにして接続すればよい。
また、音響補正装置２では、オーディオ信号出力端子についても、最大で上記７チャンネルのオーディオ信号を出力可能なようにして設けられているものとされる。そして、この音響補正装置２のオーディオ信号出力は、パワーアンプ部１３における、各チャネルに対応したオーディオ信号の入力端子に対して接続されることになる。 When the audio correction device 2 according to the present embodiment is retrofitted to such an AV system 1, as shown in the figure, the audio signal output from the media playback unit 11 is processed. To be entered. In this case, for example, as shown in FIG. 7, for example, as the acoustic correction device 2, the center channel (C), the front left channel (L), the front right channel (R), the left surround channel (Ls), Since seven channels of the right surround channel (Rs), the left back surround channel (Bsl), and the right back surround channel (Bsr) can be supported, the maximum of these seven channels can be supported, for example, 7 One audio signal input terminal is provided. In an actual AV system, a subwoofer channel is usually provided in addition to the above seven audio channels, but is omitted here for the sake of simplicity.
Further, for example, when the AV system 1 supports only the L and R stereo channels, the input terminals corresponding to the above seven audio channels for the L and R audio signals output from the media playback unit 11. Of these, connection may be made so as to input to each input terminal corresponding to the front left channel (L) and the front right channel (R).
In the acoustic correction apparatus 2, the audio signal output terminal is also provided so as to be able to output the audio signals of the seven channels at the maximum. The audio signal output of the acoustic correction apparatus 2 is connected to the audio signal input terminal corresponding to each channel in the power amplifier unit 13.

なお、前述したようにメディア再生部１１においては、例えばメディアから読み出したオーディオの情報について圧縮符号化が施されていた場合には、デコード処理を行ってデジタルオーディオ信号として出力することとしている。これは、本実施の形態の音響補正装置２が扱うべきオーディオ信号の情報は、圧縮符号化などについては復調された後の形式のオーディオ信号であるべきものとなる。これにより、本実施の形態の音響補正装置２が、圧縮符号化オーディオ信号についてのエンコーダ／デコーダを備える必要はないということになる。
また、音響補正装置２からパワーアンプ部１３に対して出力することとなる測定音としても、符号化復号後の形式に従った信号を生成すればよいわけであり、測定音の再生に関しても、圧縮符号化などのためのエンコーダ／デコーダ処理が必要となることはないようにされている。 As described above, in the media playback unit 11, for example, when audio information read from the media has been compression-encoded, it is decoded and output as a digital audio signal. This means that the audio signal information to be handled by the acoustic correction apparatus 2 of the present embodiment should be an audio signal in a format after being demodulated for compression coding and the like. As a result, it is not necessary for the sound correction apparatus 2 of the present embodiment to include an encoder / decoder for a compression-encoded audio signal.
In addition, as the measurement sound to be output from the acoustic correction device 2 to the power amplifier unit 13, it is only necessary to generate a signal according to the format after encoding and decoding. An encoder / decoder process for compression encoding or the like is not required.

また、この場合の音響補正装置２としては、ビデオ信号についても入出力可能とされている。この場合には、メディア再生部１１から出力されるビデオ信号を入力して、映像表示装置１２に対して出力するようにして、ビデオ信号系が接続されることになる。
また、音響補正装置２においては、上記オーディオ信号と同様にして、ビデオ信号についても、圧縮符号化後の形式のデジタルビデオ信号を対象として処理するものとされている。 In this case, the sound correction device 2 can also input / output a video signal. In this case, the video signal system is connected so that the video signal output from the media playback unit 11 is input and output to the video display device 12.
Further, in the sound correction apparatus 2, the video signal is also processed with respect to the digital video signal in the format after compression coding in the same manner as the audio signal.

このようにしてビデオ信号とオーディオ信号とが入力される音響測定装置２は、大きくは、フレームバッファ２１、音場補正／測定機能部２２、制御部２３、及びメモリ部２４から成るものとされる。
先ず、音場補正／測定機能部２２としては、２つの機能を有する。１つは、音場補正のために必要な音場制御のためのパラメータ値を設定するために、リスニング環境を測定するための測定機能を有する。この測定機能を実行しているときには、必要に応じて、しかるべきオーディオチャンネルから測定音が出力されるように、パワーアンプ回路１３に対して測定音の信号を出力する。
また、上記測定機能による測定結果に従って設定された音場制御のためのパラメータ値に従って、メディア再生部１１から入力されてくる各チャンネルごとのオーディオ信号について所要の信号処理を施して、パワーアンプ部１３に出力するようにされる。これにより、スピーカから出力されるコンテンツの音声により形成される音場としては、しかるべき聴取位置において良好なものとなるように補正されていることになる。 The sound measuring apparatus 2 to which the video signal and the audio signal are input in this way is roughly composed of a frame buffer 21, a sound field correction / measurement function unit 22, a control unit 23, and a memory unit 24. .
First, the sound field correction / measurement function unit 22 has two functions. One has a measurement function for measuring the listening environment in order to set parameter values for sound field control necessary for sound field correction. When this measurement function is executed, a measurement sound signal is output to the power amplifier circuit 13 so that the measurement sound is output from an appropriate audio channel as necessary.
In addition, according to the parameter value for sound field control set according to the measurement result by the measurement function, the audio signal for each channel input from the media reproducing unit 11 is subjected to necessary signal processing, and the power amplifier unit 13 To be output. As a result, the sound field formed by the sound of the content output from the speaker is corrected so as to be favorable at an appropriate listening position.

ところで上記のようにして音場制御のための信号処理が行われるということは、メディア再生部１１から入力されたオーディオ信号が、音響補正装置２内においてＤＳＰを経由することとなる。このようにオーディオ信号がＤＳＰを経由することにより、同じくメディア再生部１１から出力されるビデオ信号との再生時間に対して、タイムラグが生じることになる。フレームバッファ２は、このタイムラグを解消していわゆるリップシンクを図るために備えられる。つまり、例えば制御部２３は、メディア再生部１１から入力されてくるビデオ信号を、フレームバッファ２１に対して、例えばフレーム単位で書き込んで一時保持させてから、映像表示装置１２に出力させるように制御を実行する。これにより、音響補正装置２からは、上記したタイムラグが解消されて再生時間が適正に同期したビデオ信号及びオーディオ信号が出力されることになる。 By the way, the signal processing for the sound field control as described above means that the audio signal input from the media playback unit 11 passes through the DSP in the acoustic correction device 2. In this way, when the audio signal passes through the DSP, a time lag occurs with respect to the reproduction time with the video signal output from the media reproduction unit 11 in the same manner. The frame buffer 2 is provided to eliminate this time lag and achieve so-called lip sync. That is, for example, the control unit 23 controls the video signal input from the media playback unit 11 to be written in the frame buffer 21 for each frame, for example, temporarily held, and then output to the video display device 12. Execute. As a result, the audio correction device 2 outputs a video signal and an audio signal in which the time lag described above is eliminated and the reproduction time is appropriately synchronized.

制御部２３は、上記したフレームバッファ２１に対する書き込み／読み出し制御の他、音響補正装置２における各種機能部位に対する制御、及び各種の処理を実行する。
この場合のメモリ部２４は、例えば不揮発性のメモリ素子を備えて構成されるもので、制御部２３の制御により書き込み／読み出しが行われるようになっている。本実施の形態においてメモリ部２４に記憶される必須の情報として、１つには、音素としての測定音を生成するための基本正弦波（図１（ａ）参照）の波形データが挙げられる。また、もう１つは、例えば図７に示すようにして所定の音素の音列パターンによって測定音メロディを出力させるための制御情報としての構造を有するシーケンスデータとなる。
なお、実際においては、例えば制御部２３が参照すべき各種所要の設定情報などをはじめ、シーケンスデータ以外の所要の情報がメモリ部２４において書き込まれて記憶されるようにしても構わないものである。 In addition to the above-described write / read control for the frame buffer 21, the control unit 23 executes control for various functional parts in the sound correction device 2 and various processes.
In this case, the memory unit 24 includes, for example, a nonvolatile memory element, and writing / reading is performed under the control of the control unit 23. In the present embodiment, as essential information stored in the memory unit 24, one example is waveform data of a basic sine wave (see FIG. 1A) for generating a measurement sound as a phoneme. The other is sequence data having a structure as control information for outputting a measurement sound melody with a phoneme pattern of a predetermined phoneme as shown in FIG. 7, for example.
In actuality, for example, various necessary setting information to be referred to by the control unit 23 and other necessary information other than sequence data may be written and stored in the memory unit 24. .

また、マイクロフォン２５は、音響補正装置２に付属されるべきもので、この音響補正装置２により測定を行わせるときに、スピーカ１４から出力される測定音を収音するために音響補正装置２に対して接続されるべきものとなる。 The microphone 25 is to be attached to the sound correction device 2, and when the sound correction device 2 performs measurement, the microphone 25 is connected to the sound correction device 2 in order to collect the measurement sound output from the speaker 14. To be connected.

図１０は、音場補正／測定機能部２２の内部構成例を示している。この図に示すようにして、音場補正／測定機能部２２は、大別して、マイクロフォンアンプ１０１、本測定処理ブロック１０３、準備測定処理ブロック１０６、及び音場補正処理ブロック１１０を備えて成る。ここで、音場補正処理ブロック１１０が音場補正のための処理を行うのに対して、マイクロフォンアンプ１０１、本測定処理ブロック１０３、準備測定処理ブロック１０６側の部位は、測定処理を実行する部位である。この測定処理の結果に基づいて、音場補正処理ブロック１１０における上記音場補正処理のための各種所要のパラメータの値が変更設定される。
また、本測定と準備測定との間で測定モードを切り換えるために、スイッチ１０２、１０９が設けられる。また、測定モードと、音場補正モードとを切り換えるためにスイッチ１２０が備えられる。これらスイッチ１０２、１０９、１２０は、それぞれ、端子Ｔｍ１に対して端子Ｔｍ２又はＴｍ３が択一的に接続されるようにして切り換えが行われる。この切り換えの動作は、制御部２３が制御する。 FIG. 10 shows an internal configuration example of the sound field correction / measurement function unit 22. As shown in this figure, the sound field correction / measurement function unit 22 includes a microphone amplifier 101, a main measurement processing block 103, a preparation measurement processing block 106, and a sound field correction processing block 110. Here, the sound field correction processing block 110 performs processing for sound field correction, whereas the parts on the microphone amplifier 101, the main measurement processing block 103, and the preparation measurement processing block 106 side are parts for executing the measurement process. It is. Based on the result of this measurement processing, various required parameter values for the sound field correction processing in the sound field correction processing block 110 are changed and set.
In addition, switches 102 and 109 are provided to switch the measurement mode between the main measurement and the preparation measurement. In addition, a switch 120 is provided to switch between the measurement mode and the sound field correction mode. These switches 102, 109, and 120 are switched so that the terminal Tm2 or Tm3 is alternatively connected to the terminal Tm1. The controller 23 controls the switching operation.

図１０に示す音場補正／測定機能部２２の説明として、先ず、準備測定モード時に対応する動作について説明する。
準備測定モードのときには、先ず、制御部２３は、スイッチ１２０について端子Ｔｍ１に対して端子Ｔｍ２を接続させる。また、スイッチ１０２、１０９については、共に端子Ｔｍ１に対して端子Ｔｍ３を接続させる。これにより、測定モードとして準備測定モードに対応した音場補正／測定機能部２２における信号経路が形成される。 As an explanation of the sound field correction / measurement function unit 22 shown in FIG. 10, the operation corresponding to the preparation measurement mode will be described first.
In the preparation measurement mode, first, the control unit 23 connects the terminal Tm2 to the terminal Tm1 of the switch 120. For the switches 102 and 109, the terminal Tm3 is connected to the terminal Tm1. Thereby, a signal path in the sound field correction / measurement function unit 22 corresponding to the preparation measurement mode is formed as the measurement mode.

準備測定処理ブロック１０６は、図示するようにして、解析処理部１０７と測定音処理部１０８とを備える。この測定音処理部１０８では、例えば図１１に示すようにして、基本正弦波の波形データを入力し、所定の音高に対応する音素を生成して、これを準備測定用の測定音としてオーディオ信号形式により出力するための部位である。
測定音処理部１０８による音素の生成処理は、例えば図４に示した音素の形成手法に従ったものとなる。また、図７からの説明によっても理解されるように、本実施の形態では、測定音は、マルチのオーディオチャンネルごとに対応して出力可能とされている。従って、図１０では、図示を簡略なものとすることの便宜上、測定音処理部１０８からの信号出力ラインを１本として示しているが、実際には、図１１に示すようにして、７つのチャンネルごとに対応する測定音の信号出力ラインがあるものとされる。
また、測定音処理部１０８において、音素としてどの音高に対応する周波数を生成し、その生成した音素をどのチャンネルに対応する信号ラインから出力させるのかについては、シーケンスデータに記述される制御内容に従ったものとなる。
なお、基本正弦波の波形データは、所要のタイミングにより制御部２３の制御によってメモリ部２４からの読み出しが行われて、測定音処理部１０８に対して入力されるようになっている。また、測定音処理部１０８に対しては、シーケンスデータが直接的に入力されるのではなく、先ず、制御部２３が、シーケンスデータをメモリ部２４から読み出して解釈し、測定音処理部１０８に対して、生成すべき音素が対応する音高（周波数）と、出力すべきオーディオチャンネルを指示するようにされる。 The preparation measurement processing block 106 includes an analysis processing unit 107 and a measurement sound processing unit 108 as illustrated. In the measurement sound processing unit 108, for example, as shown in FIG. 11, waveform data of a basic sine wave is input, a phoneme corresponding to a predetermined pitch is generated, and this is used as a measurement sound for preparation measurement. This is a part for outputting in the signal format.
The phoneme generation processing by the measurement sound processing unit 108 follows, for example, the phoneme formation method shown in FIG. In addition, as can be understood from the description from FIG. 7, in the present embodiment, the measurement sound can be output corresponding to each of multiple audio channels. Therefore, in FIG. 10, for the sake of simplicity of illustration, only one signal output line from the measurement sound processing unit 108 is shown, but in reality, as shown in FIG. It is assumed that there is a corresponding measurement sound signal output line for each channel.
Further, the measurement sound processing unit 108 generates a frequency corresponding to which pitch as a phoneme, and outputs the generated phoneme from a signal line corresponding to which channel in the control content described in the sequence data. Followed.
The waveform data of the basic sine wave is read from the memory unit 24 under the control of the control unit 23 at a required timing and is input to the measurement sound processing unit 108. In addition, the sequence data is not directly input to the measurement sound processing unit 108. First, the control unit 23 reads the sequence data from the memory unit 24, interprets it, and sends it to the measurement sound processing unit 108. On the other hand, the pitch (frequency) corresponding to the phoneme to be generated and the audio channel to be output are indicated.

また、測定音処理部１０８において１つの音素を形成するための処理としては、図１２に示すブロック構成により示すことができる。
測定音処理部１０８としては、先ず、基本正弦波の波形データを入力して、ｍ次正弦波処理２０１により、指定される音高に対応する周波数の音素のベース音である、所要の次数ｍによるｍ次正弦波を生成する。このようにして生成されるｍ次正弦波の周波数は、例えば（式２）により表されるものとなる。また、次数ｍとしてどのような値を選定するのか、つまり、ベース音としてどのような周波数を設定するのかについては、シーケンスデータの内容に基づいた制御部２３の制御に従ったものとなる。 Further, the process for forming one phoneme in the measurement sound processing unit 108 can be shown by the block configuration shown in FIG.
As the measurement sound processing unit 108, first, waveform data of a basic sine wave is input, and the m-th order sine wave processing 201 is a base tone of a phoneme having a frequency corresponding to a designated pitch, and a required order m. To generate an m-th order sine wave. The frequency of the m-order sine wave generated in this way is expressed by, for example, (Equation 2). In addition, what value is selected as the order m, that is, what frequency is set as the base sound depends on the control of the control unit 23 based on the contents of the sequence data.

ここで、ｍ次正弦波処理２０１がｍ次正弦波生成のために使用する基本正弦波の波形データ（基礎波形成分データ）としては、図１（ａ）に示したように１周期分の波形データとされても良いのであるが、最小では、１／４周期分の波形データがあればよいということがいえる。つまり、波形データとしては正弦波であることから、少なくとも１／４周期分あれば、簡単な演算によって１周期分の完全な正弦波形を形成することは容易に可能とされる。また、この最小の１／４周期分の波形データとすることで、基礎波形成分データとしてはそれだけ小さなデータ量となって、メモリ部２４の容量も節約できることになる。 Here, as the waveform data (basic waveform component data) of the basic sine wave used by the m-order sine wave processing 201 for generating the m-order sine wave, a waveform corresponding to one cycle as shown in FIG. Although it may be data, at a minimum, it can be said that there is only a quarter period of waveform data. That is, since the waveform data is a sine wave, it is possible to easily form a complete sine waveform for one cycle by a simple calculation if there is at least a quarter cycle. Further, by setting the waveform data for the minimum quarter period, the basic waveform component data has a smaller data amount, and the capacity of the memory unit 24 can be saved.

ｍ次正弦波生成処理２０１の処理によって生成されたｍ次正弦波は、これまでの説明から理解されるようにして、オクターブ次数ｋ＝１とされる音素のベース音となる。また、このｍ次正弦波生成処理２０１の処理によって生成されたｍ次正弦波の波形データは、レベル調整処理２０３−１と、オクターブ高調波生成処理２０２に対して分岐されるようにして受け渡される。
オクターブ高調波生成処理２０２では、ｍ次正弦波生成処理２０１から取り込んだベース音としてのｍ次正弦波を基として、所定倍（２倍、４倍、８倍、１６倍、３２倍）による逓倍処理を実行することで、この場合には、オクターブ次数ｋ＝２，ｋ＝３，ｋ＝４，ｋ＝５，ｋ＝６による５つのオクターブ高調波を生成する。この逓倍処理としては、例えば図１に示した概念を適用すればよい。つまり、ベース音としてのｍ次正弦波を基本として、このベース音の正弦波について、図１（ｂ）（ｄ）などに示したように、オクターブ高調波の次数に応じて間引きサンプリングを行うようにされる。
これらのオクターブ次数ｋ＝２，ｋ＝３，ｋ＝４，ｋ＝５，ｋ＝６によるオクターブ高調波は、それぞれ、レベル調整処理２０３−２、２０３−３、２０３−４、２０３−５、２０３−６に対して受け渡すようにされる。
このようにして、６つのレベル調整処理２０３−１〜２０３−６では、それぞれ、ｍ次正弦波として、ベース音（ｋ＝１）と、オクターブ次数ｋ＝２〜６に対応するオクターブ高調波が入力される。
そして、これらレベル調整処理２０３−１〜２０３−６では、入力して取り込んだベース音、オクターブ高調波について所要の振幅値が設定されるように処理を実行する。なお、レベル調整処理２０３−１〜２０３−６により設定される振幅値は、予め固定的に決められた値であっても良いし、また、制御部２３の制御に従って可変されるようにしてもよい。 The m-th order sine wave generated by the m-th order sine wave generation processing 201 is a base tone of a phoneme having an octave order k = 1 as understood from the above description. The m-th order sine wave waveform data generated by the m-th order sine wave generation processing 201 is transferred so as to be branched to the level adjustment processing 203-1 and the octave harmonic generation processing 202. It is.
In the octave harmonic generation process 202, multiplication by a predetermined multiple (2 times, 4 times, 8 times, 16 times, 32 times) based on the mth order sine wave as the base sound taken from the mth order sine wave generation process 201. By executing the processing, in this case, five octave harmonics with octave orders k = 2, k = 3, k = 4, k = 5, and k = 6 are generated. As the multiplication process, for example, the concept shown in FIG. 1 may be applied. That is, on the basis of the m-th order sine wave as the base sound, the sine wave of the base sound is subjected to thinning sampling according to the order of the octave harmonics as shown in FIGS. 1B and 1D. To be.
The octave harmonics of these octave orders k = 2, k = 3, k = 4, k = 5, and k = 6 are converted into level adjustment processes 203-2, 203-3, 203-4, 203-5, respectively. 203-6.
In this way, in the six level adjustment processes 203-1 to 203-6, the bass sound (k = 1) and the octave harmonics corresponding to the octave orders k = 2 to 6 are obtained as m-order sine waves, respectively. Entered.
In these level adjustment processing 203-1 to 203-6, processing is executed so that required amplitude values are set for the bass sound and octave harmonics that are input and captured. Note that the amplitude value set by the level adjustment processing 203-1 to 203-6 may be a fixed value determined in advance, or may be varied according to the control of the control unit 23. Good.

上記レベル調整処理２０３−１〜２０３−６によりレベル調整されたベース音、及びオクターブ高調波は、合成処理２０４により合成されて１つの音素（音声信号波形）として出力される。この合成処理２０４により合成されて得られる音素は、レベル調整処理２０３−１〜２０３−６のレベル調整結果に応じたベース音とオクターブ高調波の振幅バランスに応じた音色を有していることになる。
この図１２に示すプロセスに従って生成される音素としては、例えばサンプル数Ｎに対応するものとなる。そこで、例えば図７に示すようにして出力期間Ｔａにより音素を出力するためには、測定音処理部１０８は、この図１２のプロセスに従って生成された音素を２回連続して出力するようにされる。
また、測定音処理部１０８は、例えば図１２に示すプロセスを並行的に実行することで、それぞれ異なる音高に対応する音素を同時的に生成可能とされている。また、図１２に示すプロセスにより生成した音素としての音声信号を、しかるべき１以上のオーディオチャンネルに対応する出力ラインから測定音信号として出力させることができる。 The bass sound and the octave harmonics whose levels have been adjusted by the level adjustment processes 203-1 to 203-6 are synthesized by the synthesis process 204 and output as one phoneme (voice signal waveform). The phoneme obtained by the synthesis process 204 has a tone color corresponding to the amplitude balance of the base sound and the octave harmonics according to the level adjustment results of the level adjustment processes 203-1 to 203-6. Become.
The phonemes generated according to the process shown in FIG. 12 correspond to the number of samples N, for example. Therefore, for example, in order to output phonemes in the output period Ta as shown in FIG. 7, the measured sound processing unit 108 outputs the phonemes generated according to the process of FIG. 12 twice in succession. The
Further, the measurement sound processing unit 108 can simultaneously generate phonemes corresponding to different pitches by executing the processes shown in FIG. 12 in parallel, for example. Also, a sound signal as a phoneme generated by the process shown in FIG. 12 can be output as a measurement sound signal from an output line corresponding to one or more appropriate audio channels.

図１０において、準備測定処理ブロック１０６の測定音処理部１０８から出力された、音素から成る測定音信号は、スイッチ１０９（端子Ｔｍ３→Ｔｍ１）からスイッチ１２０（Ｔｍ２→Ｔｍ１）を経由してパワーアンプ部１３に対して入力されることになる。図９に示されるパワーアンプ部１３では、入力された測定音の音声信号について増幅を行って、スピーカ１４から出力させる。
これまでの説明から理解されるように、測定音処理部１０８から、同時的に複数のチャンネルにより測定音（音素）の音声信号を出力させているときには、パワーアンプ部１３では、これらのチャンネルの各々について増幅を行って、対応するチャンネルのスピーカ１４から出力させる。
これにより、スピーカ１４からその周囲空間に対して、測定音が実音声として出力されることになる。 In FIG. 10, the measurement sound signal composed of phonemes output from the measurement sound processing unit 108 of the preparation measurement processing block 106 is sent from the switch 109 (terminal Tm3 → Tm1) to the power amplifier via the switch 120 (Tm2 → Tm1). This is input to the unit 13. In the power amplifier unit 13 illustrated in FIG. 9, the input audio signal of the measurement sound is amplified and output from the speaker 14.
As understood from the above description, when the sound signal of the measurement sound (phoneme) is simultaneously output from the measurement sound processing unit 108 through a plurality of channels, the power amplifier unit 13 Each is amplified and output from the speaker 14 of the corresponding channel.
As a result, the measurement sound is output as real sound from the speaker 14 to the surrounding space.

本測定及び準備測定のときには、図９にも示したようにして、測定音を対象として収音するためのマイクロフォン２５を音響補正装置２に対して接続するのであるが、音響補正装置２に接続されたマイクロフォン２５からの音声信号は、図１０に示すようにして、音場補正／測定機能部２２におけるマイクロフォンアンプ１０１に入力されるようになっている。
なお、マイクロフォン２５は、そのリスニング環境において最も良好な補正音場を得たいとするリスニングポジション（聴取位置）にて収音がされるように設置する。例えば、図９に示すシステムが車載機器であるとして、ユーザが、運転席で聴取しているときに適正な音場が得られるようにしたいと考えたのであれば、この運転席にユーザが座った状態で、ほぼ耳が在るとされる位置にて収音がされるようにマイクロフォン２５を設置することになる。 In the main measurement and the preparatory measurement, as shown in FIG. 9, the microphone 25 for collecting the measurement sound as a target is connected to the acoustic correction device 2. The audio signal from the microphone 25 is input to the microphone amplifier 101 in the sound field correction / measurement function unit 22 as shown in FIG.
The microphone 25 is installed so that sound is picked up at a listening position (listening position) where it is desired to obtain the best corrected sound field in the listening environment. For example, if the system shown in FIG. 9 is an in-vehicle device and the user wants to obtain an appropriate sound field when listening to the driver's seat, the user sits in this driver's seat. In this state, the microphone 25 is installed so that sound is collected at a position where the ear is almost present.

ここで、先の説明のようにして準備測定モードの下で、測定音処理部１０８から測定音の信号が出力されたのに応じて、この測定音がスピーカ１４から出力されたとすると、マイクロフォン２５によって、この測定音を含む周囲環境音が収音されることになる。この収音音声の音声信号は、上記マイクロフォンアンプ１０１により増幅されて、スイッチ１０２の端子Ｔｍ１→Ｔｍ３を介して準備測定処理ブロック１０６の解析処理部１０７に対して入力される。
解析処理部１０７では、入力された音声信号について、例えば先に図５により説明したタイミングでサンプリングを行って応答信号を得て、これについて例えばＦＦＴによる周波数解析を行なう。この周波数解析結果は、例えば制御部２３が取り込むことで、例えば図８にて説明したようにして、周波数解析結果に基づいた所要の測定結果を得るようにされる。 Assuming that the measurement sound is output from the speaker 14 in response to the measurement sound signal being output from the measurement sound processing unit 108 under the preparation measurement mode as described above, the microphone 25 Thus, ambient sound including the measurement sound is collected. The sound signal of the collected sound is amplified by the microphone amplifier 101 and input to the analysis processing unit 107 of the preparation measurement processing block 106 via the terminals Tm 1 → Tm 3 of the switch 102.
The analysis processing unit 107 samples the input audio signal, for example, at the timing described above with reference to FIG. 5 to obtain a response signal, and performs frequency analysis by FFT, for example. The frequency analysis result is captured by the control unit 23, for example, so that a required measurement result based on the frequency analysis result is obtained as described with reference to FIG.

また、本測定モードのときには、制御部２３は、スイッチ１２０については端子Ｔｍ１と端子Ｔｍ２の接続状態を維持することで測定モードとし、そのうえで、スイッチ１０２、１０９については、共に端子Ｔｍ１に対して端子Ｔｍ２の接続に切り換える。これにより、音場補正／測定機能部２２としては、測定モードとして本測定モードに対応した信号経路が形成される。 In this measurement mode, the control unit 23 sets the switch 120 to the measurement mode by maintaining the connection state between the terminal Tm1 and the terminal Tm2, and the switches 102 and 109 are both connected to the terminal Tm1. Switch to Tm2 connection. As a result, the sound field correction / measurement function unit 22 forms a signal path corresponding to the main measurement mode as the measurement mode.

本測定モードによる測定では、準備測定処理ブロック１０６に代わって、本測定処理ブロック１０３が機能するものとされている。この本測定処理ブロック１０３も、解析処理部１０４と測定音処理部１０５を備える。そして、本測定時においては、測定音処理部１０５において所要の信号波形を生成して、これを測定音として出力するようにされる。なお、本測定にあっては、準備測定において用いられる音素による測定音以外の測定音も用いられるものである。
また、このときに、各チャンネルのスピーカから出力される測定音のレベルは、先の準備測定の測定結果に応じた設定に従っている。さらに、先の準備測定によっては、スピーカの有無、（チャンネル構成）も判定されているから、ＡＶシステムにおいて無いとされるスピーカに対応するチャンネルに対して測定音の出力はしないようにされる。これにより測定音処理部１０５としての処理負担が効率的に軽減される。なお、上記した測定音のレベル設定、及びチャンネル構成に応じた測定音の出力設定は、準備測定結果に応じて制御部２３が測定音処理部１０５を制御することで行われる。 In the measurement in the main measurement mode, the main measurement processing block 103 functions instead of the preparation measurement processing block 106. The main measurement processing block 103 also includes an analysis processing unit 104 and a measurement sound processing unit 105. At the time of actual measurement, the measurement sound processing unit 105 generates a required signal waveform and outputs it as measurement sound. In this measurement, measurement sound other than the measurement sound by the phoneme used in the preparation measurement is also used.
At this time, the level of the measurement sound output from the speaker of each channel is set according to the measurement result of the previous preparation measurement. Furthermore, since the presence / absence of the speaker and (channel configuration) are also determined by the previous preparation measurement, the measurement sound is not output to the channel corresponding to the speaker that is not present in the AV system. Thereby, the processing burden as the measurement sound processing unit 105 is efficiently reduced. The measurement sound level setting and the measurement sound output setting according to the channel configuration are performed by the control unit 23 controlling the measurement sound processing unit 105 according to the preparation measurement result.

このようにして、本測定処理ブロック１０３の測定音処理部１０５から測定音の信号が出力されることによっては、準備測定の場合と同様にして、マイクロフォン２５によって、測定音を含む周囲環境音が収音され、マイクロフォンアンプ１０１からスイッチ１０２の端子Ｔｍ１→Ｔｍ２を介して解析処理部１０４に入力されることになる。 In this way, when the measurement sound signal is output from the measurement sound processing unit 105 of the main measurement processing block 103, the ambient sound including the measurement sound is generated by the microphone 25 in the same manner as in the preparation measurement. The collected sound is input from the microphone amplifier 101 to the analysis processing unit 104 via the terminals Tm1 → Tm2 of the switch 102.

解析処理部１０４としても、入力された音声信号について、測定音出力に応じた所要のタイミングでサンプリングを行って応答信号を得て、これについて例えばＦＦＴによる周波数解析を行なう。そして、例えば制御部２３がこの周波数解析結果を取り込んで、本測定としての所要の測定結果を得るようにされる。つまり、例えば、音場補正のための所要のパラメータの設定値を決定するようにされる。
ここで、この本測定処理ブロック１０３の解析処理部１０４と、準備測定処理ブロック１０６の解析処理部１０７は、例えばＦＦＴによる周波数解析を行うという点で共通の機能を有する。また、本測定処理と準備処理とが同時的に併行して実行されることはない。このことから、解析処理部１０４，１０７については１つにまとめて、本測定処理と準備処理とで共有させてもよい。 The analysis processing unit 104 also samples the input audio signal at a required timing according to the measurement sound output to obtain a response signal, and performs frequency analysis by FFT, for example. Then, for example, the control unit 23 takes in the frequency analysis result to obtain a required measurement result as the main measurement. That is, for example, a set value of a required parameter for sound field correction is determined.
Here, the analysis processing unit 104 of the main measurement processing block 103 and the analysis processing unit 107 of the preparation measurement processing block 106 have a common function in that, for example, frequency analysis is performed by FFT. Further, the main measurement process and the preparation process are not executed concurrently. Therefore, the analysis processing units 104 and 107 may be combined into one and shared between the main measurement process and the preparation process.

続いて、音場補正モードとするためには、スイッチ１２０について、端子Ｔｍ１に対して端子Ｔｍ３を接続するようにされる。なお、スイッチ１０２，１０９は、測定モード下において、本測定モードと準備測定モードとを切り換えるためのものであるから、このときには端子切り換え状態は不定でよい。 Subsequently, in order to enter the sound field correction mode, the switch 120 is connected to the terminal Tm3 with respect to the terminal Tm1. Note that the switches 102 and 109 are for switching between the main measurement mode and the preparation measurement mode in the measurement mode, and at this time, the terminal switching state may be indefinite.

音場補正モードのときには、音場補正処理ブロック１１０に対してソース音声信号が入力されてくる。ここでいうソース音声信号とは、メディア再生部１１にて再生出力されるオーディオ信号であり、これまでにも説明しているように、最大７チャンネルのマルチチャンネルによる複数のオーディオ信号が入力される場合がある。この場合の音場補正処理ブロック１１０には、ディレイ処理部１１１、イコライザ部１１２、ゲイン調整部１１３を備えることとしているが、これらの各部位としても、最大７チャンネルのオーディオ信号の各々について独立的に処理が可能な構成とされている。 In the sound field correction mode, the source sound signal is input to the sound field correction processing block 110. The source audio signal here is an audio signal reproduced and output by the media reproducing unit 11, and as described above, a plurality of audio signals of a maximum of 7 channels are input. There is a case. The sound field correction processing block 110 in this case is provided with a delay processing unit 111, an equalizer unit 112, and a gain adjustment unit 113, but each of these parts is also independent for each of up to seven channels of audio signals. It is configured to be capable of processing.

音場補正処理ブロック１１０において、ディレイ処理部１１１は、入力された各チャンネルの音声信号について、それぞれ異なる遅延時間により遅延させて出力可能に構成される。このディレイ処理部１１１は、各スピーカからの聴取位置に対する距離の相違に応じたスピーカから聴取位置までの到達音の時間差が原因となって生じる音場の乱れを補正する。
また、イコライザ部１１２は、入力された各チャンネルの音声信号ごとに独立して、それぞれ任意のイコライザ特性を設定して出力することができる。イコライザ部１１２によっては、スピーカの位置と聴取位置との関係や、スピーカと聴取位置との間に在る障害物の状態、さらにはスピーカの再生音響特性のばらつきなどにより変化する音質を補正する。
また、ゲイン調整部１１３は、入力された各チャンネルの音声信号ごとに、独立してゲインを設定して出力することができる。このゲイン調整部１１３によっては、スピーカと聴取位置との位置関係、スピーカと聴取位置との間に存在する障害物の状態、スピーカと聴取位置との距離などに応じてチャンネルごとにばらつく音量を補正する。
このような信号処理機能を備える音場補正処理ブロック１１０は、例えばオーディオ信号に対応したＤＳＰとして構成されるものである。 In the sound field correction processing block 110, the delay processing unit 111 is configured to be able to output the audio signals of the input channels after being delayed by different delay times. The delay processing unit 111 corrects the disturbance of the sound field caused by the time difference of the arrival sound from the speaker to the listening position according to the difference in distance from the listening position from each speaker.
Further, the equalizer unit 112 can set and output arbitrary equalizer characteristics independently for each input audio signal of each channel. Depending on the equalizer unit 112, the sound quality that varies depending on the relationship between the position of the speaker and the listening position, the state of the obstacle between the speaker and the listening position, and the variation in the reproduction acoustic characteristics of the speaker is corrected.
The gain adjusting unit 113 can set and output the gain independently for each input audio signal of each channel. Depending on the gain adjusting unit 113, the positional relationship between the speaker and the listening position, the state of an obstacle existing between the speaker and the listening position, the volume that varies from channel to channel is corrected depending on the distance between the speaker and the listening position. To do.
The sound field correction processing block 110 having such a signal processing function is configured as a DSP corresponding to an audio signal, for example.

制御部２３は、前述した本測定の測定結果として、各オーディオチャンネル間における聴取位置までの到達音の時間差の関係、各オーディオチャンネルの音が聴取位置に到達した段階での音質変化、及びレベルのばらつき状態などの情報を得ているものとされる。
そして、音場補正のパラメータとして、例えば、各オーディオチャンネル間における聴取位置までの到達音の時間差の関係の情報に基づいては、この時間差が解消されるように、ディレイ処理部１１１に対して各オーディオチャンネルごとの遅延時間を設定する。
また、各オーディオチャンネルの音が聴取位置に到達した段階での音質変化の情報に基づいて、この音質変化が補われるようにして、イコライザ部１１２に対して各オーディオチャンネルごとのイコライザ特性を設定する。また、聴取位置に到達した各オーディオチャンネルの音のレベルのばらつきの情報に基づいては、このばらつきが解消されるようにして、ゲイン調整部１１３に対して各オーディオチャンネルごとにゲインを設定する。 As a measurement result of the main measurement described above, the control unit 23 determines the relationship between the time difference of the arrival sound to the listening position between the audio channels, the sound quality change when the sound of each audio channel reaches the listening position, and the level It is assumed that information such as a variation state is obtained.
Then, as the sound field correction parameter, for example, based on the information on the relationship of the time difference of the arrival sound to the listening position between the audio channels, each delay processing unit 111 is informed so that this time difference is eliminated. Set the delay time for each audio channel.
Further, based on the information on the sound quality change at the stage when the sound of each audio channel reaches the listening position, the equalizer characteristic for each audio channel is set to the equalizer unit 112 so as to compensate for this sound quality change. . Also, based on the information on the variation in the sound level of each audio channel that has reached the listening position, the gain is set for each audio channel in the gain adjustment unit 113 so that this variation is eliminated.

音場補正処理ブロック１１０に入力されたソース音声信号は、上記のようにしてパラメータ設定されたディレイ処理部１１１、イコライザ部１１２、及びゲイン調整部１１３により信号処理が行われた後、パワーアンプ部１３にて増幅され、スピーカ４から実音性として出力されることになる。このようにして出力された音声により形成される音場は、例えばしかるべき聴取位置にて聴取することで、補正前よりも改善された良好なものとなっている。 The source audio signal input to the sound field correction processing block 110 is subjected to signal processing by the delay processing unit 111, the equalizer unit 112, and the gain adjustment unit 113 set as described above, and then the power amplifier unit. 13 is amplified and output from the speaker 4 as real sound. The sound field formed by the sound output in this way is improved and better than before correction, for example, by listening at an appropriate listening position.

ここで、シーケンスデータの構造について図１３に例示しておく。なお、この図に示されるデータ構造は、あくまでも一例である。
この図に示すシーケンスデータは、イベント単位の連結により形成される構造を有する。１つのイベントは、１つの音素に対応するデータである。そして、各イベントは、例えば発音時間、ベース音、高調波構造、チャンネル、解析モードの情報を格納する。
発音時間の情報は、現イベントが対応する音素についての出力タイミングを規定するものであり、これにより、その音素について、サンプル数Ｎを何回連続して出力させるのかということと、その音素を、時間的にどのタイミングで出力させるのかが特定される。出力タイミングとしては、例えば、測定音メロディとしての音素出力の開始時点を基点（０）として、この基点に対するサンプル数の積算により指定するものとして定義させることが考えられる。この場合の音素出力タイミングの分解能としては、サンプリング周波数の１周期に対応する時間が最高となる。
ベース音の情報は、ベース音として、どの次数ｍの値によるｍ次正弦波とするのかについての指定を行なう。 Here, the structure of the sequence data is illustrated in FIG. Note that the data structure shown in this figure is merely an example.
The sequence data shown in this figure has a structure formed by connecting event units. One event is data corresponding to one phoneme. Each event stores, for example, information on sound generation time, bass sound, harmonic structure, channel, and analysis mode.
The pronunciation time information defines the output timing for the phoneme to which the current event corresponds. With this, how many times the sample number N is continuously output for the phoneme, and the phoneme, It is specified at what timing in time to output. As the output timing, for example, it is conceivable that the start point of the phoneme output as the measurement sound melody is defined as a base point (0) and designated by the integration of the number of samples with respect to the base point. As the resolution of the phoneme output timing in this case, the time corresponding to one cycle of the sampling frequency is the highest.
As the bass sound information, designation as to which m-th order sine wave is to be used as the base sound is performed.

高調波構造の情報は、オクターブ次数ｋ＝２〜６によるオクターブ高調波の各振幅値についての、ベース音に対するバランスを指定する。これにより、音素の音色が決定される。なお、オクターブ高調波の各振幅値のバランスは、音素の音色のみを考慮するのではなく、例えば測定条件に適合した良好な測定結果が得られることを考慮する場合もある。
なお、高調波構造に関しては、第１次解析モードに際しては、この高調波構造の情報に従って生成するが、第２次解析モードに際しては、例えば、第１次解析モードの測定結果に従って、この段階で、より良好な測定結果が得られるようにして、適応的に変更されるようにしても良い。 The harmonic structure information designates the balance with respect to the bass sound for each amplitude value of the octave harmonics with octave orders k = 2-6. Thereby, the tone color of a phoneme is determined. Note that the balance of the amplitude values of the octave harmonics may take into account that, for example, a good measurement result suitable for the measurement conditions can be obtained instead of considering only the timbre of the phoneme.
The harmonic structure is generated in accordance with the information on the harmonic structure in the first analysis mode, but in the second analysis mode, for example, at this stage according to the measurement result in the first analysis mode. In order to obtain a better measurement result, it may be changed adaptively.

チャンネルの情報は、その音素を出力させるべきオーディオチャンネルを指定する。なお、例えば、複数のチャンネルから同じ音高に対応する音素を同時に出力させる場合のあることを考えると、このチャンネルの情報には、複数のチャンネルを指定して記述可能なものとして定義することが好ましい。このようにすれば、チャンネル数に応じた複数のイベントを作らなくとも、１つのイベントにより、複数のチャンネルから同じ音高に対応する音素を同時に出力させるための制御が可能になる。 The channel information specifies an audio channel on which the phoneme should be output. For example, considering that there is a case where phonemes corresponding to the same pitch are output from a plurality of channels at the same time, it is possible to define the information of this channel as one that can be described by specifying a plurality of channels. preferable. In this way, it is possible to perform control for simultaneously outputting phonemes corresponding to the same pitch from a plurality of channels by one event without creating a plurality of events corresponding to the number of channels.

解析モードの情報は、その音素が対応する解析モードを指定する。例えば図７及び図８に示した例に従えば、第１次解析モード、第２次解析モード、及び非解析モードの何れのモードであるのかが示されることになる。制御部２３としては、この解析モードの情報が示すモードに従って、この音素を出力して得られる音声について、解析を行うべきか否かを決定し、また、解析を行う場合は、解析モードの情報に従って、例えば第１次と第２次の何れかに対応した測定結果を得るようにされる。また、この解析モードの情報に、サンプル遅延時間Ｔdrsを指定する情報を含めるようにすることも考えられる。
このようなシーケンスデータに基づいて制御部２３が準備測定処理ブロックに対する制御を実行することで、シーケンスデータの記述内容に従った音高と出力タイミングにより音素の出力が行われ、この結果、例えば図７により説明したようにして、測定音がメロディ的に出力されることになる。 The analysis mode information specifies the analysis mode to which the phoneme corresponds. For example, according to the example shown in FIGS. 7 and 8, it is indicated which mode is the primary analysis mode, the secondary analysis mode, or the non-analysis mode. The control unit 23 determines whether or not to analyze the voice obtained by outputting this phoneme according to the mode indicated by the information of the analysis mode. Accordingly, for example, a measurement result corresponding to either the first order or the second order is obtained. It is also conceivable to include information designating the sample delay time Tdrs in the analysis mode information.
Based on such sequence data, the control unit 23 executes control on the preparatory measurement processing block, so that phonemes are output according to the pitch and output timing according to the description content of the sequence data. As described with reference to FIG. 7, the measurement sound is output melodyally.

図１４は、制御部２３が実行するとされる準備測定のための制御処理をフローチャートにより示している。
先ず、制御部２３は、ステップＳ２０１により、メモリ部２１から所要のシーケンスデータを読み込むようにされる。これより以降において、制御部２３は、読み込みを行ったシーケンスデータの内容を解析して処理可能となる。 FIG. 14 is a flowchart showing a control process for preparatory measurement that is executed by the control unit 23.
First, the control unit 23 reads required sequence data from the memory unit 21 in step S201. Thereafter, the control unit 23 can analyze and process the content of the read sequence data.

次のステップＳ２０２においては、暗騒音のチェックを行なう。これは、先に図８に示したステップＳ１０１と同様の動作を実現するための処理となる。ステップＳ２０３以降は、暗騒音のチェック結果として、マイクロフォン２５が接続されていると判定された場合の処理となる。 In the next step S202, background noise is checked. This is a process for realizing the same operation as step S101 shown in FIG. Steps S203 and after are processing when it is determined that the microphone 25 is connected as a background noise check result.

ステップＳ２０３以降は、シーケンスデータを解釈したことに応じてイベントを処理していくための処理となる。
先ず、ステップＳ２０３においては、未だ未処理のイベントの発音時間の情報を参照することで、未だ出力開始させていない音素のうち、出力開始タイミングに至った音素があるか否かについて判別する。ここで出力開始タイミングに至る音素は無いとして否定の判別結果が得られたのであれば、ステップＳ２０４の処理をスキップしてステップＳ２０５の処理に進むが、出力開始タイミングに至った音素があるとして肯定の判別結果が得られたのであれば、ステップＳ２０４の処理を実行する。 Step S203 and subsequent steps are processing for processing events according to the interpretation of the sequence data.
First, in step S203, it is determined whether or not there is a phoneme that has reached the output start timing among phonemes that have not yet started output by referring to information on the sound generation time of an event that has not yet been processed. If a negative determination result is obtained on the assumption that there is no phoneme that reaches the output start timing, the process of step S204 is skipped and the process proceeds to step S205. If the determination result is obtained, the process of step S204 is executed.

ステップＳ２０４においては、上記ステップＳ２０３において出力すべきと判別された音素について、その音素が対応するイベントに記述されたベース音、高調波構造の情報を参照して、実際に音素を生成するための処理を実行する。そして、この生成した音素を、この音素が対応するイベントに記述された発音時間の情報に基づいたサンプル数Ｎの繰り返し回数により出力させる。また、音素を音声信号として出力させるチャンネルは、同じイベントに記述されたチャンネルの情報に従って決定する。 In step S204, for the phoneme determined to be output in step S203, information on the base sound and the harmonic structure described in the event corresponding to the phoneme is used to actually generate the phoneme. Execute the process. Then, the generated phoneme is output by the number of repetitions of the number of samples N based on the sound generation time information described in the event corresponding to the phoneme. Also, a channel for outputting phonemes as an audio signal is determined according to channel information described in the same event.

これまでのステップＳ２０４の処理によって音素を出力開始させるごとに、これに対応して、サンプル遅延時間Ｔdrsに応じたタイミングでサンプリング処理イベントが発生することになる。ステップＳ２０５においては、このようにして発生されるべきサンプリング処理イベントのうちで、開始タイミングに至ったものが有るか否かについて判別する。ここで、開始タイミングに至ったサンプル処理イベントは無いとの判別結果が得られたのであれば、ステップＳ２０６→Ｓ２０７の処理をスキップしてステップＳ２０８の処理に移行する。これに対して、ステップＳ２０５において、サンプリング処理イベントのうちで開始タイミングに至ったものが有るとして肯定の判別結果が得られたのであれば、ステップＳ２０６の処理に進む。 Each time a phoneme is started to be output by the processing of step S204 so far, a sampling processing event occurs at a timing corresponding to the sample delay time Tdrs. In step S205, it is determined whether or not there is any sampling processing event to be generated in this way that has reached the start timing. If it is determined that there is no sample processing event that has reached the start timing, the process from step S206 to S207 is skipped and the process proceeds to step S208. On the other hand, if a positive determination result is obtained in step S205 that there is a sampling process event that has reached the start timing, the process proceeds to step S206.

ステップＳ２０６においては、マイクロフォン２５により収音された音声信号について、上記サンプル遅延時間Ｔdrsに従ったタイミングで、所定サンプリング数によりサンプリング処理を実行する。これまで説明した実施の形態としては、サンプル数Ｎによるサンプリング処理を実行することになる。そして、次のステップＳ２０７においては、上記ステップＳ２０６によりサンプリングして得られた応答信号について、その音素が対応するイベントが指定する解析モードに従ってのＦＦＴによる周波数解析を行い、また、この解析結果を利用して、同じく指定された解析モードに対応した測定結果を得るための処理を実行する。 In step S206, a sampling process is executed for the audio signal collected by the microphone 25 at a timing according to the sample delay time Tdrs with a predetermined sampling number. In the embodiment described so far, the sampling process with the number N of samples is executed. In the next step S207, the response signal obtained by sampling in step S206 is subjected to frequency analysis by FFT according to the analysis mode specified by the event corresponding to the phoneme, and the analysis result is used. Then, a process for obtaining a measurement result corresponding to the designated analysis mode is executed.

ステップＳ１０８においては、シーケンスが終了したか、つまりステップＳ２０１により読み込みを行ったシーケンスデータについてのイベント処理、及びこれに対応したサンプリング処理、解析処理が終了したか否かについて判別する。ここで、否定の判別結果が得られたのであれば、ステップＳ２０３の処理に戻るようにされるが、肯定の判別結果が得られたのであれば、ステップＳ２０９に進む。
ステップＳ２０９では、先の図８のステップＳ１１９の手順と同様の総合判定処理を実行する。 In step S108, it is determined whether or not the sequence has ended, that is, whether or not the event processing for the sequence data read in step S201, and the sampling processing and analysis processing corresponding thereto have ended. If a negative determination result is obtained, the process returns to step S203. If a positive determination result is obtained, the process proceeds to step S209.
In step S209, a comprehensive determination process similar to the procedure in step S119 of FIG. 8 is executed.

ところで、本実施の形態では、測定音メロディとしてどのようなメロディとなるのかは、シーケンスデータにより決まることとなる。最もシンプルな形態としては、このシーケンスデータを、予め１つのみメモリ部２４に格納するようにして用意しておき、測定音メロディを出力させるときには、このシーケンスデータに基づいて行うようにすればよい、ということになる。また、複数のシーケンスデータをメモリ部２４に格納して用意することとして、ユーザによる選択操作や、準備測定における所定条件に応じて、シーケンスデータを選択して使用することが考えられる。
また、シーケンスデータとしては、例えば工場出荷の段階でメモリ部２４に記憶済みとされているプリセットのものだけではなく、例えば音響補正装置２がユーザの手に渡った後の段階において、外部から取得してメモリ部２４に記憶（ダウンロード）させるようにすることも考えられる。
また、シーケンスデータにおいて、非解析モードに対応する測定音の出力シーケンスに関しては、ユーザの操作に応じて、メロディ、音素の音色、及び音素を出力させるスピーカなどについて任意に変更するような編集が行えるようにすることも考えられる。このようにすれば、エンタテイメント性がさらに高まることになる。ただし、解析モードに対応する音素の出力について不用意に変更されると、有意な測定ができなくなるおそれがあるので、解析モードに対応する測定音の出力シーケンスについてはユーザによる編集は不可とすることが好ましい。 By the way, in this embodiment, what kind of melody is used as the measurement sound melody is determined by the sequence data. In the simplest form, only one sequence data is prepared in advance so as to be stored in the memory unit 24, and when the measurement sound melody is output, the sequence data may be performed based on the sequence data. ,It turns out that. Further, as a plurality of sequence data stored in the memory unit 24 and prepared, it is conceivable to select and use sequence data according to a selection operation by a user or a predetermined condition in preparation measurement.
Moreover, as sequence data, for example, not only preset data stored in the memory unit 24 at the time of factory shipment, but also acquired from the outside, for example, at a stage after the sound correction device 2 reaches the user It is also conceivable to store (download) the data in the memory unit 24.
In addition, in the sequence data, the measurement sound output sequence corresponding to the non-analysis mode can be edited so as to arbitrarily change the melody, phoneme tone, and the speaker that outputs the phoneme according to the user's operation. It is also possible to do so. In this way, entertainment properties are further enhanced. However, if the phoneme output corresponding to the analysis mode is inadvertently changed, significant measurement may not be possible, so the output sequence of the measurement sound corresponding to the analysis mode cannot be edited by the user. Is preferred.

また、上記実施の形態では、基礎波形成分データを保持して、この基礎波形成分データを基にして、必要とされる全ての音素を生成することとしている。この場合には、必要な音素を得るための源が、１つの基礎波形成分データのみなので、例えばメモリ部２４などのような音響補正装置における記憶領域の記憶容量を圧迫しないというメリットがある。しかしながら、このような記憶領域について余裕があるのであれば、測定音メロディを作成するのに必要とされる全ての音素の波形データを作成して予め音源データとして保持させておき、測定音メロディの出力時には、この音源データを記憶領域から読み出して再生出力させるような構成とすることも考えられる。 Further, in the above embodiment, the basic waveform component data is held, and all necessary phonemes are generated based on the basic waveform component data. In this case, since the source for obtaining the necessary phoneme is only one basic waveform component data, there is an advantage that the storage capacity of the storage area in the acoustic correction device such as the memory unit 24 is not compressed. However, if there is room for such a storage area, waveform data of all phonemes required to create the measurement sound melody are created and stored in advance as sound source data, and the measurement sound melody At the time of output, a configuration may be considered in which the sound source data is read from the storage area and reproduced and output.

また、図２及び図４に示した概念によると、音階を成し得る音素のみを測定音メロディの要素として採択することとしている。しかしながら、音階にあてはまらない音素も、サンプル数Ｎに対して整数周期で収まるｍ次正弦波を基とする以上、測定対象周波数となり得るものであるから、測定音メロディに使用することについては何ら問題はない。むしろ、例えば、このような音階にあてはまらない音素を測定音メロディ中に利用することで、逆に測定音メロディとして音楽的により効果的なものとすることも可能であるから、積極的に使用されても良いものである。
また、非解析モードのときには、応答信号について周波数解析を行わないことを考えれば、非解析モードのときには、サンプル数Ｎに対して整数周期で収まるｍ次正弦波を基とした測定音を出力する必要はないということがいえる。そこで、非解析モードに対応しては、上記ｍ次正弦波を基とする以外の波形を用いるようにすれば、一連の測定音出力シーケンスとしてより多様な音色によるメロディとすることができるので、音楽性、エンタテイメント性はより高められることになる。例えば、上記ｍ次正弦波を基とする以外の波形として、本当の楽器の音をサンプリングしたものを用いるようにすれば、測定音メロディは、より音楽的なものとなる。 Further, according to the concept shown in FIGS. 2 and 4, only phonemes that can form a scale are adopted as elements of the measurement sound melody. However, since a phoneme that does not fall within the scale can be a measurement target frequency as long as it is based on an mth-order sine wave that fits in an integer period with respect to the number of samples N, there is no problem with using it as a measurement sound melody. There is no. Rather, for example, by using a phoneme that does not fit in such a scale in the measurement sound melody, it is possible to make the measurement sound melody more musically effective. It is good.
In consideration of not performing frequency analysis on the response signal in the non-analysis mode, in the non-analysis mode, a measurement sound based on an mth-order sine wave that fits in an integer period with respect to the number of samples N is output. It can be said that there is no need. Therefore, in response to the non-analysis mode, if a waveform other than the m-th order sine wave is used, a melody with more diverse tones can be obtained as a series of measurement sound output sequences. Musicality and entertainment will be further enhanced. For example, if a sampled sound of a real musical instrument is used as a waveform other than that based on the m-th order sine wave, the measured sound melody becomes more musical.

また、実施の形態において測定音を収音するマイクロフォン２５としては、例えば無指向性のモノラルのもの１つを用いれば、充分に有意な測定を行うことは可能であるが、例えば、複数のマイクロフォンをしかるべき位置に配置したり、ステレオ対応のマイクロフォン、あるいは複数のバイノーラル方式に従ったマイクロフォンなどを用いれば、より信頼性の高い測定結果を得ることも可能である。 In the embodiment, as the microphone 25 that picks up the measurement sound, for example, if one non-directional monaural one is used, a sufficiently significant measurement can be performed. For example, a plurality of microphones can be used. Can be obtained at an appropriate position, or a stereo-compatible microphone or a plurality of binaural microphones can be used to obtain a more reliable measurement result.

また、例えば図１０に示した本実施の形態の音響補正装置２において、準備測定処理ブロック１０６が担うとされる、測定音処理部１０８及び解析処理部１０７としての動作、つまり、音素生成の処理と、測定音メロディを構築する（生成した音素をシーケンスデータに応じたタイミングにより出力させる）ための制御処理と、収音音声信号を所要タイミングでサンプリングするとともに、ＦＦＴを行なう処理などは、ハードウェアにより構成してもよい。また、音響補正装置２としては、マイクロコンピュータを備えることとして、プログラムに従ってＣＰＵが実行する処理として実現されるようにすればよい。この構成を図１０に対応させると、制御部２３がマイクロコンピュータとして構成されることになり、準備測定処理ブロック１０６としての機能部ロックは、実際には、この制御部２３内のＣＰＵが実行するソフトウェア的処理となる。
また、本測定処理ブロック１０３、及び音場補正処理ブロック１１０としての機能についても、ハードウェアとして構成してもよいし、あるいは、ソフトウェアにより構成してもよいものである。 Further, for example, in the acoustic correction device 2 of the present embodiment shown in FIG. 10, the operations as the measurement sound processing unit 108 and the analysis processing unit 107, which are assumed to be responsible for the preparation measurement processing block 106, that is, phoneme generation processing The control processing for constructing the measurement sound melody (outputting the generated phoneme at the timing according to the sequence data), sampling the collected sound signal at the required timing, and performing the FFT, etc. You may comprise by. Moreover, what is necessary is just to be implement | achieved as a process which CPU performs according to a program as the acoustic correction apparatus 2 having a microcomputer. When this configuration corresponds to FIG. 10, the control unit 23 is configured as a microcomputer, and the functional unit lock as the preparation measurement processing block 106 is actually executed by the CPU in the control unit 23. Software processing.
Also, the functions as the main measurement processing block 103 and the sound field correction processing block 110 may be configured as hardware or may be configured by software.

また、これまでの実施の形態の説明においては、ｍ次正弦波に基づいた測定音は、音場補正のための準備測定において用いられるべきものであるとして説明したが、本測定の測定環境や測定条件によっては、本測定においても問題なく使用することができる。また、測定の目的としては、測定対象として人間の可聴周波数帯域の音が適当とされるようなものであれば、音場補正に限定されない。
また、上記実施の形態では、ｍ次正弦波に基づいた測定音の応答信号について周波数解析を行うのにあたってＦＦＴを採用しているが、例えばＤＦＴ(Discrete Fourier Transform：離散的フーリエ変換）をはじめとして、他の周波数解析を採用することも考えられる。 In the description of the embodiments so far, the measurement sound based on the mth-order sine wave has been described as being used in the preparation measurement for sound field correction. Depending on the measurement conditions, this measurement can be used without any problem. In addition, the purpose of the measurement is not limited to sound field correction as long as sound in a human audible frequency band is appropriate as a measurement target.
In the above embodiment, FFT is used to perform frequency analysis on the response signal of the measurement sound based on the mth-order sine wave. For example, DFT (Discrete Fourier Transform) is used. It is also conceivable to employ other frequency analysis.

１ＡＶシステム、２音響補正装置、１１メディア再生部、１２映像表示装置、１３パワーアンプ部、１４スピーカ、２１フレームバッファ、２２音場補正／測定機能部、２３制御部、２４メモリ部、２５マイクロフォン、１０１マイクロフォンアンプ、１０２，１２０スイッチ、１０３本測定処理ブロック、１０４，１０７解析処理部、１０５，１０８測定音処理部、１０６準備測定処理ブロック、１１０音場補正処理ブロック、１１０音場補正処理ブロック、１１１ディレイ処理部、１１２イコライザ部、１１３ゲイン調整部、２０１ｍ次正弦波生成処理、２０２オクターブ高調波生成処理、２０３−１〜２０３−６レベル調整処理、２０４合成処理 1 AV system, 2 sound correction device, 11 media playback unit, 12 video display device, 13 power amplifier unit, 14 speaker, 21 frame buffer, 22 sound field correction / measurement function unit, 23 control unit, 24 memory unit, 25 microphone , 101 Microphone amplifier, 102, 120 switch, 103 Main measurement processing block, 104, 107 Analysis processing unit, 105, 108 Measurement sound processing unit, 106 Preparation measurement processing block, 110 Sound field correction processing block, 110 Sound field correction processing block , 111 delay processing unit, 112 equalizer unit, 113 gain adjustment unit, 201 mth-order sine wave generation processing, 202 octave harmonic generation processing, 203-1 to 203-6 level adjustment processing, 204 synthesis processing

Claims

A measurement method comprising a first measurement procedure and a second measurement procedure following the first measurement procedure,
The first measurement procedure is as follows.
A first output procedure for outputting a plurality of required phonemes obtained based on different fundamental sound components to separate speakers so that the output periods overlap each other;
A first sound collection procedure for obtaining an audio signal by collecting the plurality of phonemes emitted from the separate speakers through a plurality of spatial transmission paths;
Based on the analysis result obtained by executing a predetermined frequency analysis process on the audio signal collected in the first sound collection procedure, the signal output for each of the separate speakers is in accordance with the sound pressure level. Execute the setting procedure and to set
The second measurement procedure is as follows.
A plurality of required phonemes obtained based on different fundamental sound components are subjected to characteristics according to the sound pressure level set in the setting procedure, and the obtained plurality of phoneme signals are output to the separate speakers. 2 output procedures;
A second sound collection procedure for obtaining a sound signal by collecting the plurality of phoneme signals emitted from the separate speakers via the plurality of spatial transmission paths;
Based on an analysis result obtained by executing a predetermined frequency analysis process on the audio signal collected in the second sound collection procedure, a measurement result on a required measurement item is obtained for each of the plurality of spatial transmission paths. To obtain the measurement procedure and
The phoneme is obtained on the basis of a fundamental component that is a sine wave in which an integer number of periods is applied to a predetermined number of samples N expressed as a power of 2,
One phoneme output in the first output procedure and the second output procedure has a frequency having a frequency 1 / (2P? ) (P is a natural number) of the fundamental component to which a predetermined number of periods of the integer is applied. When a component is a virtual fundamental component, a signal formed by synthesizing an arbitrary harmonic component from a plurality of harmonic components that have a frequency on a predetermined octave number with respect to this virtual fundamental component Is output as
At least either one of the upper Symbol first output procedure and second output steps with the required timing after output the required phonemes and outputs the next required phonemes, among the phonemes, A phoneme with a specific frequency component set as one reference frequency, and a specific frequency component having a frequency that can be another pitch in the scale when the reference frequency is one pitch that forms a predetermined scale. Output phonemes
Measurement methods.

One phoneme output in at least one of the first output procedure and the second output procedure is a sine wave whose period number is 1 with respect to a predetermined sample number N expressed by a power of 2 The measurement method according to claim 1, wherein the measurement is output using basic waveform component data for a predetermined period that is equal to or more than ¼ period.

In at least one of the first sound collection procedure and the second sound collection procedure, the audio signal emitted from the speaker and collected is sampled at a predetermined timing with the number of samples N as the minimum sample unit. The measurement method according to claim 1, wherein a sampling procedure is executed.

The measurement method according to claim 1, wherein at least one of the first output procedure and the second output procedure outputs the next required phoneme at a required timing after outputting the required phoneme.

The specified phoneme is output at a specified output start timing based on control information specifying a phoneme output pattern in at least one of the first output procedure and the second output procedure. The measuring method as described in.

In at least one of the first output procedure and the second output procedure, among the phonemes, a phoneme having a specific frequency component set as one reference frequency, and this reference frequency is converted into a predetermined scale. The measurement method according to claim 1, wherein a phoneme having a specific frequency component having a frequency that can be another pitch in the scale is output when one pitch is formed.

Prior to the first measurement procedure, a determination procedure for determining the presence or absence of background noise is provided,
The measurement method according to claim 1, wherein the first measurement procedure is executed when it is determined that background noise exists as a result of the background noise check in the determination procedure.

First output means for outputting a plurality of required phonemes obtained based on different fundamental sound components to separate speakers so that the output periods overlap each other;
First sound collection means for obtaining a sound signal by collecting the plurality of phonemes emitted from the separate speakers through a plurality of spatial transmission paths;
Based on the analysis result obtained by executing a predetermined frequency analysis process on the sound signal collected by the first sound collection means, the sound output level corresponding to the signal output for each separate speaker is determined. Setting means for setting;
A plurality of required phonemes obtained based on different fundamental sound components are subjected to characteristics according to the sound pressure level set by the setting means, and the obtained plurality of phoneme signals are output to the separate speakers. Two output means;
Second sound collection means for collecting the plurality of phoneme signals emitted from the separate speakers through the plurality of spatial transmission paths to obtain a sound signal;
Based on an analysis result obtained by executing a predetermined frequency analysis process on the audio signal collected by the second sound collection procedure, a measurement result on a required measurement item is obtained for each of the plurality of spatial transmission paths. Measuring means to obtain,
The phoneme is obtained on the basis of a fundamental component that is a sine wave in which an integer number of periods is applied to a predetermined number of samples N expressed as a power of 2,
One phoneme output by the first output means and the second output means is a frequency component having a frequency of 1 / (2P ) ( P is a natural number) of the fundamental component to which a predetermined number of periods of the integer is applied. As a signal formed by synthesizing an arbitrary harmonic component from a plurality of harmonic components said to have a frequency on a predetermined octave number with respect to this virtual fundamental component. Output,
At least either one of the upper Symbol first output means and second output means, at a predetermined timing after output the required phonemes and outputs the next required phonemes, among the phonemes, A phoneme with a specific frequency component set as one reference frequency, and a specific frequency component having a frequency that can be another pitch in the scale when the reference frequency is one pitch forming a certain predetermined scale. Output phonemes
Measurement equipment.

A program for causing a computer to execute a measurement procedure comprising a first measurement procedure and a second measurement procedure following the first measurement procedure,
The first measurement procedure is as follows.
A first output procedure for outputting a plurality of required phonemes obtained based on different fundamental sound components to separate speakers so that the output periods overlap each other;
A first sound collection procedure for obtaining an audio signal by collecting the plurality of phonemes emitted from the separate speakers through a plurality of spatial transmission paths;
Based on the analysis result obtained by executing a predetermined frequency analysis process on the audio signal collected in the first sound collection procedure, the signal output for each of the separate speakers is in accordance with the sound pressure level. Execute the setting procedure and to set
The second measurement procedure is as follows.
A plurality of required phonemes obtained based on different fundamental sound components are subjected to characteristics according to the sound pressure level set in the setting procedure, and the obtained plurality of phoneme signals are output to the separate speakers. 2 output procedures;
A second sound collection procedure for obtaining a sound signal by collecting the plurality of phoneme signals emitted from the separate speakers via the plurality of spatial transmission paths;
Based on an analysis result obtained by executing a predetermined frequency analysis process on the audio signal collected in the second sound collection procedure, a measurement result on a required measurement item is obtained for each of the plurality of spatial transmission paths. To obtain the measurement procedure and
The phoneme is obtained on the basis of a fundamental component that is a sine wave in which an integer number of periods is applied to a predetermined number of samples N expressed as a power of 2,
One phoneme output in the first output procedure and the second output procedure has a frequency having a frequency 1 / (2P? ) (P is a natural number) of the fundamental component to which a predetermined number of periods of the integer is applied. When a component is a virtual fundamental component, a signal formed by synthesizing an arbitrary harmonic component from a plurality of harmonic components that have a frequency on a predetermined octave number with respect to this virtual fundamental component Is output as
At least either one of the upper Symbol first output procedure and second output steps with the required timing after output the required phonemes and outputs the next required phonemes, among the phonemes, A phoneme with a specific frequency component set as one reference frequency, and a specific frequency component having a frequency that can be another pitch in the scale when the reference frequency is one pitch that forms a predetermined scale. Output phonemes
Program to be executed by steps to the computer.