JP2015163909A

JP2015163909A - Acoustic reproduction device, acoustic reproduction method, and acoustic reproduction program

Info

Publication number: JP2015163909A
Application number: JP2014039269A
Authority: JP
Inventors: 幹篤 ▲角▼岡; Motoshi Sumioka; 菜美長田; Nami Osada; 佐々木　和雄; Kazuo Sasaki; 和雄佐々木
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2014-02-28
Filing date: 2014-02-28
Publication date: 2015-09-10

Abstract

PROBLEM TO BE SOLVED: To provide an acoustic reproduction device, an acoustic reproduction method, and an acoustic reproduction program with which it is possible to accurately recognize the vertical direction of a sound even when acoustic data irreversibly compressed with a high compression rate where a high-frequency component tends to be lacking is decoded.SOLUTION: Provided is an acoustic reproduction device for reproducing the acoustic data transmitted from an acoustic broadcasting device, the acoustic reproduction device comprising: lacking frequency analysis means for analyzing the frequency component region of the acoustic data that is lacking due to acoustic compression by the acoustic broadcasting device; sound pressure analysis means for analyzing the sound pressure of a frequency band, among the lacking frequency component region, that adjoins a lowest frequency band and is lower than the lowest frequency band; and high frequency component addition means for adding the frequency component of a sound pressure equivalent to said sound pressure to the lacking frequency component region.

Description

本発明は、圧縮のために欠落した音響データの高周波成分を補って音響再生する音響再生装置、音響再生方法及び音響再生プログラムに関する。 The present invention relates to a sound reproducing device, a sound reproducing method, and a sound reproducing program that reproduce sound by supplementing high-frequency components of sound data that are missing due to compression.

音響信号を圧縮する音響符号化技術は、音響信号の伝送及び蓄積において極めて重要な技術である。 An acoustic coding technique for compressing an acoustic signal is an extremely important technique in transmission and storage of an acoustic signal.

例えば、ある地点の周囲の音響環境を、限られた数の仮想スピーカで集約して別の地点で再現する音響ＡＲ（Augmented Reality：拡張現実）システムがある。 For example, there is an acoustic AR (Augmented Reality) system in which an acoustic environment around a certain point is aggregated by a limited number of virtual speakers and reproduced at another point.

図１は、音響ＡＲシステムの概略を説明するための図である。
図１に示すように、ネットワーク４を介して接続された三次元音響放送装置３及び三次元音響再生装置５を有する音響ＡＲシステムでは、ある地点１の異なる方位に設置された複数のマイク３５−１、３５−２、３５−３、３５−４、３５−５、３５−６、３５−７、３５−８が、複数の音源２−１、２−２、２−３から発生された音響信号をそれぞれ取得する。 FIG. 1 is a diagram for explaining the outline of the acoustic AR system.
As shown in FIG. 1, in an acoustic AR system having a three-dimensional sound broadcasting device 3 and a three-dimensional sound reproducing device 5 connected via a network 4, a plurality of microphones 35- installed in different directions at a certain point 1 1, 35-2, 35-3, 35-4, 35-5, 35-6, 35-7, and 35-8 are generated from a plurality of sound sources 2-1, 2-2, and 2-3. Acquire each signal.

三次元音響放送装置３は、音響入力部３１、音源管理部３２、集約部３３、及び送信部３４を備える。音響入力部３１は、マイク３５−１乃至３５−８で取得した音響信号をそれぞれ三次元音響放送装置３に入力する。音源管理部３２は、音響入力部３１で入力した音響信号を管理する。集約部３３は、複数のマイク３５−１乃至３５−８で取得した音響信号を集約する。そして、送信部３４は、集約した音響データをネットワーク４経由で三次元音響再生装置５に送信する。 The three-dimensional audio broadcasting device 3 includes an audio input unit 31, a sound source management unit 32, an aggregation unit 33, and a transmission unit 34. The sound input unit 31 inputs sound signals acquired by the microphones 35-1 to 35-8 to the three-dimensional sound broadcasting device 3. The sound source management unit 32 manages the sound signal input by the sound input unit 31. The aggregating unit 33 aggregates the acoustic signals acquired by the plurality of microphones 35-1 to 35-8. Then, the transmission unit 34 transmits the collected acoustic data to the three-dimensional sound reproduction device 5 via the network 4.

三次元音響再生装置５は、受信部５１及びＨＲＴＦ（Head-Related Transfer Function：頭部伝達関数）処理部５２を備える。頭部姿勢センサ５３は、利用者６の頭部の姿勢を検出する。イヤホン５４は、右スピーカ５４Ｒ及び左スピーカ５４Ｌを備え、音響を出力する。 The three-dimensional sound reproduction device 5 includes a receiving unit 51 and an HRTF (Head-Related Transfer Function) processing unit 52. The head posture sensor 53 detects the posture of the head of the user 6. The earphone 54 includes a right speaker 54R and a left speaker 54L, and outputs sound.

受信部５１は、三次元音響再生装置５からネットワーク４経由で送信された音響データを受信する。ＨＲＴＦ処理部５２は、頭部伝達関数を用いた処理であり、受信した音響データのそれぞれのチャンネルの音に対して、角度毎の周波数特性を適用して、左右２つのスピーカ（右スピーカ５４Ｒ及び左スピーカ５４Ｌ）に集約（加算処理）する。 The receiving unit 51 receives acoustic data transmitted from the three-dimensional sound reproducing device 5 via the network 4. The HRTF processing unit 52 is a process using a head-related transfer function, and applies frequency characteristics for each angle to the sound of each channel of the received acoustic data, and two left and right speakers (the right speaker 54R and the right speaker 54R). The left speaker 54L) is aggregated (addition process).

このような音響ＡＲシステムにより、イヤホン５４を装着した利用者６は、複数の音源２−１、２−２、２−３から発生された音響信号が、利用者６の異なる方位に仮想的に設置されている複数の仮想スピーカ７−１、７−２、７−３、７−４、７−５、７−６、７−７、７−８から発生されたように聞こえる。 With such an acoustic AR system, the user 6 wearing the earphone 54 virtually receives the acoustic signals generated from the plurality of sound sources 2-1, 2-2, 2-3 in different directions of the user 6. Sounds like being generated from a plurality of installed virtual speakers 7-1, 7-2, 7-3, 7-4, 7-5, 7-6, 7-7, 7-8.

上述したように、音響ＡＲシステムは、三次元音響放送装置３が複数のチャネルで集約した音響データを、別の地点に設置された三次元音響再生装置５にストリーミングしている。このストリーミングのための通信帯域は、例えば４４．１ｋＨｚ、１６ｂｉｔの通信データであれば、６Ｍｂｐｓ程度が必要になる。そこで、この音響信号を圧縮して、例えば５１２ｋｂｐｓ程度に圧縮して通信することが望まれる。この場合、圧縮率は約８％である。 As described above, the sound AR system streams the sound data collected by the three-dimensional sound broadcasting device 3 through a plurality of channels to the three-dimensional sound reproduction device 5 installed at another point. For example, if the communication band for streaming is 44.1 kHz and 16-bit communication data, about 6 Mbps is required. Therefore, it is desired to compress this acoustic signal, for example, compress it to about 512 kbps for communication. In this case, the compression rate is about 8%.

現在の可逆圧縮技術では、圧縮率は７０％程度である。他方、不可逆圧縮であれば、圧縮率４％程度まで圧縮することが可能である。ところが、例えばＭＰ３（MPEG（Moving Picture Experts Group）1 Audio Layer-3）形式や、ＡＡＣ（Advanced Audio Coding）形式による不可逆圧縮では、音響データに含まれる高周波成分を取り除くように圧縮して符号化するため、復号した音響データは、高周波成分が欠落してしまう。 In the current lossless compression technique, the compression rate is about 70%. On the other hand, in the case of irreversible compression, the compression rate can be reduced to about 4%. However, in irreversible compression using, for example, MP3 (MPEG (Moving Picture Experts Group) 1 Audio Layer-3) format or AAC (Advanced Audio Coding) format, compression is performed so as to remove high-frequency components contained in audio data. For this reason, the decoded acoustic data lacks high frequency components.

また、人間は、左右両耳から得た音の周波数特性及び位相差を脳で処理することにより、音源の位置（音像）を判断している。このような事象を音像定位という。人間が明確な音像定位を得るためには、例えば、８ｋＨｚ以上の比較的高い周波数成分が必要である。すなわち、不可逆圧縮で圧縮した音響データを復号して再生する場合のような低い周波数成分のみでは、音像を明確に知ることができない。つまり、通信に係る情報量の節約のため、或いはその他の理由によって、最高周波数が低くなり、正しい音像定位が得られない場合がある。特に、上下方向の判定のキーとなる高周波成分の脱落により前方の定位感が悪化する。具体的には、前方の音が実際の位置よりも上方に聞こえてしまう。このような場合、音響の高圧縮と定位感を両立させるため、音域を高周波数領域側に拡大することが考えられる。 In addition, humans determine the position (sound image) of a sound source by processing the frequency characteristics and phase difference of the sound obtained from the left and right ears with the brain. Such an event is called sound image localization. In order for humans to obtain a clear sound image localization, for example, a relatively high frequency component of 8 kHz or more is required. That is, a sound image cannot be clearly known only with a low frequency component as in the case of decoding and reproducing acoustic data compressed by lossy compression. That is, there is a case where the maximum frequency is lowered and the correct sound image localization cannot be obtained for saving the amount of information related to communication or for other reasons. In particular, the sense of localization at the front is deteriorated due to the dropout of high-frequency components which are the keys for the determination in the vertical direction. Specifically, the forward sound is heard above the actual position. In such a case, in order to achieve both high acoustic compression and a sense of localization, it is conceivable to expand the sound range to the high frequency region side.

図２は、音域拡大の例を説明する図である。
図２（ａ）に示すような元の入力信号（上述の高周波成分が欠落した音響データに相当）に、ハイパスフィルタをかけると図２（ｂ）に示すように、入力信号内での高周波成分が取り出される。次に、取り出した高周波成分を逓倍すると、図２（ｃ）に示す拡大高周波成分が得られる。そして、図２（ａ）の元の入力信号と、図２（ｃ）の拡大高周波成分とを合成すると、図２（ｄ）に示す合成成分が得られる。（例えば、特許文献１を参照。）。 FIG. 2 is a diagram for explaining an example of sound range expansion.
When a high-pass filter is applied to the original input signal (corresponding to the above-described acoustic data lacking the high-frequency component) as shown in FIG. 2A, the high-frequency component in the input signal is shown in FIG. 2B. Is taken out. Next, when the extracted high frequency component is multiplied, an enlarged high frequency component shown in FIG. 2C is obtained. Then, when the original input signal of FIG. 2A and the enlarged high frequency component of FIG. 2C are synthesized, a synthesized component shown in FIG. 2D is obtained. (For example, see Patent Document 1).

また、音響信号に含まれる高周波成分を特徴成分として抽出し、抽出された特徴成分の音像が音響信号の音像よりも聴取者に近接して定位するように、音響信号および抽出された特徴成分を音響出力部に供給する技術が開示されている（例えば、特許文献２を参照。）。 In addition, the high-frequency component included in the acoustic signal is extracted as a feature component, and the acoustic signal and the extracted feature component are determined so that the sound image of the extracted feature component is localized closer to the listener than the sound image of the acoustic signal. A technique for supplying to an acoustic output unit is disclosed (for example, see Patent Document 2).

特開２００３−１３４５９６号公報JP 2003-134596 A 特開２０１１−０４９８６２号公報JP 2011-049862 A 音響サイエンスシリーズ２「空間音響学」、飯田一博、森本政之、日本音響学会編、コロナ社Acoustic Science Series 2 “Spatial Acoustics”, Kazuhiro Iida, Masayuki Morimoto, The Acoustical Society of Japan, Corona

図３は、従来技術の問題点を示す図であり、図４は、周波数特性と音像の定位感との関係を示す図である。 FIG. 3 is a diagram illustrating a problem of the prior art, and FIG. 4 is a diagram illustrating a relationship between frequency characteristics and a sense of localization of a sound image.

図３に示したように、音響データの音域を拡大して欠落した高周波成分として用いる場合、音の周波数特性がフラットな場合（図３（ａ））には問題はない。しかしながら、自然な音によくあるように周波数が増えるに連れ音量が減る右下がりの特性を持っていた場合（図３（ｂ））、人工的な窪み（ノッチ）ができてしまう。 As shown in FIG. 3, there is no problem when the frequency range of sound is flat (FIG. 3 (a)) when used as a missing high frequency component by expanding the sound range of acoustic data. However, if it has a characteristic of decreasing right as the frequency increases as is often the case with natural sounds (FIG. 3B), an artificial depression (notch) is formed.

他方、図４に示すように、人間は周波数特性の窪みの位置を検知して音の上下方向を認知していることが知られている（例えば、非特許文献１を参照。）。 On the other hand, as shown in FIG. 4, it is known that a human recognizes the vertical direction of sound by detecting the position of the depression of the frequency characteristic (see, for example, Non-Patent Document 1).

しかしながら、図３に示すような人工的な窪みは、人間が検知する音の上下方向を正確に認知できない、という問題点があった。 However, the artificial depression as shown in FIG. 3 has a problem that the vertical direction of the sound detected by humans cannot be accurately recognized.

１つの側面では、本発明は、高周波成分が欠落してしまう圧縮率の高い不可逆圧縮した音響データを復号した場合であっても、音の上下方向を正確に認知することが可能な音響再生装置、音響再生方法及び音響再生プログラムを提供することを目的とする。 In one aspect, the present invention provides a sound reproducing apparatus that can accurately recognize the vertical direction of sound even when irreversibly compressed acoustic data with a high compression rate at which high-frequency components are lost is decoded. An object of the present invention is to provide a sound reproduction method and a sound reproduction program.

１つの案では、音響再生装置は、音響放送装置から送信される音響データを再生する音響再生装置であって、前記音響放送装置での音響圧縮により欠落した前記音響データの周波数成分領域を解析する欠落周波数解析手段と、前記欠落した周波数成分領域のうち最も低い周波数帯に隣接し、前記最も低い周波数帯よりも低い周波数帯の音圧を解析する音圧解析手段と、前記音圧に相当する音圧の周波数成分を前記欠落した周波数成分領域に付加する高周波成分付加手段とを備えることを特徴とする。 In one plan, the sound reproduction device is a sound reproduction device that reproduces sound data transmitted from the sound broadcast device, and analyzes a frequency component region of the sound data that is lost due to sound compression in the sound broadcast device. Corresponding to a missing frequency analysis means, a sound pressure analysis means for analyzing a sound pressure in a frequency band adjacent to the lowest frequency band of the missing frequency component regions and lower than the lowest frequency band, and the sound pressure High frequency component addition means for adding a frequency component of sound pressure to the missing frequency component region.

実施の形態によれば、圧縮率の高い不可逆圧縮した音響データの復号により欠落した高周波成分の領域に、欠落しなかった領域のエッジ部分と同様の音圧を付加するので、高圧縮された音響データであっても、音像の正確な前方定位感を発生させることができるようになる。 According to the embodiment, since the sound pressure similar to the edge portion of the region that was not lost is added to the region of the high-frequency component that is lost due to the decoding of the irreversibly compressed acoustic data with a high compression rate, the highly compressed sound Even in the case of data, it is possible to generate an accurate forward localization feeling of a sound image.

音響ＡＲシステムの概略を説明するための図である。It is a figure for demonstrating the outline of an acoustic AR system. 音域拡大の例を説明する図である。It is a figure explaining the example of a sound range expansion. 従来技術の問題点を示す図である。It is a figure which shows the problem of a prior art. 周波数特性と音像の定位感との関係を示す図である。It is a figure which shows the relationship between a frequency characteristic and the feeling of localization of a sound image. 本実施の形態に係る三次元音響放送装置及び三次元音響再生装置のハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of the three-dimensional sound broadcasting apparatus and three-dimensional sound reproduction apparatus which concern on this Embodiment. 本実施の形態の概要を示す図である。It is a figure which shows the outline | summary of this Embodiment. 本実施の形態における機能ブロックを示す図である。It is a figure which shows the functional block in this Embodiment. 係数データの例を示す図である。It is a figure which shows the example of coefficient data. ＨＲＴＦ処理を説明するための図（その１）である。It is FIG. (1) for demonstrating HRTF process. ＨＲＴＦ処理を説明するための図（その２）である。It is FIG. (2) for demonstrating a HRTF process. ＨＲＴＦ処理を説明するための図（その３）である。FIG. 10 is a third diagram illustrating the HRTF process. 欠落周波数解析処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a missing frequency analysis process. 音の周波数と感じ方の関係を示す図である。It is a figure which shows the relationship between the frequency of a sound, and how to feel. 正当な聴覚を持つ人が等しい大きさに感じる純音の音圧レベルと周波数の関係を示す図である。It is a figure which shows the relationship between the sound pressure level and frequency of a pure tone which a person with a proper hearing feels equal. 音圧解析処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a sound pressure analysis process. 高周波成分付加処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a high frequency component addition process. 情報処理装置の構成図である。It is a block diagram of information processing apparatus.

以下、図面を参照しながら、実施の形態について詳細に説明する。
図５は、本実施の形態に係る三次元音響放送装置及び三次元音響再生装置のハードウェア構成を示す図である。 Hereinafter, embodiments will be described in detail with reference to the drawings.
FIG. 5 is a diagram illustrating a hardware configuration of the three-dimensional sound broadcasting apparatus and the three-dimensional sound reproducing apparatus according to the present embodiment.

図５において、本実施の形態に係る音響ＡＲシステムは、ネットワーク４を介して接続された三次元音響放送装置１００及び三次元音響再生装置２００を備える。 In FIG. 5, the acoustic AR system according to the present embodiment includes a three-dimensional sound broadcasting apparatus 100 and a three-dimensional sound reproduction apparatus 200 connected via a network 4.

三次元音響放送装置１００は、ＣＰＵ（Central Processing Unit：中央処理装置）８１、記憶装置８２、通信ＩＦ（InterFace：インターフェース）８３、及び音響入力装置８４を備える。記憶装置８２は、例えばハードディスクメモリ、ＲＯＭ（Read Only Memory：読み出し専用記憶装置）、及びＲＡＭ（Random Access Memory：即時呼び出し記憶装置）等である。これらのハードウェア各部は、バスを介して相互に接続されている。 The three-dimensional audio broadcasting apparatus 100 includes a CPU (Central Processing Unit) 81, a storage device 82, a communication IF (InterFace) 83, and an audio input device 84. The storage device 82 is, for example, a hard disk memory, a ROM (Read Only Memory), a RAM (Random Access Memory), or the like. These hardware units are connected to each other via a bus.

ＣＰＵ８１は、三次元音響放送装置１００のハードウェア各部を制御するプロセッサであり、ハードディスクメモリに格納された各ソフトウェアプログラムをＲＡＭにロードして実行する。ＲＯＭは、例えば不揮発性の半導体メモリであり、三次元音響放送装置１００の起動時にＣＰＵ８１が実行するＢＩＯＳ（Basic Input/Output System：基本入出力システム）、ファームウェア等を格納している。ＲＡＭは、例えばＳＲＡＭ（Static RAM）、ＤＲＡＭ（Dynamic RAM）等であり、ＣＰＵ８１が実行する処理の過程で必要なデータ等を一時的に格納する。ハードディスクメモリは、三次元音響放送装置１００に内蔵された、又は外部に接続された、大容量の情報を格納することが可能な補助記憶装置である。 The CPU 81 is a processor that controls each part of the hardware of the three-dimensional sound broadcasting apparatus 100, and loads each software program stored in the hard disk memory to the RAM for execution. The ROM is, for example, a non-volatile semiconductor memory, and stores a BIOS (Basic Input / Output System) executed by the CPU 81 when the 3D sound broadcasting apparatus 100 is activated, firmware, and the like. The RAM is, for example, SRAM (Static RAM), DRAM (Dynamic RAM), and the like, and temporarily stores data and the like necessary in the course of processing executed by the CPU 81. The hard disk memory is an auxiliary storage device that is built in the three-dimensional sound broadcasting apparatus 100 or connected to the outside and can store a large amount of information.

通信ＩＦ８３は、有線若しくは無線通信のモデム又はＬＡＮ（Local Area Network）カード等であり、インターネット等のネットワーク４に接続されている。 The communication IF 83 is a wired or wireless communication modem, a LAN (Local Area Network) card, or the like, and is connected to the network 4 such as the Internet.

音響入力装置８４は、複数のマイク３５−ｎで取得した音響信号をそれぞれ三次元音響放送装置１００に入力し、バスを介してＣＰＵ８１に伝送する。そして、ＣＰＵ８１は、三次元音響放送装置１００が入力した音響信号に基づいて音響放送処理を実行する他、記憶装置８２に格納された音響信号に基づいて音響放送処理を実行する。 The sound input device 84 inputs the sound signals acquired by the plurality of microphones 35-n to the three-dimensional sound broadcasting device 100 and transmits them to the CPU 81 via the bus. Then, the CPU 81 executes the acoustic broadcast process based on the acoustic signal stored in the storage device 82 in addition to executing the acoustic broadcast process based on the acoustic signal input by the three-dimensional acoustic broadcast apparatus 100.

三次元音響再生装置２００は、ＣＰＵ９１、記憶装置９２、通信ＩＦ９３、及び音響出力装置９４を備える。記憶装置９２は、例えばハードディスクメモリ、ＲＯＭ、及びＲＡＭ等である。これらのハードウェア各部は、バスを介して相互に接続されている。 The three-dimensional sound reproduction device 200 includes a CPU 91, a storage device 92, a communication IF 93, and a sound output device 94. The storage device 92 is, for example, a hard disk memory, a ROM, a RAM, or the like. These hardware units are connected to each other via a bus.

ＣＰＵ９１は、三次元音響再生装置２００のハードウェア各部を制御するプロセッサであり、ハードディスクメモリに格納された各ソフトウェアプログラムをＲＡＭにロードして実行する。ＲＯＭは、例えば不揮発性の半導体メモリであり、三次元音響再生装置２００の起動時にＣＰＵ９１が実行するＢＩＯＳ、ファームウェア等を格納している。ＲＡＭは、例えばＳＲＡＭ、ＤＲＡＭ等であり、ＣＰＵ９１が実行する処理の過程で必要なデータ等を一時的に格納する。ハードディスクメモリは、三次元音響再生装置２００に内蔵された、又は外部に接続された、大容量の情報を格納することが可能な補助記憶装置である。 The CPU 91 is a processor that controls each part of the hardware of the three-dimensional sound reproduction apparatus 200, and loads each software program stored in the hard disk memory to the RAM for execution. The ROM is, for example, a nonvolatile semiconductor memory, and stores a BIOS, firmware, and the like that are executed by the CPU 91 when the three-dimensional sound reproduction device 200 is activated. The RAM is, for example, SRAM, DRAM, and the like, and temporarily stores data and the like necessary in the course of processing executed by the CPU 91. The hard disk memory is an auxiliary storage device that is built in the three-dimensional sound reproduction device 200 or is connected to the outside and can store a large amount of information.

通信ＩＦ９３は、有線若しくは無線通信のモデム又はＬＡＮカード等であり、インターネット等のネットワーク４に接続されている。また、通信ＩＦ９３は、利用者の頭部の姿勢を検出する頭部姿勢センサ５３とも接続されている。頭部姿勢センサ５３は、地磁気センサや角速度センサを用いることで実現する場合もある。 The communication IF 93 is a wired or wireless communication modem, a LAN card, or the like, and is connected to the network 4 such as the Internet. The communication IF 93 is also connected to a head posture sensor 53 that detects the posture of the user's head. The head posture sensor 53 may be realized by using a geomagnetic sensor or an angular velocity sensor.

音響出力装置９４は、ＣＰＵ９１が実行する音響再生処理に係る処理結果をイヤホン５４から出力する。 The sound output device 94 outputs the processing result related to the sound reproduction processing executed by the CPU 91 from the earphone 54.

本実施の形態の三次元音響再生装置２００は、ＣＰＵ９１が音響再生プログラムを実行することで機能する。音響再生プログラムの実行内容である音響再生処理については、詳細を後述する。 The three-dimensional sound reproduction apparatus 200 according to the present embodiment functions when the CPU 91 executes a sound reproduction program. Details of the sound reproduction process, which is the execution content of the sound reproduction program, will be described later.

図６は、本実施の形態の概要を示す図である。
本実施の形態に係る三次元音響放送装置１００は、元の音に係る音響信号を不可逆圧縮で高圧縮する。ここで、元の音は音像の前方定位感が高い（図６（ａ））。 FIG. 6 is a diagram showing an outline of the present embodiment.
The three-dimensional sound broadcasting apparatus 100 according to the present embodiment highly compresses the sound signal related to the original sound by irreversible compression. Here, the original sound has a high forward localization feeling of the sound image (FIG. 6A).

音響信号の不可逆圧縮は、高周波成分を取り除くように圧縮して符号化するため、伝送データ量が少なくなる半面、圧縮後の音の前方定位感は低くなる（図６（ｂ））。 Since the irreversible compression of the acoustic signal is compressed and encoded so as to remove high frequency components, the amount of transmitted data is reduced, but the forward localization feeling of the compressed sound is reduced (FIG. 6B).

そこで、本実施の形態に係る三次元音響再生装置２００は、音像の前方定位感を得るためだけであれば、元の音を完全には再現する必要がないことに着目し、欠落した高周波領域に所定の周波数分布、例えばホワイトノイズを付加する。高周波領域にホワイトノイズを付加しても、どういう音であるのかというような音の了解への影響は小さいが、音像の前方定位感は高くなる（図６（ｃ））。 Therefore, the three-dimensional sound reproduction apparatus 200 according to the present embodiment pays attention to the fact that the original sound does not need to be completely reproduced if only to obtain a sense of front localization of the sound image. A predetermined frequency distribution, for example, white noise is added. Even if white noise is added to the high-frequency region, the effect on sound intelligibility, such as what kind of sound it is, is small, but the sense of localization of the sound image increases (FIG. 6 (c)).

このように、本実施の形態は、三次元音響放送装置１００で高周波成分をカットすることで通信量を削減しても、三次元音響再生装置２００で高周波成分を追加することで音像の前方定位感を補うことができる。 As described above, according to the present embodiment, even if the communication volume is reduced by cutting high-frequency components in the three-dimensional sound broadcasting apparatus 100, the high-frequency components are added by the three-dimensional sound reproduction apparatus 200, so Can compensate for the feeling.

図７は、本実施の形態における機能ブロックを示す図であり、図８は、係数データの例を示す図である。 FIG. 7 is a diagram showing functional blocks in the present embodiment, and FIG. 8 is a diagram showing an example of coefficient data.

図７において、三次元音響放送装置１００は、音響入力部３１、ＤＦＴ（Discrete Fourier Transform：離散フーリエ変換）部１０１、削減パラメータＤＢ（DataBase：データベース）１０２、高周波成分削減部１０３、係数圧縮部１０４、及び送信部３４を備える。 In FIG. 7, the three-dimensional sound broadcasting apparatus 100 includes an acoustic input unit 31, a DFT (Discrete Fourier Transform) unit 101, a reduction parameter DB (DataBase) 102, a high frequency component reduction unit 103, and a coefficient compression unit 104. And a transmission unit 34.

音響入力部３１は、不図示の複数のマイク３５−１乃至３５−８（図５参照）で取得した音響信号をそれぞれ三次元音響放送装置１００に入力する。ＤＦＴ部１０１は、各チャンネルの音響信号に高速フーリエ変換（ＦＦＴ：Fast Fourier Transform）等により離散フーリエ変換を行って、音響信号のスペクトルを出力する。例えば、図８に例示するように、１６ｂｉｔ、４４．１ｋＨｚでサンプリングした音響信号に対して１０２４点ずつ離散フーリエ変換を行う。 The sound input unit 31 inputs sound signals acquired by a plurality of microphones 35-1 to 35-8 (see FIG. 5) (not shown) to the three-dimensional sound broadcasting apparatus 100. The DFT unit 101 performs discrete Fourier transform on the acoustic signal of each channel by Fast Fourier Transform (FFT) or the like, and outputs a spectrum of the acoustic signal. For example, as illustrated in FIG. 8, 1024 points are subjected to discrete Fourier transform for an acoustic signal sampled at 16 bits and 44.1 kHz.

削減パラメータＤＢ１０２は、ＤＦＴ部１０１が離散フーリエ変換を行って出力した音響信号のスペクトルについて、高周波成分削減部１０３が削減する周波数成分のパラメータを格納する。高周波成分削減部１０３は、削減パラメータＤＢ１０２を参照してＤＦＴ部１０１が出力した音響信号のスペクトルのうち高周波成分を削減する。係数圧縮部１０４は、高周波成分が削減された音響信号の係数データを圧縮する。そして、送信部３４は、圧縮した音響データをネットワーク４経由で三次元音響再生装置２００に送信する。 The reduction parameter DB 102 stores parameters of frequency components to be reduced by the high frequency component reduction unit 103 for the spectrum of the acoustic signal output by the DFT unit 101 performing discrete Fourier transform. The high frequency component reduction unit 103 refers to the reduction parameter DB 102 and reduces high frequency components in the spectrum of the acoustic signal output from the DFT unit 101. The coefficient compression unit 104 compresses the coefficient data of the acoustic signal from which the high frequency component has been reduced. Then, the transmission unit 34 transmits the compressed acoustic data to the three-dimensional sound reproduction device 200 via the network 4.

また、図７において、三次元音響再生装置２００は、受信部５１、係数復元部２０１、欠落周波数解析部２０２、前方定位関連情報ＤＢ２０３、音圧解析部２０４、高周波成分付加部２０５、逆ＤＦＴ部２０６、センサ値受信部２０７、及びＨＲＴＦ処理部２０８を備える。 In FIG. 7, the three-dimensional sound reproduction apparatus 200 includes a receiving unit 51, a coefficient restoring unit 201, a missing frequency analyzing unit 202, a front localization related information DB 203, a sound pressure analyzing unit 204, a high frequency component adding unit 205, and an inverse DFT unit. 206, a sensor value receiving unit 207, and an HRTF processing unit 208.

受信部５１は、三次元音響放送装置１００からネットワーク４経由で送信された音響データを受信する。係数復元部２０１は、圧縮された係数データを復号する。 The receiving unit 51 receives acoustic data transmitted from the three-dimensional acoustic broadcast apparatus 100 via the network 4. The coefficient restoration unit 201 decodes the compressed coefficient data.

欠落周波数解析部２０２は、欠落周波数解析処理を実行することにより、欠落した（削減された）周波数領域を求める（図７（１））。欠落周波数解析処理の処理内容の詳細については、図１２を用いて後述する。 The missing frequency analysis unit 202 performs missing frequency analysis processing to obtain missing (reduced) frequency regions (FIG. 7 (1)). Details of the processing content of the missing frequency analysis processing will be described later with reference to FIG.

音圧解析部２０４は、音圧解析処理を実行することにより、エッジ音圧を求める（図７（２））。すなわち、欠落周波数解析部２０２が解析した欠落した周波数成分領域のうち、最も低い周波数帯に隣接し、かつその周波数帯よりも低い周波数帯での平均音量を求める。音圧解析処理の処理内容の詳細については、図１５を用いて後述する。 The sound pressure analysis unit 204 obtains edge sound pressure by executing sound pressure analysis processing (FIG. 7 (2)). That is, the average volume in the frequency band adjacent to the lowest frequency band and lower than the frequency band among the missing frequency component regions analyzed by the missing frequency analysis unit 202 is obtained. Details of the processing content of the sound pressure analysis processing will be described later with reference to FIG.

高周波成分付加部２０５は、高周波成分付加処理を実行することにより、求めたエッジ音圧と同じ音圧の周波数分布、例えばホワイトノイズ等の音響データを、欠落した周波数領域に付加する（図７（３））。高周波成分付加処理の処理内容の詳細については、図１６を用いて後述する。 The high-frequency component adding unit 205 performs high-frequency component addition processing to add frequency distribution of the same sound pressure as the obtained edge sound pressure, for example, acoustic data such as white noise to the missing frequency region (FIG. 7 ( 3)). Details of the processing content of the high-frequency component addition processing will be described later with reference to FIG.

前方定位関連情報ＤＢ２０３は、欠落周波数解析部２０２によって求めた欠落周波数の区間、及び音圧解析部２０４によって求めたエッジ音圧をチャンネル毎に格納する。 The forward localization related information DB 203 stores the missing frequency section obtained by the missing frequency analysis unit 202 and the edge sound pressure obtained by the sound pressure analysis unit 204 for each channel.

逆ＤＦＴ部２０６は、高周波数領域に所定の周波数分布が付加されたスペクトルに逆離散フーリエ変換を行って、各チャンネル毎の音響信号として出力する。センサ値受信部２０７は、頭部姿勢センサ５３が検出した利用者の頭部の姿勢を示す値を受信する。ＨＲＴＦ処理部２０８は、センサ値受信部２０７が頭部姿勢センサ５３から受信したセンサ値を用いて、逆ＤＦＴ部２０６が出力した音響信号をＨＲＴＦ処理し、イヤホン５４に出力する。 The inverse DFT unit 206 performs inverse discrete Fourier transform on the spectrum in which a predetermined frequency distribution is added to the high frequency region, and outputs the result as an acoustic signal for each channel. The sensor value receiving unit 207 receives a value indicating the posture of the user's head detected by the head posture sensor 53. The HRTF processing unit 208 performs HRTF processing on the acoustic signal output from the inverse DFT unit 206 using the sensor value received from the head position sensor 53 by the sensor value receiving unit 207, and outputs it to the earphone 54.

図９、図１０、図１１は、ＨＲＴＦ処理を説明するための図である。
例えば、図９（ｂ）に示すように、チャンネル０の音に対して、利用者６の正面を基準にして０度の周波数特性（図９（ａ）参照）を適用し、チャンネル７の音に対して、３１５度の周波数特性を適用し、チャンネル６の音に対して、２７０度の周波数特性を適用し、チャンネル５の音に対して、２２５度の周波数特性を適用して、これらを加算して左スピーカ５４Ｌに集約する。同様に、チャンネル１の音に対して、４５度の周波数特性を適用し、チャンネル２の音に対して、９０度の周波数特性を適用し、チャンネル３の音に対して、１３５度の周波数特性を適用し、チャンネル４の音に対して、１８０度の周波数特性を適用して、これらを加算して右スピーカ５４Ｒに集約する。ここで、音に対して周波数特性を適用するとは、図１０に示すように、音の係数データ（図１０（ａ））とＨＲＴＦ特性（図１０（ｂ））とを掛け合わせて新しい音の係数データ（図１０（ｃ））を求めることをいう。 9, FIG. 10 and FIG. 11 are diagrams for explaining the HRTF processing.
For example, as shown in FIG. 9B, a frequency characteristic of 0 degree (see FIG. 9A) is applied to the sound of channel 0 with reference to the front of the user 6, and the sound of channel 7 is applied. On the other hand, a frequency characteristic of 315 degrees is applied, a frequency characteristic of 270 degrees is applied to the sound of channel 6, and a frequency characteristic of 225 degrees is applied to the sound of channel 5, The sum is added to the left speaker 54L. Similarly, a frequency characteristic of 45 degrees is applied to the sound of channel 1, a frequency characteristic of 90 degrees is applied to the sound of channel 2, and a frequency characteristic of 135 degrees is applied to the sound of channel 3. Is applied to the sound of channel 4 and a frequency characteristic of 180 degrees is applied, and these are added and collected in the right speaker 54R. Here, the frequency characteristic is applied to the sound, as shown in FIG. 10, by multiplying the coefficient data of the sound (FIG. 10 (a)) and the HRTF characteristic (FIG. 10 (b)). It means obtaining coefficient data (FIG. 10C).

また、ＨＲＴＦ処理部５２は、図１１に示すように、頭部姿勢センサ５３で検出した利用者６の頭部の姿勢に基づいて、利用者６が向いている方向に対応させて出力する音を変換する。ここで用いるＨＲＴＦは、音像の定位方向に対応するように予め決定されている。イヤホン５４は、ＨＲＴＦ処理部５２で変換した音響信号を右スピーカ５４Ｒと左スピーカ５４Ｌから出力する。 Further, as shown in FIG. 11, the HRTF processing unit 52 outputs sound corresponding to the direction in which the user 6 is facing based on the posture of the head of the user 6 detected by the head posture sensor 53. Convert. The HRTF used here is determined in advance so as to correspond to the localization direction of the sound image. The earphone 54 outputs the acoustic signal converted by the HRTF processing unit 52 from the right speaker 54R and the left speaker 54L.

図１２は、欠落周波数解析処理の流れを示すフローチャートである。
まず、欠落周波数解析部２０２は、ステップＳ１２０１において、係数復元部２０１が復号した係数データを入力する。 FIG. 12 is a flowchart showing the flow of missing frequency analysis processing.
First, the missing frequency analysis unit 202 inputs the coefficient data decoded by the coefficient restoration unit 201 in step S1201.

次に、全てのチャンネルについて欠落周波数解析処理を実行するために、ステップＳ１２０２において、未処理のチャンネルがあるか否かを判断する。全てのチャンネルについて欠落周波数解析処理が終了していれば（ステップＳ１２０２：Ｎ）、本欠落周波数解析処理を終了する。 Next, in order to execute the missing frequency analysis process for all channels, it is determined in step S1202 whether or not there is an unprocessed channel. If the missing frequency analysis process has been completed for all channels (step S1202: N), this missing frequency analysis process is terminated.

他方、未処理のチャンネルがあれば（ステップＳ１２０２：Ｙ）、ステップＳ１２０３において、未処理のチャンネルから１つを選び、周波数ｆ１〜ｆ１０２４に対応する係数データｐ１〜ｐ１０２４のうち、所定の閾値未満（ｐ<threshould）となる最大の周波数ｆ_maxを求める。 On the other hand, if there is an unprocessed channel (step S1202: Y), one of the unprocessed channels is selected in step S1203, and the coefficient data p1 to p1024 corresponding to the frequencies f1 to f1024 is less than a predetermined threshold ( The maximum frequency f _max that satisfies p <threshould) is obtained.

ステップＳ１２０４において、ｆ_maxが１６ｋＨｚ以下か否かを判断する。１６ｋＨｚ以下でなければ（ステップＳ１２０４：Ｎ）、ステップＳ１２０５において、欠落区間なしとして［０，０］を前方定位関連情報ＤＢ２０３に格納する。 In step S1204, it is determined whether f _max is 16 kHz or less. If it is not less than 16 kHz (step S1204: N), [0, 0] is stored in the forward localization related information DB 203 as no missing section in step S1205.

他方、ｆ_maxが１６ｋＨｚ以下の場合（ステップＳ１２０４：Ｙ）、ステップＳ１２０６において、ｆ_maxが４ｋＨｚ以上か否かを判断する。４ｋＨｚ以上でなければ（ステップＳ１２０６：Ｎ）、ステップＳ１２０７において、欠落区間なしとして［４ｋ，ｆ１０２４］を前方定位関連情報ＤＢ２０３に格納する。 On the other hand, if f _max is 16 kHz or less (step S1204: Y), it is determined in step S1206 whether f _max is 4 kHz or more. If it is not 4 kHz or more (step S1206: N), [4k, f1024] is stored in the forward localization related information DB 203 as no missing section in step S1207.

他方、ｆ_maxが４ｋＨｚ以上の場合（ステップＳ１２０６：Ｙ）、すなわち、ｆ_maxが４ｋＨｚ以上、かつ１６ｋＨｚ以下の場合、ステップＳ１２０８において、このときの区間［ｆ_max，ｆ１０２４］を欠落周波数帯として前方定位関連情報ＤＢ２０３に格納する。そして、ステップＳ１２０２に戻る。 On the other hand, if f _max is 4 kHz or more (step S1206: Y), that is, if f _max is 4 kHz or more and 16 kHz or less, in step S1208, the section [f _max , f1024] at this time is set as the missing frequency band. Stored in the localization related information DB 203. Then, the process returns to step S1202.

図１３は、音の周波数と感じ方の関係を示す図である。
図１２のステップＳ１２０８で、欠落周波数帯として前方定位関連情報ＤＢ２０３に格納する周波数を、ｆ_maxが４ｋＨｚ以上、かつ１６ｋＨｚ以下としたのは、図１３に示すように、この間の周波数が音の定位感に影響を与えるからである。 FIG. 13 is a diagram illustrating the relationship between the frequency of sound and how it is felt.
In step S1208 of FIG. 12, the frequency stored in the forward localization related information DB 203 as a missing frequency band is set so that f _max is 4 kHz or more and 16 kHz or less, as shown in FIG. This is because it affects the feeling.

図１４は、正当な聴覚を持つ人が等しい大きさに感じる純音の音圧レベルと周波数の関係を示す図である。 FIG. 14 is a diagram showing the relationship between the sound pressure level and frequency of a pure tone that a person with proper hearing feels at the same magnitude.

曲線に付してある数値は音の大きさのレベルで、１［ｋＨｚ］の純音の音圧レベル（ｄＢ）と同じ値をｐｈｏｎ（フォン）という単位で表し、１［ｋＨｚ］の純音と同じ大きさに聞こえるそれぞれの周波数の音圧レベルを結んで等感曲線として示している。 The numerical value attached to the curve is the level of the loudness, the same value as the sound pressure level (dB) of a pure tone of 1 [kHz] is expressed in units of phon, and the same as the pure tone of 1 [kHz]. The sound pressure level of each frequency that sounds like a magnitude is connected and shown as an isometric curve.

図１４に示すように、周波数が高いほど音圧レベルを低く感じる。特に、４［ｋＨｚ］あたりは音が小さく感じる。 As shown in FIG. 14, the higher the frequency, the lower the sound pressure level. In particular, the sound feels small around 4 [kHz].

図１５は、音圧解析処理の流れを示すフローチャートである。
まず、全てのチャンネルについて音圧解析処理を実行するために、ステップＳ１５０１において、未処理のチャンネルがあるか否かを判断する。全てのチャンネルについて音圧解析処理が終了していれば（ステップＳ１５０１：Ｎ）、本音圧解析処理を終了する。 FIG. 15 is a flowchart showing the flow of sound pressure analysis processing.
First, in order to execute the sound pressure analysis process for all channels, it is determined in step S1501 whether there is an unprocessed channel. If the sound pressure analysis process has been completed for all channels (step S1501: N), the sound pressure analysis process is terminated.

他方、未処理のチャンネルがあれば（ステップＳ１５０１：Ｙ）、ステップＳ１５０２において、未処理のチャンネルから１つを選び、前方定位関連情報ＤＢ２０３から、欠落周波数の最小値ｆ_minを取得する。 On the other hand, if there is an unprocessed channel (step S1501: Y), one of the unprocessed channels is selected in step S1502, and the minimum value f _min of the missing frequency is acquired from the forward localization related information DB 203.

ステップＳ１５０３において、このときの区間［ｆ_min−Δｆ，ｆ_min］における平均音圧を求め、前方定位関連情報ＤＢ２０３のエッジ音圧の欄に格納する。そして、ステップＳ１５０１に戻る。なお、Δｆは事前に設定可能な所定の値である。また、ｆ_minが０の場合はエッジ音圧を０とする。 In step S1503, the average sound pressure in the section [f _min −Δf, f _min ] at this time is obtained and stored in the edge sound pressure column of the front localization related information DB 203. Then, the process returns to step S1501. Δf is a predetermined value that can be set in advance. When f _min is 0, the edge sound pressure is set to 0.

図１６は、高周波成分付加処理の流れを示すフローチャートである。
まず、全てのチャンネルについて高周波成分付加処理を実行するために、ステップＳ１６０１において、未処理のチャンネルがあるか否かを判断する。全てのチャンネルについて高周波成分付加処理が終了していれば（ステップＳ１６０１：Ｎ）、本高周波成分付加処理を終了する。 FIG. 16 is a flowchart showing the flow of high-frequency component addition processing.
First, in order to execute high-frequency component addition processing for all channels, it is determined in step S1601 whether or not there is an unprocessed channel. If the high frequency component addition processing has been completed for all channels (step S1601: N), this high frequency component addition processing is terminated.

他方、未処理のチャンネルがあれば（ステップＳ１６０１：Ｙ）、ステップＳ１６０２において、未処理のチャンネルから１つを選び、前方定位関連情報ＤＢ２０３から、欠落周波数帯とエッジ音圧を取得する。 On the other hand, if there is an unprocessed channel (step S1601: Y), one of the unprocessed channels is selected in step S1602, and the missing frequency band and the edge sound pressure are acquired from the front localization related information DB 203.

ステップＳ１６０３において、このときの区間［ｆ_min，ｆ_max］における音圧データとして、事前に用意したホワイトノイズ等の周波数特性データをエッジ音圧と同じレベルになるように挿入する。そして、ステップＳ１６０１に戻る。 In step S1603, frequency characteristic data such as white noise prepared in advance is inserted so as to have the same level as the edge sound pressure as the sound pressure data in the section [f _min , f _max ] at this time. Then, the process returns to step S1601.

以上、図面を参照しながら、本実施の形態について詳細に説明したが、以下のように変形することも可能である。 While the present embodiment has been described in detail with reference to the drawings, it can be modified as follows.

例えば、変形例１として、付加する高周波成分であるホワイトノイズの代わりに、低周波数成分を逓倍したものを更にイコライジングして周波数特性をフラットにしたものを用いることもできる。 For example, instead of white noise, which is a high-frequency component to be added, a variation in which a low frequency component is multiplied and further equalized to make the frequency characteristic flat can be used as Modification 1.

また、変形例２として、音圧解析部２０４による音圧解析処理で解析したエッジ音圧が、平均音圧に比べて非常に小さい場合、エッジ付近の音圧を滑らかに盛り上げて、例えば、平均の８割程度の音圧まで上げた上で、高周波成分付加部２０５による高周波成分付加処理を実行することもできる。 As a second modification, when the edge sound pressure analyzed by the sound pressure analysis processing by the sound pressure analysis unit 204 is very small compared to the average sound pressure, the sound pressure near the edge is smoothly raised, for example, an average The high-frequency component addition processing by the high-frequency component addition unit 205 can be executed after the sound pressure is increased to about 80%.

また、通常は本実施の形態を実行し、定位感を落としてでも音質を向上したい場合は変形例１を実行し、音質を落としてでも、さらなる定位感を得たい場合は変形例２を実行する等、選択的に各実施の形態を実行することもできる。 Also, this embodiment is usually executed, and if it is desired to improve sound quality even if the sense of localization is lowered, Modification 1 is executed, and if further localization is desired even if the sound quality is lowered, Modification 2 is executed. For example, each embodiment can be selectively executed.

図１７は、情報処理装置の構成図である。
上述の三次元音響再生装置２００は、例えば、図１７に示すような情報処理装置（コンピュータ）を用いて実現することが可能である。図１７の情報処理装置は、ＣＰＵ１７０１、メモリ１７０２、入力装置１７０３、出力装置１７０４、外部記録装置１７０５、媒体駆動装置１７０６及びネットワーク接続装置１７０７を備える。これらはバス１７０８により互いに接続されている。 FIG. 17 is a configuration diagram of the information processing apparatus.
The above-described three-dimensional sound reproduction apparatus 200 can be realized using, for example, an information processing apparatus (computer) as shown in FIG. 17 includes a CPU 1701, a memory 1702, an input device 1703, an output device 1704, an external recording device 1705, a medium driving device 1706, and a network connection device 1707. These are connected to each other by a bus 1708.

メモリ１７０２は、例えば、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）、フラッシュメモリ等の半導体メモリであり、三次元音響再生装置２００が実行する音響再生処理に用いられるプログラム及びデータを格納する。例えば、ＣＰＵ１７０１は、メモリ１７０２を利用してプログラムを実行することにより、上述の音響再生処理を行う。 The memory 1702 is, for example, a semiconductor memory such as a ROM (Read Only Memory), a RAM (Random Access Memory), or a flash memory, and stores a program and data used for sound reproduction processing executed by the three-dimensional sound reproduction device 200. . For example, the CPU 1701 performs the above-described sound reproduction process by executing a program using the memory 1702.

入力装置１７０３は、例えば、キーボード、ポインティングデバイス等であり、オペレータからの指示や情報の入力に用いられる。出力装置１７０４は、例えば、表示装置、プリンタ、スピーカ等であり、オペレータへの問い合わせや処理結果の出力に用いられる。 The input device 1703 is, for example, a keyboard, a pointing device, or the like, and is used for inputting instructions and information from an operator. The output device 1704 is, for example, a display device, a printer, a speaker, and the like, and is used to output an inquiry to the operator and a processing result.

外部記録装置１７０５は、例えば、磁気ディスク装置、光ディスク装置、光磁気ディスク装置、テープ装置等である。この外部記録装置１７０５には、ハードディスクドライブも含まれる。情報処理装置は、この外部記録装置１７０５にプログラム及びデータを格納しておき、それらをメモリ１７０２にロードして使用することができる。 The external recording device 1705 is, for example, a magnetic disk device, an optical disk device, a magneto-optical disk device, a tape device, or the like. The external recording device 1705 includes a hard disk drive. The information processing apparatus can store programs and data in the external recording apparatus 1705 and load them into the memory 1702 for use.

媒体駆動装置１７０６は、可搬型記録媒体１７０９を駆動し、その記録内容にアクセスする。可搬型記録媒体１７０９は、メモリデバイス、フレキシブルディスク、光ディスク、光磁気ディスク等である。この可搬型記録媒体１７０９には、Compact Disk Read Only Memory （ＣＤ−ＲＯＭ）、Digital Versatile Disk（ＤＶＤ）、Universal Serial Bus（ＵＳＢ）メモリ等も含まれる。オペレータは、この可搬型記録媒体１７０９にプログラム及びデータを格納しておき、それらをメモリ１７０２にロードして使用することができる。 The medium driving device 1706 drives a portable recording medium 1709 and accesses the recorded contents. The portable recording medium 1709 is a memory device, a flexible disk, an optical disk, a magneto-optical disk, or the like. The portable recording medium 1709 includes a compact disk read only memory (CD-ROM), a digital versatile disk (DVD), a universal serial bus (USB) memory, and the like. An operator can store programs and data in the portable recording medium 1709 and load them into the memory 1702 for use.

このように、音響再生処理に用いられるプログラム及びデータを格納するコンピュータ読み取り可能な記録媒体には、メモリ１７０２、外部記録装置１７０５、及び可搬型記録媒体１７０９のような、物理的な（非一時的な）記録媒体が含まれる。 As described above, the computer-readable recording medium for storing the program and data used for the sound reproduction processing includes physical (non-transitory) such as the memory 1702, the external recording device 1705, and the portable recording medium 1709. Recording medium).

ネットワーク接続装置１７０７は、通信ネットワーク１７１０に接続され、通信に伴うデータ変換を行う通信インターフェースである。情報処理装置は、プログラム及びデータを外部の装置からネットワーク接続装置１７０７を介して受け取り、それらをメモリ１７０２にロードして使用することができる。 The network connection device 1707 is a communication interface that is connected to the communication network 1710 and performs data conversion accompanying communication. The information processing apparatus can receive a program and data from an external apparatus via the network connection apparatus 1707, and can use them by loading them into the memory 1702.

開示の実施の形態とその利点について詳しく説明したが、当業者は、特許請求の範囲に明確に記載した本発明の範囲から逸脱することなく、様々な変更、追加、省略をすることができる。 Although embodiments of the disclosure and advantages thereof have been described in detail, those skilled in the art can make various changes, additions and omissions without departing from the scope of the present invention clearly described in the claims.

図面を参照しながら説明した実施の形態に関し、さらに以下の付記を開示する。
（付記１）
音響放送装置から送信される音響データを再生する音響再生装置において、
前記音響放送装置での音響圧縮により欠落した前記音響データの周波数成分領域を解析する欠落周波数解析手段と、
前記欠落した周波数成分領域のうち最も低い周波数帯に隣接し、前記最も低い周波数帯よりも低い周波数帯の音圧を解析する音圧解析手段と、
前記音圧に相当する音圧の周波数成分を前記欠落した周波数成分領域に付加する高周波成分付加手段と、
を備えることを特徴とする音響再生装置。
（付記２）
前記高周波成分付加手段は、前記音圧に相当する音圧のホワイトノイズを前記欠落した周波数成分領域に付加する、
ことを特徴とする付記１に記載の音響再生装置。
（付記３）
前記高周波成分付加手段は、前記最も低い周波数帯よりも低い周波数帯の周波数成分を逓倍しイコライジングして周波数特性をフラットにしたものを前記欠落した周波数成分領域に付加する、
ことを特徴とする付記１に記載の音響再生装置。
（付記４）
音響放送装置から送信される音響データを再生する音響再生装置のコンピュータに、
前記音響放送装置での音響圧縮により欠落した前記音響データの周波数成分領域を解析し、
前記欠落した周波数成分領域のうち最も低い周波数帯に隣接し、前記最も低い周波数帯よりも低い周波数帯の音圧を解析し、
前記音圧に相当する音圧の周波数成分を前記欠落した周波数成分領域に付加する、
処理を実行させることを特徴とする音響再生プログラム。
（付記５）
前記音圧に相当する音圧のホワイトノイズを前記欠落した周波数成分領域に付加する、
ことを特徴とする付記４に記載の音響再生プログラム。
（付記６）
前記最も低い周波数帯よりも低い周波数帯の周波数成分を逓倍しイコライジングして周波数特性をフラットにしたものを前記欠落した周波数成分領域に付加する、
ことを特徴とする付記４に記載の音響再生プログラム。
（付記７）
音響放送装置から送信される音響データを再生する音響再生装置のコンピュータが実行する音響再生方法であって、
前記音響放送装置での音響圧縮により欠落した前記音響データの周波数成分領域を解析し、
前記欠落した周波数成分領域のうち最も低い周波数帯に隣接し、前記最も低い周波数帯よりも低い周波数帯の音圧を解析し、
前記音圧に相当する音圧の周波数成分を前記欠落した周波数成分領域に付加する、
ことを特徴とする音響再生方法。
（付記８）
前記音圧に相当する音圧のホワイトノイズを前記欠落した周波数成分領域に付加する、
ことを特徴とする付記７に記載の音響再生方法。
（付記９）
前記最も低い周波数帯よりも低い周波数帯の周波数成分を逓倍しイコライジングして周波数特性をフラットにしたものを前記欠落した周波数成分領域に付加する、
ことを特徴とする付記７に記載の音響再生方法。 The following notes are further disclosed with respect to the embodiment described with reference to the drawings.
(Appendix 1)
In a sound reproduction device that reproduces sound data transmitted from a sound broadcast device,
A missing frequency analysis means for analyzing a frequency component region of the acoustic data that is missing due to acoustic compression in the acoustic broadcast device;
A sound pressure analyzing means for analyzing a sound pressure in a frequency band adjacent to the lowest frequency band of the missing frequency component regions and lower than the lowest frequency band;
High frequency component adding means for adding a frequency component of sound pressure corresponding to the sound pressure to the missing frequency component region;
A sound reproducing device comprising:
(Appendix 2)
The high frequency component adding means adds white noise of sound pressure corresponding to the sound pressure to the missing frequency component region.
The sound reproducing device according to Supplementary Note 1, wherein:
(Appendix 3)
The high frequency component adding means multiplies and equalizes a frequency component of a frequency band lower than the lowest frequency band and adds a flat frequency characteristic to the missing frequency component region.
The sound reproducing device according to Supplementary Note 1, wherein:
(Appendix 4)
In the computer of the sound reproduction device that reproduces the sound data transmitted from the sound broadcasting device,
Analyzing the frequency component region of the acoustic data missing due to acoustic compression in the acoustic broadcast device,
Analyzing the sound pressure of the frequency band adjacent to the lowest frequency band of the missing frequency component region and lower than the lowest frequency band,
Adding a frequency component of sound pressure corresponding to the sound pressure to the missing frequency component region;
A sound reproduction program for executing a process.
(Appendix 5)
Adding white noise of sound pressure corresponding to the sound pressure to the missing frequency component region;
The sound reproduction program according to supplementary note 4, characterized in that:
(Appendix 6)
The frequency component of the frequency band lower than the lowest frequency band is multiplied and equalized to add a flat frequency characteristic to the missing frequency component region,
The sound reproduction program according to supplementary note 4, characterized in that:
(Appendix 7)
An acoustic reproduction method executed by a computer of an acoustic reproduction device that reproduces acoustic data transmitted from an acoustic broadcast device,
Analyzing the frequency component region of the acoustic data missing due to acoustic compression in the acoustic broadcast device,
Analyzing the sound pressure of the frequency band adjacent to the lowest frequency band of the missing frequency component region and lower than the lowest frequency band,
Adding a frequency component of sound pressure corresponding to the sound pressure to the missing frequency component region;
An acoustic reproduction method characterized by the above.
(Appendix 8)
Adding white noise of sound pressure corresponding to the sound pressure to the missing frequency component region;
The sound reproducing method according to appendix 7, wherein
(Appendix 9)
The frequency component of the frequency band lower than the lowest frequency band is multiplied and equalized to add a flat frequency characteristic to the missing frequency component region,
The sound reproducing method according to appendix 7, wherein

１地点
２−１、２−２、２−３音源
３三次元音響放送装置
４ネットワーク
５三次元音響再生装置
６利用者
７−１、７−２、７−３、７−４、７−５、７−６、７−７、７−８仮想スピーカ
３１音響入力部
３２音源管理部
３３集約部
３４送信部
３５−１、３５−２、３５−３、３５−４、３５−５、３５−６、３５−７、３５−８マイク
５１受信部
５２ＨＲＴＦ処理部
５３頭部姿勢センサ
５４イヤホン
５４Ｒ右スピーカ
５４Ｌ左スピーカ
８１ＣＰＵ
８２記憶装置
８３通信ＩＦ
８４音響入力装置
９１ＣＰＵ
９２記憶装置
９３通信ＩＦ
９４音響出力装置
１００三次元音響放送装置
１０１ＤＦＴ部
１０２削減パラメータＤＢ
１０３高周波成分削減部
１０４係数圧縮部
２００三次元音響再生装置
２０１係数復元部
２０２欠落周波数解析部
２０３前方定位関連情報ＤＢ
２０４音圧解析部
２０５高周波成分付加部
２０６逆ＤＦＴ部
２０７センサ値受信部
２０８ＨＲＴＦ処理部
１７０１ＣＰＵ
１７０２メモリ
１７０３入力装置
１７０４出力装置
１７０５外部記録装置
１７０６媒体駆動装置
１７０７ネットワーク接続装置
１７０８バス
１７０９可搬型記録媒体
１７１０通信ネットワーク 1 point 2-1, 2-2, 2-3 sound source 3 3D sound broadcasting device 4 network 5 3D sound reproduction device 6 user 7-1, 7-2, 7-3, 7-4, 7-5 , 7-6, 7-7, 7-8 Virtual speaker 31 Acoustic input unit 32 Sound source management unit 33 Aggregation unit 34 Transmission unit 35-1, 35-2, 35-3, 35-4, 35-5, 35- 6, 35-7, 35-8 Microphone 51 Receiver 52 HRTF processor 53 Head posture sensor 54 Earphone 54R Right speaker 54L Left speaker 81 CPU
82 Storage device 83 Communication IF
84 Sound input device 91 CPU
92 Storage device 93 Communication IF
94 Sound output device 100 Three-dimensional sound broadcasting device 101 DFT unit 102 Reduction parameter DB
DESCRIPTION OF SYMBOLS 103 High frequency component reduction part 104 Coefficient compression part 200 Three-dimensional sound reproduction apparatus 201 Coefficient decompression | restoration part 202 Missing frequency analysis part 203 Front localization related information DB
204 Sound pressure analysis unit 205 High frequency component addition unit 206 Inverse DFT unit 207 Sensor value reception unit 208 HRTF processing unit 1701 CPU
1702 Memory 1703 Input device 1704 Output device 1705 External recording device 1706 Medium drive device 1707 Network connection device 1708 Bus 1709 Portable recording medium 1710 Communication network

Claims

In a sound reproduction device that reproduces sound data transmitted from a sound broadcast device,
A missing frequency analysis means for analyzing a frequency component region of the acoustic data that is missing due to acoustic compression in the acoustic broadcast device;
A sound pressure analyzing means for analyzing a sound pressure in a frequency band adjacent to the lowest frequency band of the missing frequency component regions and lower than the lowest frequency band;
High frequency component adding means for adding a frequency component of sound pressure corresponding to the sound pressure to the missing frequency component region;
A sound reproducing device comprising:

The high frequency component adding means adds white noise of sound pressure corresponding to the sound pressure to the missing frequency component region.
The sound reproducing device according to claim 1.

The high frequency component adding means multiplies and equalizes a frequency component of a frequency band lower than the lowest frequency band and adds a flat frequency characteristic to the missing frequency component region.
The sound reproducing device according to claim 1.

In the computer of the sound reproduction device that reproduces the sound data transmitted from the sound broadcasting device,
Analyzing the frequency component region of the acoustic data missing due to acoustic compression in the acoustic broadcast device,
Analyzing the sound pressure of the frequency band adjacent to the lowest frequency band of the missing frequency component region and lower than the lowest frequency band,
Adding a frequency component of sound pressure corresponding to the sound pressure to the missing frequency component region;
A sound reproduction program for executing a process.

An acoustic reproduction method executed by a computer of an acoustic reproduction device that reproduces acoustic data transmitted from an acoustic broadcast device,
Analyzing the frequency component region of the acoustic data missing due to acoustic compression in the acoustic broadcast device,
Analyzing the sound pressure of the frequency band adjacent to the lowest frequency band of the missing frequency component region and lower than the lowest frequency band,
Adding a frequency component of sound pressure corresponding to the sound pressure to the missing frequency component region;
An acoustic reproduction method characterized by the above.