JP2017220798A

JP2017220798A - Information processing device, sound processing method and sound processing program

Info

Publication number: JP2017220798A
Application number: JP2016113784A
Authority: JP
Inventors: 真志中尾; Shinji Nakao; 重清奥田; Shigekiyo Okuda
Original assignee: Koei Tecmo Holdings Co Ltd
Current assignee: Koei Tecmo Holdings Co Ltd
Priority date: 2016-06-07
Filing date: 2016-06-07
Publication date: 2017-12-14
Anticipated expiration: 2036-06-07
Also published as: JP6670685B2

Abstract

PROBLEM TO BE SOLVED: To provide sound which does not make a user feel discomfort in content providing.SOLUTION: An information processing device for processing sound that is outputted in prescribed content is provided. The information processing device includes: an encode processing unit for generating a center signal from an addition value of a first signal and a second signal of stereo sound included in the sound that is outputted, and generating a peripheral signal from a difference value between the first signal and the second signal; a filter processing unit for generating corrected center signal by removing sound of a prescribed band from the generated center signal; and a decode processing unit for decoding the first signal and the second signal on the basis of the generated corrected center signal and peripheral signal.SELECTED DRAWING: Figure 5

Description

本発明は、情報処理装置、サウンド処理方法及びサウンド処理プログラムに関する。 The present invention relates to an information processing apparatus, a sound processing method, and a sound processing program.

スピーカ等を備えた音響装置において、再生される音としてＢＧＭ、台詞、効果音等さまざまなものがある。ここで、同時並列的に出力される音の種類に応じて、相対的な優先度があり、音量等を調整すべき場合がある。例えば、ＢＧＭが恒常的に出力されている状況下で、台詞が間欠的に出力される場合には、台詞を聴こえやすくするために、台詞の出力レベルに応じてＢＧＭのレベルを下げるように調整するダッキング処理が行われる（例えば、特許文献１及び特許文献２参照）。 In an audio device including a speaker or the like, there are various sounds such as BGM, dialogue, and sound effects that are reproduced. Here, depending on the type of sound output in parallel, there is a relative priority, and the volume or the like should be adjusted. For example, in the situation where BGM is output constantly, if the line is output intermittently, the BGM level is adjusted to be lowered according to the line output level in order to make the line easy to hear. The ducking process is performed (see, for example, Patent Document 1 and Patent Document 2).

ダッキング処理の一例として、ボリュームダッキング処理がある。ボリュームダッキング処理では、再生される音のうち台詞は重要であるため、台詞が出力される際には注目している台詞以外の周囲の音（効果音、ＢＧＭ等）を下げ、台詞を聞き取り易くする。例えば、台詞の開始に応じてＢＧＭの音量を下げ、台詞の終了に応じてＢＧＭの音量を上げる、ボリュームダッキングと言われる方法である。 An example of the ducking process is a volume ducking process. In volume ducking processing, dialogue is important in the sound to be played back, so when dialogue is output, surrounding sounds (sound effects, BGM, etc.) other than the dialogue you are interested in are lowered and the dialogue is easy to hear. To do. For example, there is a method called volume ducking, in which the volume of the BGM is lowered according to the start of the dialogue and the volume of the BGM is raised according to the end of the dialogue.

また、テレビ放送コンテンツでは、映像とともに流れる音声が聞き取り易いように、再生される音声を予め調整するプリプロダクションが行われている。プリプロダクションでは、台詞と周囲のＢＧＭや効果音等との調整を予め放送前に行う。 Also, in television broadcast content, pre-production is performed in which the reproduced sound is adjusted in advance so that the sound flowing along with the video can be easily heard. In pre-production, adjustments of dialogue and surrounding BGM and sound effects are performed in advance before broadcasting.

特開２０１５−２０１１２５号公報Japanese Patent Laying-Open No. 2015-201125 特開２００６−３２５２０７号公報JP 2006-325207 A

しかしながら、ボリュームダッキングでは、台詞の割り込みが続くとボリュームの上げ下げが頻繁に行われて聴き難くなるという欠点がある。また、ゲームのように、音声内容がゲームの進行状況に応じて変化する場合には、テレビ放送コンテンツのようにプリプロダクションを行うことができない。このため、ゲームのように進行状況に応じてインタラクティブに処理が行われるコンテンツにて、最適な音声を出力することは困難であった。 However, volume ducking has a drawback in that if the interruption of the dialogue continues, the volume is frequently raised and lowered, making it difficult to listen. Moreover, when the audio content changes according to the progress of the game as in a game, pre-production cannot be performed as in the case of television broadcast content. For this reason, it has been difficult to output optimal sound in content that is interactively processed according to the progress, such as a game.

そこで、一側面では、本発明は、コンテンツの提供においてユーザに違和感を感じさせない音を提供することを目的とする。 Therefore, in one aspect, an object of the present invention is to provide a sound that does not make a user feel uncomfortable in providing content.

一つの案では、所定のコンテンツにて出力する音を加工する情報処理装置であって、出力する前記音に含まれるステレオ音の第１の信号と第２の信号との加算値から中央信号を生成し、前記第１の信号と前記第２の信号との差分値から周辺信号を生成するエンコード処理部と、生成した前記中央信号から所定の帯域の音を除去した補正中央信号を生成するフィルタ処理部と、生成した前記補正中央信号と周辺信号とに基づき、第１の信号と第２の信号とを復元するデコード処理部と、を有する情報処理装置が提供される。 In one proposal, an information processing apparatus for processing a sound output with a predetermined content, wherein a central signal is obtained from an addition value of a first signal and a second signal of a stereo sound included in the output sound. An encoding processing unit that generates and generates a peripheral signal from a difference value between the first signal and the second signal, and a filter that generates a corrected central signal by removing a predetermined band of sound from the generated central signal An information processing apparatus is provided that includes a processing unit and a decoding processing unit that restores the first signal and the second signal based on the generated corrected central signal and peripheral signal.

一側面によれば、コンテンツの提供においてユーザに違和感を感じさせない音を提供することができる。 According to one aspect, it is possible to provide a sound that does not make the user feel uncomfortable in providing content.

コンプレッサが行う処理の一例を示す図。The figure which shows an example of the process which a compressor performs. ダッキング方法の一例を示す図。The figure which shows an example of the ducking method. 一実施形態にかかる情報処理装置の機能構成の一例を示す図。The figure which shows an example of a function structure of the information processing apparatus concerning one Embodiment. 一実施形態にかかる情報処理装置のハードウェア構成の一例を示す図。The figure which shows an example of the hardware constitutions of the information processing apparatus concerning one Embodiment. 一実施形態にかかるＭＳダッキング部のハードウェア構成の一例を示す図。The figure which shows an example of the hardware constitutions of MS ducking part concerning one Embodiment. 一実施形態にかかるＭＳダッキング部の機能構成の一例を示す図。The figure which shows an example of a function structure of the MS ducking part concerning one Embodiment. 一実施形態にかかるパンダッキング部の機能構成の一例を示す図。The figure which shows an example of a function structure of the panda packing part concerning one Embodiment. 一実施形態にかかるパンダッキングによる音量制御を説明するための図。The figure for demonstrating the volume control by the panda decking concerning one Embodiment. 一実施形態にかかるサウンド処理の一例を示すフローチャート。The flowchart which shows an example of the sound process concerning one Embodiment. 一実施形態にかかるＭＳダッキング処理の一例を示すフローチャート。The flowchart which shows an example of the MS ducking process concerning one Embodiment. 一実施形態にかかるパンダッキング処理の一例を示すフローチャート。The flowchart which shows an example of the panda decking process concerning one Embodiment. 一実施形態にかかるサラウンドサウンドのパンに応じた制御の一例を示す図。The figure which shows an example of the control according to pan of the surround sound concerning one Embodiment.

以下、本発明の実施形態について添付の図面を参照しながら説明する。なお、本明細書及び図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複した説明を省く。 Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. In addition, in this specification and drawing, about the component which has the substantially same function structure, the duplicate description is abbreviate | omitted by attaching | subjecting the same code | symbol.

［コンプレッサ］
音の大きさを圧縮するコンプレッサ５について、図１を参照しながら説明する。コンプレッサ５は、図１（ａ）に示すように、音信号を入力し、入力信号の波形の信号強度（振幅ピーク等）の大きさに応じて、図１（ｂ）に示すゲイン（音量）調整を適用し、サウンドの大きさを圧縮する機能を有する。 [compressor]
The compressor 5 for compressing the loudness will be described with reference to FIG. As shown in FIG. 1A, the compressor 5 inputs a sound signal, and the gain (volume) shown in FIG. 1B according to the signal intensity (amplitude peak, etc.) of the waveform of the input signal. Has the ability to apply adjustments and compress the loudness of the sound.

例えば、パラメータに閾値と圧縮率とがある場合、入力信号の信号強度が閾値を超えたとき、超過分の大きさに圧縮率が適用されるようにゲインが調整される。例えば閾値が０．３、圧縮率を１／２だとして、信号強度が０．５になったら、超過分０．２を１／２になるように信号を８０％に減衰させる。 For example, when the parameter includes a threshold value and a compression rate, when the signal strength of the input signal exceeds the threshold value, the gain is adjusted so that the compression rate is applied to the excess amount. For example, assuming that the threshold value is 0.3 and the compression ratio is ½, and the signal strength becomes 0.5, the signal is attenuated to 80% so that the excess 0.2 becomes ½.

［ダッキング］
メインの音声が鳴る際に、他の音の音量を小さくして目立たせるように音を加工することをダッキングという。ダッキング処理の一例として、ボリュームダッキング処理がある。ボリュームダッキング処理は、例えば会話等の音声再生中に、ＢＧＭや効果音等の音量を一定量だけ下げる方法である。図２（ａ）に示すダッキング方法では、会話音声データの再生が開始すると、対象サウンドの音量を一定量下げ、会話音声データの再生が終了すると、対象サウンドの音量を元に戻す。 [Ducking]
When the main sound is played, ducking is the process of reducing the volume of other sounds to make them stand out. An example of the ducking process is a volume ducking process. The volume ducking process is a method of lowering the volume of BGM, sound effects, etc. by a certain amount during audio reproduction such as conversation. In the ducking method shown in FIG. 2A, the volume of the target sound is reduced by a certain amount when the playback of the conversational voice data is started, and the volume of the target sound is restored when the playback of the conversational voice data is completed.

図２（ｂ）に、コンプレッサ５による自動ボリュームダッキングの構成例を示す。例えばコンプレッサ５には、サイドチェイン型コンプレッサを使用することができる。コンプレッサ５は、ダッキング対象となるＢＧＭ等の音信号を入力し、視聴者に聴かせたい会話音声信号を入力し、入力した会話音声信号から適用ゲインの計算を行い、ダッキング対象となる音にその計算されたゲイン圧縮を適用する。これにより、例えば、ラジオ放送などで、話者が発言中にその声の大きさに応じてＢＧＭの音量が自動的に小さくなることで、話者の発言が聞き取り易くなる。ゲイン計算用入力をサイドチェイン入力という。なお、サイドチェイン型のコンプレッサ５を使わない、コンプレッサ式ボリュームダッキングを改良したダッキング方法も存在する。 FIG. 2B shows a configuration example of automatic volume ducking by the compressor 5. For example, a side chain type compressor can be used for the compressor 5. The compressor 5 inputs a sound signal such as BGM to be ducked, inputs a conversation voice signal that the viewer wants to listen to, calculates an applied gain from the input conversation voice signal, Apply calculated gain compression. Thereby, for example, in a radio broadcast, the volume of the BGM is automatically reduced according to the volume of the voice while the speaker is speaking, so that the speaker's voice can be easily heard. The gain calculation input is called a side chain input. There is also a ducking method in which the compressor type volume ducking is improved without using the side chain type compressor 5.

しかしながら、ボリュームダッキングでは、会話音声の割り込みが続くとボリュームの上げ下げが頻繁に行われて聴き難くなるという欠点がある。一方、人間の聴覚は周波数ごとの成分分布や方向による位相差等による違いを解析し、周囲の混ざった音から音源を分離して認識することができる。つまり、人間の聴覚器官はたくさんの音が重なっても位置が離れている音は分離して聴き取ることができる。 However, volume ducking has a drawback in that if conversational voice interruption continues, the volume is frequently raised and lowered, making it difficult to hear. On the other hand, the human auditory sense can analyze the difference due to the component distribution for each frequency and the phase difference depending on the direction, etc., and recognize the sound source by separating it from the mixed sound. In other words, the human auditory organ can separate and listen to sounds that are far apart even if many sounds overlap.

例えば、中心方向（センター方向）から再生される会話音声に限定して明瞭に聴き分けられるようにするためには、会話音声以外の音の中心方向の成分のみを加工し、会話音声以外の音を会話音声と分離して認識できるようにすればよい。つまり、会話音声と同一方向からの会話音声以外の音を軽減させられれば，全体の音量を変化させることなく会話音声の聴き取り易くすることができる。これにより、上記のボリュームダッキングに比べて音量の増減が意識されにくく、重要な音が聞き取り易く、没入感を阻害しないゲーム等のコンテンツを提供できる。 For example, in order to be able to hear clearly only in conversation voices played from the center direction (center direction), only the components in the center direction of sounds other than conversation voices are processed, and sounds other than conversation voices are processed. Can be recognized separately from the conversational voice. That is, if sounds other than the conversation voice from the same direction as the conversation voice can be reduced, the conversation voice can be easily heard without changing the overall volume. As a result, it is possible to provide content such as a game that is less conscious of volume increase / decrease than the above volume ducking, can easily hear important sounds, and does not impair the sense of immersion.

そこで、以下に説明する一実施形態に係る情報処理装置では、ＭＳダッキング処理を行う。ＭＳダッキング処理では、ステレオ音やサラウンド音等の多重音のＭｉｄ成分を小さく、Ｓｉｄｅ成分を大きくすることで、ステレオ音やサラウンド音等の中央の音をキャンセルして、ステレオ音やサラウンド音等の多重音を左右に分離させる。これによれば、ダッキング中でも音圧の変化を最小限にすることができ、比較的安い処理コストで聞き取り易い音を提供することが可能である。 Therefore, the information processing apparatus according to an embodiment described below performs MS ducking processing. In the MS ducking process, the mid component of multiple sounds such as stereo sound and surround sound is reduced and the side component is increased to cancel the center sound such as stereo sound and surround sound. Separate multiple sounds left and right. According to this, it is possible to minimize a change in sound pressure even during ducking, and it is possible to provide a sound that is easy to hear at a relatively low processing cost.

また、以下に説明する一実施形態に係る情報処理装置では、パンダッキング処理を行う。パンダッキング処理では、会話音声が中央方向（センター方向）のパンが０のときに再生されることを前提として、パンが０に近付く程、３Ｄにより発音されるモノラル音の音量を小さくすることで会話音声と方向が重なる音のみを減らすことができる。 In addition, the information processing apparatus according to an embodiment described below performs panda ucking processing. In the panda decking process, on the assumption that the conversational voice is played when the pan in the center direction (center direction) is 0, as the pan approaches 0, the volume of the monaural sound produced by 3D is reduced. Only sounds that overlap the direction of the conversation voice can be reduced.

なお、パンは、音が聞こえる位置を意味する。一般的に位置（方向）を持った音源（音を発するオブジェクト）は、モノラル波形信号を複数スピーカに送出する割合を変化させてパン（＝定位：空間的な位置表現）を表す。以下では、ＭＳダッキング及びパンダッキングを実行する情報処理装置１０について説明する。 Pan means a position where sound can be heard. In general, a sound source (object that emits sound) having a position (direction) represents panning (= localization: spatial position expression) by changing a ratio of sending monaural waveform signals to a plurality of speakers. Below, the information processing apparatus 10 which performs MS ducking and panda ducking is demonstrated.

［情報処理装置］
まず、本発明の一実施形態に係る情報処理装置１０の機能構成の一例について、図３を参照しながら説明する。一実施形態に係る情報処理装置１０は、音響装置、ゲーム機器、パーソナルコンピュータ、タブレット型機器、スマートフォン、携帯型音楽再生装置、ＨＭＤ（Head Mount Display）等のウェアラブル表示デバイス等、音を出力可能ないずれの電子機器にも適用できる。 [Information processing device]
First, an example of a functional configuration of the information processing apparatus 10 according to an embodiment of the present invention will be described with reference to FIG. The information processing apparatus 10 according to an embodiment can output sound, such as a sound device, a game device, a personal computer, a tablet device, a smartphone, a portable music playback device, and a wearable display device such as an HMD (Head Mount Display). It can be applied to any electronic device.

本実施形態では、情報処理装置１０の一例であるゲーム機器を挙げて説明する。情報処理装置１０は、ゲーム実行中の状況に応じて、キャラクタの会話（台詞）を出力する際、会話以外のＢＧＭ、効果音等のステレオ音又はサラウンド音、３Ｄ等のモノラル音のダッキングに本実施形態に係るダッキング方法を使用する。なお、情報処理装置１０は、ゲーム機器に限らず、所定のコンテンツにて出力する音を加工する機器であればよい。所定のコンテンツには、ゲームやＴＶ放送用のドラマ等であって、ヒロインの会話等の音を出力するものが含まれる。 In the present embodiment, a game machine that is an example of the information processing apparatus 10 will be described. When the information processing apparatus 10 outputs a conversation (line) of a character according to the situation during game execution, the information processing apparatus 10 is used for ducking a BGM other than the conversation, a stereo sound such as a sound effect or a surround sound, a monaural sound such as 3D The ducking method according to the embodiment is used. Note that the information processing apparatus 10 is not limited to a game machine, and may be any machine that processes sound output with predetermined content. The predetermined contents include games, TV broadcast dramas, and the like that output sounds such as heroine conversations.

情報処理装置１０は、受付部１１、ゲーム実行部１２、ＭＳダッキング部１３、パンダッキング部１４、サウンド処理部１５、グラフィック処理部１６、通信部１７、表示部１８、音出力部１９及び記憶部２０を有する。 The information processing apparatus 10 includes a reception unit 11, a game execution unit 12, an MS ducking unit 13, a panda ducking unit 14, a sound processing unit 15, a graphic processing unit 16, a communication unit 17, a display unit 18, a sound output unit 19, and a storage unit. 20

受付部１１は、プレイヤのゲームに対する操作を受け付ける。プレイヤがコントローラ等の入力装置を用いて行う操作には、例えば、ゲームに登場するヒーローやヒロイン等のキャラクタを動作させるための操作、ゲームを起動又は停止させるための操作等が含まれる。ゲーム実行部１２は、記憶部２０に記憶されたゲーム処理プログラム１２１を情報処理装置１０のＣＰＵ（図４参照）に実行させることで、所望のゲームを実行する。 The reception part 11 receives operation with respect to the game of a player. The operations performed by the player using an input device such as a controller include, for example, operations for operating characters such as heroes and heroines appearing in the game, operations for starting or stopping the game, and the like. The game execution unit 12 executes a desired game by causing the CPU (see FIG. 4) of the information processing apparatus 10 to execute the game processing program 121 stored in the storage unit 20.

ＭＳダッキング部１３は、ＭＳ処理によりステレオ音又はサラウンド音のダッキングを行う。ＭＳダッキング部１３は、チャンネル数が２以上のサウンドに適用可能である。チャンネル数が２のサウンドはステレオ音であり、チャンネル数が３以上のサウンドはサラウンド音である。つまり、ＭＳダッキング部１３は、モノラル音に適用できない。ＭＳダッキング部１３は、複数の音がミックスされた後の多チャンネルのサウンドに適用できる。ＭＳダッキング部１３は、会話音声再生中にダッキング対象サウンドに対して、以下のＭＳダッキングを適用する。
（１）ＭＳダッキング部１３は、入力したＬ（Ｌｅｆｔ）信号とＲ（Ｒｉｇｈｔ）信号に基づき、ＭＳエンコード処理を行う。ＭＳダッキング部１３は、ＭＳエンコード処理により、Ｌ信号とＲ信号とから、Ｍ（Ｍｉｄ）信号とＳ（Ｓｉｄｅ）信号とを生成する。
（２）ＭＳダッキング部１３は、Ｍ信号にフィルタ処理を適用する。明瞭に聴かせたいサウンドは人間の声なので、フィルタ処理では、ＢＧＭや効果音から人の声の帯域を含む指定帯域をカットする。
（３）ＭＳダッキング部１３は、ＭＳデコード処理を通して、Ｌ信号とＲ信号とを復元する。復元したＬ信号とＲ信号とから構成される音は、中心方向に近づくに従って所定の帯域がカットされたような音になる。 The MS ducking unit 13 performs ducking of stereo sound or surround sound by MS processing. The MS ducking unit 13 can be applied to a sound having two or more channels. A sound with 2 channels is a stereo sound, and a sound with 3 or more channels is a surround sound. That is, the MS ducking unit 13 cannot be applied to monaural sound. The MS ducking unit 13 can be applied to a multi-channel sound after a plurality of sounds are mixed. The MS ducking unit 13 applies the following MS ducking to the sound to be ducked during conversational voice reproduction.
(1) The MS ducking unit 13 performs MS encoding processing based on the input L (Left) signal and R (Right) signal. The MS ducking unit 13 generates an M (Mid) signal and an S (Side) signal from the L signal and the R signal by MS encoding processing.
(2) The MS ducking unit 13 applies filter processing to the M signal. Since the sound to be clearly heard is a human voice, the filter process cuts a designated band including the band of the human voice from BGM and sound effects.
(3) The MS ducking unit 13 restores the L signal and the R signal through the MS decoding process. The sound composed of the restored L signal and R signal is such that a predetermined band is cut as it approaches the center direction.

パンダッキング部１４は、パンニング処理におけるダッキングを行う。特に、本実施形態では、パンダッキング部１４は、左右パンを調整しながらダッキングを行う。パンダッキング部１４は、モノラル音のダッキングに適用可能である。パンダッキング部１４は、インゲーム効果音が鳴らされる位置情報によって左右パンを計算し、該左右パンを音の再生時に適用する。音が聴こえる方向がパンである。 The panda ducking unit 14 performs ducking in the panning process. In particular, in the present embodiment, the pan ducking unit 14 performs ducking while adjusting the left and right pan. The panda ducking unit 14 is applicable to monaural sound ducking. The panda decking unit 14 calculates the left and right pans based on the position information where the in-game sound effect is sounded, and applies the left and right pans during sound reproduction. The direction in which the sound can be heard is panning.

３Ｄ音の場合、パンダッキング部１４は、左右パンと同時に前後パンを計算する。しかし、本実施形態に係るＭＳダッキング処理では、前後パンはダッキング処理に直接関係しない。パンダッキング部１４は、会話音声再生中にダッキング対象音の音量を、左右パンの値が中央方向に近付く程小さくするようにする。 In the case of a 3D sound, the panda decking unit 14 calculates the front and rear pans simultaneously with the left and right pans. However, in the MS ducking process according to the present embodiment, the front and rear pans are not directly related to the ducking process. The panducking unit 14 reduces the volume of the sound to be ducked during conversational voice reproduction as the left and right pan values approach the center.

なお、パンダッキング部１４では、フィルタ処理を行わない。モノラル信号のダッキングでは、フィルタ処理を行うことは現実的でない。モノラル音は、ステレオ信号やサラウンド音と異なり、多数の信号をミックスさせた信号ではないため、フィルタ処理を行う場合には、モノラル音の数と同数のフィルタが必要となるためである。 Note that the panda packing unit 14 does not perform filter processing. In the case of monaural signal ducking, it is not practical to perform filtering. This is because, unlike a stereo signal or a surround sound, a monaural sound is not a signal in which a large number of signals are mixed. Therefore, when filtering is performed, the same number of filters as the number of monaural sounds are required.

サウンド処理部１５は、ダッキングした音と会話音声とを合成し、音出力部１９は、合成された音を出力する。これにより、音出力部１９は、ゲームの進行に応じた効果音やＢＧＭとともに、例えばゲームのキャラクタの会話等の音声を聞き取り易い音に加工して出力できる。左右パンでは、台詞は、モノラル音であっても左右に振る必要がなく、パンが中央に位置する状態で出力される。本実施形態に係る情報処理装置１０は、記憶部２０に記憶されたサウンド処理プログラム１２２をＣＰＵ（図４参照）に実行させることで、ダッキング処理を含むサウンド処理を実行する。 The sound processing unit 15 synthesizes the ducked sound and the conversation voice, and the sound output unit 19 outputs the synthesized sound. As a result, the sound output unit 19 can process and output sound such as a conversation of a game character into a sound that is easy to hear together with sound effects and BGM according to the progress of the game. In left and right panning, the dialogue does not need to be swung to the left or right, even if it is a monaural sound, and is output with the pan located at the center. The information processing apparatus 10 according to the present embodiment executes sound processing including ducking processing by causing the CPU (see FIG. 4) to execute the sound processing program 122 stored in the storage unit 20.

通信部１７は、他のゲーム機器や携帯機器等と通信や音声による通話を行う。グラフィック処理部１６は、ゲームの進行に応じた描画命令を出力すると、表示部１８にビデオ信号を出力する。表示部１８は、ゲームの進行に合わせてゲーム画像を表示する。 The communication unit 17 performs communication or voice communication with other game devices or portable devices. When the graphic processing unit 16 outputs a drawing command corresponding to the progress of the game, the graphic processing unit 16 outputs a video signal to the display unit 18. The display unit 18 displays a game image as the game progresses.

なお、図３は機能に着目したブロック図を描いており、これらの機能ブロックで示した各部は、ハードウエアのみ、ソフトウエアのみ、またはハードウエアとソフトウエアとの組合せによって実現することができる。 Note that FIG. 3 is a block diagram focusing on functions, and each unit indicated by these functional blocks can be realized by hardware alone, software alone, or a combination of hardware and software.

［情報処理装置のハードウェア構成］
次に、本発明の一実施形態に係る情報処理装置１０のハードウェア構成の一例について、図４を参照しながら説明する。本実施形態に係る情報処理装置１０は、ＣＰＵ（Central Processing Unit）２１、ＲＯＭ（Read Only Memory）２２、ＲＡＭ（Random Access Memory）２３及びＨＤＤ（Hard Disk Drive）２４を有している。また、本実施形態に係る情報処理装置１０は、グラフィックカード２５、外部Ｉ／Ｆ（インターフェース）２６、通信Ｉ／Ｆ２７、入力Ｉ／Ｆ２８、ディスプレイ２９、スピーカ３０、エンコーダ／デコーダ１２及びフィルタ１３を有している。各部は、それぞれがバスで相互に接続されている。 [Hardware configuration of information processing device]
Next, an example of the hardware configuration of the information processing apparatus 10 according to an embodiment of the present invention will be described with reference to FIG. The information processing apparatus 10 according to the present embodiment includes a CPU (Central Processing Unit) 21, a ROM (Read Only Memory) 22, a RAM (Random Access Memory) 23, and an HDD (Hard Disk Drive) 24. Further, the information processing apparatus 10 according to the present embodiment includes a graphic card 25, an external I / F (interface) 26, a communication I / F 27, an input I / F 28, a display 29, a speaker 30, the encoder / decoder 12, and a filter 13. Have. Each part is mutually connected by a bus.

ＲＯＭ２２は、電源を切っても内部データを保持することができる不揮発性の半導体メモリである。ＲＯＭ２２には、プログラム及びデータが格納されている。ＲＡＭ２３は、プログラムやデータを一時保持する揮発性の半導体メモリである。 The ROM 22 is a nonvolatile semiconductor memory that can retain internal data even when the power is turned off. The ROM 22 stores programs and data. The RAM 23 is a volatile semiconductor memory that temporarily stores programs and data.

ＨＤＤ２４は、プログラムやデータを格納している不揮発性の記憶装置である。ＨＤＤ２４に格納されるプログラムには、情報処理装置１０の全体を制御する基本ソフトウェア及びアプリケーションソフトウェアがある。ＨＤＤ２４には、各種のデータベースが格納されてもよい。本実施形態では、ＨＤＤ２４には、ゲーム処理プログラム１２１、サウンド処理プログラム１２２等の各種プログラムが格納される。 The HDD 24 is a non-volatile storage device that stores programs and data. The programs stored in the HDD 24 include basic software and application software that control the entire information processing apparatus 10. Various databases may be stored in the HDD 24. In the present embodiment, the HDD 24 stores various programs such as a game processing program 121 and a sound processing program 122.

ＣＰＵ２１は、ＲＯＭ２２やＨＤＤ２４からプログラムやデータをＲＡＭ２３上に読み出し、上記各種処理を実行することで、情報処理装置１０の全体の制御や情報処理装置１０に搭載された機能を実現する。 The CPU 21 implements overall control of the information processing apparatus 10 and functions installed in the information processing apparatus 10 by reading programs and data from the ROM 22 and the HDD 24 onto the RAM 23 and executing the various processes described above.

外部Ｉ／Ｆ２６は、情報処理装置１０を外部装置に接続するインターフェースである。外部装置には、記録媒体２６ａなどがある。これにより、情報処理装置１０は、外部Ｉ／Ｆ２６を介して記録媒体２６ａの読み取り及び書き込みを行うことができる。記録媒体２６ａの一例としては、ＣＤ（Compact Disk）、ＤＶＤ（Digital Versatile Disk）、ＳＤメモリカード（SD Memory card）又はＵＳＢメモリ（Universal Serial Bus memory）等が挙げられる。 The external I / F 26 is an interface that connects the information processing apparatus 10 to an external apparatus. The external device includes a recording medium 26a. Thereby, the information processing apparatus 10 can read and write the recording medium 26 a via the external I / F 26. Examples of the recording medium 26a include a CD (Compact Disk), a DVD (Digital Versatile Disk), an SD memory card (SD Memory card), or a USB memory (Universal Serial Bus memory).

例えば、情報処理装置１０には、ゲーム処理プログラム１２１及びサウンド処理プログラム１２２等のプログラムが格納された記録媒体２６ａを装着することが可能である。これらのプログラムは、外部Ｉ／Ｆ２６により読み出されて、ＲＡＭ２３に読み込まれる。 For example, the information processing apparatus 10 can be loaded with a recording medium 26 a in which programs such as a game processing program 121 and a sound processing program 122 are stored. These programs are read by the external I / F 26 and read into the RAM 23.

ＣＰＵ２１は、ＲＡＭ２３にロードされた上記の各種プログラムを処理し、グラフィックカード２５にゲームの進行に応じた画像の出力を指示する。グラフィックカード２５は、指示に従い画面に表示するゲームの画像処理を行い、ゲーム画像をディスプレイ２９に描画させる。グラフィックカード２５から出力される画像の一フレーム時間は、例えば１／３０〜１／６０秒である。グラフィックカード２５は、フレーム単位で１枚の画像の描画を実行する。すなわち、一秒間に３０回〜６０回のフレームの画像が描画される。 The CPU 21 processes the various programs loaded in the RAM 23 and instructs the graphic card 25 to output an image according to the progress of the game. The graphic card 25 performs image processing of the game to be displayed on the screen according to the instruction, and causes the display 29 to draw the game image. One frame time of an image output from the graphic card 25 is, for example, 1/30 to 1/60 seconds. The graphic card 25 executes drawing of one image for each frame. That is, an image of 30 to 60 frames per second is drawn.

ディスプレイ２９はタッチパネルを搭載していても良い。これにより、コントローラ１を用いずに画面にタッチすることで、入力操作を行うことができる。この場合、タッチパネルにより検出されたタッチ位置の入力情報は、ＲＡＭ２３に格納され、ＣＰＵ２１はＲＡＭ２３に格納された入力情報をもとに各種の計算処理を実行する。 The display 29 may be equipped with a touch panel. Thereby, an input operation can be performed by touching the screen without using the controller 1. In this case, the input information of the touch position detected by the touch panel is stored in the RAM 23, and the CPU 21 executes various calculation processes based on the input information stored in the RAM 23.

通信Ｉ／Ｆ２７は、情報処理装置１０をネットワークに接続するインターフェースである。また、通信Ｉ／Ｆ２７は、アンテナを有する通信ユニットを介して、他のゲーム機器や携帯機器と通信や音声通話を行う機能を有していてもよい。 The communication I / F 27 is an interface that connects the information processing apparatus 10 to a network. In addition, the communication I / F 27 may have a function of performing communication and voice communication with other game devices and portable devices via a communication unit having an antenna.

入力Ｉ／Ｆ２８は、コントローラ１に接続するインターフェースである。ユーザは、コントローラ１の操作ボタン２及び方向キー３を操作することでゲームを操作し、キャラクタに所定の動作を行わせることができる。入力Ｉ／Ｆ２８は、ユーザがコントローラを用いて行った入力操作に基づく入力情報をＲＡＭ２３に格納させる。ＣＰＵ２１は、ＲＡＭ２３に格納された入力情報に基づきゲームのキャラクタの動作に関する各種の計算処理を実行する。 The input I / F 28 is an interface connected to the controller 1. The user can operate the game by operating the operation button 2 and the direction key 3 of the controller 1 to cause the character to perform a predetermined action. The input I / F 28 causes the RAM 23 to store input information based on an input operation performed by the user using the controller. The CPU 21 executes various calculation processes relating to the action of the game character based on the input information stored in the RAM 23.

受付部１１及びゲーム実行部１２の機能は、情報処理装置１０にインストールされたゲーム処理プログラム１２１がＣＰＵ２１に実行させる処理により実現可能である。パンダッキング部１４及びサウンド処理部１５の機能は、情報処理装置１０にインストールされたサウンド処理プログラム１２２がＣＰＵ２１に実行させる処理により実現可能である。 The functions of the reception unit 11 and the game execution unit 12 can be realized by a process that the game processing program 121 installed in the information processing apparatus 10 causes the CPU 21 to execute. The functions of the panda decking unit 14 and the sound processing unit 15 can be realized by processing executed by the CPU 21 by the sound processing program 122 installed in the information processing apparatus 10.

グラフィック処理部１６の機能は、例えばグラフィックカード２５を用いて実現可能である。通信部１７の機能は、例えば通信インターフェース２７により実現可能である。表示部１８の機能は、例えばディスプレイ２９により実現可能である。音出力部１９の機能は、例えばスピーカ３０により実現可能である。 The function of the graphic processing unit 16 can be realized using the graphic card 25, for example. The function of the communication part 17 is realizable by the communication interface 27, for example. The function of the display unit 18 can be realized by the display 29, for example. The function of the sound output unit 19 can be realized by the speaker 30, for example.

エンコーダ／デコーダ１２は、ＭＳダッキング処理において、Ｌ信号及びＲ信号からＭ信号及びＳ信号を生成するエンコード処理と、Ｍ信号を加工したＭ'信号及びＳ信号からＬ信号及びＲ信号を復元するデコード処理とを実行する。 The encoder / decoder 12 performs an encoding process for generating an M signal and an S signal from the L signal and the R signal in an MS ducking process, and a decoding for restoring the L signal and the R signal from the M ′ signal and the S signal obtained by processing the M signal. Process.

フィルタ１３は、Ｍ信号から所定の帯域の信号を除去するフィルタである。
（ＭＳダッキング）
ＭＳダッキング部１３のハードウェア構成の一例を図５に示し、ＭＳダッキング部１３の機能構成の一例を図６に示す。ＭＳダッキング部１３は、ＢＧＭや効果音のようなステレオ音又はサラウンド音をダッキングする。ステレオ音及びサラウンド音は、それ自身がたくさんの音の集合であり、後に説明するパンダッキングのようにモノラル音に位置を持たせるようなものとは異なる。サラウンド音はステレオ音に含まれるため、以下では、ステレオ音を入力した場合について説明する。 The filter 13 is a filter that removes a signal in a predetermined band from the M signal.
(MS ducking)
An example of the hardware configuration of the MS ducking unit 13 is shown in FIG. 5, and an example of the functional configuration of the MS ducking unit 13 is shown in FIG. The MS ducking unit 13 ducks stereo sound or surround sound such as BGM or sound effect. The stereo sound and the surround sound are themselves a collection of many sounds, and are different from those in which the monaural sound has a position as in the panda deck described later. Since the surround sound is included in the stereo sound, a case where the stereo sound is input will be described below.

図５に示すＭＳエンコーダ１２ａは、ステレオ音のＬ信号及びＲ信号を入力し、エンコード処理を行い、Ｍ信号及びＳ信号を生成し、出力する。具体的には、ＭＳエンコーダ１２ａは、以下の式（１）によりＭ信号を生成し、式（２）によりＳ信号を生成する。 The MS encoder 12a shown in FIG. 5 inputs the stereo sound L signal and R signal, performs an encoding process, and generates and outputs an M signal and an S signal. Specifically, the MS encoder 12a generates an M signal according to the following equation (1), and generates an S signal according to the equation (2).

Ｍ＝Ｌ＋Ｒ・・・（１）
Ｓ＝Ｌ−Ｒ・・・（２）
なお、Ｌ信号はステレオ音の第１の信号の一例であり、Ｒ信号はステレオ音の第２の信号の一例である。第１の信号及び第２の信号の他の例としては、サラウンドシステムから出力される上下の信号や、右斜め上と左斜め下の信号や、左斜め上と右斜め下の信号等が挙げられる。 M = L + R (1)
S = L−R (2)
The L signal is an example of a first signal of stereo sound, and the R signal is an example of a second signal of stereo sound. Other examples of the first signal and the second signal include the upper and lower signals output from the surround system, the signals on the upper right and the lower left, the signals on the upper left and the lower right, and the like. It is done.

また、Ｍ信号及びＭ信号を加工したＭ'信号はステレオ音の中央（センター）信号の一例であり、ステレオ音のＳ信号は周辺（サイド）信号の一例である。Ｍ'信号は、Ｍ信号を加工した補正中央信号の一例である。 Further, the M signal and the M ′ signal obtained by processing the M signal are an example of the center signal of the stereo sound, and the S signal of the stereo sound is an example of the peripheral (side) signal. The M ′ signal is an example of a corrected center signal obtained by processing the M signal.

Ｍ信号は、Ｌ信号とＲ信号との加算値から生成され、Ｓ信号は、Ｌ信号とＲ信号との差分値から生成される。つまり、Ｍ信号は、左側（Ｌ信号）及び右側（Ｒ信号）の音が加算されることで生成され、中央（Ｍｉｄｄｌｅ）の音成分が支配的な信号である。Ｓ信号は、Ｌ信号及びＲ信号の一方の音から他方の音を引くことで生成され、中央から外れた横（Ｓｉｄｅ）の音成分が支配的な信号である。つまり、Ｍ信号は、ステレオ音の中央成分が横成分（または縦成分）よりも多い信号であり、Ｓ信号は、ステレオ音の横成分（または縦成分）が中央成分よりも多い信号である。このようにしてＬ信号及びＲ信号をＭ信号及びＳ信号に分けることで、Ｍ信号のみ及びＳ信号のみを所定に加工を施すことができる。 The M signal is generated from the added value of the L signal and the R signal, and the S signal is generated from the difference value between the L signal and the R signal. In other words, the M signal is generated by adding the left (L signal) and right (R signal) sounds, and the central (Middle) sound component is a dominant signal. The S signal is generated by subtracting the other sound from one sound of the L signal and the R signal, and is a signal in which a lateral sound component deviating from the center is dominant. In other words, the M signal is a signal that has more central components of stereo sound than the horizontal component (or vertical component), and the S signal is a signal that has more horizontal components (or vertical component) of stereo sound than the central component. By dividing the L signal and the R signal into the M signal and the S signal in this manner, only the M signal and only the S signal can be processed in a predetermined manner.

人の声は３００Ｈｚ〜３ｋＨｚの帯域に含まれる。そこで、ＭＳエンコーダ１２ａから出力されたＭ信号をローパスフィルタ（以下、「ＬＰＦ１３ａ」という。）に通し、３００Ｈｚ以下の帯域の信号にする。また、別途、Ｍ信号をハイパスフィルタ（以下、「ＨＰＦ１３ｂ」という。）に通し、３ｋＨｚ以上の帯域の信号にする。ＬＰＦ１３ａから出力された３００Ｈｚ以下の帯域の信号と、ＨＰＦ１３ｂから出力された３ｋＨｚ以上の帯域の信号とを合成した信号であるＭ'信号は、ＭＳデコーダ１２ｂに入力される。Ｍ'信号は、３００Ｈｚ〜３ｋＨｚの帯域、つまり、入力したステレオ音（Ｌ信号及びＲ信号）のうち、人の声が含まれる帯域の音を除去した信号である。つまり、Ｍ'信号は、入力した音から人の声の帯域を空洞化させた音（ＢＧＭ、効果音等）である。なお、ダッキング非動作時はフィルタ処理を通さない。この場合、Ｍ'信号はＭ信号そのものであり、ＭＳデコーダ１２ｂに入力される。 Human voice is included in the band of 300 Hz to 3 kHz. Therefore, the M signal output from the MS encoder 12a is passed through a low-pass filter (hereinafter referred to as “LPF 13a”) to be a signal in a band of 300 Hz or less. Separately, the M signal is passed through a high-pass filter (hereinafter referred to as “HPF13b”) to obtain a signal in a band of 3 kHz or higher. The M ′ signal, which is a signal obtained by synthesizing the signal in the band of 300 Hz or less output from the LPF 13a and the signal in the band of 3 kHz or more output from the HPF 13b, is input to the MS decoder 12b. The M ′ signal is a signal obtained by removing sounds in a band of 300 Hz to 3 kHz, that is, in a band including human voices from the input stereo sound (L signal and R signal). That is, the M ′ signal is a sound (BGM, sound effect, etc.) obtained by hollowing out the human voice band from the input sound. Note that the filter process is not passed when ducking is not in operation. In this case, the M ′ signal is the M signal itself and is input to the MS decoder 12b.

ＭＳデコーダ１２ｂは、Ｍ'信号及びＳ信号を入力し、デコード処理を行い、Ｌ信号及びＲ信号を復元する。これにより、人の声が含まれる帯域の音を除去したＬ信号及びＲ信号が出力される。具体的には、ＭＳデコーダ１２ｂは、以下の式（３）によりＬ信号を生成し、式（４）によりＲ信号を生成する。 The MS decoder 12b receives the M ′ signal and the S signal, performs a decoding process, and restores the L signal and the R signal. Thereby, the L signal and the R signal from which the sound in the band including the human voice is removed are output. Specifically, the MS decoder 12b generates an L signal according to the following equation (3), and generates an R signal according to the equation (4).

Ｌ＝（Ｍ'＋Ｓ）÷２・・・（３）
Ｒ＝（Ｍ'−Ｓ）÷２・・・（４）
サウンド処理部１５は、復元されたＬ信号及びＲ信号に、所定の音声（台詞等）を合成する。音出力部１９は、合成した音を出力する。復元後のＬ信号及びＲ信号の３００Ｈｚ〜３ｋＨｚの帯域（台詞等の音声の帯域を含む）には他の音が存在しない。このため、合成後のステレオ音では、台詞が聞き易く、また周囲の音と調整された音となる。これにより、ユーザからのインタラクティブ操作に対する動作を行う情報処理装置１０において、台詞等の重要な音が聞き易く、またＢＧＭ等の周囲の音と調整された音がリアルタイムに生成され、コンテンツにて出力する音としてユーザに提供できる。 L = (M ′ + S) ÷ 2 (3)
R = (M′−S) ÷ 2 (4)
The sound processing unit 15 synthesizes predetermined speech (such as speech) with the restored L signal and R signal. The sound output unit 19 outputs the synthesized sound. There is no other sound in the band of 300 Hz to 3 kHz (including speech bands such as lines) of the restored L signal and R signal. For this reason, in the synthesized stereo sound, the dialogue is easy to hear and becomes a sound adjusted with the surrounding sounds. As a result, in the information processing apparatus 10 that performs an operation in response to an interactive operation from the user, it is easy to hear important sounds such as lines, and sounds adjusted with surrounding sounds such as BGM are generated in real time and output as content Can be provided to the user as a sound to play.

本実施形態では、３００Ｈｚ〜３ｋＨｚの帯域以外であっても、重要なシーンでメインキャラクタが銃を発するときの銃の音等、ゲームの進行に重要な効果音は削りたくない音である。この場合、Ｍ信号から削りたくない音の帯域を除去しないフィルタを使用してＭ信号のフィルタ処理を行うようにしてもよい。 In the present embodiment, sound effects that are important for the progress of the game, such as the sound of a gun when the main character fires a gun in an important scene, are sounds that do not want to be cut even in a band other than 300 Hz to 3 kHz. In this case, the filter processing of the M signal may be performed using a filter that does not remove the band of the sound that is not desired to be removed from the M signal.

なお、ＭＳダッキング部１３は、図６に示すエンコード処理部３２、フィルタ処理部３３及びデコード処理部３４を有する。エンコード処理部３２は、所定のコンテンツにて出力する音に含まれるステレオ音のＬ信号及びＲ信号を加算してＭ信号を生成し、エンコード処理部３２は、Ｌ信号からＲ信号を減算してＳ信号を生成する。 The MS ducking unit 13 includes an encoding processing unit 32, a filter processing unit 33, and a decoding processing unit 34 illustrated in FIG. The encoding processing unit 32 generates an M signal by adding the stereo L signal and the R signal included in the sound output by the predetermined content, and the encoding processing unit 32 subtracts the R signal from the L signal. An S signal is generated.

フィルタ処理部３３は、生成したＭ信号から所定の帯域の音を除去したＭ'信号を生成する。デコード処理部３４は、生成したＭ'信号及びＳ信号に基づき、Ｌ信号及びＲ信号を復元する。 The filter processing unit 33 generates an M ′ signal obtained by removing a predetermined band of sound from the generated M signal. The decode processing unit 34 restores the L signal and the R signal based on the generated M ′ signal and S signal.

なお、エンコード処理部３２の機能は、ＭＳエンコーダ１２ａにより実現可能である。フィルタ処理部３３の機能は、ＬＰＦ１３ａ及びＨＰＦ１３ｂにより実現可能である。デコード処理部３４の機能は、ＭＳデコーダ１２ｂにより実現可能である。
（パンダッキング）
パンダッキング部１４の機能構成の一例を図７に示す。パンダッキング部１４は、パン計算部３５及び音量補正部３６の機能を有する。パン計算部３５は、入力したモノラル音を左右にパンする。音量補正部３６は、パンが中央に近付く程、音量を小さくする補正を行う。サウンド処理部１５は、補正したモノラル音に、所定の音声（台詞等）を合成する。音出力部１９は、合成した音を出力する。 The function of the encoding processing unit 32 can be realized by the MS encoder 12a. The function of the filter processing unit 33 can be realized by the LPF 13a and the HPF 13b. The function of the decoding processing unit 34 can be realized by the MS decoder 12b.
(Pundakking)
An example of the functional configuration of the panda decking unit 14 is shown in FIG. The panda decking unit 14 has functions of a pan calculation unit 35 and a volume correction unit 36. The pan calculator 35 pans the input monaural sound left and right. The volume correction unit 36 performs correction to reduce the volume as the pan approaches the center. The sound processing unit 15 synthesizes a predetermined sound (such as speech) with the corrected monaural sound. The sound output unit 19 outputs the synthesized sound.

パン計算部３５は、パンニングの音量計算を以下の式に基づき行う。このとき、パンの範囲を−１．０〜１．０、パンの中央を０．０とする。
パンの左右の音量は、以下の各式（５）（６）を用いて計算できる。
左パン音量＝ｃｏｓ（π×（パン＋１）÷４）・・・（５）
右パン音量＝ｓｉｎ（π×（パン＋１）÷４）・・・（６）
これによれば、図８（ａ）の非ダッキング動作時の左右パンと音量との関係を示すグラフのように、パンの中央方向を０として、左右方向にパンしたときに、音出力部１９は、右のスピーカから図８（ａ）に示す右音量を出力し、左のスピーカから図８（ａ）に示す左音量を出力する。つまり、非ダッキング動作時、パンニング計算結果の音量値がそのまま適用されて音が出力される。パンが中央方向、つまり０である場合の左音量及び右音量は０．７０７となる。 The pan calculator 35 performs panning sound volume calculation based on the following equation. At this time, the bread range is -1.0 to 1.0, and the center of the bread is 0.0.
The left and right volume of the pan can be calculated using the following equations (5) and (6).
Left pan volume = cos (π × (pan + 1) ÷ 4) (5)
Right pan volume = sin (π × (pan + 1) ÷ 4) (6)
According to this, as shown in the graph showing the relationship between the left and right pans and the volume during the non-ducking operation in FIG. 8A, the sound output unit 19 when panning in the left and right directions with the center direction of the pan being zero. Outputs the right volume shown in FIG. 8A from the right speaker, and outputs the left volume shown in FIG. 8A from the left speaker. That is, during the non-ducking operation, the sound volume is output by applying the volume value of the panning calculation result as it is. When the pan is in the center direction, that is, 0, the left volume and the right volume are 0.707.

ダッキング動作時、音量補正部３６は、パンが中央に近付く程、音量を小さくする補正を行う。音量の補正は、適用する式により変わる。例えば、音量補正部３６は、左右パンの右音量と左音量との絶対値を乗算することで、音量を小さくする補正を行ってもよい。また、例えば、音量補正部３６は、左右パンの右音量と左音量の小さい方の音量を左右の音量から減算することで、音量を小さくする補正を行ってもよい。 During the ducking operation, the volume correction unit 36 performs correction to reduce the volume as the pan approaches the center. The correction of the volume varies depending on the formula to be applied. For example, the volume correction unit 36 may perform correction to reduce the volume by multiplying the absolute values of the right volume and left volume of the left and right pans. Further, for example, the volume correction unit 36 may perform correction to reduce the volume by subtracting the volume of the right volume of the left and right pan and the volume of the lower left volume from the volume of the left and right.

図８（ｂ）及び図８（ｃ）は、ダッキング動作時の左右パンと音量との関係を示すグラフである。図８（ｂ）はダッキング動作時（弱）の場合の一例を示し、図８（ｃ）はダッキング動作時（強）の場合の一例を示す。図８（ｃ）のダッキング動作時（強）の場合には、図８（ｂ）のダッキング動作時（弱）の場合よりもパンが０（中央）又は０に近付いたときの音量が小さい。図８（ｂ）のダッキング動作時、パンが中央方向、つまり０である場合の左音量及び右音量は０．７０７よりも小さくなる。図８（ｃ）のダッキング動作時、パンが中央方向の左音量及び右音量は０である。 FIG. 8B and FIG. 8C are graphs showing the relationship between the left and right pan and the volume during the ducking operation. FIG. 8B shows an example in the case of the ducking operation (weak), and FIG. 8C shows an example in the case of the ducking operation (strong). In the case of the ducking operation (strong) in FIG. 8C, the volume when the pan approaches 0 (center) or 0 is smaller than in the ducking operation (weak) in FIG. 8B. In the ducking operation of FIG. 8B, the left volume and the right volume when the pan is in the center direction, that is, 0, become smaller than 0.707. At the time of the ducking operation in FIG. 8C, the left volume and the right volume in the pan direction are 0.

パンが中央にあると、中央の音として聞こえる。本実施形態では、パンが中央又は中央に近付いたときに、音量を小さく又は０する。これにより、パンが中央又は中央に近付いたときにＢＧＭや効果音を小さく又は０にでき、人の声の帯域を空洞化させることができる。この結果、人の声の帯域を空洞化させた信号に人の声を合成することで、台詞が聞きやすく、周囲の音とが調整された信号を生成できる。 If the pan is in the center, it will be heard as the center sound. In this embodiment, when the pan approaches or approaches the center, the volume is reduced or reduced to zero. As a result, when the pan approaches or approaches the center, BGM and sound effects can be reduced or reduced to zero, and the human voice band can be hollowed out. As a result, by synthesizing a human voice with a signal obtained by hollowing out the human voice band, it is possible to generate a signal in which the dialogue is easy to hear and the surrounding sounds are adjusted.

サウンド処理が開始される際の前提について簡単に説明する。ゲーム中に鳴らしたい音素材の波形がある場合、サウンド処理部１５は、それをどのように鳴らすか（音量、パン等）を決定し、そのパラメータに基づいて各波形を加工し、ミキサでサウンドを合成する。合成された音は、最終的にスピーカ３０から出力される。例えばインタラクティブ要素が強い効果音は、音量やパンといったパラメータをリアルタイムに変化させて加工したうえでミキサに投入される。逆に、プリプロダクションで加工済みの波形となるＢＧＭ等は、ほぼそのままの音又は音量のみ変化された音でミキサに投入される。 The premise when sound processing is started will be briefly described. If there is a waveform of the sound material that you want to play during the game, the sound processing unit 15 determines how to play it (volume, pan, etc.), processes each waveform based on the parameters, and uses the mixer to sound Is synthesized. The synthesized sound is finally output from the speaker 30. For example, sound effects with strong interactive elements are processed by changing parameters such as volume and pan in real time before being input to the mixer. On the other hand, BGM or the like that has a waveform processed by pre-production is input to the mixer with almost the same sound or a sound whose volume is changed.

パンダッキング処理は、波形の加工段階で行われるので適用対象は効果音等に限定される。ＭＳダッキング処理は対象サウンドをミキサでまとめたあとの多チャンネルサウンド信号に対して行われる。 Since the pandacking process is performed at the waveform processing stage, the application target is limited to sound effects and the like. The MS ducking process is performed on the multi-channel sound signal after the target sounds are collected by the mixer.

［サウンド処理方法］
次に、本実施形態に係る情報処理装置１０を用いたサウンド処理の一例について、図９を参照して説明する。本処理が開始されると、サウンド処理部１５は、入力される音がモノラル音であるかを判定する（ステップＳ１０）。サウンド処理部１５は、モノラル音でないと判定すると、ＭＳダッキング対象かを判定する（ステップＳ１２）。ＭＳダッキング対象であると判定された場合、ＭＳダッキング処理部１３は、ＭＳダッキング処理を実行し（ステップＳ１４）、ステップＳ１６に進む。一方、ステップＳ１２においてＭＳダッキング対象でないと判定された場合、直ちにステップＳ１６に進む。ＭＳダッキング処理については、図１０のフローチャートを参照しながら後程説明する。 [Sound processing method]
Next, an example of sound processing using the information processing apparatus 10 according to the present embodiment will be described with reference to FIG. When this process is started, the sound processing unit 15 determines whether or not the input sound is a monaural sound (step S10). If the sound processing unit 15 determines that the sound is not a monaural sound, the sound processing unit 15 determines whether it is an MS ducking target (step S12). If it is determined that the target is MS ducking, the MS ducking processing unit 13 executes MS ducking processing (step S14), and proceeds to step S16. On the other hand, if it is determined in step S12 that it is not an MS ducking target, the process immediately proceeds to step S16. The MS ducking process will be described later with reference to the flowchart of FIG.

ステップＳ１０において、サウンド処理部１５は、モノラル音であると判定すると、パンダッキング対象かを判定する（ステップＳ１８）。パンダッキング対象であると判定された場合、パンダッキング処理部１４は、パンダッキング処理を実行し（ステップＳ２０）、ステップＳ１６に進む。一方、ステップＳ１８においてパンダッキング対象でないと判定された場合、直ちにステップＳ１６に進む。パンダッキング処理については、図１１のフローチャートを参照しながら後程説明する。 In step S10, when determining that the sound processing unit 15 is a monaural sound, the sound processing unit 15 determines whether it is a panda hacking target (step S18). When it is determined that the target is panda hacking, the panda hacking processing unit 14 executes the panda hacking process (step S20) and proceeds to step S16. On the other hand, if it is determined in step S18 that it is not a panda hacking target, the process immediately proceeds to step S16. The panda decking process will be described later with reference to the flowchart of FIG.

ステップＳ１６において、サウンド処理部１５は、加工した周囲の音に台詞を合成するミックス処理を行う。次に、サウンド処理部１５は、合成した音を出力し（ステップＳ２２）、本処理を終了する。 In step S <b> 16, the sound processing unit 15 performs a mixing process for synthesizing a speech with the processed ambient sound. Next, the sound processing unit 15 outputs the synthesized sound (step S22) and ends this process.

このようにして、ＭＳダッキング処理又はパンダッキング処理後の音声信号は、ミックス処理において、台詞も含めて合成される（バスルーティング）。
（ＭＳダッキング）
図９のステップＳ１４にて呼び出されるＭＳダッキング処理について、図１０を参照しながら説明する。図１０の処理が開始されると、受付部１１は、ステレオ音のＬ信号及びＲ信号を入力する（ステップＳ４０）。次に、エンコード処理部３２は、Ｌ信号及びＲ信号をエンコードし、Ｍ信号及びＳ信号を生成する（ステップＳ４２）。エンコード処理部３２は、以下の式（１）によりＭ信号を生成し、式（２）によりＳ信号を生成する。 In this manner, the audio signal after the MS ducking process or the panda ducking process is synthesized including the dialogue (bus routing) in the mixing process.
(MS ducking)
The MS ducking process called in step S14 in FIG. 9 will be described with reference to FIG. When the process of FIG. 10 is started, the reception unit 11 inputs the L signal and the R signal of stereo sound (step S40). Next, the encoding process part 32 encodes L signal and R signal, and produces | generates M signal and S signal (step S42). The encoding processing unit 32 generates an M signal by the following expression (1) and generates an S signal by the expression (2).

Ｍ＝Ｌ＋Ｒ・・・（１）
Ｓ＝Ｌ−Ｒ・・・（２）
次に、フィルタ処理部３３は、Ｍ信号を３００Ｈｚ以下の成分を通すＬＰＦ１３ａを用いてフィルタ処理する（ステップＳ４４）。また、フィルタ処理部３３は、Ｍ信号を３ｋＨｚ以上の成分を通すＨＰＦ１３ｂを用いてフィルタ処理する（ステップＳ４４）。 M = L + R (1)
S = L−R (2)
Next, the filter processing unit 33 filters the M signal using the LPF 13a that passes a component of 300 Hz or less (step S44). Further, the filter processing unit 33 filters the M signal using the HPF 13b that passes a component of 3 kHz or higher (step S44).

次に、フィルタ処理部３３は、フィルタ処理したそれぞれの信号を合成して、Ｍ'信号を生成する（ステップＳ４６）。次に、デコード処理部３４は、Ｍ信号及びＳ信号をデコードし、Ｌ信号及びＲ信号を復元し（ステップＳ４８）、本処理を終了する。 Next, the filter processing unit 33 synthesizes the filtered signals to generate an M ′ signal (step S46). Next, the decoding processing unit 34 decodes the M signal and the S signal, restores the L signal and the R signal (step S48), and ends this processing.

Ｌ＝（Ｍ'＋Ｓ）÷２・・・（３）
Ｒ＝（Ｍ'−Ｓ）÷２・・・（４）
（パンダッキング）
次に、図９のステップＳ２０にて呼び出されるパンダッキング処理について、図１１を参照しながら説明する。図１１の処理が開始されると、受付部１１は、モノラル音の信号を入力する（ステップＳ５０）。次に、パン計算部３５は、パンに応じた音量を計算する（ステップＳ５２）。 L = (M ′ + S) ÷ 2 (3)
R = (M′−S) ÷ 2 (4)
(Pundakking)
Next, the panda packing process called in step S20 of FIG. 9 will be described with reference to FIG. When the process of FIG. 11 is started, the reception unit 11 inputs a monaural sound signal (step S50). Next, the pan calculation unit 35 calculates the volume corresponding to the pan (step S52).

パン計算部３５は、パンが中央に近付いたかを判定する（ステップＳ５４）。パン計算部３５は、パンが中央に近付いていないと判定すると、計算した音量のままにし（ステップＳ５６）、ステップＳ６０に進む。一方、パン計算部３５は、パンが中央に近付いたと判定すると、計算した音量よりも音量を小さくし（ステップＳ５８）、本処理を終了する。 The bread calculator 35 determines whether the bread has approached the center (step S54). If the pan calculation unit 35 determines that the pan is not close to the center, the pan calculation unit 35 keeps the calculated volume (step S56) and proceeds to step S60. On the other hand, when determining that the pan has approached the center, the pan calculation unit 35 makes the volume smaller than the calculated volume (step S58), and ends the process.

以上に説明したように、本実施形態に係るサウンド処理によれば、ＭＳダッキング及びパンダッキングを並行して行うことができる。例えば、ステレオ音やサラウンド音は、ＭＳダッキングの方法で音の加工を行い、モノラル音は、パンダッキングの方法で音の加工を行い、人の声の帯域（３００Ｈｚ〜３ｋＨｚの帯域）を空洞化させる。 As described above, according to the sound processing according to the present embodiment, MS ducking and panda ducking can be performed in parallel. For example, stereo sound and surround sound are processed by the MS ducking method, and monaural sound is processed by the panda ducking method to hollow out the human voice band (300 Hz to 3 kHz band). Let

ＭＳダッキングにおいては、ＭＳダッキング部１３は、ステレオ音の３００Ｈｚ〜３ｋＨｚの帯域を除去し、除去後の音に台詞等の音声を合成する。これにより、合成後のステレオ音等では、台詞が聞き易く、また周囲の音と調整された音となる。これにより、ユーザからのインタラクティブ操作に対する動作を行う情報処理装置１０において、台詞等の重要な音が聞き易く、またＢＧＭ等の周囲の音と調整された音をリアルタイムに生成することができる。 In MS ducking, the MS ducking unit 13 removes the 300 Hz to 3 kHz band of the stereo sound, and synthesizes speech such as speech to the removed sound. As a result, in the synthesized stereo sound or the like, the speech is easy to hear, and the sound is adjusted with the surrounding sound. Thereby, in the information processing apparatus 10 that performs an operation in response to an interactive operation from the user, important sounds such as lines can be easily heard, and sounds adjusted with surrounding sounds such as BGM can be generated in real time.

パンダッキングにおいては、パンダッキング部１４は、パンが中央又は中央に近付いたときにＢＧＭや効果音を小さくするように音量を補正し、台詞等の音声をパンの中央から出力させることで、台詞が聞きやすく、周囲の音と調整された音をリアルタイムに生成することができる。なお、パンが中央に近付くとは、例えば、左右パンの中央から１／３の領域に入ったときでもよく、左右パンの中央から１／４の領域に入ったときでもよい。 In panda hacking, the panda hacking unit 14 corrects the volume so that BGM and sound effects are reduced when the pan approaches the center or near the center, and outputs speech such as speech from the center of the pan. Is easy to hear and can generate ambient sounds and adjusted sounds in real time. It should be noted that the fact that the pan approaches the center may be, for example, when it enters a region of ３ from the center of the left and right pans or when it enters a region of ¼ from the center of the left and right pans.

特に、パンダッキングでは元の音から音が変化し難い。よって、ＭＳダッキングは、中心に音がないようなＢＧＭに使用することが好ましく、パンダッキングは、３Ｄ（３Ｄ座標位置指定）に使用することが好ましい。いずれも話者の声の帯域を元の音から削除した信号に加工し、話者の声を通り易くすることができる。これにより、コンテンツの提供においてユーザに違和感を感じさせない音を提供することができる。また、本実施形態に係るダッキング処理によれば、ボリュームダッキングと比較して、全体の音量感を変化させずに、中央の音（つまり台詞等のゲームの進行に重要な音）をより良く通すことができる。
（サラウンド音の加工）
サラウンド音を加工するサラウンド処理では、左右パンに併せて前後パンによる前後ボリューム比率が計算され、それらを掛けあわせることで全てのサラウンドチャンネルのボリューム値が決まる。さらに、インテリア処理を加えて左右音量が算出される。インテリア処理は、音源位置と聴者位置との距離が決められたインテリア半径を下回る場合に左右音量差および前後音量差を減らすものである。 In particular, in panda hacking, the sound is difficult to change from the original sound. Therefore, MS ducking is preferably used for BGM in which there is no sound at the center, and panda ducking is preferably used for 3D (3D coordinate position designation). In either case, the voice band of the speaker can be processed into a signal deleted from the original sound, and the voice of the speaker can be easily passed. Thereby, it is possible to provide a sound that does not make the user feel uncomfortable in providing the content. In addition, according to the ducking process according to the present embodiment, compared to volume ducking, the central sound (that is, a sound important for the progress of the game such as dialogue) is passed better without changing the overall volume feeling. be able to.
(Surround sound processing)
In the surround processing for processing the surround sound, the front / rear volume ratio of the front / rear pan is calculated together with the left / right pan, and the volume values of all the surround channels are determined by multiplying them. Further, the left and right sound volumes are calculated by adding interior processing. The interior processing is to reduce the left / right volume difference and the front / rear volume difference when the distance between the sound source position and the listener position is smaller than the determined interior radius.

なお、５．１ｃｈサラウンド音の場合について、図１２を用いて簡単に補足する。サラウンドの音を加工する場合、フロントレフトのスピーカＡ、フロントライトのスピーカＢ、フロントセンターのスピーカＣ、ＬＦＥ（Low Frequency Effect）のスピーカＤ、サラウンドレフトのスピーカＥ、サラウンドライトのスピーカＦの６つの信号チャンネルがある。 The case of 5.1ch surround sound will be briefly supplemented using FIG. When processing surround sound, the front left speaker A, the front right speaker B, the front center speaker C, the LFE (Low Frequency Effect) speaker D, the surround left speaker E, and the surround right speaker F There is a signal channel.

ＭＳ処理（Mid/Side処理）は、ステレオ波形の信号（つまり、Ｌ信号及びＲ信号）に対する処理であるため、フロントおよびサラウンドのステレオペアのスピーカから出力される音をそれぞれ別々に処理する。また、フロントセンター及びＬＦＥには左右がないが、Mid/SideのＭ信号のみの音があるとみなすことができるため、フィルタ処理だけを適用する。まとめると、５．１ｃｈサラウンド信号に対するＭＳダッキング処理は以下になる。 Since the MS process (Mid / Side process) is a process for a stereo waveform signal (that is, an L signal and an R signal), sounds output from the speakers of the front and surround stereo pairs are separately processed. Further, although there is no left and right in the front center and LFE, since it can be considered that there is only sound of Mid / Side M signal, only filter processing is applied. In summary, the MS ducking process for 5.1ch surround signals is as follows.

フロントレフト及びフロントライトの信号を入力→ＭＳダッキング処理→フロントレフト及びフロントライトのスピーカＡ，Ｂに信号を出力
サラウンドレフト及びサラウンドライトの信号を入力→ＭＳダッキング処理→サラウンドレフト及びサラウンドライトのスピーカＥ，Ｆに信号を出力
フロントセンターの信号を入力→フィルタ処理のみ→フロントセンターのスピーカＣに信号を出力
ＬＦＥの信号を入力→フィルタ処理のみ→ＬＦＥのスピーカＤに信号を出力
ゲームサウンドのメインとなるサラウンドサウンドを説明する前にステレオサウンドの例を挙げる。ステレオサウンドの場合は送出先のスピーカはレフトとライトのスピーカＡ，Ｂであり、パンという値（左右比率を表すもの、−１〜１の範囲）を変更することで左右方向を表現する。パンの中央を０．０とする。 Input front left and front right signals → MS ducking processing → Output signals to front left and front right speakers A and B Input surround left and surround right signals → MS ducking processing → Surround left and surround right speakers E , F Output signal Front center signal → Filter processing only → Output signal to front center speaker C Input LFE signal → Filter processing only → Output signal to LFE speaker D Main game sound Before explaining surround sound, here is an example of stereo sound. In the case of stereo sound, the destination speakers are the left and right speakers A and B, and the left and right directions are expressed by changing the value of pan (representing the left / right ratio, the range of −1 to 1). Let the center of the bread be 0.0.

与えられたパンに対して下に示す計算式（前述した式（５）（６））でモノラルサウンドを鳴らすと、パンがどんな値であっても音量感が同じになる。ただし、ステレオリスリング環境を条件とする。
レフトスピーカ音量（左パン音量）＝ｃｏｓ（π×（パン＋１）÷４）・・・（５）
ライトスピーカ音量（右パン音量）＝ｓｉｎ（π×（パン＋１）÷４）・・・（６）
ほとんどのゲームで使われるサラウンド音に拡張する場合、基本的な考え方は上記ステレオ音と同じであり、図１２に示す左右パンの他に前後パンの概念が追加される。この２つのパンは音源位置の方向（水平面上の３６０度方向）で決まる。 When a monaural sound is played with the following formula (formulas (5) and (6) described above) for a given pan, the sense of volume is the same regardless of the pan. However, a stereo list ring environment is a condition.
Left speaker volume (left pan volume) = cos (π × (pan + 1) ÷ 4) (5)
Light speaker volume (right pan volume) = sin (π × (pan + 1) ÷ 4) (6)
When expanding to the surround sound used in most games, the basic concept is the same as the stereo sound, and the concept of front and rear pan is added in addition to the left and right pan shown in FIG. These two pans are determined by the direction of the sound source position (360-degree direction on the horizontal plane).

音源位置の方向が、フロントスピーカの間にある場合は左右パンがその割合で−１〜１の範囲、前後パンが−１に固定
音源位置の方向が、サイドスピーカの間にある場合は左右パンがその割合で−１〜１の範囲、前後パンが１固定
音源位置の方向が、フロントレフトとサイドレフトの間にある場合は左右パンが−１固定、前後パンがその割合で−１〜１の範囲
音源位置の方向が、フロントライトとサイドライトの間にある場合は左右パンが１固定、前後パンがその割合で−１〜１の範囲となる。 When the direction of the sound source is between the front speakers, the left and right pans are in the range of −1 to 1, and the front and rear pans are fixed to −1. When the direction of the sound source is between the side speakers, the left and right pans are fixed. Is in the range of −1 to 1 in that ratio, and the front and rear pans are fixed to 1 When the direction of the sound source is between the front left and side left, the left and right pans are fixed to −1, and the front and rear pans are in the ratio of −1 to 1 When the direction of the sound source is between the front light and the side light, the left and right pans are fixed at 1, and the front and rear pans are in the range of −1 to 1 in proportion.

左右音量比率及び前後音量比率は、ステレオ音の場合と同じように以下の式そのままとなる。
左パン音量＝Cos(π×（左右パン＋１）÷４）
右パン音量＝Sin（π×（左右パン＋１）÷４）
前方パン音量＝Cos(π×（前後パン＋１）÷４）
後方パン音量＝Sin（π×（前後パン＋１）÷４）
この計算結果から各スピーカの音量比が以下のように決まる。
フロントレフトスピーカ音量＝左パン音量×前方パン音量
フロントライトスピーカ音量＝右パン音量×前方パン音量
サラウンドレフトスピーカ音量＝左パン音量×後方パン音量
サラウンドライトスピーカ音量＝右パン音量×後方パン音量
本実施形態に係るパンダッキングは、上記の左パン音量および右パン音量の計算式に補正を加える処理を行い、それ以外は音量の補正を行わずにサラウンド音として各スピーカから出力する。 The left / right volume ratio and the front / rear volume ratio are the same as the following formulas as in the case of stereo sound.
Left pan volume = Cos (π x (left and right pan + 1) ÷ 4)
Right pan volume = Sin (π x (left and right pan + 1) ÷ 4)
Front pan volume = Cos (π x (front and rear pan + 1) ÷ 4)
Rear pan volume = Sin (π x (front and rear pan + 1) ÷ 4)
From this calculation result, the volume ratio of each speaker is determined as follows.
Front Left Speaker Volume = Left Pan Volume x Forward Pan Volume Front Right Speaker Volume = Right Pan Volume x Forward Pan Volume Surround Left Speaker Volume = Left Pan Volume x Rear Pan Volume Surround Right Speaker Volume = Right Pan Volume x Rear Pan Volume The panda decking according to the embodiment performs processing for correcting the calculation formulas of the left pan volume and the right pan volume, and outputs the surround sound from each speaker without performing the volume correction otherwise.

以上、情報処理装置、サウンド処理方法及びサウンド処理プログラムを上記実施形態により説明したが、本発明にかかる情報処理装置、サウンド処理方法及びサウンド処理プログラムは上記実施形態に限定されるものではなく、本発明の範囲内で種々の変形及び改良が可能である。また、上記実施形態及び変形例が複数存在する場合、矛盾しない範囲で組み合わせることができる。 The information processing apparatus, the sound processing method, and the sound processing program have been described in the above embodiment. However, the information processing apparatus, the sound processing method, and the sound processing program according to the present invention are not limited to the above embodiment. Various modifications and improvements are possible within the scope of the invention. In addition, when there are a plurality of the above-described embodiments and modifications, they can be combined within a consistent range.

１０：情報処理装置
１１：受付部
１２：ゲーム実行部
１２ａ：ＭＳエンコーダ
１２ｂ：ＭＳデコーダ
１３ａ：ＬＰＦ
１３ｂ：ＨＰＦ
１３：ＭＳダッキング部
１４：パンダッキング部
１５：サウンド処理部
１６：グラフィック処理部
１７：通信部
１８：表示部
１９：音出力部
２０：記憶部
２１：ＣＰＵ
２２：ＲＯＭ
２３：ＲＡＭ
２４：ＨＤＤ
２５：グラフィックカード
２６：外部Ｉ／Ｆ
２６ａ：記憶媒体
２７：通信Ｉ／Ｆ
２８：入力Ｉ／Ｆ
２８ａ：メモリカード
２９：ディスプレイ
３０：スピーカ
３２：エンコード処理部
３３：フィルタ処理部
３４：デコード処理部
３５：パン計算部
３６：音量補正部
１２１：ゲーム処理プログラム
１２２：サウンド処理プログラム 10: Information processing device 11: Reception unit 12: Game execution unit 12a: MS encoder 12b: MS decoder 13a: LPF
13b: HPF
13: MS ducking unit 14: Panducking unit 15: Sound processing unit 16: Graphic processing unit 17: Communication unit 18: Display unit 19: Sound output unit 20: Storage unit 21: CPU
22: ROM
23: RAM
24: HDD
25: Graphic card 26: External I / F
26a: Storage medium 27: Communication I / F
28: Input I / F
28a: Memory card 29: Display 30: Speaker 32: Encoding processing unit 33: Filter processing unit 34: Decoding processing unit 35: Pan calculation unit 36: Volume correction unit 121: Game processing program 122: Sound processing program

Claims

An information processing apparatus for processing sound output with predetermined content,
A central signal is generated from the sum of the first signal and the second signal of the stereo sound included in the sound to be output, and a peripheral signal is generated from the difference value between the first signal and the second signal Encoding processing unit to
A filter processing unit that generates a corrected central signal obtained by removing a predetermined band of sound from the generated central signal;
A decoding processor that restores the first signal and the second signal based on the generated corrected central signal and peripheral signal;
An information processing apparatus.

The predetermined band is a band including an audio band important for the content.
The information processing apparatus according to claim 1.

A sound processing unit that synthesizes audio important to the content with the restored first signal and second signal;
A sound output unit for outputting the synthesized sound,
The information processing apparatus according to claim 1 or 2.

A pan calculation unit for calculating a volume corresponding to left and right pans indicating a left / right ratio of a monaural sound included in the sound to be output;
A volume correction unit that corrects the volume so as to be smaller than the calculated volume as the left and right pan approaches the center;
The information processing apparatus according to any one of claims 1 to 3.

A sound processing unit that synthesizes important audio to the content to the corrected monaural sound;
A sound output unit for outputting the synthesized sound;
The information processing apparatus according to claim 4.

A sound processing method in which a computer executes processing for processing sound output with predetermined content,
A central signal is generated from the sum of the first signal and the second signal of the stereo sound included in the sound to be output, and a peripheral signal is generated from the difference value between the first signal and the second signal And
A corrected central signal is generated by removing a predetermined band of sound from the generated central signal,
Restoring the first signal and the second signal based on the generated corrected central signal and peripheral signal;
A sound processing method comprising:

A sound processing program in which a computer executes processing for processing sound output with predetermined content,
A central signal is generated from the sum of the first signal and the second signal of the stereo sound included in the sound to be output, and a peripheral signal is generated from the difference value between the first signal and the second signal And
A corrected central signal is generated by removing a predetermined band of sound from the generated central signal,
Restoring the first signal and the second signal based on the generated corrected central signal and peripheral signal;
A sound processing program that causes a computer to execute processing.