JP2013142795A

JP2013142795A - Conversation protection system and conversation protection method

Info

Publication number: JP2013142795A
Application number: JP2012003244A
Authority: JP
Inventors: Atsuhisa Sugawara; 敦寿菅原; Yoshihiro Irie; 佳洋入江; Yojiro Kamise; 陽二郎神瀬
Original assignee: Glory Ltd
Current assignee: Glory Ltd
Priority date: 2012-01-11
Filing date: 2012-01-11
Publication date: 2013-07-22
Anticipated expiration: 2032-01-11
Also published as: JP5925493B2

Abstract

PROBLEM TO BE SOLVED: To prevent that the content of conversation is caught by a third party by reproducing sound, and to reduce discomfort of the third party to the reproduced sound.SOLUTION: A conversation protection system is constituted of: a microphone for collecting voice for performing conversation; a storage part which stores attention sound effect data showing time axis waveform which attenuates after a sound pressure level starts in several seconds and base sound effect data showing time axis waveform whose sound pressure level gradually changes in comparison with an attention sound effect; a speaker for reproducing the attention sound effect and the base sound effect toward a third party except a party concerned; and a control part which reproduces the base sound effect while at least the sound pressure level of conversation voice collected by the microphone exceeds a first threshold, and reproduces the attention sound effect whenever the sound pressure level of the conversation voice exceeds a second threshold.

Description

この発明は、会話の内容が会話当事者以外の第三者によって聞き取られることを防止するための会話保護システム及び会話保護方法に関する。 The present invention relates to a conversation protection system and a conversation protection method for preventing the contents of a conversation from being heard by a third party other than the conversation party.

従来、銀行や病院等では、会話の音声が漏れて第三者に聞き取られることを防止するために様々なシステムが利用されている。例えば、銀行内で行われる行員と顧客の会話や、病院内で行われる受付担当者、医師及び薬剤師と患者との会話には、第三者には聞かれたくない個人情報が含まれる場合があるため、第三者に向けて別の音を再生することにより、第三者が会話音声を聞き取り難いようにしている。 2. Description of the Related Art Conventionally, various systems are used in banks, hospitals, and the like to prevent a conversational voice from leaking and being heard by a third party. For example, a conversation between a bank employee and a customer in a bank or a conversation between a receptionist, a doctor, a pharmacist, and a patient in a hospital may contain personal information that a third party does not want to hear. For this reason, by playing another sound toward the third party, it is difficult for the third party to hear the conversation voice.

例えば、特許文献１では、駅や空港等の公共の場所で会話する複数のグループの間に、吸音効果を有するスクリーンを仕切りとして設置した上で、ＢＧＭ（ＢａｃｋｇｒｏｕｎｄＭｕｓｉｃ）を流すことにより会話の内容を保護する技術が開示されている。また、特許文献２では、ＢＧＭの再生に関して、隣室で行われる会話音声を集音して、会話音声の音量に応じてＢＧＭの音量を調整する技術が開示されている。 For example, in Patent Document 1, the content of a conversation is made by flowing a BGM (Background Music) after a screen having a sound absorbing effect is set as a partition between a plurality of groups that have a conversation in a public place such as a station or an airport. A technique for protecting the image is disclosed. Patent Document 2 discloses a technique for collecting conversational speech performed in a neighboring room and adjusting the volume of the BGM in accordance with the volume of the conversational speech with respect to BGM reproduction.

しかしながら、人間の耳は、いわゆるカクテルパーティ効果によって、特定の音を選択的に聴取する選択的聴取能力を有している。このため、会話音声の聴取をより困難にするために、会話に無関係なＢＧＭではなく、会話音声に基づいて生成したマスキング音を利用する場合がある。 However, the human ear has a selective listening ability to selectively listen to a specific sound by the so-called cocktail party effect. For this reason, in order to make it more difficult to listen to the conversational sound, a masking sound generated based on the conversational sound may be used instead of the BGM unrelated to the conversation.

例えば、特許文献３では、会話音声の周波数スペクトルと逆位相の音を生成して、これをマスキング音として、会話の間だけ再生する技術が開示されている。また、特許文献４では、マスキング音に関して、会話音声の周波数スペクトルから抽出した包絡線及び微細構造に基づいて生成した防聴音を利用する技術が開示されている。防聴音とは、会話内容を聴かれることを防止するための音で、会話音声に被せるように再生することで会話音声の音韻性を壊すことができるマスキング音の一種である。 For example, Patent Document 3 discloses a technique of generating a sound having a phase opposite to the frequency spectrum of conversational speech and reproducing it as a masking sound only during conversation. Patent Document 4 discloses a technique that uses hearing loss generated based on an envelope and a fine structure extracted from a frequency spectrum of a conversational sound for masking sound. The hearing-proof sound is a type of masking sound that can prevent the conversation content from being listened to and can break the phonological property of the conversation voice by playing it over the conversation voice.

特表２０１１−５２８４４５号公報Special table 2011-528445 gazette 特開２００７−２５６６０６号公報JP 2007-256606 A 特開２０１０−１９９３５号公報JP 2010-19935 A 特許第４７６１５０６号公報Japanese Patent No. 4761506

しかしながら、上記従来技術によれば、会話の内容を保護するためのＢＧＭやマスキング音に対して、第三者が不快感や違和感を覚える場合がある。例えば、音楽等をＢＧＭとして再生する場合に、カクテルパーティ効果を考慮して、会話音声を聞き取られることがないようにＢＧＭの音量を大きくすると、大きな音に不快感を覚える場合がある。 However, according to the above prior art, a third party may feel uncomfortable or uncomfortable with respect to BGM or masking sound for protecting the content of conversation. For example, when music or the like is reproduced as BGM, taking into account the cocktail party effect, if the volume of the BGM is increased so that the conversational voice is not heard, the loud sound may be uncomfortable.

また、マスキング音を再生する場合には、人工的に生成された周波数特性を有する音に違和感を覚える場合がある。マスキング音は、会話音声の特徴に合わせて生成された音であるため、ＢＧＭのように会話音声と無関係な音を利用する場合に比べて小さい音量で、会話音声を聞き取り難くする効果を得ることができる。ところが、日常生活では経験しない聞き慣れないマスキング音を聞いた第三者は、たとえ音量が大きくない場合でも、この音に違和感を覚える場合がある。また、違和感を覚えながら、聞き慣れないマスキング音を聞くことに集中してしまい、違和感を増大させる場合がある。 In addition, when reproducing the masking sound, there is a case where the sound having the artificially generated frequency characteristic is uncomfortable. Since the masking sound is generated according to the characteristics of the conversational voice, it is possible to obtain an effect that makes it difficult to hear the conversational voice at a lower volume than when using a sound unrelated to the conversational voice such as BGM. Can do. However, a third party who hears an unfamiliar masking sound that is not experienced in daily life may feel uncomfortable even if the volume is not high. In addition, the user may concentrate on listening to an unfamiliar masking sound while feeling uncomfortable, which may increase the uncomfortable feeling.

本発明は、上述した従来技術による問題点を解消するためになされたもので、会話音声が第三者に聞き取られないように会話内容を保護するための音を再生しながら、この音に対して第三者が違和感や不快感を覚えることがない会話保護システム及び会話保護方法を提供することを目的とする。 The present invention has been made to solve the above-described problems caused by the prior art, and while reproducing the sound for protecting the conversation contents so that the conversation voice cannot be heard by a third party, It is an object of the present invention to provide a conversation protection system and a conversation protection method in which a third party does not feel discomfort or discomfort.

上述した課題を解決し、目的を達成するために、本発明は、会話保護システムであって、会話する音声を集音するためのマイクロホンと、数秒の間に音圧レベルが起ち上がった後減衰する時間軸波形を示すアテンション効果音と該アテンション効果音に比べて音圧レベルが緩やかに変化する時間軸波形を示すベース効果音とを保存する記憶部と、前記会話を行う会話当事者を除く第三者に向けて、前記アテンション効果音及び前記ベース効果音のいずれか一つ又は両方を再生するためのスピーカと、少なくとも前記マイクロホンによって集音された会話音声の音圧レベルが第１のしきい値を超えている間は前記スピーカによる前記ベース効果音を再生する制御と、前記会話音声の音圧レベルが第２のしきい値を超える度に前記アテンション効果音を再生する制御とのいずれか一つ又は両方の制御を行う制御部とを備えることを特徴とする。 In order to solve the above-mentioned problems and achieve the object, the present invention is a conversation protection system, which is a microphone for collecting speech to be spoken and attenuating after a sound pressure level rises in a few seconds A storage unit for storing an attention effect sound indicating a time axis waveform to be performed and a base effect sound indicating a time axis waveform in which the sound pressure level gradually changes compared to the attention effect sound, and a conversation party excluding the conversation party performing the conversation A sound pressure level of a conversational sound collected by at least a speaker for reproducing one or both of the attention sound effect and the base sound effect and at least the microphone is set to a first threshold. While the value exceeds the value, control for reproducing the base sound effect by the speaker and the attention effect each time the sound pressure level of the conversational sound exceeds a second threshold value. Characterized in that it comprises a control unit for performing either one or both control of the control to play sound.

また、本発明は、上記発明において、前記制御部は、会話音声の音圧レベルが所定のしきい値を下回った場合に、前記スピーカによる前記ベース効果音及び前記アテンション効果音のいずれか一つ又は両方の再生を停止することを特徴とする。 Also, in the present invention according to the above-mentioned invention, the control unit may select one of the base sound effect and the attention sound effect by the speaker when the sound pressure level of the conversational sound falls below a predetermined threshold value. Alternatively, both reproductions are stopped.

また、本発明は、上記発明において、前記制御部は、前記アテンション効果音を再生する度に、所定時間内に聞こえる音の数、該音が聞こえるタイミング、該音の音色、及び該音の高さのうち少なくとも１つが変化するように前記アテンション効果音の再生を制御することを特徴とする。 Further, according to the present invention, in the above invention, each time the attention sound effect is reproduced, the control unit counts the number of sounds that can be heard within a predetermined time, the timing at which the sound is heard, the tone color of the sound, and the pitch of the sound. The reproduction of the attention sound effect is controlled so that at least one of them changes.

また、本発明は、上記発明において、前記制御部は、サイン波を利用して前記アテンション効果音を生成することを特徴とする。 Further, the present invention is characterized in that, in the above invention, the control unit generates the attention sound effect using a sine wave.

また、本発明は、上記発明において、前記マイクロホンによって集音された音声の周波数特性に基づいて前記音声をマスキングして聞き取り難くするマスキング音を生成するマスキング音生成部をさらに備え、前記制御部は、前記マスキング音生成部によって生成されたマスキング音を再生することを特徴とする。 The present invention further includes a masking sound generation unit that generates a masking sound that masks the sound and makes it difficult to hear based on a frequency characteristic of the sound collected by the microphone. The masking sound generated by the masking sound generator is reproduced.

また、本発明は、上記発明において、前記マスキング音生成部は、前記マイクロホンによって集音された音声からスペクトル包絡及びスペクトル微細構造を抽出して、周波数方向に延びる軸を中心として前記スペクトル包絡を上下に入れ替えるための反転軸を設定し、当該反転軸を中心として前記スペクトル包絡を反転させることにより前記スペクトル包絡に対して変形を施して変形スペクトル包絡を生成して、前記変形スペクトル包絡及び前記スペクトル微細構造を合成した防聴音を生成して前記マスキング音とすることを特徴とする。 Further, the present invention is the above invention, wherein the masking sound generator extracts a spectral envelope and a spectral fine structure from the sound collected by the microphone, and raises and lowers the spectral envelope around an axis extending in a frequency direction. An inversion axis is set to replace the spectrum envelope, and the spectrum envelope is inverted about the inversion axis to deform the spectrum envelope to generate a deformed spectrum envelope, and the deformed spectrum envelope and the spectrum fineness are generated. A hearing loss sound having a synthesized structure is generated and used as the masking sound.

また、本発明は、上記発明において、前記記憶部には、複数の前記アテンション効果音が保存されており、前記制御部は、前記記憶部からランダムに選択した前記アテンション効果音を再生することを特徴とする。 Further, the present invention is the above invention, wherein the storage unit stores a plurality of the attention sound effects, and the control unit reproduces the attention sound effects randomly selected from the storage unit. Features.

また、本発明は、上記発明において、前記記憶部には、各アテンション効果音の音色及び音の高さに基づいて設定された複数のアテンション効果音の組合せが設定テーブルとして保存されており、前記制御部は、前記設定テーブルからランダムに選択した組合せに基づいてアテンション効果音を再生することを特徴とする。 Further, the present invention is the above invention, wherein the storage unit stores a plurality of attention sound effect combinations set based on the tone color and pitch of each attention sound effect as a setting table, The control unit plays the attention sound effect based on a combination randomly selected from the setting table.

また、本発明は、上記発明において、前記制御部は、各アテンション効果音を再生する際の音量をランダムに変更することを特徴とする。 Further, the present invention is characterized in that, in the above-mentioned invention, the control unit randomly changes a volume when reproducing each attention sound effect.

また、本発明は、上記発明において、前記アテンション効果音は、楽器の音であることを特徴とする。 In the present invention, the attention sound effect is a sound of a musical instrument.

また、本発明は、会話保護方法であって、会話する音声を集音する音声集音ステップと、少なくとも前記音声収集ステップで集音された会話音声の音圧レベルが第１のしきい値を超えている間は、音圧レベルが緩やかに変化する時間軸波形を示すベース効果音を再生するベース効果音再生ステップ、及び前記音声集音ステップで集音された会話音声の音圧レベルが第２のしきい値を超えた場合に、数秒の間に音圧レベルが起ち上がった後に減衰する時間軸波形を示すアテンション効果音を再生するアテンション効果音再生ステップのいずれか一つ又は両方のステップを含む効果音再生ステップとを含むことを特徴とする。 The present invention is also a conversation protection method, comprising: a sound collecting step for collecting conversational sound; and a sound pressure level of at least the conversation sound collected in the sound collecting step has a first threshold value. The sound pressure level of the conversational sound collected in the voice sound collection step and the base sound effect reproduction step for reproducing the base sound effect showing the time axis waveform in which the sound pressure level changes slowly, One or both of the attention effect sound reproducing steps for reproducing the attention effect sound indicating the time axis waveform that decays after the sound pressure level rises within a few seconds when the threshold value of 2 is exceeded And a sound effect reproduction step including.

また、本発明は、上記発明において、前記音声集音ステップで集音された会話音声の音圧レベルが所定のしきい値を下回った場合に、前記ベース効果音及び前記アテンション効果音のいずれか一つ又は両方の再生を停止する効果音停止ステップをさらに含むことを特徴とする。 Further, the present invention provides the above-described invention, wherein, when the sound pressure level of the conversational sound collected in the voice sound collecting step falls below a predetermined threshold value, one of the base sound effect and the attention sound effect. It further includes a sound effect stop step for stopping one or both reproductions.

本発明によれば、保護対象となる会話音声の音圧が所定のしきい値を超えている間は少なくともベース効果音を再生し、さらに会話音声の音量に応じてランダムなタイミングでアテンション効果音を再生することにより聞く者にランダムな印象を与えることができる。また、数秒の間に減衰する短音であるアテンション効果音は第三者の注意を引きやすいので、音に慣れた第三者がカクテルパーティ効果による選択的聴取を行って会話音声を聞き取ることを防止して、会話保護の効果を高めることができる。 According to the present invention, at least the base sound effect is reproduced while the sound pressure of the conversation voice to be protected exceeds a predetermined threshold, and the attention sound effect is generated at random timing according to the volume of the conversation voice. It is possible to give a random impression to the listener by playing. In addition, attention sound effects, which are short sounds that decay within a few seconds, are easy to draw the attention of third parties, so that third parties who are used to the sound can listen to conversational voices by selectively listening to the cocktail party effect. Can prevent and enhance the effect of conversation protection.

また、本発明によれば、会話がなされていないときには、ベース効果音及びアテンション効果音を停止することができるので、静かな環境等に合わせた利用にも適している。 Further, according to the present invention, when there is no conversation, the bass sound effect and the attention sound effect can be stopped, which is suitable for use in a quiet environment.

また、本発明によれば、アテンション効果音の１回の再生で所定時間内に聞こえる音の数、該音が聞こえるタイミング、該音の音色、及び該音の高さのうち少なくとも１つを変更するので、第三者にランダムな印象を与えて、音に慣れることを防止することができる。 Further, according to the present invention, at least one of the number of sounds that can be heard within a predetermined time by one reproduction of the attention sound effect, the timing at which the sound is heard, the tone color of the sound, and the pitch of the sound is changed. As a result, a random impression can be given to a third party to prevent them from getting used to the sound.

また、本発明によれば、サイン波を利用してアテンション効果音を生成することができるので、聞く者にランダムな印象を与えるアテンション効果音を自在に生成して利用することができる。 Further, according to the present invention, attention sound effects can be generated using sine waves, so attention sound effects that give a random impression to the listener can be freely generated and used.

また、本発明によれば、会話音声の周波数特性に基づいて生成したマスキング音を再生することで会話を保護することができる。また、マスキング音が日常生活では聞くことがない違和感を覚える音である場合も、第三者の注意をベース効果音及びアテンション効果音に向けることができるので、マスキング音に対する違和感を低減することができる。 Further, according to the present invention, it is possible to protect the conversation by reproducing the masking sound generated based on the frequency characteristic of the conversation voice. In addition, even if the masking sound is an uncomfortable sound that cannot be heard in daily life, the third party's attention can be directed to the base sound effect and the attention sound effect, which can reduce the uncomfortable feeling of the masking sound. it can.

また、本発明によれば、会話音声の音韻性を壊すための防聴音を再生することで、より効果的に会話を保護することができる。また、ベース効果音及びアテンション効果音により防聴音の違和感を低減することができる。 Further, according to the present invention, it is possible to protect the conversation more effectively by reproducing the hearing-aid sound for breaking the phoneme of the conversation voice. In addition, the sense of incongruity of the hearing-proof sound can be reduced by the base effect sound and the attention effect sound.

また、本発明によれば、音色や音の高さ等が異なる様々なアテンション効果音データを記憶部に保存して、この中からランダムに選択した音を再生することで聞く者にランダムな印象を与えて、会話を保護する効果や防聴音等のマスキング音の違和感を低減する効果を維持することができる。 Further, according to the present invention, various attention sound effect data having different timbres, pitches, etc. are stored in the storage unit, and a random impression is given to the listener by playing back a randomly selected sound from among them. Thus, it is possible to maintain the effect of protecting the conversation and reducing the uncomfortable feeling of the masking sound such as the hearing-aid sound.

また、本発明によれば、音色等に基づいて、続けて再生された場合でも違和感を覚えないアテンション効果音の組合せを設定テーブルとして設定して、この設定に基づいてアテンション効果音を再生するので、会話を保護する効果及びマスキング音の違和感を低減する効果に加えて、アテンション効果音に対する違和感をも低減して、心地よい印象を与えることができる。 In addition, according to the present invention, a combination of attention sound effects that does not give a sense of incongruity even when continuously played back is set as a setting table based on the tone color and the attention sound effects are played back based on this setting. In addition to the effect of protecting the conversation and the effect of reducing the uncomfortable feeling of the masking sound, the uncomfortable feeling of the attention sound can also be reduced to give a pleasant impression.

また、本発明によれば、再生するアテンション効果音の音色等を変更することに加えて、アテンション効果音を再生する際の音量を変更することもできるので、よりランダムな印象を与えて、さらに第三者が音に慣れてカクテルパーティ効果の発揮を抑制することができる。 Further, according to the present invention, in addition to changing the tone color of the attention sound effect to be reproduced, the volume at the time of reproducing the attention sound effect can also be changed. Third parties can get used to the sound and suppress the effect of the cocktail party.

図１は、本発明に係る会話保護システムの利用例を説明する図である。FIG. 1 is a diagram for explaining an example of use of a conversation protection system according to the present invention. 図２は、本実施形態に係る会話保護システムの設置例を説明する図である。FIG. 2 is a diagram for explaining an installation example of the conversation protection system according to the present embodiment. 図３は、本実施形態に係る会話保護システムの機能構成概略を示すブロック図である。FIG. 3 is a block diagram illustrating a schematic functional configuration of the conversation protection system according to the present embodiment. 図４は、本実施形態に係る会話音声の解析例を示す説明図である。FIG. 4 is an explanatory diagram illustrating an analysis example of conversational speech according to the present embodiment. 図５は、本実施形態に係るベース効果音及びアテンション効果音の再生タイミングを説明する図である。FIG. 5 is a diagram for explaining the reproduction timing of the bass sound effect and the attention sound effect according to the present embodiment. 図６は、本実施形態に係るアテンション効果音の振幅波形の例を示す図である。FIG. 6 is a diagram illustrating an example of the amplitude waveform of the attention sound effect according to the present embodiment. 図７は、本実施形態に係る記憶部に保存されるアテンション効果音の例を示す図である。FIG. 7 is a diagram illustrating an example of an attention sound effect stored in the storage unit according to the present embodiment. 図８は、本実施形態に係る記憶部に保存されるアテンション効果音の組合せが設定された設定テーブルの例を示す図である。FIG. 8 is a diagram illustrating an example of a setting table in which a combination of attention sound effects stored in the storage unit according to the present embodiment is set. 図９は、本実施形態に係る複数のアテンション効果音の再生方法について説明する図である。FIG. 9 is a diagram for explaining a method of reproducing a plurality of attention sound effects according to the present embodiment. 図１０は、本実施形態に係るベース効果音の再生方法を説明するフローチャートである。FIG. 10 is a flowchart for explaining a bass sound effect reproducing method according to the present embodiment. 図１１は、本実施形態に係るアテンション効果音の再生方法を説明するフローチャートである。FIG. 11 is a flowchart for explaining a method for reproducing an attention sound effect according to the present embodiment. 図１２は、本実施形態に係るアテンション効果音の生成方法を説明する図である。FIG. 12 is a diagram for explaining a method for generating an attention sound effect according to this embodiment.

以下に添付図面を参照して、この発明に係る会話保護システム及び会話保護方法の好適な実施形態について詳細に説明する。会話保護システムは、例えば、銀行等の金融機関や、病院や薬局等の医療機関で行われる会話内容が第三者によって聞き取られることを防止して、会話に含まれる個人情報やプライバシーを保護するために利用するシステムである。 Exemplary embodiments of a conversation protection system and a conversation protection method according to the present invention will be explained below in detail with reference to the accompanying drawings. The conversation protection system protects personal information and privacy contained in conversations by preventing third parties from listening to conversations conducted at financial institutions such as banks and medical institutions such as hospitals and pharmacies. It is a system used for this purpose.

図１は、会話保護システムの利用例を説明する図である。この例では、ブース内で会話を行う当事者１及び２の音声が、第三者３に聞き取られることを防止する。例えば、ブースを仕切るために設けられたパーティション５１の外側で待合席５２に座っている人物や、隣のブースに居る人物が第三者３に該当する。 FIG. 1 is a diagram illustrating an example of use of a conversation protection system. In this example, the voices of the parties 1 and 2 having a conversation in the booth are prevented from being heard by the third party 3. For example, a person sitting in a waiting seat 52 outside a partition 51 provided for partitioning a booth or a person in an adjacent booth corresponds to the third party 3.

会話保護システムは、音声処理装置１０と、マイクロホン（以下「マイク」と記載する）２０と、スピーカ３０によって構成される。マイク２０は、例えば、ブース内で会話当事者１及び２が座るテーブル５０に設置され、保護対象となる会話音声を集音するために利用される。音声処理装置１０は、マイク２０によって集音された音声に基づきマスキング音を生成して、このマスキング音と後述する効果音とをスピーカ３０によって再生する機能を有する。スピーカ３０は、ブースの外に居る第三者３に向けて音を再生するように設置されている。 The conversation protection system includes an audio processing device 10, a microphone (hereinafter referred to as “microphone”) 20, and a speaker 30. The microphone 20 is installed, for example, on a table 50 on which the conversation parties 1 and 2 sit in a booth, and is used to collect conversational voice to be protected. The sound processing device 10 has a function of generating a masking sound based on the sound collected by the microphone 20 and reproducing the masking sound and a sound effect described later by the speaker 30. The speaker 30 is installed so as to reproduce sound toward the third party 3 outside the booth.

図２は、上方から見た会話保護システムの設置例を説明する図である。このように、会話保護システムは、音声処理装置１０、マイク２０及びスピーカ３０に加えて、スピーカ３０から出力する音を制御するための出力音操作部４０を備える場合もある。この出力音操作部４０によって、例えば、スピーカ３０から出力される音の再生開始及び再生停止の制御や、再生時の音量制御を行うことができる。なお、図２では、説明を簡略化するために、１つのブースの会話を保護するためのシステムのみを示しているが、複数のブースがある場合には、各ブースの会話音声を保護するための音声処理装置１０、マイク２０、スピーカ３０及び出力音操作部４０が設置される。 FIG. 2 is a diagram for explaining an installation example of the conversation protection system as viewed from above. As described above, the conversation protection system may include the output sound operation unit 40 for controlling the sound output from the speaker 30 in addition to the sound processing device 10, the microphone 20, and the speaker 30. By this output sound operation unit 40, for example, playback start and playback stop control of sound output from the speaker 30 and volume control during playback can be performed. In FIG. 2, only the system for protecting the conversation of one booth is shown for the sake of simplicity, but when there are a plurality of booths, the conversation voice of each booth is protected. Audio processing apparatus 10, microphone 20, speaker 30, and output sound operation unit 40 are installed.

図３は、会話保護システムの機能構成概略を示すブロック図である。図３を参照しながら音声処理装置１０について詳細を説明する。音声処理装置１０は、マイク２０によって集音された会話音声の入力を受ける入力音解析部１１と、会話音声に基づいて防聴音を生成する防聴音生成部（マスキング音生成部）１２と、防聴音とは別に再生する効果音を制御する効果音制御部１３と、効果音として利用する音データや効果音の再生を制御するための設定等が保存されている記憶部１４と、防聴音及び効果音を会話音声に応じてスピーカ３０で再生する制御を行う出力音制御部１５とを有している。なお、入力音解析部１１、防聴音生成部１２、効果音制御部１３及び出力音制御部１５は、ＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）によって構成されている。また、音声処理装置１０は、ＤＳＰを含む専用のハードウェアによって構成されてもよいし、コンピュータ装置を利用して構成されてもよい。 FIG. 3 is a block diagram showing an outline of a functional configuration of the conversation protection system. Details of the speech processing apparatus 10 will be described with reference to FIG. The speech processing apparatus 10 includes an input sound analysis unit 11 that receives input of conversational sound collected by a microphone 20, a hearing-proof sound generation unit (masking sound generation unit) 12 that generates a hearing-aid sound based on the conversational sound, A sound effect control unit 13 that controls a sound effect to be reproduced separately from the listening sound, a storage unit 14 that stores sound data used as the sound effect, settings for controlling the reproduction of the sound effect, etc. And an output sound control unit 15 that performs control to reproduce the sound effect on the speaker 30 according to the conversational sound. The input sound analysis unit 11, the hearing loss generation unit 12, the sound effect control unit 13, and the output sound control unit 15 are configured by a DSP (Digital Signal Processor). The voice processing apparatus 10 may be configured by dedicated hardware including a DSP, or may be configured using a computer apparatus.

なお、図３では、会話保護システムの説明に必要な構成要素のみを示しているが、音声処理装置１０は、この他に、例えば、マイク２０からの入力信号及びスピーカ３０への出力信号を処理するためのＡ／Ｄ（Ｄ／Ａ）コンバータ及びアンプを有している。また、外部装置との間で有線又は無線で通信を行うための通信インターフェイスを有する場合もある。 In FIG. 3, only the components necessary for explaining the conversation protection system are shown, but the audio processing apparatus 10 also processes, for example, an input signal from the microphone 20 and an output signal to the speaker 30. An A / D (D / A) converter and an amplifier. There may also be a communication interface for performing wired or wireless communication with an external device.

入力音解析部１１は、マイク２０から入力され、Ａ／Ｄコンバータでデジタル化された保護対象となる会話音声の周波数特性や音量を解析する機能を有する。例えば、会話音声が、図４上段に示す振幅波形を示す場合に、この振幅波形をリアルタイムに解析して、同図下段に示す音圧波形を生成する。この音圧波形を形成する音圧レベルは、防聴音生成部１２によって防聴音を生成する処理や、効果音制御部１３によって生成された効果音の再生タイミングを制御するために利用される。 The input sound analysis unit 11 has a function of analyzing the frequency characteristics and volume of conversational speech that is input from the microphone 20 and digitized by the A / D converter and that is to be protected. For example, when the conversational voice shows the amplitude waveform shown in the upper part of FIG. 4, the amplitude waveform is analyzed in real time to generate the sound pressure waveform shown in the lower part of the figure. The sound pressure level that forms the sound pressure waveform is used to control the process of generating the hearing loss by the hearing loss generator 12 and the reproduction timing of the sound effect generated by the sound controller 13.

防聴音生成部１２は、会話音声に合わせてスピーカ３０から再生することで、会話音声の音韻性を壊すことができる防聴音を生成する機能を有する。防聴音は、会話音声の周波数特性を示すスペクトルから抽出した包絡線及び微細構造に係る特徴に基づいて生成されるマスキング音の一種である。具体的には、会話音声から得られた音声スペクトルからスペクトル包絡及びスペクトル微細構造を抽出して、スペクトル包絡を上下に入れ替えるための周波数方向に延びる反転軸を設定して当該反転軸を中心としてスペクトル包絡を反転させることによりスペクトル包絡に対して変形を施した変形スペクトル包絡を生成し、さらに、この変形スペクトル包絡及びスペクトル微細構造を合成した変形スペクトルを生成して、これを防聴音とする。なお、防聴音は、特許第４７６１５０６号公報によって開示された従来技術によって生成することができるので詳細な説明は省略する。 The hearing-proof sound generation unit 12 has a function of generating a hearing-proof sound that can break the phonological property of the conversation voice by playing it from the speaker 30 in accordance with the conversation voice. Hearing loss is a type of masking sound that is generated based on features related to an envelope and a fine structure extracted from a spectrum indicating frequency characteristics of conversational speech. Specifically, the spectral envelope and spectral fine structure are extracted from the speech spectrum obtained from the conversational speech, an inversion axis extending in the frequency direction for switching the spectral envelope up and down is set, and the spectrum is centered on the inversion axis. By reversing the envelope, a deformed spectrum envelope is generated by deforming the spectrum envelope. Further, a deformed spectrum obtained by synthesizing the deformed spectrum envelope and the spectrum fine structure is generated, and this is used as hearing loss. In addition, since a hearing-proof sound can be produced | generated by the prior art disclosed by the patent 4761506, detailed description is abbreviate | omitted.

効果音制御部１３は、効果音データの組合せや再生音量を制御して、会話音声に合わせてスピーカ３０から再生される効果音を生成する機能を有する。ここで、本実施形態で言う効果音とは、第三者３が会話音声を聞き取り難くする効果と防聴音に対する違和感を低減する効果とを得るために再生する音である。防聴音が会話音声の特徴に基づく周波数特性を有する音であるのに対し、効果音は会話音声とは無関係な周波数特性を有している。また、防聴音が会話音声に基づいて生成される音であるのに対し、効果音としては予め用意された曲や楽器等の音を利用することができる。防聴音は人工的に操作された周波数特性を有するため聞いたときに違和感を覚える場合があるが、効果音からはそのような違和感を覚えることがなく、音の種類によっては逆に心地よい印象を受ける。 The sound effect control unit 13 has a function of generating a sound effect reproduced from the speaker 30 in accordance with a conversational sound by controlling a combination of sound effect data and a reproduction volume. Here, the sound effect referred to in the present embodiment is a sound that is reproduced in order to obtain an effect that makes it difficult for the third party 3 to hear the conversational sound and an effect that reduces the sense of incongruity with the hearing-aid sound. While the hearing-proof sound is a sound having a frequency characteristic based on the characteristics of the conversational sound, the sound effect has a frequency characteristic unrelated to the conversational sound. Further, while the hearing-proof sound is a sound generated based on the conversational sound, a sound such as a song or a musical instrument prepared in advance can be used as the sound effect. Hearing loss has artificially manipulated frequency characteristics, so it may feel uncomfortable when you hear it, but there is no such discomfort from sound effects, and depending on the type of sound, it gives a pleasant impression. receive.

効果音として、ベース効果音とアテンション効果音の２種類の音を利用する。ベース効果音は会話音声が続く間途切れることなく続けて再生される音であり、アテンション効果音は会話音声の音圧が所定のしきい値を超える度に再生される音である。ベース効果音として利用するベース効果音データ１４ｂと、アテンション効果音として利用するアテンション効果音データ１４ａは、フラッシュメモリ等の一般的なメモリから構成される記憶部１４に保存される。効果音データのデータ形式として、例えば、ＰＣＭ（ＰｕｌｓｅＣｏｄｅＭｏｄｕｌａｔｉｏｎ）音源が利用される。また、ＭＰ３などの圧縮音源などでもよい。なお、記憶部１４は、複数の効果音データを保存することができれば、ハードディスク等の他の記憶装置を利用するものであってもよい。 As sound effects, two types of sounds are used: bass sound effects and attention sound effects. The base sound effect is a sound that is reproduced continuously without interruption during the conversational voice, and the attention sound effect is a sound that is reproduced every time the sound pressure of the conversational voice exceeds a predetermined threshold. The base sound effect data 14b used as the base sound effect and the attention sound effect data 14a used as the attention sound effect are stored in the storage unit 14 including a general memory such as a flash memory. For example, a PCM (Pulse Code Modulation) sound source is used as the data format of the sound effect data. Also, a compressed sound source such as MP3 may be used. The storage unit 14 may use another storage device such as a hard disk as long as it can store a plurality of sound effect data.

ベース効果音としては、静かな印象を受ける曲を利用することが好ましい。具体的には、時間軸方向の音圧レベルの変化が緩やかな音源を利用する。例えば、複数のオルゴール曲が、ベース効果音データ１４ｂとして予め記憶部１４に保存される。ベース効果音は、従来装置で利用されるＢＧＭに相当する音である。 As the bass sound effect, it is preferable to use a song that receives a quiet impression. Specifically, a sound source with a gradual change in sound pressure level in the time axis direction is used. For example, a plurality of music boxes are stored in the storage unit 14 in advance as the base sound effect data 14b. The base sound effect is a sound corresponding to BGM used in a conventional apparatus.

ベース効果音は、数秒〜数十秒の長さで、再生した時に第三者３が驚いたり不快感を覚えるような音が含まれず、同じ曲又は別の曲を連続して再生した場合でも曲の終わりと次の曲の初めとが違和感なくつながり、音量を変化させても不快感を覚えないものであれば、その内容は特に限定されない。例えば、打楽器や弦楽器等の楽器による曲であってもよいし、川のせせらぎ、波の音又は小鳥のさえずり等を利用した音であってもよい。 The base sound effect is several seconds to several tens of seconds long and does not include sounds that would make the third party 3 surprised or uncomfortable when played, even if the same song or another song is played continuously The content is not particularly limited as long as the end of the song is connected to the beginning of the next song without a sense of incongruity and no discomfort is felt even when the volume is changed. For example, it may be a song by a musical instrument such as a percussion instrument or a stringed instrument, or may be a sound using a river murmur, a sound of a wave, or a song of a bird.

利用者は、記憶部１４に保存された複数のベース効果音データ１４ｂの中から、再生したい音を選択することができる。再生するベース効果音データ１４ｂが予め選択されている場合には、選択内容が効果音出力条件１４ｃの一部として記憶部１４に保存される。効果音制御部１３は、効果音出力条件１４ｃに含まれる設定を参照してベース効果音データ１４ｂを選択する。そして、選択されたベース効果音が、出力音制御部１５によってスピーカ３０から再生される。なお、出力音操作部４０がベース効果音データ１４ｂを選択するための操作部を有しており、利用者がこの操作部を操作してベース効果音データ１４ｂを選択してもよい。また、音声処理装置１０の備える通信機能により、リモコン等の外部装置から受信した信号に基づいてベース効果音データ１４ｂが選択される態様であっても構わない。 The user can select a sound to be reproduced from a plurality of bass sound effect data 14 b stored in the storage unit 14. When the bass sound effect data 14b to be reproduced is selected in advance, the selected content is stored in the storage unit 14 as a part of the sound effect output condition 14c. The sound effect control unit 13 selects the base sound effect data 14b with reference to the setting included in the sound effect output condition 14c. Then, the selected bass sound effect is reproduced from the speaker 30 by the output sound control unit 15. Note that the output sound operation unit 40 may include an operation unit for selecting the bass sound effect data 14b, and the user may select the base sound effect data 14b by operating the operation unit. Moreover, the aspect which the base sound effect data 14b may be selected based on the signal received from external devices, such as a remote control, with the communication function with which the audio | voice processing apparatus 10 is provided may be sufficient.

アテンション効果音としては、聞く人が驚くような音ではなくかつ注意を引きやすい音を利用することが好ましい。例えば、時間軸方向の音圧レベルが、数秒の間に、起ち上がった後に減衰するような短い音を利用する。具体的には、鉄琴、木琴等の打楽器を叩いた音、ギターやハープ等の弦楽器を弾いた音、ベルや鐘等を鳴らした音等をアテンション効果音として利用する。様々な音色の様々な高さの音が、アテンション効果音データ１４ａとして予め記憶部１４に保存される。 As the attention sound, it is preferable to use a sound that is not surprising to the listener and that is easy to draw attention. For example, a short sound is used in which the sound pressure level in the time axis direction rises within a few seconds and then decays. Specifically, a sound of hitting a percussion instrument such as an iron koto or a xylophone, a sound of playing a stringed instrument such as a guitar or a harp, or a sound of a bell or bell is used as an attention effect sound. Sounds of various pitches with various pitches are stored in advance in the storage unit 14 as attention sound effect data 14a.

アテンション効果音は、短い音で、再生したときに第三者が驚いたり不快感を覚えるような音ではなく、ベース効果音に重ねて再生したときに違和感を覚えずかつベース効果音に埋もれることなく容易に聞き取れる音であれば、その音色や音の高さ等の条件は特に限定されない。例えば、カリンバのような民族楽器の音であってもよいし、サイン波を利用して生成された音であってもよいし、鳥のさえずりや虫の鳴き声等であっても構わない。また、記憶部１４に予め保存されたデータを利用してもよいし、複数のサイン波に窓関数を適用したものを合成して音を生成して利用する態様であっても構わない。 Attention sound effects are short sounds that are not surprised or uncomfortable by a third party when played, but do not feel uncomfortable when played over the base sound effects and are buried in the base sound effects As long as it is a sound that can be easily heard, the conditions such as tone color and pitch are not particularly limited. For example, it may be a sound of an folk instrument such as kalimba, a sound generated using a sine wave, a bird's singing sound, an insect's crying, or the like. Further, data stored in advance in the storage unit 14 may be used, or a mode in which sound is generated by combining a plurality of sine waves to which a window function is applied may be used.

アテンション効果音データ１４ａは、記憶部１４に保存されたデータの中から、効果音制御部１３によってランダムに選択される。アテンション効果音データ１４ａは、会話音声の音量が所定のしきい値を超えたタイミングで、ベース効果音データ１４ｂに重ねて再生される。 The attention sound effect data 14 a is randomly selected by the sound effect control unit 13 from the data stored in the storage unit 14. The attention sound effect data 14a is reproduced by being superimposed on the base sound effect data 14b at a timing when the volume of the conversational sound exceeds a predetermined threshold value.

記憶部１４には、ベース効果音データ１４ｂ及びアテンション効果音データ１４ａの他に、効果音出力条件１４ｃが保存されている。効果音出力条件１４ｃには、会話保護システムの利用者によって選択されたベース効果音データ１４ｂに係る設定情報、ベース効果音データ１４ｂの再生条件、アテンション効果音データ１４ａの選択条件及び再生条件等が含まれる。 The storage unit 14 stores sound effect output conditions 14c in addition to the base sound effect data 14b and the attention sound effect data 14a. The sound effect output condition 14c includes setting information related to the base sound effect data 14b selected by the user of the conversation protection system, the reproduction condition of the base sound effect data 14b, the selection condition and the reproduction condition of the attention sound effect data 14a, and the like. included.

出力音制御部１５は、Ｄ／Ａコンバータ及びアンプを有し、防聴音生成部１２で生成された防聴音をスピーカ３０から再生する機能を有する。防聴音は、会話の内容を聞き取ることができないように、会話音声に被せて音声の音韻性を壊すように再生される。また、出力音制御部１５は、効果音制御部１３によって選択された効果音を、効果音制御部１３によって設定された音量でスピーカ３０から再生する機能を有する。 The output sound control unit 15 includes a D / A converter and an amplifier, and has a function of reproducing the hearing-aid sound generated by the hearing-aid sound generation unit 12 from the speaker 30. The hearing-proof sound is reproduced so as to destroy the phoneme of the voice over the conversation voice so that the contents of the conversation cannot be heard. Further, the output sound control unit 15 has a function of reproducing the sound effect selected by the sound effect control unit 13 from the speaker 30 at a volume set by the sound effect control unit 13.

効果音の再生タイミングは、入力音解析部１１によって図４に示すように生成された会話音声の音圧レベルに基づいて制御される。再生タイミングの制御に利用されるベース効果音用のしきい値Ｂ及びアテンション効果音用のしきい値Ａは、記憶部１４に保存された効果音出力条件１４ｃに含まれている。 The playback timing of the sound effect is controlled based on the sound pressure level of the conversational voice generated by the input sound analysis unit 11 as shown in FIG. The threshold value B for the base sound effect and the threshold value A for the attention sound effect used for controlling the reproduction timing are included in the sound effect output condition 14c stored in the storage unit 14.

図５は、効果音の再生タイミングを説明する図である。例えば、図５で、音圧波形の下部に示したように、会話音声の音圧レベルがしきい値Ｂを超えると、ベース効果音の再生が開始される。そして、音圧レベルがしきい値Ｂを下回るとベース効果音の再生が停止される。ベース効果音は、効果音出力条件１４ｃに含まれる設定条件等に基づいて選択され、出力音制御部１５によって所定の音量レベルでスピーカ３０から再生される。 FIG. 5 is a diagram for explaining the reproduction timing of the sound effect. For example, as shown in the lower part of the sound pressure waveform in FIG. 5, when the sound pressure level of the conversational voice exceeds the threshold value B, the reproduction of the base effect sound is started. When the sound pressure level falls below the threshold value B, the reproduction of the bass sound effect is stopped. The base sound effect is selected based on the setting condition included in the sound effect output condition 14c, and is reproduced from the speaker 30 by the output sound control unit 15 at a predetermined volume level.

ベース効果音は、会話音声の音圧レベルがしきい値Ｂを超えている間だけ再生され、会話が無いときには停止するように制御される。このため、静かな環境に会話保護システムを設置した場合でも、会話を保護する必要がある場合にのみ予め設定された音量でベース効果音を再生し、会話が無いときには再生を停止して静かな状態を保つことが可能である。 The base sound effect is reproduced only while the sound pressure level of the conversation voice exceeds the threshold B, and is controlled to stop when there is no conversation. For this reason, even if the conversation protection system is installed in a quiet environment, the base sound effect is played at a preset volume only when it is necessary to protect the conversation. It is possible to keep the state.

なお、ベース効果音の再生を制御する方法については、会話音声の音圧レベルがしきい値Ｂを下回る度にベース効果音の再生を停止する態様に限らず、音圧レベルがしきい値を下回った状態が予め設定された所定時間続いた場合にのみベース効果音の再生を停止するようにしてもよい。すなわち、会話が短時間途切れただけである場合にはベース効果音の再生を続けるように制御してもよい。また、ベース効果音を常に再生してＢＧＭとして利用したい場合には、出力音操作部４０による再生停止の操作がされない限り、ベース効果音の再生を続けるように制御してもよい。また、ベース効果音の再生の開始及び停止を制御するのではなく、ベース効果音を常に再生しながら、その再生音量を会話音声の音圧レベルに応じて制御することにより、図５に示す再生状況を実現しても構わない。これらのベース効果音の再生制御は、効果音出力条件１４ｃの設定を変更することにより実現できるようになっている。 The method for controlling the reproduction of the bass sound effect is not limited to the mode in which the reproduction of the base sound effect is stopped every time the sound pressure level of the conversational sound falls below the threshold value B. You may make it stop reproduction | regeneration of a bass sound effect only when the state in which it fell below continued for the preset predetermined time. That is, when the conversation is only interrupted for a short time, it may be controlled to continue playing the base sound effect. When it is desired to always reproduce the bass sound effect and use it as the BGM, control may be performed so that the bass sound effect continues to be played unless the playback sound operation unit 40 is operated to stop the playback. Also, instead of controlling the start and stop of the playback of the bass sound effect, the playback volume shown in FIG. 5 is controlled by constantly playing the base sound effect and controlling the playback volume according to the sound pressure level of the conversational voice. You may realize the situation. The reproduction control of these bass sound effects can be realized by changing the setting of the sound effect output condition 14c.

さらに、ベース効果音用のしきい値に関し、第三者３に聞こえるようにベース効果音の再生を制御する際のしきい値と、第三者３に聞こえないようにベース効果音の再生を制御する際のしきい値とが異なる値に設定されても構わない。すなわち、例えばベース効果音の再生開始を判断するしきい値と再生停止を判断するしきい値とが異なる設定値であっても構わない。 Furthermore, regarding the threshold value for the base sound effect, the threshold value for controlling the reproduction of the base sound effect so that it can be heard by the third party 3 and the reproduction of the base sound effect so that it cannot be heard by the third person 3 The threshold value for control may be set to a different value. That is, for example, the threshold value for determining the start of playback of the bass sound effect and the threshold value for determining the stop of playback may be different set values.

また、図５で、音圧波形の上部に示したように、会話音声の音圧レベルがしきい値Ｂを超えてベース効果音の再生が開始された後、さらに音圧レベルがしきい値Ａを超えると、アテンション効果音が再生される。アテンション効果音は、記憶部１４に保存された複数の音データの中から効果音制御部１３によってランダムに選択される。また、再生時の音量レベルについても効果音制御部１３によってランダムに設定される。なお、再生時の音量レベルについては、音量レベルの範囲だけが予め設定されており、この範囲内でランダムに設定されるようになっている。 Further, as shown in the upper part of the sound pressure waveform in FIG. 5, after the sound level of the conversational sound exceeds the threshold value B and the reproduction of the base effect sound is started, the sound pressure level is further increased to the threshold value. When A is exceeded, an attention sound effect is played. The attention sound effect is randomly selected by the sound effect control unit 13 from a plurality of sound data stored in the storage unit 14. Also, the sound volume level at the time of reproduction is randomly set by the sound effect control unit 13. As for the volume level at the time of reproduction, only the range of the volume level is set in advance, and is set at random within this range.

アテンション効果音は会話音声の音圧がしきい値Ａを超える度に再生されるが、会話音声の音圧は不規則に変化するので、アテンション効果音は図５に示すようにランダムなタイミングで再生されることになる。なお、図５ではアテンション効果音を２段で示しているが、このように、会話音声の音圧がしきい値Ａを超えるタイミングによっては、先に再生されたアテンション効果音の再生が完了する前に、次のアテンション効果音が再生される場合もある。 Attention sound effects are played each time the sound pressure of the conversation voice exceeds the threshold A, but since the sound pressure of the conversation voice changes irregularly, the attention sound effects are at random timing as shown in FIG. Will be played. In FIG. 5, the attention sound effect is shown in two stages. Thus, depending on the timing at which the sound pressure of the conversational sound exceeds the threshold value A, the reproduction of the previously played attention sound effect is completed. Before the next attention sound effect may be played.

また、図５では、ベース効果音及びアテンション効果音の両方を再生する場合を示しているが、本実施形態がこれに限定されるものではなく、しきい値Ａ及びＢを設定することによりベース効果音又はアテンション効果音のいずれか一方のみを再生するように制御することもできる。 FIG. 5 shows a case where both the base sound effect and the attention sound effect are reproduced. However, the present embodiment is not limited to this, and the base value can be set by setting the threshold values A and B. It is also possible to control to reproduce only one of the sound effect and the attention sound effect.

アテンション効果音として、ベル音のように、聞く人の注意を引きやすい短い音が利用される。このため、会話音声の音量が所定のしきい値Ａを超えるタイミングでアテンション効果音を再生して、第三者の注意をアテンション効果音に向かせることで、会話音声を聞き取り難くする効果がある。 As an attention sound effect, a short sound such as a bell sound that can easily attract the listener's attention is used. For this reason, an attention sound effect is reproduced at a timing when the volume of the conversation sound exceeds a predetermined threshold A, and the third party's attention is directed to the attention effect sound, thereby making it difficult to hear the conversation sound. .

また、図５には示していないが、防聴音生成部１２によって生成された防聴音も、マスキング音として、会話音声に合わせて再生される。防聴音は、例えば、図５に示すベース効果音と同様に、しきい値Ｂを超える間、会話音声に合わせて再生される。防聴音は、違和感を覚える音となる場合もある。しかし、防聴音が再生される間、会話音声がしきい値Ａを超えるランダムなタイミングでアテンション効果音が再生されるので、第三者の注意はアテンション効果音に引きつけられる。この結果、防聴音に対する違和感を低減させる効果を得ることができる。 In addition, although not shown in FIG. 5, the hearing protection sound generated by the hearing protection sound generation unit 12 is also reproduced as a masking sound in accordance with the conversational sound. For example, as in the case of the base sound effect shown in FIG. 5, the hearing-proof sound is reproduced in accordance with the conversation sound while the threshold value B is exceeded. A hearing-proof sound may be a sound that makes you feel uncomfortable. However, since the attention effect sound is reproduced at random timing when the conversational sound exceeds the threshold A while the hearing protection sound is reproduced, the attention of the third party is attracted to the attention effect sound. As a result, it is possible to obtain an effect of reducing a sense of incongruity with the hearing loss sound.

また、アテンション効果音は、音色、音の高さ、再生音量及びタイミングを変えながら再生されるので、同じような音が繰り返して再生される場合のように単調な印象を受けることがない。すなわち、アテンション効果音のランダムな印象により、第三者はアテンション効果音に慣れることがない。このため、第三者の注意を引き続けて、会話音声を保護する効果と防聴音の違和感を低減する効果とを維持し続けることができる。 Further, the attention sound effect is reproduced while changing the tone color, pitch, reproduction volume and timing, so that it does not receive a monotonous impression as in the case where a similar sound is reproduced repeatedly. That is, the random impression of the attention sound effect prevents a third party from getting used to the attention sound effect. For this reason, it is possible to keep the attention of a third party and maintain the effect of protecting the conversational sound and the effect of reducing the uncomfortable feeling of the hearing loss sound.

アテンション効果音は、会話音声の音圧がしきい値Ａを超えた場合にのみ再生される短い音であるため、アテンション効果音の再生を終了してから次のアテンション効果音が再生される迄の間に隙間の時間が生ずる場合がある。ベース効果音を再生することなくアテンション効果音のみを再生した場合には、アテンション効果音の隙間で防聴音の印象が強くなり、防聴音に違和感を覚える可能性がある。このため、しきい値Ａよりも音圧レベルの低いしきい値Ｂを設定して、このしきい値Ｂを超える間はベース効果音を再生し、２つのアテンション効果音の隙間ではベース効果音が聞こえるように再生が制御される。これにより、アテンション効果音の隙間で防聴音の印象が強くなることを回避することができる。 Since the attention sound is a short sound that is played only when the sound pressure of the conversation voice exceeds the threshold A, the attention sound is played until the next attention sound is played after the end of the reproduction of the attention sound. There may be a gap time between the two. When only the attention effect sound is played back without playing the base sound effect, the impression of the hearing aid sound becomes strong in the gap between the attention effect sounds, and there is a possibility that the hearing aid sound may feel uncomfortable. For this reason, a threshold value B having a sound pressure level lower than the threshold value A is set, the base sound effect is reproduced while the threshold value B is exceeded, and the base sound effect is generated in the gap between the two attention sound effects. Playback is controlled so that can be heard. As a result, it is possible to avoid an increase in the impression of the hearing loss sound in the gap between the attention sound effects.

ベース効果音は、音量の揺らぎの少ない音である。アテンション効果音を再生することなくベース効果音のみを再生した場合には、ベース効果音を聞くうちにその音に慣れて防聴音の印象が強くなり、防聴音に違和感を覚える可能性がある。このため、ベース効果音に加えて、聞く者の注意を引くアテンション効果音を再生して、効果音に対してランダムな印象を与えるように制御される。これにより、ベース効果音に慣れて防聴音の印象が強くなることを回避することができる。 The bass sound effect is a sound with less volume fluctuation. When only the bass sound effect is played back without playing the attention sound effect, while hearing the base sound effect, the impression of the hearing loss sound becomes stronger as the sound gets used to the sound, and there is a possibility that the hearing loss sound feels uncomfortable. For this reason, in addition to the base sound effect, an attention sound effect that attracts the listener's attention is reproduced, and control is performed to give a random impression to the sound effect. Thereby, it is possible to avoid an increase in the impression of the hearing-aid sound by getting used to the bass sound effect.

このように、アテンション効果音及びベース効果音の２種類の効果音を利用することよって、防聴音に対する違和感を効果的に低減することができる。また、アテンション効果音及びベース効果音によって、マスキング音としての効果も得られるので、防聴音のみを利用する場合に比べて、より会話の内容を聞き取り難くするという効果を得ることもできる。 In this way, by using two types of sound effects, the attention sound effect and the bass sound effect, it is possible to effectively reduce the uncomfortable feeling with respect to the hearing loss sound. Further, since the effect as a masking sound can be obtained by the attention sound effect and the bass sound effect, it is possible to obtain an effect of making it difficult to hear the content of the conversation as compared with the case where only the hearing-aid sound is used.

次に、アテンション効果音データについて詳細を説明する。アテンション効果音は、会話音声をマスキングすると共に、第三者３の注意を会話音声からそらしてアテンション効果音へ向けるために利用される。第三者３が音に慣れてしまうと、注意を引く効果が低くなってしまうため、音に慣れることがないように、ランダムな印象を与えるようにアテンション効果音の再生が制御される。 Next, details of the attention sound effect data will be described. The attention sound effect is used for masking the conversation sound and for diverting the attention of the third party 3 from the conversation sound to the attention sound effect. When the third person 3 gets used to the sound, the effect of attracting attention is reduced, so that the reproduction of the attention effect sound is controlled so as to give a random impression so as not to get used to the sound.

図６に示すように、様々な時間軸波形を有するアテンション効果音が利用される。アテンション効果音は、ベル、木琴、鉄琴等の異なる音色で、例えば２秒の間に音圧レベルが起ち上がった後に減衰する音である。この２秒間に１つの音が再生される場合もあるし２つ以上の複数の音が再生される場合もある。 As shown in FIG. 6, attention sound effects having various time axis waveforms are used. The attention sound effect is a sound that decays after the sound pressure level rises in 2 seconds, for example, with different timbres such as bell, xylophone, and iron koto. There may be a case where one sound is played back during the two seconds, or a case where a plurality of two or more sounds are played back.

鉄琴の音色のアテンション効果音を例に具体的に説明すると、選択されたアテンション効果音によって、２秒の間に、鉄琴の音が１回だけ聞こえる場合もあるし、同じ高さ又は異なる高さの鉄琴の音が複数回聞こえる場合もある。また、例えば音が２回聞こえる場合でも、各音が聞こえるタイミングは、選択されたアテンション効果音によって同じ場合もあるし異なる場合もある。すなわち、同じ２秒間のアテンション効果音であっても、各アテンション効果音を再生したときに聞こえる音色、音の数、各音の高さ、各音が聞こえるタイミングの少なくとも１つが異なるようになっている。 To explain specifically, the attention sound effect of the koto tone, the koto sound may be heard only once in 2 seconds depending on the selected attention sound effect. In some cases, you can hear the sound of the height of the koto. For example, even when sounds are heard twice, the timing at which each sound is heard may be the same or different depending on the selected attention sound effect. That is, at least one of the tone color, the number of sounds, the pitch of each sound, and the timing at which each sound can be heard when each attention effect sound is reproduced, even if the attention effect sound is the same for 2 seconds. Yes.

図７は、記憶部１４に保存されるアテンション効果音データ１４ａの例を示す図である。このように、様々な音色の複数の音が、アテンション効果音データ１４ａとして記憶部１４に保存されている。効果音制御部１３は、これらの中から再生するアテンション効果音データ１４ａをランダムに選択する。 FIG. 7 is a diagram illustrating an example of the attention sound effect data 14 a stored in the storage unit 14. Thus, a plurality of sounds of various timbres are stored in the storage unit 14 as the attention sound effect data 14a. The sound effect control unit 13 randomly selects the attention sound effect data 14a to be reproduced from these.

例えば、効果音制御部１３が、再生時間が２秒間のアテンション効果音データ１４ａの中から３つのデータをランダムに選択する。この場合には、３つのアテンション効果音データ１４ａが６秒かけて再生されることになる。しかし、図６に示したように、１つのアテンション効果音データ１４ａの中に含まれる音は各々異なっている。このため、２秒間のアテンション効果音データ１４ａを３つ選択した場合でも、６秒の間に聞こえる音は３つ以上のランダムな数になる。また、選択されたアテンション効果音データ１４ａによって、音色や、含まれる各音が再生されるタイミングや、各音の高さも異なる。さらに、選択された各アテンション効果音データ１４ａは、音量レベルをランダムに変更して再生される。 For example, the sound effect control unit 13 randomly selects three data from the attention sound effect data 14a having a reproduction time of 2 seconds. In this case, the three attention sound effect data 14a are reproduced over 6 seconds. However, as shown in FIG. 6, the sounds included in one attention sound effect data 14a are different from each other. For this reason, even when three attention sound effect data 14a for 2 seconds are selected, the number of sounds that can be heard in 6 seconds is a random number of 3 or more. Further, the tone color, the timing at which each included sound is reproduced, and the pitch of each sound differ depending on the selected attention sound effect data 14a. Further, each selected attention sound effect data 14a is reproduced by changing the volume level at random.

このように、様々な音からなる複数のアテンション効果音データ１４ａの中から、再生する音をランダムに選択して、音量レベルを変更しながら再生することにより、ランダムな印象を受ける音を再生することができる。この結果、アテンション効果音を聞く第三者３が音に慣れることがなく、聞く者の注意を引きつける効果を維持し続けることができる。 In this way, a sound that receives a random impression is reproduced by randomly selecting a sound to be reproduced from among a plurality of attention effect sound data 14a consisting of various sounds and changing the volume level. be able to. As a result, the third party 3 who hears the attention sound does not get used to the sound, and the effect of attracting the listener's attention can be maintained.

なお、会話保護システムでは、効果音を聞く者に対して、ランダムな印象を与えるだけではなく、心地よい印象を与えることもできる。アテンション効果音データ１４ａをランダムに選択して再生した場合に、続けて再生されるアテンション効果音データ１４ａによっては、違和感を覚える場合がある。例えば、続けて再生される音の高さが急激に変化したり、音の高さが不協和音を構成するような関係にあったり、音色の組合せの相性が悪い場合には、再生された音に違和感を覚える場合がある。このため、会話保護システムでは、記憶部１４の効果音出力条件１４ｃの中に、アテンション効果音データ１４ａの組合せを設定したテーブルが保存されている。例えば、音色の組合せ、音の高さの変化、協和音を構成する音の高さの関係等を考慮して、アテンション効果音データ１４ａの組合せが設定テーブルに設定される。 In the conversation protection system, not only a random impression but also a pleasant impression can be given to the person who hears the sound effect. When the attention sound effect data 14a is randomly selected and reproduced, depending on the attention sound effect data 14a to be continuously reproduced, a sense of incongruity may be felt. For example, if the pitch of the sound that is played continuously changes rapidly, the pitch is in a dissonant relationship, or the combination of timbres is not compatible, You may feel uncomfortable. For this reason, in the conversation protection system, a table in which the combination of the attention sound effect data 14a is stored in the sound effect output condition 14c of the storage unit 14. For example, the combination of the attention sound effect data 14a is set in the setting table in consideration of the combination of the timbre, the change in the pitch, the relationship between the pitches of the sounds that make up the consonance.

図８は、アテンション効果音データ１４ａの設定テーブルの一例である。この設定テーブルでは、続けて再生した場合に心地よい音となるアテンション効果音データ１４ａの音色の組合せが予め設定されている。 FIG. 8 is an example of a setting table for the attention sound effect data 14a. In this setting table, combinations of timbres of attention sound effect data 14a, which are pleasant sounds when continuously reproduced, are set in advance.

アテンション効果音データ１４ａの設定テーブル利用する場合には、効果音制御部１３が、設定テーブルに設定された組合せをランダムに選択する。そして、選択した組合せに基づいて、各音色のアテンション効果音データ１４ａをランダムに選択する。このとき、効果音制御部１３は、各アテンション効果音データ１４ａを再生するときの音量レベルの設定も行うが、音量レベルについても、不快に感じることがないように予め設定された所定範囲内で設定されるようになっている。なお、音量レベルの設定条件についても、アテンション効果音データ１４ａの設定テーブルと同様に、効果音出力条件１４ｃとして記憶部１４に保存されている。 When using the setting table of the attention sound effect data 14a, the sound effect control unit 13 randomly selects a combination set in the setting table. Based on the selected combination, the attention sound effect data 14a of each tone color is selected at random. At this time, the sound effect control unit 13 also sets the volume level when reproducing each attention sound effect data 14a, but the sound volume level is also within a predetermined range set in advance so as not to feel uncomfortable. It is set up. Note that the volume level setting condition is also stored in the storage unit 14 as the sound effect output condition 14c, as in the setting table of the attention sound effect data 14a.

例えば、選択されたアテンション効果音データ１４ａの組合せが図８に示す設定テーブルのＮｏ．１であった場合には、設定テーブルに従い、図７に示すアテンション効果音データ１４ａの中から、ベルの音色の２つのアテンション効果音データ１４ａと、木琴の音色の１つのアテンション効果音データ１４ａがランダムに選択される。例えば、ランダムに選択されたアテンション効果音データ１４ａが、ベルＢ、ベルＡ及び木琴Ｂであった場合には、図９（ａ）に示すように、これら３つのアテンション効果音データ１４ａが、順に再生される。また、例えば、音量レベルを所定レベルに対して＋２０％から−２０％の間で変更するように設定されており、ランダムに設定された音量レベルが９０％、１００％及び１１０％であった場合には、各音がこの音量で再生される。すなわち、図９（ａ）に示すように、所定の音量レベルに対して、９０％の音量レベルでベルＢの音が再生され、１００％の音量レベルでベルＡの音が再生され、１１０％の音量レベルで木琴Ｂの音が再生される。 For example, the combination of the selected attention sound effect data 14a is set to No. in the setting table shown in FIG. If it is 1, two attention sound effect data 14a for the bell tone and one attention sound effect data 14a for the xylophone tone are selected from the attention sound effect data 14a shown in FIG. Randomly selected. For example, when the attention sound effect data 14a selected at random is Bell B, Bell A, and Xylophone B, these three attention sound effect data 14a are sequentially displayed as shown in FIG. Played. In addition, for example, when the volume level is set to be changed between + 20% and −20% with respect to a predetermined level, and the randomly set volume levels are 90%, 100%, and 110% Each sound is played at this volume. That is, as shown in FIG. 9A, the sound of the bell B is reproduced at a volume level of 90% with respect to a predetermined volume level, the sound of the bell A is reproduced at a volume level of 100%, and 110%. The sound of xylophone B is reproduced at the volume level of.

なお、複数のアテンション効果音を連続して再生する方法は、複数の音を図９（ａ）に示すように連続して再生する態様に限らず、同図（ｂ）のように、各音の一部が重なるように再生してもよい。この場合には、各音の重なり、すなわち各音の再生タイミングを、予め設定された所定範囲内でランダムに設定すればよい。 Note that the method of continuously playing a plurality of attention sound effects is not limited to a mode in which a plurality of sounds are continuously played as shown in FIG. 9A, and each sound is played as shown in FIG. You may reproduce | regenerate so that a part of may overlap. In this case, the overlapping of each sound, that is, the reproduction timing of each sound may be set at random within a predetermined range set in advance.

また、アテンション効果音の選択方法について、利用者の好みを反映して選択されるようにしてもよい。例えば、図８に示す設定テーブルの音色の組合せを利用者の好みに合わせて設定してもよいし、図７に示すデータの中から利用者の好みに合わせてアテンション効果音データ１４ａを選択し、これらをランダムに組み合わせたものを設定テーブルとしてもよい。また、利用者の好みによらず設定された図８の設定テーブルはそのままに、効果音制御部１３がランダムにデータを選択する際に選択可能なアテンション効果音データ１４ａを、予め利用者の好みに合わせて絞っておくことによって、利用者の好みが反映されるようにしても構わない。 Further, the attention sound effect selection method may be selected reflecting the user's preference. For example, the combination of timbres in the setting table shown in FIG. 8 may be set according to the user's preference, or the attention sound effect data 14a is selected from the data shown in FIG. 7 according to the user's preference. The setting table may be a combination of these at random. Further, the attention sound effect data 14a that can be selected when the sound effect control unit 13 selects data randomly is stored in advance in the user's preference without changing the setting table of FIG. 8 set irrespective of the user's preference. It is also possible to reflect the user's preference by narrowing down according to the above.

次に、ベース効果音データ１４ｂ及びアテンション効果音データ１４ａの再生処理について説明する。図１０は、ベース効果音データ１４ｂを再生する際の処理を示すフローチャートである。また、図１１は、アテンション効果音データ１４ａを再生する際の処理を示すフローチャートである。 Next, the reproduction process of the bass sound effect data 14b and the attention sound effect data 14a will be described. FIG. 10 is a flowchart showing a process when reproducing the bass sound effect data 14b. FIG. 11 is a flowchart showing a process when reproducing the attention sound effect data 14a.

まず、ベース効果音データ１４ｂを再生する際の処理について説明する。マイク２０によって集音された会話音声から入力音解析部１１によって生成された音圧波形が、出力音制御部１５によって監視される（ステップＳ１及びステップＳ１；Ｎｏ）。 First, a process for reproducing the bass sound effect data 14b will be described. The sound pressure waveform generated by the input sound analysis unit 11 from the conversational sound collected by the microphone 20 is monitored by the output sound control unit 15 (step S1 and step S1; No).

そして、会話音声から得られた音圧レベルが、予め設定されたベース効果音用のしきい値Ｂを超えた場合には（ステップＳ１；Ｙｅｓ）、効果音出力条件１４ｃ内の設定に基づいて選択されたベース効果音データ１４ｂが再生される（ステップＳ２）。出力音制御部１５は、音圧レベルの監視を継続する（ステップＳ３及びステップＳ３；Ｎｏ）。 When the sound pressure level obtained from the conversational sound exceeds the preset threshold B for the base sound effect (step S1; Yes), the sound pressure level obtained from the conversation sound is based on the setting in the sound effect output condition 14c. The selected bass sound effect data 14b is reproduced (step S2). The output sound control unit 15 continues to monitor the sound pressure level (Step S3 and Step S3; No).

そして、会話音声から得られた音圧レベルが、しきい値Ｂを下回った場合には（ステップＳ３；Ｙｅｓ）、ベース効果音データ１４ｂの再生を停止する（ステップＳ４）。このとき、第３者が違和感を覚えることがないように、ベース効果音はフェードアウトするように停止される。ベース効果音データ１４ｂの再生を停止した後も、出力音制御部１５は、会話音声から得られた音圧レベルの監視を継続して、ベース効果音データ１４ｂの再生及び停止を制御する。すなわち、図５に示したように、会話音声の音圧レベルが所定のしきい値Ｂを超えている間、ベース効果音データ１４ｂが再生される。 When the sound pressure level obtained from the conversational voice is below the threshold value B (step S3; Yes), the reproduction of the bass sound effect data 14b is stopped (step S4). At this time, the bass sound effect is stopped to fade out so that the third party does not feel uncomfortable. Even after the reproduction of the base sound effect data 14b is stopped, the output sound control unit 15 continues to monitor the sound pressure level obtained from the conversational sound, and controls the reproduction and stop of the base sound effect data 14b. That is, as shown in FIG. 5, while the sound pressure level of the conversation voice exceeds the predetermined threshold value B, the base sound effect data 14b is reproduced.

なお、ベース効果音データ１４ｂの再生を停止する際に、アテンション効果音データ１４ａが再生されている場合には、このアテンション効果音データ１４ａについてもベース効果音データ１４ｂと同様に再生を停止するように制御してもよい。また、ベース効果音データ１４ｂについては、再生及び停止を制御する態様の他、ベース効果音データ１４ｂを常に再生しながら、再生音量を制御する態様であっても構わない。具体的には、会話音声の音圧レベルがしきい値Ｂを超えたときにはベース効果音データ１４ｂの再生音量をフェードインして、所定の音量に達した所で音量を維持したまま再生を続け、会話音声の音圧レベルがしきい値Ｂを下回った場合にはフェードアウトするように音量を絞るようにしてもよい。 If the attention sound effect data 14a is reproduced when the reproduction of the bass sound effect data 14b is stopped, the reproduction of the attention sound effect data 14a is also stopped in the same manner as the base sound effect data 14b. You may control to. Further, the bass sound effect data 14b may be in a mode in which the playback sound volume is controlled while always playing back the base sound effect data 14b, in addition to a mode in which playback and stop are controlled. Specifically, when the sound pressure level of the conversation voice exceeds the threshold value B, the playback volume of the base sound effect data 14b is faded in, and the playback is continued while maintaining the volume when the predetermined volume is reached. When the sound pressure level of the conversational voice falls below the threshold value B, the volume may be reduced so as to fade out.

次に、アテンション効果音データ１４ａを再生する際の処理について説明する。出力音制御部１５は、ベース効果音データ１４ｂの場合と同様に、会話音声から得られた音圧波形を監視する（ステップＳ１１及びステップＳ１１；Ｎｏ）。 Next, a process for reproducing the attention sound effect data 14a will be described. The output sound control unit 15 monitors the sound pressure waveform obtained from the conversational voice as in the case of the bass sound effect data 14b (Step S11 and Step S11; No).

そして、会話音声から得られた音圧レベルが、予め設定されたアテンション効果音用のしきい値Ａを超えた場合には（ステップＳ１１；Ｙｅｓ）、効果音制御部１３によって、アテンション効果音データ１４ａがランダムに選択される（ステップＳ１２）。さらに、効果音制御部１３は、アテンション効果音データ１４ａを再生する際の音量レベルを所定範囲内でランダムに設定する（ステップＳ１３）。 When the sound pressure level obtained from the conversational voice exceeds a preset threshold value A for the attention sound effect (step S11; Yes), the sound effect control unit 13 causes the attention sound effect data to be obtained. 14a is selected at random (step S12). Furthermore, the sound effect control unit 13 randomly sets the volume level when reproducing the attention sound effect data 14a within a predetermined range (step S13).

そして、出力音制御部１５が、ランダムに選択されたアテンション効果音データ１４ａを、ランダムに設定された音量でスピーカ３０から再生する（ステップＳ１４）。出力音制御部１５は、これらの処理が行われる間も音圧レベルの監視を継続して、会話音声から得られた音圧レベルがしきい値Ａを超えた場合には、次のアテンション効果音データ１４ａを再生する。すなわち、図５に示したように、会話音声の音圧レベルが所定のしきい値Ａを超える度に、ランダムに選択されたアテンション効果音データ１４ａが、ランダムに設定された音量レベルで再生される。 And the output sound control part 15 reproduces | regenerates the attention sound effect data 14a selected at random from the speaker 30 with the sound volume set at random (step S14). The output sound control unit 15 continues to monitor the sound pressure level during these processes, and when the sound pressure level obtained from the conversational sound exceeds the threshold A, the next attention effect is obtained. The sound data 14a is reproduced. That is, as shown in FIG. 5, each time the sound pressure level of the conversational voice exceeds a predetermined threshold A, the randomly selected attention sound effect data 14a is reproduced at a randomly set volume level. The

なお、本実施形態では、図８に示すアテンション効果音データ１４ａの設定テーブルを利用する態様を示したが、設定テーブルは１つである場合に限定されず、複数の設定テーブルを利用する態様であってもよい。 In the present embodiment, an aspect using the setting table of the attention sound effect data 14a shown in FIG. 8 is shown, but the present invention is not limited to a single setting table, and an aspect using a plurality of setting tables. There may be.

例えば、特定の音色のアテンション効果音を多く含むように複数の設定テーブルを用意して、会話音声に合わせて設定テーブルを選択して利用してもよい。具体的には、会話音声の音圧レベルに合わせて、音圧レベルが小さいときには静かな印象を受ける木琴等のアテンション効果音が多く含まれるテーブルを利用して、音圧レベルが大きいときには鉄琴等の強い印象を受ける音色のアテンション効果音が多く含まれるテーブルを利用する。 For example, a plurality of setting tables may be prepared so as to include a lot of attention sound effects of a specific tone color, and the setting tables may be selected and used according to the conversation voice. Specifically, a table containing a lot of attention sound effects such as xylophone, which receives a quiet impression when the sound pressure level is low, matches the sound pressure level of the conversational voice, and when the sound pressure level is high Use a table that contains a lot of attention-sounding timbres that receive a strong impression.

また、入力音解析部１１が、会話音声を解析するときに、声質や性別を判定して、この判定結果に基づいて設定テーブルを選択して利用してもよい。例えば、声質や性別に応じて、会話音声をマスキングする効果や防聴音の違和感を低減する効果が高いアテンション効果音データ１４ａが選択されるように、設定テーブルを予め設定して利用する。具体的には、会話音声の声質に合わせて、例えば女性の高い声に対しては鉄琴等のアテンション効果音が多く含まれる設定テーブルを利用して、男性の低い声に対しては木琴等のアテンション効果音が多く含まれる設定テーブルを利用する。 Moreover, when the input sound analysis unit 11 analyzes the conversational voice, the voice quality and the sex may be determined, and the setting table may be selected and used based on the determination result. For example, the setting table is set and used in advance so that the attention sound effect data 14a having a high effect of masking the conversational sound and reducing the sense of incongruity of the hearing loss sound is selected according to voice quality and gender. Specifically, in accordance with the voice quality of conversational voice, for example, using a setting table that contains many attention sound effects such as iron koto for female high voice, xylophone for male low voice, etc. Use a setting table that contains a lot of attention sound effects.

このように、アテンション効果音データ１４ａをランダムに選択しながらも、その音色や音の高さが、会話音声の声質等の特徴に合わせて選択されるように設定テーブルを利用すれば、会話音声に対するマスキング効果や、防聴音の違和感の低減効果をより高くすることができる。 As described above, if the setting table is used so that the tone color and the pitch of the attention sound data 14a are selected in accordance with characteristics such as the voice quality of the conversation voice while the attention sound effect data 14a is selected at random, the conversation voice can be used. It is possible to further increase the masking effect on the sound and the effect of reducing the sense of incongruity of the hearing loss.

なお、アテンション効果音データ１４ａを再生するときの音量レベルについても、会話音声に応じて変化させる態様であってもよい。また、ベース効果音データ１４ｂの選択や再生時の音量レベルについても、会話音声に応じて設定する態様であっても構わない。 Note that the volume level when the attention sound effect data 14a is reproduced may be changed in accordance with the conversational voice. Further, the volume level at the time of selection and reproduction of the base sound effect data 14b may be set according to the conversational sound.

また、本実施形態では、音色を考慮してアテンション効果音の組合せを設定テーブルに設定する態様を示したが、これに加えて、ベース効果音として利用される音や曲に応じてアテンション効果音の組合せが予め設定される態様であっても構わない。具体的には、鳥のさえずりを利用したアテンション効果音の組合せを設定テーブルとして作成して、この設定テーブルを川のせせらぎの音からなるベース効果音と関連付けて利用する。これにより、ベース効果音データ１４ｂとして川のせせらぎの音が選択された場合には、鳥のさえずりをアテンション効果音データ１４ａとして再生することができる。このように、ベース効果音に合わせたアテンション効果音を利用するように設定することで、聞く者に心地よい印象を与えることができる。 In this embodiment, the combination of attention sound effects is set in the setting table in consideration of the timbre, but in addition to this, the attention sound effects according to the sound and music used as the base sound effects are shown. It is also possible to adopt a mode in which these combinations are preset. More specifically, a combination of attention sound effects using the chirping of a bird is created as a setting table, and this setting table is used in association with a bass sound effect consisting of the sound of a river. As a result, when a river murmur sound is selected as the base sound effect data 14b, it is possible to reproduce the bird's song as the attention sound effect data 14a. In this way, a comfortable impression can be given to the listener by setting so as to use an attention sound effect that matches the base sound effect.

また、本実施形態では、図６に示すように、再生長さが同じアテンション効果音データ１４ａの中から３つのデータを選択して再生する態様を示した。しかし、本実施形態はこれに限定されず、選択されるアテンション効果音データ１４ａの数や各アテンション効果音データ１４ａの長さが異なる態様であっても構わない。 Further, in the present embodiment, as shown in FIG. 6, a mode is shown in which three data are selected and reproduced from the attention sound effect data 14a having the same reproduction length. However, the present embodiment is not limited to this, and the number of attention effect sound data 14a to be selected and the length of each attention effect sound data 14a may be different.

具体的には、例えば、図１２（ａ）に示す長さｔａのアテンション効果音が、同図（ｂ）に示すように、長さが同じｔ１の２つのアテンション効果音データ１４ａを組み合わせて生成されてもよいし、同図（ｃ）に示すように、長さが異なる３つのアテンション効果音データ１４ａを組み合わせて生成される態様であっても構わない。アテンション効果音については、第三者３がこの音に慣れて注意を引きつける効果が薄れないように、ランダムな印象を与えながら注意を引きつけることができれば、音色、音の高さ、音量、再生タイミング及び生成方法等が本実施形態で説明した例に限定されるものではない。 Specifically, for example, an attention sound effect having a length ta shown in FIG. 12A is generated by combining two attention effect sound data 14a having the same length t1 as shown in FIG. 12B. Alternatively, as shown in FIG. 5C, it may be generated by combining three attention sound effect data 14a having different lengths. For attention sound effects, if you can draw attention while giving a random impression so that the effect of attracting attention by third party 3 is not lost, tone, pitch, volume, playback timing The generation method and the like are not limited to the examples described in this embodiment.

上述してきたように、本実施形態によれば、アテンション効果音データ１４ａ及びベース効果音データ１４ｂの２種類の音を、保護すべき会話音声の音量に応じて再生することにより、第三者３が会話の内容を聞き取り難くすることができる。 As described above, according to the present embodiment, two kinds of sounds, the attention sound effect data 14a and the base sound effect data 14b, are reproduced according to the volume of the conversational sound to be protected. Can make it difficult to hear the content of the conversation.

また、アテンション効果音データ１４ａ及びベース効果音データ１４ｂを、会話音声の聞き取りを困難にする防聴音に加えて再生することで、会話を効果的に保護しつつ、防聴音に対する違和感を低減することができる。 Further, by reproducing the attention sound effect data 14a and the base sound effect data 14b in addition to the hearing-aid sound that makes it difficult to hear the conversational sound, the conversation is effectively protected and the sense of incongruity with the hearing-aid sound is reduced. Can do.

また、アテンション効果音データ１４ａを、音色、音の高さ、音量等を変更しながら再生することで聞く者にランダムな印象を与え、再生される音に慣れて防聴音に対する違和感を低減する効果が薄れることを防ぐことができる。 In addition, the attention sound effect data 14a is reproduced while changing the tone color, pitch, volume, etc., thereby giving a random impression to the listener, and getting used to the reproduced sound to reduce the sense of incongruity with the hearing loss sound. Can be prevented from fading.

以上のように、本発明は、会話音声が第三者に聞き取られないように音を再生して会話内容を保護しながら、第三者が再生した音に違和感や不快感を覚えることを防ぐために有用な技術である。 As described above, the present invention protects the content of a conversation by playing the sound so that the conversation voice is not heard by a third party, and prevents the sound played by the third party from feeling uncomfortable or uncomfortable. This is a useful technique.

１０音声処理装置
１１入力音解析部
１２防聴音生成部
１３効果音制御部
１４記憶部
１４ａアテンション効果音データ
１４ｂベース効果音データ
１４ｃ効果音出力条件
１５出力音制御部
２０マイク
３０スピーカ
４０出力音操作部
５０テーブル
５１パーティション
５２待合席 DESCRIPTION OF SYMBOLS 10 Sound processing apparatus 11 Input sound analysis part 12 Hearing loss sound generation part 13 Sound effect control part 14 Storage part 14a Attention sound effect data 14b Base effect sound data 14c Sound effect output condition 15 Output sound control part 20 Microphone 30 Speaker 40 Output sound operation Department 50 Table 51 Partition 52 Waiting seat

Claims

A microphone to collect the conversational voice,
An attention effect sound that shows a time axis waveform that decays after the sound pressure level rises within a few seconds, and a base effect sound that shows a time axis waveform whose sound pressure level changes more slowly than the attention effect sound are stored. A storage unit;
A speaker for playing one or both of the attention sound effect and the base sound effect toward a third party excluding the conversation party performing the conversation;
Control that reproduces the base sound effect by the speaker at least while the sound pressure level of the conversation voice collected by the microphone exceeds a first threshold value, and the sound pressure level of the conversation voice is a second And a control unit that controls one or both of the control to reproduce the attention sound effect every time the threshold value is exceeded.

The control unit stops reproduction of one or both of the base sound effect and the attention sound effect by the speaker when the sound pressure level of the conversational sound falls below a predetermined threshold value. The conversation protection system according to claim 1.

Each time the control unit reproduces the attention sound effect, at least one of the number of sounds that can be heard within a predetermined time, the timing at which the sound is heard, the tone color of the sound, and the pitch of the sound changes. The conversation protection system according to claim 1, wherein reproduction of the attention sound effect is controlled.

The conversation control system according to claim 1, wherein the control unit generates the attention sound effect using a sine wave.

A masking sound generator for generating a masking sound that masks the sound and makes it difficult to hear based on frequency characteristics of the sound collected by the microphone;
The conversation control system according to claim 1, wherein the control unit reproduces the masking sound generated by the masking sound generation unit.

The masking sound generation unit extracts a spectral envelope and a spectral fine structure from the sound collected by the microphone, and sets an inversion axis for switching the spectral envelope up and down around an axis extending in the frequency direction, Generating a modified spectral envelope by reversing the spectral envelope by reversing the spectral envelope around the inversion axis, and generating hearing loss sound that combines the modified spectral envelope and the spectral fine structure; The conversation protection system according to claim 5, wherein the masking sound is used.

The storage unit stores a plurality of the attention sound effects,
The conversation control system according to claim 1, wherein the control unit reproduces the attention sound effect selected at random from the storage unit.

In the storage unit, a combination of a plurality of attention sound effects set based on the tone color and pitch of each attention sound effect is stored as a setting table,
8. The conversation protection system according to claim 7, wherein the control unit reproduces an attention sound effect based on a combination randomly selected from the setting table.

The conversation control system according to claim 1, wherein the control unit randomly changes a volume at the time of reproducing each attention sound effect.

The conversation protection system according to claim 1, wherein the attention sound effect is a sound of a musical instrument.

A voice collection step that collects the conversational voice;
A base that reproduces a base sound effect that exhibits a time-axis waveform in which the sound pressure level gradually changes while the sound pressure level of the conversational sound collected in the sound collecting step exceeds the first threshold value. Time to decay after the sound pressure level rises for several seconds when the sound pressure level of the conversational sound collected in the sound effect reproducing step and the sound collecting step exceeds the second threshold value And a sound effect reproduction step including any one or both of the attention sound effect reproduction steps for reproducing the attention sound effect indicating the axis waveform.

The effect of stopping the reproduction of one or both of the base sound effect and the attention sound effect when the sound pressure level of the conversational sound collected in the sound collecting step falls below a predetermined threshold value. The speech protection method according to claim 11, further comprising a sound stop step.