JP4618351B2

JP4618351B2 - Communication system, communication device, and program for transmitting state of remote place

Info

Publication number: JP4618351B2
Application number: JP2008223903A
Authority: JP
Inventors: 達也入山
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2008-09-01
Filing date: 2008-09-01
Publication date: 2011-01-26
Anticipated expiration: 2023-09-10
Also published as: JP2008301529A

Description

本発明は、離れた場所における様子を伝達する通信システム、通信装置およびプログラムに関する。 The present invention relates to a communication system, a communication apparatus, and a program for transmitting a state in a remote place.

家庭内や会社等において、我々は通常、同じ部屋や近くの部屋にいる他の家族やスタッフ等の気配を感じながら生活している。そのような気配は、我々が意識するか否かにかかわらず、日常生活において重要な役割を果たしている。例えば、会社において、上司は隣室で仕事をしている部下の気配によって、その部下が現在どの程度多忙であるかをある程度察することができる。また、お年寄りと同居している家族は、例えば、隣室の気配によって、お年寄りに何らかの異常がないかを知ることができる。 In homes and offices, we usually live while feeling the signs of other family members and staff in the same room or nearby rooms. Such signs play an important role in everyday life, whether or not we are conscious. For example, in a company, a boss can sense to some extent how busy his / her subordinate is by looking at his / her subordinate working in the next room. Moreover, the family living with the elderly can know whether there are any abnormalities in the elderly by, for example, the sign in the adjacent room.

近年、インターネット等の通信技術の進歩と普及に伴い、従業員に在宅勤務を許可する会社が増えてきている。また、互いに離れて暮らす家族も増えてきている。そのような状況において、例えば、先の例では、上司は在宅勤務をしている部下がどの程度多忙であるかを察することが難しい。また、後の例では、お年寄りと離れて暮らしている家族は、お年寄りに何らかの異常があった場合であっても、すぐさまその異常を知ることができない。 In recent years, with the advancement and spread of communication technologies such as the Internet, an increasing number of companies permit employees to work from home. An increasing number of families are living apart from each other. In such a situation, for example, in the previous example, it is difficult for the boss to detect how busy the subordinate who is working from home. In the later example, a family living away from the elderly cannot immediately know the abnormality even if the elderly have some abnormality.

上記の状況に対し、在宅勤務を行っている従業員の自宅や家族と離れて暮らすお年寄りの自宅にマイクを設置して、それらの自宅の様子を示す音を上司や家族のいる場所に送信する方法が考えられる。その場合、上司や家族は音により、従業員やお年寄りの様子をリアルタイムに知ることができる。しかしながら、上記の音によるモニタにおいては、モニタされる側の者のプライバシーが侵害され、不都合である。 In response to the above situation, a microphone is installed at the home of an employee who is working from home or an elderly home away from his / her family, and a sound indicating the state of those homes is sent to the location of the boss or family A way to do this is conceivable. In that case, the boss and family can know the state of employees and the elderly in real time by sound. However, the above-described sound monitoring is inconvenient because the privacy of the person being monitored is infringed.

上記の問題点を克服するために、速度センサ、タッチセンサ、赤外線ＣＣＤセンサ等により人の動き等を検出し、その検出結果を送信する第１の通信端末と、この第１の通信端末からセンサによる検出結果を受信し、受信した検出結果に応じて予め設定された音を発する等の動作を行う第２の通信端末とを組み合わせた通信システムが提案されている（例えば、特許文献１参照。）。
特開２００２−３１４７０７号公報 In order to overcome the above-described problems, a first communication terminal that detects a human movement or the like by a speed sensor, a touch sensor, an infrared CCD sensor or the like and transmits the detection result, and a sensor from the first communication terminal A communication system is proposed that combines a second communication terminal that performs an operation such as receiving a detection result obtained by the above-described method and generating a preset sound according to the received detection result (see, for example, Patent Document 1). ).
JP 2002-314707 A

上述した従来技術による通信システムのうち、速度センサもしくは赤外線ＣＣＤセンサを用いた構成のものは、それらのセンサが一般的に高価である上に、モニタが可能となる範囲が狭いため、例えば自宅全体をモニタしたい場合には、複数のセンサを自宅内の様々な場所に配置する必要があった。また、上述した従来技術による通信システムのうち、タッチセンサを用いた構成のものは、モニタされる側の者がタッチセンサに触れることがない限り、何らかの情報がモニタする側の者に伝達されることがなく、遠隔地の様子を常時モニタする、という目的には利用できなかった。 Among the communication systems according to the above-described prior art, those using a speed sensor or an infrared CCD sensor are generally expensive and have a narrow range that can be monitored. When it is desired to monitor a plurality of sensors, it is necessary to arrange a plurality of sensors at various places in the home. In addition, in the communication system according to the related art described above, in the configuration using the touch sensor, some information is transmitted to the person on the monitor side unless the person on the monitor side touches the touch sensor. It was not possible to use it for the purpose of constantly monitoring the situation in remote areas.

本発明は、上記の状況に鑑みてなされたものであり、プライバシーの侵害を行うことなく、安価かつ容易に、リアルタイムに遠隔地における様子を伝達することを可能とする通信システム、通信装置およびプログラムを提供することを目的とする。 The present invention has been made in view of the above situation, and a communication system, a communication device, and a program capable of transmitting a state in a remote place in real time at a low cost and without infringing on privacy. The purpose is to provide.

上記課題を達成するために、本発明は、第１の場所に配置され、前記第１の場所において収音した第１の音に基づいて音情報を生成する音情報生成手段と、前記音情報生成手段が生成した音情報を送信する第１送信手段とを備える第１の通信装置と、前記第１の場所とは異なる第２の場所に配置され、通信回線を介して前記第１の通信装置から音情報を受信する第１受信手段と、受信した前記音情報に基づいて第２の音を生成して出力する出力手段とを少なくとも備える第２の通信装置と、前記第１の通信装置、前記第２の通信装置、前記通信回線内に設けられた情報処理装置、のいずれかに設けられ、前記音情報生成手段により生成された前記音情報を、当該音情報に含まれる会話の内容を伝達しない別の音を表す曖昧音情報に変換する曖昧音変換手段とを備え、前記出力手段が出力する前記第２の音は、前記曖昧音変換手段が変換した曖昧音情報が表す音であることを特徴とする通信システムを提供する。 To achieve the above object, the present invention is placed at a first location, and sound information generation means for generating sound information based on the first sound FREE iterator sound collection in the first place, the A first communication device including first transmission means for transmitting sound information generated by the sound information generation means; and a first communication device arranged in a second location different from the first location, and the first communication device via a communication line. A second communication device comprising at least a first receiving means for receiving sound information from the communication device; and an output means for generating and outputting a second sound based on the received sound information ; communication device, said second communication device, the information processing device provided in said communication line, provided on either of, the sound information generated by the sound information generation unit, included in the sound information conversation 曖converting obscure tone information of indicating different sounds that do not convey the And a sound conversion unit, the second sound the output means outputs provides a communication system in which the ambiguous sound converting means and said sound der Rukoto represented ambiguous sound information converted.

上記の通信システムによれば、第１の場所における生活音が第２の場所に伝達される際、例えば会話等の情報が伝達されないため、第１の場所にいる者のプライバシーが保護される等の効果が得られる。 According to the above communication system, when the sound of life in the first place is transmitted to the second place, for example, information such as conversation is not transmitted, so that the privacy of the person in the first place is protected, etc. The effect is obtained.

さらに、本発明は、音情報を入力する入力手段と、前記入力手段により入力された音情報を、当該音情報に含まれる会話の内容を伝達しない別の音を表す曖昧音情報に変換する変換手段と、前記変換手段により変換された曖昧音情報を出力する出力手段とを備え、さらに、他の通信装置から音情報を受信し当該音情報を前記入力手段に対し出力する受信手段と、前記出力手段により出力された曖昧音情報を他の通信装置に送信する送信手段との少なくとも一方を備えることを特徴とする通信装置を提供する。 Furthermore, the present invention is conversion for converting an input means for inputting sound information, sound information input by the input means, the ambiguous sound information representing another sound does not transmit the contents of the conversation included in the sound information Means, and output means for outputting the ambiguous sound information converted by the conversion means, and further receiving means for receiving sound information from another communication device and outputting the sound information to the input means, and There is provided a communication apparatus comprising at least one of transmission means for transmitting ambiguous sound information output by an output means to another communication apparatus.

この通信装置によれば、例えば会話等の情報が伝達されないように加工された生活音が生成されるため、生活音の発生源にいる者のプライバシーが保護される等の効果が得られる。 According to this communication apparatus, for example, a living sound processed so that information such as conversation is not transmitted is generated, and thus an effect such as protecting the privacy of a person who is at the source of the living sound can be obtained.

上記の通信装置において、音情報が有するいずれの部分の情報を欠落もしくは隠蔽するかを示す指示情報を入力する指示情報入力手段を備え、前記生成手段は、前記指示情報入力手段により入力された指示情報に基づき、曖昧音情報を生成する構成としてもよい。 In the above communication apparatus, the communication device further includes instruction information input means for inputting instruction information indicating which part of the sound information is missing or concealed, and the generation means includes the instruction input by the instruction information input means. It may be configured to generate ambiguous sound information based on the information.

この通信装置によれば、ユーザは生活音に含まれる情報のうち、いずれの情報を伝達しないようにするかを変更できる。 According to this communication apparatus, the user can change which information is not transmitted among the information included in the living sound.

上記の通信装置において、基準となる音を示す基準音情報を記憶する記憶手段と、前記生成手段が曖昧音情報の生成に用いる音情報が示す音の物理的特徴と、前記記憶手段により記憶されている基準音情報が示す音の物理的特徴との類似度を示す指標を算出する算出手段と、前記算出手段により算出された指標を用いて、前記音情報が示す音と前記基準音情報が示す音とが類似しているか否かを判定する判定手段とを備え、前記生成手段は、前記判定手段による判定の結果に基づき、曖昧音情報の生成の開始、曖昧音情報の生成の終了および音情報の加工の方法の変更の少なくとも１つを行う構成としてもよい。 In the above communication device, storage means for storing reference sound information indicating a reference sound, physical characteristics of the sound indicated by the sound information used by the generation means for generating ambiguous sound information, and the storage means are stored. Calculating means for calculating an index indicating the similarity to the physical feature of the sound indicated by the reference sound information, and using the index calculated by the calculating means, the sound indicated by the sound information and the reference sound information are Determination means for determining whether or not the sound to be shown is similar, the generation means based on a result of determination by the determination means, start of generation of ambiguous sound information, end of generation of ambiguous sound information and It is good also as a structure which performs at least 1 of the change of the processing method of sound information.

この通信装置によれば、生活音に含まれる音情報のうち、特定の人物の会話や特定の言葉等の、限定された情報のみを伝達することのない音情報、もしくはそれら限定された情報のみを伝達する音情報が生成される。 According to this communication device, only sound information that does not transmit only limited information, such as a conversation of a specific person or specific words, among sound information included in the living sound, or only those limited information Is generated.

上記の通信装置において、前記生成手段は、音情報が示す音が含む特定の周波数成分を除去もしくは低減するフィルタ処理を行うことにより、曖昧音情報を生成する構成としてもよい。 In the communication apparatus, the generation unit may generate ambiguous sound information by performing a filtering process that removes or reduces a specific frequency component included in the sound indicated by the sound information.

この通信装置によれば、生活音から、例えば会話等の情報が含まれる周波数帯の成分を除去することにより、会話等の情報が伝達されないように加工された生活音が生成される。 According to this communication apparatus, a life sound processed so that information such as conversation is not transmitted is generated by removing a component of a frequency band including information such as conversation from the life sound.

上記の通信装置において、前記生成手段により生成された曖昧音情報が示す音の音高を一定量だけ移動する音高移動手段を備える構成としてもよい。 The communication device may include a pitch moving unit that moves a pitch of a sound indicated by the ambiguous sound information generated by the generating unit by a certain amount.

この通信装置によれば、例えば通信可能な音の周波数帯が限られている公衆電話回線網等において、生活音に含まれる任意の周波数帯の成分を送受信することが可能となる。 According to this communication apparatus, for example, in a public telephone line network where the frequency band of sound that can be communicated is limited, it is possible to transmit / receive components in an arbitrary frequency band included in the living sound.

上記の通信装置において、曖昧音情報の生成の開始を示す指示情報を入力する指示情報入力手段を備え、前記生成手段は、前記指示情報入力手段により指示情報が入力された場合、曖昧音情報の生成を開始する構成としてもよい。 The above communication device further includes instruction information input means for inputting instruction information indicating the start of generation of ambiguous sound information, and the generation means receives the ambiguous sound information when the instruction information is input by the instruction information input means. It is good also as a structure which starts production | generation.

この通信装置によれば、ユーザの希望する時に、通常の生活音の通信から、特定の情報を伝達しない音の通信への切り替えを行うことができる。 According to this communication device, it is possible to switch from normal life sound communication to sound communication that does not transmit specific information when the user desires.

上記の通信装置において、曖昧音情報の生成の終了を示す指示情報を入力する指示情報入力手段を備え、前記生成手段は、前記指示情報入力手段により指示情報が入力された場合、曖昧音情報の生成を終了する構成としてもよい。 The above communication device further includes instruction information input means for inputting instruction information indicating the end of generation of ambiguous sound information, and the generation means receives the ambiguous sound information when the instruction information is input by the instruction information input means. It is good also as a structure which complete | finishes production | generation.

この通信装置によれば、ユーザの希望する時に、特定の情報を伝達しない音の通信から、通常の生活音の通信への切り替えを行うことができる。 According to this communication device, it is possible to switch from sound communication that does not transmit specific information to normal life sound communication when the user desires.

上記の通信装置において、前記指示情報を、通信回線を介して他の通信装置から受信する指示情報受信手段を備え、前記指示情報入力手段は、前記指示情報受信手段により受信された指示情報を入力する構成としてもよい。 The communication apparatus includes an instruction information receiving unit that receives the instruction information from another communication apparatus via a communication line, and the instruction information input unit inputs the instruction information received by the instruction information receiving unit. It is good also as composition to do.

この通信装置によれば、音情報の加工が行われる場所とは異なる場所にいるユーザであっても、生活音に含まれる情報のうち、いずれの情報を伝達しないようにするかを変更できる。 According to this communication apparatus, even a user who is in a place different from the place where the sound information is processed can change which information is not transmitted among the information included in the life sound.

上記の通信装置において、電気、光、温度、音、圧力の少なくとも１つの変化を検出し、前記変化を検出した場合に指示情報を出力するセンサを備え、前記指示情報入力手段は、前記センサにより出力された指示情報を入力する構成としてもよい。 The communication apparatus includes a sensor that detects a change in at least one of electricity, light, temperature, sound, and pressure, and outputs instruction information when the change is detected, and the instruction information input unit includes the sensor The output instruction information may be input.

この通信装置によれば、ユーザが手動で操作することなく、センサにより、通常の生活音の通信と特定の情報を伝達しない音の通信との間の切り替えが行われる。 According to this communication device, switching between normal life sound communication and sound communication that does not transmit specific information is performed by the sensor without manual operation by the user.

上記の通信装置において、電気、光、温度、音、圧力の少なくとも１つの変化を検出するセンサを備え、前記受信手段および前記送信手段の少なくとも一方は、前記センサにより変化が検出された場合、音情報の受信の開始または終了、もしくは曖昧音情報の送信の開始または終了を行う構成としてもよい。 The communication device includes a sensor that detects a change in at least one of electricity, light, temperature, sound, and pressure, and at least one of the receiving unit and the transmitting unit has a sound when a change is detected by the sensor. It may be configured to start or end reception of information, or start or end transmission of ambiguous sound information.

この通信装置によれば、ユーザが手動で操作することなく、センサにより、通信の開始もしくは終了が行われる。 According to this communication device, the start or end of communication is performed by the sensor without manual operation by the user.

上記の通信装置において、前記生成手段が曖昧音情報の生成に用いる音情報は音を標本化および量子化して得られるデジタル音データであり、前記生成手段は、音情報のサンプリング周波数を下げることにより曖昧音情報を生成する構成としてもよい。 In the above communication apparatus, the sound information used by the generating means for generating the ambiguous sound information is digital sound data obtained by sampling and quantizing the sound, and the generating means reduces the sampling frequency of the sound information. It is good also as a structure which produces ambiguous sound information.

この通信装置によれば、生活音に含まれる会話等の情報を伝達することのない音情報が容易に生成される。 According to this communication device, sound information that does not transmit information such as conversation included in the living sound can be easily generated.

上記の通信装置において、前記生成手段が曖昧音情報の生成に用いる音情報は音を標本化および量子化して得られるデジタル音データであり、前記生成手段は、音情報の量子化ビット数を下げることにより曖昧音情報を生成する構成としてもよい。 In the above communication apparatus, the sound information used by the generating means for generating the ambiguous sound information is digital sound data obtained by sampling and quantizing the sound, and the generating means lowers the number of quantization bits of the sound information. It is good also as a structure which produces ambiguous sound information by this.

この通信装置によっても、生活音に含まれる会話等の情報を伝達することのない音情報が容易に生成される。 Also with this communication device, sound information that does not transmit information such as conversation included in the living sound can be easily generated.

上記の通信装置において、音情報が示す音の音量、音高および音質の少なくとも１つを測定する測定手段を備え、前記生成手段は、前記測定手段による測定の結果を用いて、曖昧音情報を生成する構成としてもよい。 In the above communication device, the communication device further includes a measurement unit that measures at least one of a volume, a pitch, and a sound quality of sound indicated by the sound information, and the generation unit uses the result of the measurement by the measurement unit to obtain the ambiguous sound information. It is good also as composition to generate.

この通信装置によれば、生活音に含まれる情報のうち、音量、音高もしくは音質といった限られた情報のみを用いて音情報が生成されるので、生活音に含まれる会話等の内容を伝達することのない音情報が生成される。 According to this communication device, sound information is generated using only limited information such as volume, pitch or sound quality among the information included in the living sound, so that the contents of conversations and the like included in the living sound are transmitted. Sound information that does not occur is generated.

さらに、本発明は、音情報を入力する入力処理と、前記入力処理において入力された音情報に対し、当該音情報が有する情報の一部を欠落もしくは隠蔽するための加工を施すことにより曖昧音情報を生成する生成処理と、前記生成処理により生成された曖昧音情報を出力する出力処理とをコンピュータに実行させ、さらに、前記入力処理において用いられる音情報を他の通信装置から受信する受信処理と、前記出力処理において出力された曖昧音情報を他の通信装置に送信する送信処理との少なくとも一方を前記コンピュータに実行させることを特徴とするプログラムを提供する。 Furthermore, the present invention provides an ambiguous sound by performing input processing for inputting sound information and processing for missing or concealing part of the information included in the sound information for the sound information input in the input processing. A reception process for causing a computer to execute a generation process for generating information and an output process for outputting ambiguous sound information generated by the generation process, and further receiving sound information used in the input process from another communication device And a program for causing the computer to execute at least one of transmission processing for transmitting the ambiguous sound information output in the output processing to another communication device.

上記のプログラムによれば、例えば会話等の情報が伝達されないように加工された生活音がコンピュータにより生成される。 According to the above program, a life sound processed so that information such as conversation is not transmitted is generated by the computer.

本発明にかかる通信装置およびプログラムによれば、ユーザは、離れた場所における様子を、その場所にいる者のプライバシーを侵害することなく、リアルタイムに知ることができるシステムを安価かつ容易に構築することができる。 According to the communication device and the program according to the present invention, a user can easily and inexpensively construct a system capable of knowing a situation in a distant place in real time without infringing on the privacy of a person in the place. Can do.

［１．第１実施形態］
［１．１．通信システムの構成］
図１は、本発明の第１実施形態における通信システム１の構成を示した図である。通信システム１は、従業員Ａの自宅Ｘに配置されたマイク１１、端末装置１２および電話機１３ｘと、従業員Ａの上司Ｂが勤務している会社Ｙに配置された電話機１３ｙ、アンプ１５およびスピーカ１６とを主要な構成要素として有している。なお、以下の説明においては、通信システム１の構成要素間は全て有線接続されているものとするが、通信システム１の構成要素間の一部もしくは全てが無線接続されていてもよい。 [1. First Embodiment]
[1.1. Configuration of communication system]
FIG. 1 is a diagram showing a configuration of a communication system 1 in the first embodiment of the present invention. The communication system 1 includes a microphone 11, a terminal device 12, and a telephone set 13x that are arranged at a home X of an employee A, and a telephone set 13y, an amplifier 15, and a speaker that are arranged at a company Y where the boss B of the employee A is working. 16 as main components. In the following description, the components of the communication system 1 are all connected by wire, but some or all of the components of the communication system 1 may be wirelessly connected.

自宅Ｘにおける電話機１３ｘおよび会社Ｙにおける電話機１３ｙは、公衆電話回線網１４を介して音声通信を行う一般的な電話機能を備えた電話機である。また、電話機１３ｘおよび１３ｙは、送受話器以外の音声入出力手段として、音声入力部と音声出力部とを有している。ここで、音声出力部は、公衆電話回線網１４を介して受信される通信相手からの音声信号を外部機器へ出力する手段であり、音声入力部は外部機器から音声信号を受け取り、公衆電話回線網１４を介して通信相手に送る手段である。なお、以下の説明において、「音声」は人の声に限られず、広く音一般を意味するものとする。 The telephone 13x at home X and the telephone 13y at company Y are telephones having a general telephone function for performing voice communication via the public telephone line network 14. The telephones 13x and 13y have a voice input unit and a voice output unit as voice input / output means other than the handset. Here, the voice output unit is means for outputting a voice signal from a communication partner received via the public telephone network 14 to an external device, and the voice input unit receives the voice signal from the external device, It is means for sending to a communication partner via the network 14. In the following description, “speech” is not limited to a human voice, and broadly means sound in general.

会社Ｙにおいて、アンプ１５は、一般的なオーディオ用アンプであり、増幅部１５１と操作部１５２とを有する。ここで、増幅部１５１は、電話機１３ｙの音声出力部から出力される音声信号のレベルを調整し、スピーカ１６に出力する。また、操作部１５２は、ユーザの操作を受け付けるキーパッド等を有している。ユーザは操作部１５２を用いた操作により、増幅部１５１の電源のＯＮ／ＯＦＦおよび増幅部１５１の出力レベルの調整を指示することができる。スピーカ１６は、一般的なスピーカであり、増幅部１５１から出力される音声信号を音として出力する。 In the company Y, the amplifier 15 is a general audio amplifier, and includes an amplification unit 151 and an operation unit 152. Here, the amplifying unit 151 adjusts the level of the audio signal output from the audio output unit of the telephone set 13 y and outputs it to the speaker 16. The operation unit 152 includes a keypad that accepts user operations. The user can instruct to turn on / off the power of the amplification unit 151 and adjust the output level of the amplification unit 151 by an operation using the operation unit 152. The speaker 16 is a general speaker and outputs the audio signal output from the amplification unit 151 as sound.

自宅Ｘにおいて、マイク１１は、音声をアナログ音声信号（以下、「音声信号」と呼ぶ）に変換して出力する一般的なマイクロフォンである。端末装置１２は、通信システム１において本発明の特徴を実現するための中心的な構成要素である。端末装置１２は、マイク１１から入力される音声信号を加工し、入力された音声信号に含まれる情報の一部を取り除いた後、加工後の音声信号を電話機１３ｘの音声入力部に出力する装置である。この端末装置１２は、音声加工部１２１と操作部１２２を備えている。 At home X, the microphone 11 is a general microphone that converts voice into an analog voice signal (hereinafter referred to as “voice signal”) and outputs the analog voice signal. The terminal device 12 is a central component for realizing the features of the present invention in the communication system 1. The terminal device 12 processes an audio signal input from the microphone 11, removes a part of information included in the input audio signal, and then outputs the processed audio signal to the audio input unit of the telephone set 13x. It is. The terminal device 12 includes a voice processing unit 121 and an operation unit 122.

音声加工部１２１は、例えば以下のようなフィルタ等を１以上備えている。
（ａ）カットオフ周波数４００Ｈｚのローパスフィルタ１２１１。
（ｂ）２倍音を生成するピッチシフタ１２１２。
（ｃ）カットオフ周波数３．５ｋＨｚのハイパスフィルタ１２１３。
（ｄ）１／２倍音を生成するピッチシフタ１２１４。
（ｅ）ノイズ低減フィルタ１２１５。
（ｆ）増幅部１２１６。
図１には、これらの全ての要素を用いた音声加工部１２１の構成例が示されている。しかし、これはあくまでも一例であり、これらの要素の一部を欠いた構成にすることを妨げるものではない。 The voice processing unit 121 includes, for example, one or more filters as described below.
(A) A low-pass filter 1211 having a cutoff frequency of 400 Hz.
(B) A pitch shifter 1212 that generates a second overtone.
(C) A high-pass filter 1213 having a cutoff frequency of 3.5 kHz.
(D) A pitch shifter 1214 that generates a ½ overtone.
(E) Noise reduction filter 1215.
(F) Amplifying unit 1216.
FIG. 1 shows a configuration example of the sound processing unit 121 using all these elements. However, this is merely an example, and does not prevent the configuration lacking some of these elements.

なお、上記に例示したフィルタ等は、アナログ回路を用いたものであってもよいし、デジタル回路を用いたものであってもよい。なお、デジタル回路を用いたフィルタ等により音声加工部１２１を構成する場合、マイク１１から出力される音声信号をデジタル信号に変換するＡ／Ｄ（ＡｎａｌｏｇｔｏＤｉｇｉｔａｌ）コンバータと、音声加工部１２１から得られるデジタル信号をアナログ音声信号に変換して電話機１３ｘの音声入力部に出力するＤ／Ａ（ＤｉｇｉｔａｌｔｏＡｎａｌｏｇ）コンバータを端末装置１２に付加すればよい。 Note that the filter and the like exemplified above may be one using an analog circuit or one using a digital circuit. When the sound processing unit 121 is configured by a filter or the like using a digital circuit, an A / D (Analog to Digital) converter that converts a sound signal output from the microphone 11 into a digital signal and the sound processing unit 121 are obtained. A digital to analog (D / A) converter that converts a digital signal to be converted into an analog voice signal and outputs the analog voice signal to the voice input unit of the telephone set 13x may be added to the terminal device 12.

図１に示す音声加工部１２１において、ローパスフィルタ１２１１およびハイパスフィルタ１２１３は、マイク１１から得られる音声信号から人間の声の周波数帯（４００Ｈｚ〜３．５ｋＨｚ）の信号を除いた信号を得るための手段を構成している。このように、マイク１１により集音された音声から何らかの情報が除去もしくは隠蔽された音声を、以下、「曖昧音声」と呼ぶ。また、曖昧音声を示す音声信号を「曖昧音声信号」と呼ぶ。 In the audio processing unit 121 shown in FIG. 1, the low-pass filter 1211 and the high-pass filter 1213 are for obtaining a signal obtained by excluding a signal of a human voice frequency band (400 Hz to 3.5 kHz) from an audio signal obtained from the microphone 11. Means. In this way, the sound from which some information is removed or concealed from the sound collected by the microphone 11 is hereinafter referred to as “ambiguous sound”. An audio signal indicating ambiguous speech is referred to as an “ambiguous speech signal”.

ピッチシフタ１２１２は、マイク１１から出力された音声信号のうちローパスフィルタ１２１１を通過した４００Ｈｚ以下の成分の周波数を２倍に変換して出力する。また、ピッチシフタ１２１４は、マイク１１から出力された音声信号のうちハイパスフィルタ１２１３を通過した３．５ｋＨｚ以上の成分の周波数を１／２倍に変換して出力する。 The pitch shifter 1212 converts the frequency of a component of 400 Hz or less that has passed through the low-pass filter 1211 out of the audio signal output from the microphone 11 and outputs the converted signal. Further, the pitch shifter 1214 converts the frequency of the component of 3.5 kHz or higher that has passed through the high-pass filter 1213 out of the audio signal output from the microphone 11 to 1/2 and outputs the converted signal.

ノイズ低減フィルタ１２１５には、ピッチシフタ１２１２および１２１３の各出力信号が入力される。このノイズ低減フィルタ１２１５は、マイク１１による集音時に発生するノイズ成分やローパスフィルタ１２１１、ハイパスフィルタ１２１３等のフィルタ処理により目立つようになるノイズ成分を入力信号から除去する。ピッチシフタ１２１２および１２１４の各出力信号は、このノイズ低減フィルタ１２１５を通過させることにより、聞き手にとって不快感の少ない音を示す音声信号となる。 Each output signal of pitch shifters 1212 and 1213 is input to noise reduction filter 1215. The noise reduction filter 1215 removes noise components generated during sound collection by the microphone 11 and noise components that become noticeable by filter processing such as the low-pass filter 1211 and the high-pass filter 1213 from the input signal. The output signals of the pitch shifters 1212 and 1214 are passed through the noise reduction filter 1215, thereby becoming sound signals indicating sounds with less discomfort for the listener.

増幅部１２１６は、ノイズ低減フィルタ１２１５の出力信号を、公衆電話回線網１４のダイナミックレンジ等に応じた適切なレベルの音声信号に増幅して出力する。 The amplifying unit 1216 amplifies the output signal of the noise reduction filter 1215 to an audio signal having an appropriate level according to the dynamic range of the public telephone line network 14 and outputs the audio signal.

以上説明した構成はあくまでも例示であり、マイク１１により集音された音声を示す音声信号に何らかの加工を加えることにより、集音された音声に含まれる情報の一部が伝達されることを妨げるものであれば、他の如何なるフィルタ等およびそれらの組合せであっても、音声加工部１２１の構成部として利用可能である。例えば、音声加工部１２１は、入力される音声信号を１００ｍｓ程度保持した後に出力するディレイを備え、ディレイの出力と入力される音声信号の差分を取ることにより、音声信号により示される音声を不明瞭にするように構成されていてもよい。また、例えば、音声加工部１２１は正弦波等の音声信号を生成するトーンジェネレータを備え、トーンジェネレータが生成する音声信号を、入力された音声信号に応じて読み出すことにより変調した後、出力するようにしてもよい。 The above-described configuration is merely an example, and some processing is added to the audio signal indicating the sound collected by the microphone 11 to prevent a part of the information included in the collected sound from being transmitted. If so, any other filter or combination thereof can be used as a component of the audio processing unit 121. For example, the audio processing unit 121 includes a delay that is output after holding the input audio signal for about 100 ms, and the difference between the output of the delay and the input audio signal is taken to obscure the audio indicated by the audio signal. You may be comprised so that it may become. In addition, for example, the audio processing unit 121 includes a tone generator that generates an audio signal such as a sine wave, and the audio signal generated by the tone generator is modulated by reading out according to the input audio signal and then output. It may be.

操作部１２２は、キーパッドを備え、ユーザの操作を受け付ける。ユーザは操作部１２２を用いた操作により、音声加工部１２１に含まれるフィルタ等のパラメータ、増幅部１２１６の出力レベル等の変更や、音声加工部１２１の電源のＯＮ／ＯＦＦ等を指示することができる。 The operation unit 122 includes a keypad and accepts user operations. The user can instruct to change parameters such as a filter included in the audio processing unit 121, an output level of the amplification unit 1216, and to turn on / off the power of the audio processing unit 121 by operating the operation unit 122. it can.

［１．２．通信システムの動作］
通信システム１を利用するにあたり、従業員Ａまたは上司Ｂは、電話機１３ｘまたは電話機１３ｙを用いて、相手方の電話番号をダイヤルし、電話機１３ｘと電話機１３ｙとの間に音声通信接続を確立する。続いて、従業員Ａは端末装置１２の操作部１２２を操作し、音声加工部１２１の電源をＯＮする。その結果、マイク１１により集音された自宅Ｘにおける物音や声等（以下、「生活音」と呼ぶ）が音声加工部１２１により加工され曖昧音声信号に変換された後、電話機１３ｘに出力されるようになる。従業員Ａは、電話機１３ｙに出力される曖昧音声信号が示す曖昧音声を電話機１３ｘの送受話器のスピーカから聞きながら、端末装置１２の操作部１２２を操作して曖昧音声の音量を調整する。電話機１３ｘは、端末装置１２から入力される曖昧音声信号を公衆電話回線網１４を介して電話機１３ｙに送信する。 [1.2. Operation of communication system]
In using the communication system 1, the employee A or the supervisor B uses the telephone set 13x or the telephone set 13y to dial the telephone number of the other party and establishes a voice communication connection between the telephone set 13x and the telephone set 13y. Subsequently, the employee A operates the operation unit 122 of the terminal device 12 to turn on the sound processing unit 121. As a result, a sound, voice or the like (hereinafter referred to as “living sound”) at home X collected by the microphone 11 is processed by the voice processing unit 121 and converted into an ambiguous voice signal, and then output to the telephone set 13x. It becomes like. The employee A adjusts the volume of the ambiguous voice by operating the operation unit 122 of the terminal device 12 while listening to the ambiguous voice indicated by the ambiguous voice signal output to the telephone 13y from the speaker of the handset of the telephone 13x. The telephone set 13x transmits an ambiguous voice signal input from the terminal device 12 to the telephone set 13y via the public telephone line network.

なお、従業員Ａが曖昧音声の音量を確認する方法は、電話機１３ｘの送受話器のスピーカから発せられる曖昧音声を聞く方法に限られない。例えば、端末装置１２に表示部を設け、表示部に端末装置１２から出力される曖昧音声信号のレベルを示すインジケータ等の表示を行わせてもよい。 Note that the method of confirming the volume of the ambiguous voice by the employee A is not limited to the method of listening to the ambiguous voice emitted from the speaker of the handset of the telephone set 13x. For example, a display unit may be provided in the terminal device 12, and an indicator or the like indicating the level of an ambiguous voice signal output from the terminal device 12 may be displayed on the display unit.

会社Ｙでは、以上のようにして電話機１３ｘから送信された曖昧音声信号が電話機１３ｙによって受信され、電話機１３ｙの送受話器のスピーカから出力される。このとき、上司Ｂは、曖昧音声が送受話器のスピーカから出力されるのを確認すると、アンプ１５の操作部１５２を操作し、アンプ１５の増幅部１５１の電源をＯＮする。その結果、電話機１３ｙが端末装置１２から受け取った曖昧音声信号はアンプ１５に入力され、アンプ１５により増幅された後、スピーカ１６に出力される。従って、上司Ｂは曖昧音声をスピーカ１６からも聞くことができるようになる。上司Ｂはスピーカ１６から聞こえてくる曖昧音声をモニタし、アンプ１５の操作部１５２を操作して、スピーカ１６から発せられる曖昧音声の音量を調整する。 In the company Y, the ambiguous voice signal transmitted from the telephone set 13x as described above is received by the telephone set 13y and output from the speaker of the handset of the telephone set 13y. At this time, upon confirming that the ambiguous voice is output from the speaker of the handset, the superior B operates the operation unit 152 of the amplifier 15 and turns on the power of the amplification unit 151 of the amplifier 15. As a result, the ambiguous audio signal received by the telephone set 13y from the terminal device 12 is input to the amplifier 15, amplified by the amplifier 15, and then output to the speaker 16. Therefore, the superior B can hear the ambiguous voice from the speaker 16. The boss B monitors the ambiguous sound heard from the speaker 16 and operates the operation unit 152 of the amplifier 15 to adjust the volume of the ambiguous sound emitted from the speaker 16.

従業員Ａおよび上司Ｂは以上の操作を終了すると、それぞれ電話機１３ｘおよび電話機１３ｙの送受話器を置く。この場合、電話機１３ｘおよび電話機１３ｙには、それぞれ電源ＯＮ状態の端末装置１２およびアンプ１５が接続されているため、電話機１３ｘと電話機１３ｙとの間の音声通信接続は切断されない。従って、その後、従業員Ａもしくは上司Ｂが操作部１２２もしくは操作部１５２を操作して、端末装置１２もしくはアンプ１５の電源をＯＦＦするまでの間、自宅Ｘに配置されたマイク１１により集音される音声により生成される曖昧音声は、常時、会社Ｙに配置されたスピーカ１６から発音される。 When employee A and boss B complete the above operations, they place the handsets of telephone 13x and telephone 13y, respectively. In this case, since the terminal device 12 and the amplifier 15 in the power ON state are connected to the telephone set 13x and the telephone set 13y, the voice communication connection between the telephone set 13x and the telephone set 13y is not disconnected. Therefore, after that, until the employee A or the boss B operates the operation unit 122 or the operation unit 152 to turn off the power of the terminal device 12 or the amplifier 15, the sound is collected by the microphone 11 disposed in the home X. The ambiguous voice generated by the voice is always emitted from the speaker 16 arranged in the company Y.

以上のように、通信システム１によれば、上司Ｂは、離れた場所において従業員Ａの動作に伴い発せられる物音等をリアルタイムに耳にすることができる。その結果、上司Ｂは従業員Ａが今、何かの作業中であるか、睡眠中であるか等の様子を、大まかに知ることができる。しかしながら、端末装置１２が備える音声加工部１２１による音声信号の加工の結果、自宅Ｘにおける生活音に含まれる従業員Ａの声は、会社Ｙにおいては発音されないか、もしくは一部発音されたとしてもその声が示す会話の内容を伝達する程明瞭ではない。従って、従業員Ａのプライバシーが侵害されることはなく、また従業員Ａが誰かと会話をした場合であっても、上司Ｂがその従業員Ａと誰かとの間の会話によって仕事の邪魔をされる等の不都合がない。 As described above, according to the communication system 1, the boss B can hear in real time the sound generated by the operation of the employee A at a remote location. As a result, the superior B can roughly know whether the employee A is currently working or sleeping. However, as a result of processing the audio signal by the audio processing unit 121 included in the terminal device 12, the voice of the employee A included in the living sound at home X is not pronounced in the company Y or even if it is partially pronounced. It is not clear enough to convey the content of the conversation that the voice shows. Therefore, the privacy of employee A will not be violated, and even if employee A has a conversation with someone, supervisor B will not interfere with his work due to the conversation between employee A and someone. There is no inconvenience such as being done.

以下、上記のように、互いに離れた複数地点間において適度に不明瞭にされた音声を伝達する通信のことを「曖昧通信」と呼ぶ。曖昧通信を実現するために、ユーザが準備すべき装置のうち、端末装置１２以外の装置、すなわちマイク１１、電話機１３ｘ、電話機１３ｙ、アンプ１５およびスピーカ１６はいずれも通常の会社や家庭に既にあるか、容易に入手可能なものである。また、端末装置１２は簡単な構造のフィルタ等を組み合わせただけのものであるため、低費用で製造可能である。その結果、通信システム１のユーザは、低費用で曖昧通信を行うことができる。 Hereinafter, as described above, communication that transmits a sound that is moderately obscured between a plurality of points distant from each other is referred to as “ambiguous communication”. Of the devices to be prepared by the user to realize ambiguous communication, devices other than the terminal device 12, that is, the microphone 11, the telephone set 13x, the telephone set 13y, the amplifier 15, and the speaker 16 are all already in a normal company or home. Or are readily available. Further, since the terminal device 12 is simply a combination of a filter having a simple structure, it can be manufactured at low cost. As a result, the user of the communication system 1 can perform ambiguous communication at low cost.

なお、通信システム１は曖昧通信を行うために、音声通話に関する電話回線の接続を長時間必要とする。従って、通信システム１は、音声通話に関する電話回線の接続料金が月額固定料金等の定額制である公衆電話回線網を利用可能な場合において特に実用的である。 Note that the communication system 1 requires long-time connection of a telephone line related to a voice call in order to perform ambiguous communication. Therefore, the communication system 1 is particularly practical when a public telephone line network in which a telephone line connection charge for voice calls is a flat rate system such as a fixed monthly charge can be used.

［２．第２実施形態］
第２実施形態は、上述した第１実施形態と多くの点で類似しているため、以下、第２実施形態が第１実施形態と異なる点のみを説明する。図２は、第２実施形態における通信システム２の構成を示した図である。通信システム２においては、端末装置１２は会社Ｙにおいて電話機１３ｙとスピーカ１６との間に接続され、アンプ１５は、自宅Ｘにおいてマイク１１と電話機１３ｘとの間に接続されている。 [2. Second Embodiment]
Since the second embodiment is similar in many respects to the first embodiment described above, only the differences of the second embodiment from the first embodiment will be described below. FIG. 2 is a diagram illustrating a configuration of the communication system 2 according to the second embodiment. In the communication system 2, the terminal device 12 is connected between the telephone 13y and the speaker 16 in the company Y, and the amplifier 15 is connected between the microphone 11 and the telephone 13x at home X.

通信システム２においては、電話機１３ｘから電話機１３ｙに対し、自宅Ｘにおける生活音をそのまま示す音声信号が出力される。電話機１３ｙにより受け取られた音声信号は、会社Ｙにおいて端末装置１２により加工され、曖昧音声信号に変換される。その結果、スピーカ１６から発せられる音声は、第１実施形態の場合と同様に曖昧音声となる。 In the communication system 2, an audio signal indicating the life sound at home X is output from the telephone set 13x to the telephone set 13y. The voice signal received by the telephone 13y is processed by the terminal device 12 in the company Y and converted into an ambiguous voice signal. As a result, the voice emitted from the speaker 16 becomes an ambiguous voice as in the case of the first embodiment.

第２実施形態によれば、曖昧通信の受信側にいる者は、端末装置１２により加工される前の音声信号が示す自宅Ｘの生活音を、電話機１３ｙの送受話器のスピーカからいつでも聞くことができる。従って、曖昧通信の受信側にいる者が、曖昧通信の送信側における何らかの異常を察した場合、すぐさま曖昧通信の送信側における状況を通常の明瞭な音声により確認することができる。第２実施形態は、例えば、一人暮らしのユーザが、会社等から不在中の自宅の様子をモニタしたい場合や、一人暮らしのお年寄りと離れて暮らす家族が、お年寄りの暮らす家の様子をモニタしたい場合などに特に有効である。 According to the second embodiment, the person on the receiving side of the ambiguous communication can listen to the life sound of the home X indicated by the audio signal before being processed by the terminal device 12 at any time from the speaker of the handset of the telephone 13y. it can. Therefore, when a person on the receiving side of the ambiguous communication observes some abnormality on the transmitting side of the ambiguous communication, the situation on the transmitting side of the ambiguous communication can be immediately confirmed by a normal clear voice. In the second embodiment, for example, when a user living alone wants to monitor the state of his / her home away from the company, or when a family living away from the elderly living alone wants to monitor the state of his / her home It is particularly effective for such as.

［３．第３実施形態］
第３実施形態は、上述した第１実施形態と多くの点で類似しているため、以下、第３実施形態が第１実施形態と異なる点のみを説明する。図３は、第３実施形態における通信システム３の構成を示した図である。通信システム３においては、自宅Ｘにおけるマイク１１と電話機１３ｘの間と、会社Ｙにおける電話機１３ｙとスピーカ１６の間の両方に、端末装置１２が接続されている。以下、自宅Ｘおよび会社Ｙに配置されている端末装置１２を、それぞれ端末装置１２ｘおよび端末装置１２ｙと呼ぶ。 [3. Third Embodiment]
Since the third embodiment is similar in many respects to the first embodiment described above, only the differences of the third embodiment from the first embodiment will be described below. FIG. 3 is a diagram showing a configuration of the communication system 3 in the third embodiment. In the communication system 3, the terminal device 12 is connected both between the microphone 11 and the telephone 13 x at home X and between the telephone 13 y and the speaker 16 at company Y. Hereinafter, the terminal devices 12 arranged in the home X and the company Y are referred to as a terminal device 12x and a terminal device 12y, respectively.

端末装置１２ｘの音声加工部１２１は、例えば以下のようなフィルタ等が直列に接続された組合せを備えている。
（ａ）カットオフ周波数４００Ｈｚのローパスフィルタ１２１１。
（ｂ）２倍音を生成するピッチシフタ１２１２。
（ｃ）増幅部１２１６ｘ。 The audio processing unit 121 of the terminal device 12x includes a combination in which, for example, the following filters are connected in series.
(A) A low-pass filter 1211 having a cutoff frequency of 400 Hz.
(B) A pitch shifter 1212 that generates a second overtone.
(C) Amplifying unit 1216x.

また、端末装置１２ｙの音声加工部１２１は、例えば以下のようなフィルタ等が直列に接続された組合せを備えている。
（ｄ）１／２倍音を生成するピッチシフタ１２１４。
（ｅ）ノイズ低減フィルタ１２１５。
（ｆ）増幅部１２１６ｙ。 The voice processing unit 121 of the terminal device 12y includes a combination in which, for example, the following filters are connected in series.
(D) A pitch shifter 1214 that generates a ½ overtone.
(E) Noise reduction filter 1215.
(F) Amplifying unit 1216y.

なお、上記のフィルタ等の構成は例示であり、端末装置１２ｘおよび端末装置１２ｙの音声加工部１２１は、それぞれ様々なフィルタ等が直列および並列に適宜接続された組合せを備えていてよい。 Note that the configuration of the above-described filter and the like is an example, and the audio processing unit 121 of the terminal device 12x and the terminal device 12y may include a combination in which various filters and the like are appropriately connected in series and in parallel.

第３実施形態において、自宅Ｘにおける端末装置１２ｘは、生活音を示す音声信号のうち、４００Ｈｚ以下の周波数帯に含まれるものを取り出し、その取り出した音声信号を１オクターブ、高音側にシフトさせ、電話機１３ｘから電話機１３ｙに送信させる。従って、例えば自宅Ｘにおいて発生した４００Ｈｚの音声は、８００Ｈｚに変換された音声信号として電話機１３ｙに送信される。このように高音側にシフトされた音声信号は、会社Ｙにおいて、端末装置１２ｙによって、１オクターブ、低音側にシフトされる。その結果、スピーカ１６から発音される音声は、自宅Ｘにおける生活音のうち、４００Ｈｚ以下の周波数帯に含まれる音声の一部を再現したものとなる。なお、ノイズ低減フィルタ１２１５が端末装置１２ｙに設けられているため、スピーカ１６から発音される音声はノイズの少ない音声となる。 In the third embodiment, the terminal device 12x at home X takes out a sound signal indicating a life sound included in a frequency band of 400 Hz or less, shifts the extracted sound signal to one octave and a high-pitched sound side, The telephone 13x transmits to the telephone 13y. Therefore, for example, 400 Hz sound generated at home X is transmitted to the telephone set 13y as a sound signal converted to 800 Hz. The sound signal shifted to the high sound side in this way is shifted to the low sound side by one octave in the company Y by the terminal device 12y. As a result, the sound generated from the speaker 16 reproduces a part of the sound included in the frequency band of 400 Hz or less among the living sounds at home X. Since the noise reduction filter 1215 is provided in the terminal device 12y, the sound produced from the speaker 16 is a sound with less noise.

第３実施形態によれば、例えば、公衆電話回線網１４を通過可能な音声信号の周波数帯が限られている場合において、曖昧通信の受信側において、送信側における音高と同じ音高で、公衆電話回線網１４を通過不可能な周波数帯の音声信号を再現したり、ノイズ低減の効果を向上させることができる。また、端末装置１２ｘと端末装置１２ｙのそれぞれにおいて、カットオフ周波数やピッチシフトの幅等のパラメータを操作部１２２により変更可能な構成とすることにより、曖昧通信の送信側と受信側のそれぞれにおいて、公衆電話回線網１４に出力する曖昧音声信号およびスピーカ１６に出力する曖昧音声信号を、ユーザの好みに応じて調整することができる。 According to the third embodiment, for example, when the frequency band of the audio signal that can pass through the public telephone line network 14 is limited, the reception side of the ambiguous communication has the same pitch as the transmission side. An audio signal in a frequency band that cannot pass through the public telephone line network 14 can be reproduced, and the noise reduction effect can be improved. Further, in each of the terminal device 12x and the terminal device 12y, by configuring the parameters such as the cut-off frequency and the width of the pitch shift by the operation unit 122, on the transmission side and the reception side of the ambiguous communication, The ambiguous voice signal output to the public telephone line network 14 and the ambiguous voice signal output to the speaker 16 can be adjusted according to the user's preference.

［４．第４実施形態］
第４実施形態は、上述した第１実施形態と多くの点で類似しているため、以下、第４実施形態が第１実施形態と異なる点のみを説明する。図４は、第４実施形態における通信システム４の構成を示した図である。通信システム４においては、自宅Ｘに配置された端末装置１２にセンサ１７が接続されている。また、端末装置１２に備えられた音声加工部１２１は、マイク１１から入力される音声信号を、フィルタ等をバイパスして増幅部１２１６に出力するためのスイッチを備えている。 [4. Fourth Embodiment]
Since the fourth embodiment is similar in many respects to the first embodiment described above, only the differences of the fourth embodiment from the first embodiment will be described below. FIG. 4 is a diagram showing a configuration of the communication system 4 in the fourth embodiment. In the communication system 4, a sensor 17 is connected to the terminal device 12 arranged at home X. The audio processing unit 121 provided in the terminal device 12 includes a switch for outputting the audio signal input from the microphone 11 to the amplifying unit 1216, bypassing a filter or the like.

センサ１７は、自宅Ｘに生活者が居るか否かを検出するためのセンサであり、例えば以下のようなセンサである。
（ａ）フォトダイオード等を備え、自宅Ｘの室内の光の強さが所定値以上である間、信号を出力するセンサ。
（ｂ）電流計等を備え、自宅Ｘにおいて消費されている電流が所定値以上である間、信号を出力するセンサ。
（ｃ）ドアの鍵部に設置され、鍵が解除されている間、信号を出力するセンサ。 The sensor 17 is a sensor for detecting whether or not there is a resident at home X. For example, the sensor 17 is as follows.
(A) A sensor that includes a photodiode and outputs a signal while the intensity of light in the room at home X is equal to or greater than a predetermined value.
(B) A sensor that includes an ammeter or the like and outputs a signal while the current consumed at home X is equal to or greater than a predetermined value.
(C) A sensor that is installed in the key part of the door and outputs a signal while the key is released.

なお、上記のセンサは例示であり、例えば自宅Ｘの室内の人や物の動きを検出している間、信号を出力するセンサ等、他の様々なセンサが利用可能である。また、センサ１７は、アナログ回路、デジタル回路のいずれを用いたものであってもよい。 The above-described sensor is an example, and various other sensors such as a sensor that outputs a signal while detecting the movement of a person or an object in the room at home X can be used. The sensor 17 may use either an analog circuit or a digital circuit.

音声加工部１２１はセンサ１７から信号を受け取っている間、マイク１１から入力される音声信号を曖昧音声信号に加工して電話機１３ｘに出力する。一方、音声加工部１２１はセンサ１７から信号を受け取っていない間、マイク１１から入力される音声信号を、フィルタ等をバイパスし増幅部１２１６のみを介して電話機１３ｘに出力する。 While receiving the signal from the sensor 17, the voice processing unit 121 processes the voice signal input from the microphone 11 into an ambiguous voice signal and outputs it to the telephone set 13x. On the other hand, while the audio processing unit 121 does not receive a signal from the sensor 17, the audio signal input from the microphone 11 is output to the telephone set 13 x only through the amplification unit 1216, bypassing the filter and the like.

第４実施形態によれば、例えば自宅Ｘの室内の照明が点灯されていない場合など、曖昧通信の送信側にユーザがいないと推察される状況においては、電話機１３ｘと電話機１３ｙとの間で通常の音声通信が行われる。その結果、ユーザは、通信の送信側に誰かがいる場合には曖昧通信により、また通信の送信側に誰もいない場合には通常の音声により、通信の送信側の様子をモニタすることができる。 According to the fourth embodiment, in a situation where it is inferred that there is no user on the transmission side of the ambiguous communication, for example, when the indoor lighting of the home X is not turned on, it is normal between the telephone set 13x and the telephone set 13y. Voice communication. As a result, the user can monitor the state of the communication transmission side by using ambiguous communication when someone is on the communication transmission side, and by normal voice when there is no one on the communication transmission side. .

［５．第５実施形態］
第５実施形態は、上述した第１実施形態と多くの点で共通しているため、以下、第５実施形態が第１実施形態と異なる点のみを説明する。図５は、第５実施形態における通信システム５の構成を示した図である。通信システム５においては、自宅Ｘに配置された端末装置１２に計時部１２３が備えられている。また、自宅Ｘにはセンサ１７が配置され、センサ１７は電話機１３ｘに接続されている。センサ１７は端末装置１２において、計時部１２３を介して音声加工部１２１にも接続されている。さらに、電話機１３ｘはオートダイヤル機能を備えている。 [5. Fifth Embodiment]
Since the fifth embodiment is common in many respects to the first embodiment described above, only the differences of the fifth embodiment from the first embodiment will be described below. FIG. 5 is a diagram illustrating a configuration of the communication system 5 according to the fifth embodiment. In the communication system 5, the time measuring unit 123 is provided in the terminal device 12 disposed in the home X. A sensor 17 is disposed at home X, and the sensor 17 is connected to the telephone set 13x. In the terminal device 12, the sensor 17 is also connected to the sound processing unit 121 via the time measuring unit 123. Furthermore, the telephone set 13x has an auto dial function.

センサ１７は、上記の第４実施形態におけるものと同様のセンサである。電話機１３ｘは、センサ１７からの信号がＯＦＦの状態からＯＮの状態に変化すると、会社Ｙの電話番号を自動的にダイヤルする。その結果、電話機１３ｘと電話機１３ｙとの間の音声通信接続が確立される。その一方で、端末装置１２の音声加工部１２１は、計時部１２３を介してセンサ１７から受信する信号がＯＦＦの状態からＯＮの状態に変化すると、音声加工部１２１の電源をＯＮする。その結果、自宅Ｘと会社Ｙとの間で曖昧通信が開始される。 The sensor 17 is the same sensor as that in the fourth embodiment. When the signal from the sensor 17 changes from the OFF state to the ON state, the telephone set 13x automatically dials the telephone number of the company Y. As a result, a voice communication connection is established between the telephone set 13x and the telephone set 13y. On the other hand, the voice processing unit 121 of the terminal device 12 turns on the power of the voice processing unit 121 when the signal received from the sensor 17 via the time measuring unit 123 changes from the OFF state to the ON state. As a result, ambiguous communication is started between the home X and the company Y.

計時部１２３は、センサ１７から受け取る信号がＯＮからＯＦＦに変化すると、計時の値を０に初期化して計時を再開始する。計時部１２３は、計時の値が所定の値に達すると、音声加工部１２１に通信終了の信号を出力する。音声加工部１２１は、計時部１２３から通信終了の信号を受け取ると、音声加工部１２１の電源をＯＦＦする。その結果、電話機１３ｘと電話機１３ｙとの間で確立されていた音声通信接続が切断される。 When the signal received from the sensor 17 changes from ON to OFF, the time measuring unit 123 initializes the time value to 0 and restarts time measurement. When the time value reaches a predetermined value, the time measuring unit 123 outputs a communication end signal to the sound processing unit 121. When the voice processing unit 121 receives a communication end signal from the time measuring unit 123, the voice processing unit 121 turns off the power of the voice processing unit 121. As a result, the voice communication connection established between the telephone set 13x and the telephone set 13y is disconnected.

第５実施形態によれば、例えば自宅Ｘの室内の照明が消されてから所定の時間が経過するなど、従業員Ａが自宅Ｘにいないか就寝中であると推察される状況において、自動的に曖昧通信が切断される。その結果、公衆電話回線網１４を用いた通信料金が音声通信接続の接続時間に応じて増加するような場合、曖昧通信を行う必要がない時間に関する通信料金を節減することができる。 According to the fifth embodiment, in a situation where it is assumed that the employee A is not at home X or is sleeping, for example, a predetermined time has elapsed since the lighting in the room at home X is turned off, The ambiguous communication is disconnected. As a result, when the communication fee using the public telephone line network 14 increases according to the connection time of the voice communication connection, it is possible to reduce the communication fee related to the time when the ambiguous communication is not required.

［６．第６実施形態］
第６実施形態は、上述した第１実施形態と多くの点で共通しているため、以下、第６実施形態が第１実施形態と異なる点のみを説明する。図６は、第６実施形態における通信システム６の構成を示した図である。通信システム６においては、自宅Ｘと会社Ｙの両方に、曖昧通信の送信側の構成要素であるマイク１１および端末装置１２と、曖昧通信の受信側の構成要素であるアンプ１５およびスピーカ１６が配置されている。以下、これらの構成要素の名称の後ろに「ｘ」または「ｙ」を付けて、自宅Ｘと会社Ｙに配置された同種の構成要素を区別する。 [6. Sixth Embodiment]
Since the sixth embodiment is common in many respects to the first embodiment described above, only the differences of the sixth embodiment from the first embodiment will be described below. FIG. 6 is a diagram showing a configuration of the communication system 6 in the sixth embodiment. In the communication system 6, the microphone 11 and the terminal device 12 that are components on the transmission side of the ambiguous communication and the amplifier 15 and the speaker 16 that are components on the reception side of the ambiguous communication are arranged in both the home X and the company Y. Has been. Hereinafter, “x” or “y” is appended to the names of these components to distinguish the same types of components arranged at home X and company Y.

第６実施形態においては、自宅Ｘにおける生活音が端末装置１２ｘにおいて曖昧音声信号に変換され、会社Ｙにおいてスピーカ１６ｙから曖昧音声が発音される。また、会社Ｙにおける生活音が端末装置１２ｙにおいて曖昧音声信号に変換され、自宅Ｘにおいてスピーカ１６ｘから曖昧音声が発音される。その結果、自宅Ｘと会社Ｙとの間で、双方向の曖昧通信が実現される。 In the sixth embodiment, the living sound at home X is converted into an ambiguous voice signal at the terminal device 12x, and the ambiguous voice is pronounced at the company Y from the speaker 16y. Moreover, the life sound in the company Y is converted into an ambiguous voice signal in the terminal device 12y, and the ambiguous voice is pronounced from the speaker 16x in the home X. As a result, two-way ambiguous communication is realized between the home X and the company Y.

［７．第７実施形態］
第７実施形態は、上述した第６実施形態と多くの点で共通しているため、以下、第７実施形態が第６実施形態と異なる点のみを説明する。図７は、第７実施形態における通信システム７の構成を示した図である。通信システム７においては、自宅Ｘと会社Ｙの間に加え、自宅Ｘと自宅Ｚの間および会社Ｙと自宅Ｚの間において曖昧通信が行われる。例えば、自宅Ｚは従業員Ａと同様に在宅勤務をしている従業員Ｃの自宅である。 [7. Seventh Embodiment]
Since the seventh embodiment is common in many respects to the sixth embodiment described above, only the differences of the seventh embodiment from the sixth embodiment will be described below. FIG. 7 is a diagram showing the configuration of the communication system 7 in the seventh embodiment. In the communication system 7, ambiguous communication is performed between the home X and the home Y and between the home X and the home Z and between the company Y and the home Z. For example, home Z is the home of employee C who is working from home like employee A.

自宅Ｘ、会社Ｙおよび自宅Ｚのそれぞれには、電話機１３が２台ずつ配置されている。また、自宅Ｘ、会社Ｙおよび自宅Ｚのそれぞれには、ミキサ１８が配置されている。以下、これらの構成要素の名称の後ろに「ｘ」、「ｙ」もしくは「ｚ」を付けて、自宅Ｘ、会社Ｙおよび自宅Ｚに配置された同種の構成要素を区別する。また、自宅Ｘ、会社Ｙおよび自宅Ｚのそれぞれに配置された２台の電話機１３を区別する目的で、「ｘ」、「ｙ」もしくは「ｚ」の後ろにさらに「１」もしくは「２」を付ける。 Two telephones 13 are disposed at each of the home X, the company Y, and the home Z. A mixer 18 is disposed in each of the home X, the company Y, and the home Z. Hereinafter, “x”, “y”, or “z” is appended to the names of these components to distinguish the same types of components arranged in the home X, the company Y, and the home Z. Further, for the purpose of distinguishing the two telephones 13 arranged in each of the home X, the company Y, and the home Z, “1” or “2” is further added after “x”, “y”, or “z”. wear.

電話機１３のそれぞれの音声入力部は端末装置１２に接続されている。また、電話機１３のそれぞれの音声出力部はミキサ１８に接続されている。ミキサ１８は、複数の音声入力部と１つの音声出力部を備え、音声入力部を介して入力される複数の音声信号を加算し、加算により得られる音声信号を音声出力部から出力する装置である。ミキサ１８の音声出力部には、アンプ１５が接続されている。 Each voice input unit of the telephone 13 is connected to the terminal device 12. Each audio output unit of the telephone 13 is connected to a mixer 18. The mixer 18 includes a plurality of sound input units and one sound output unit, adds a plurality of sound signals input via the sound input unit, and outputs a sound signal obtained by the addition from the sound output unit. is there. An amplifier 15 is connected to the audio output unit of the mixer 18.

電話機１３のそれぞれは、任意に他の場所に配置された電話機１３との間で音声通信接続を確立することができる。以下、例として、電話機１３ｘ１と電話機１３ｚ２との間、電話機１３ｘ２と電話機１３ｙ１との間、電話機１３ｙ２と電話機１３ｚ１との間にそれぞれ音声通信接続が確立されるものとする。その場合、例えば自宅Ｘにおいては、電話機１３ｘ１の音声出力部からは自宅Ｚから受け取られる曖昧音声信号が出力され、電話機１３ｘ２の音声出力部からは会社Ｙから受け取られる曖昧音声信号が出力される。 Each of the telephones 13 can establish a voice communication connection with a telephone 13 that is arbitrarily located elsewhere. Hereinafter, as an example, it is assumed that voice communication connections are established between the telephone set 13x1 and the telephone set 13z2, between the telephone set 13x2 and the telephone set 13y1, and between the telephone set 13y2 and the telephone set 13z1, respectively. In this case, for example, at home X, an ambiguous audio signal received from home Z is output from the audio output unit of telephone 13x1, and an ambiguous audio signal received from company Y is output from the audio output unit of telephone 13x2.

このように２台の電話機１３ｘから出力される曖昧音声信号は、ミキサ１８ｘにより加算され、アンプ１５ｘに出力される。その結果、スピーカ１６ｘからは、自宅Ｚと会社Ｙの両方の様子を伝達する曖昧音声が発音される。同様に、会社Ｙにおいては自宅Ｘと自宅Ｚの両方の様子を伝達する曖昧音声が発音され、自宅Ｚにおいては自宅Ｘと会社Ｙの両方の様子を伝達する曖昧音声が発音される。 In this way, the ambiguous audio signals output from the two telephone sets 13x are added by the mixer 18x and output to the amplifier 15x. As a result, the speaker 16x produces ambiguous voices that convey the appearance of both the home Z and the company Y. Similarly, in company Y, ambiguous voices that convey the appearance of both home X and home Z are pronounced, and in home Z, ambiguous voices that convey the appearance of both home X and company Y are pronounced.

なお、第７実施形態において曖昧通信が行われる場所は３箇所に限られず、４箇所以上であってもよい。その場合、どの場所の組合せにおいて曖昧通信を行うかは任意に選択可能である。また、曖昧通信のそれぞれは双方向であっても単方向であってもよい。第７実施形態によれば、ユーザは、同時に複数の遠隔地の様子を曖昧音声によりモニタすることができる。 In the seventh embodiment, the place where ambiguous communication is performed is not limited to three places, and may be four or more places. In that case, it is possible to arbitrarily select at which combination of places the ambiguous communication is performed. Each of the ambiguous communications may be bidirectional or unidirectional. According to the seventh embodiment, the user can simultaneously monitor the state of a plurality of remote locations with ambiguous voice.

［８．第８実施形態］
［８．１．通信システムの構成］
上述した第１実施形態〜第７実施形態においては、曖昧通信における送信側から受信側に伝達される音声情報は、主としてアナログ音声信号の形式を取っている。それに対し、以下に説明する第８実施形態においては、曖昧通信の送信側から受信側に伝達される音声情報は、主としてデジタル音声データの形式を取る。 [8. Eighth Embodiment]
[8.1. Configuration of communication system]
In the first to seventh embodiments described above, the audio information transmitted from the transmission side to the reception side in the ambiguous communication mainly takes the form of an analog audio signal. On the other hand, in the eighth embodiment described below, the audio information transmitted from the transmission side of the ambiguous communication to the reception side mainly takes the form of digital audio data.

図８は、本発明の第２実施形態における通信システム８の構成を示した図である。通信システム８においては、まず、従業員Ａの自宅である自宅Ｘと会社Ｙのそれぞれに、端末装置２１、ＤＳＬ（ＤｉｇｉｔａｌＳｕｂｓｃｒｉｂｅｒＬｉｎｅ）モデム２２およびスプリッタ２３の組が１組ずつ配置され、記載の順序で接続されている。以下、自宅Ｘと会社Ｙに配置されている同種の構成要素を区別する必要がある場合には、構成要素名の後にそれぞれ「ｘ」もしくは「ｙ」を付す。また、スプリッタ２３には、ＤＳＬモデム２２に加え、電話機１３が接続されている。 FIG. 8 is a diagram showing the configuration of the communication system 8 in the second embodiment of the present invention. In the communication system 8, first, a set of a terminal device 21, a digital subscriber line (DSL) modem 22, and a splitter 23 is arranged in each of the home X and the company Y, which are the homes of the employee A. Connected in order. Hereinafter, when it is necessary to distinguish the same type of components arranged in the home X and the company Y, “x” or “y” is appended to the component name, respectively. In addition to the DSL modem 22, a telephone set 13 is connected to the splitter 23.

端末装置２１はマイク、Ａ／Ｄコンバータ、Ｄ／Ａコンバータ、アンプ、スピーカ等を有し、マイクを介して得られる音声信号をＡ／Ｄコンバータを介して音声データに変換した後、その音声データに対して加工処理を行い、曖昧音声を示す音声データ（以下、「曖昧音声データ」と呼ぶ）を生成する装置である。また、端末装置２１は、データ通信ネットワークを介して他の通信機器との間で、パケットデータの送受信を行うことが可能である。 The terminal device 21 includes a microphone, an A / D converter, a D / A converter, an amplifier, a speaker, and the like. After converting an audio signal obtained via the microphone into audio data via the A / D converter, the audio data Is a device that generates voice data indicating ambiguous voice (hereinafter referred to as “fuzzy voice data”). In addition, the terminal device 21 can transmit and receive packet data to and from other communication devices via the data communication network.

端末装置２１は、汎用コンピュータに特定のプログラムに従った処理を行わせることによっても実現可能である。以下の説明においては、端末装置２１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ＨＤ（ＨａｒｄＤｉｓｋ）、表示部、操作部、ＮＷ（Ｎｅｔｗｏｒｋ）入出力部、マイク、Ａ／Ｄコンバータ、Ｄ／Ａコンバータ、アンプ、およびスピーカを有する汎用コンピュータに、通信システム８の端末装置用のプログラムを実行させることにより実現するものとする。 The terminal device 21 can also be realized by causing a general-purpose computer to perform processing according to a specific program. In the following description, the terminal device 21 includes a CPU (Central Processing Unit), a DSP (Digital Signal Processor), a ROM (Read Only Memory), a RAM (Random Access Memory), an HD (Hard Disk), a display unit, and an operation unit. It is realized by causing a general-purpose computer having an NW (Network) input / output unit, a microphone, an A / D converter, a D / A converter, an amplifier, and a speaker to execute a program for a terminal device of the communication system 8. .

なお、端末装置２１を実現するための汎用コンピュータは、マイク、アンプ、スピーカ等を備えない代わりに、音声信号入出力インタフェースを介してマイク、アンプ等と接続されていてもよい。また、端末装置２１を実現するための汎用コンピュータは、Ａ／Ｄコンバータ、Ｄ／Ａコンバータ等を備えない代わりに、音声データ入出力インタフェースを介して、Ａ／Ｄコンバータを内蔵したデジタルマイクや、Ｄ／Ａコンバータを内蔵したデジタルアンプ等と接続されていてもよい。 Note that the general-purpose computer for realizing the terminal device 21 may be connected to a microphone, an amplifier, and the like via an audio signal input / output interface instead of including a microphone, an amplifier, a speaker, and the like. In addition, the general-purpose computer for realizing the terminal device 21 does not include an A / D converter, a D / A converter, or the like, but a digital microphone incorporating an A / D converter via an audio data input / output interface, You may connect with the digital amplifier etc. which incorporated the D / A converter.

ＤＳＬモデム２２は、スプリッタ２３からアナログ信号を受け取り、受け取ったアナログ信号をデジタルデータに変換する装置である。スプリッタ２３は、公衆電話回線網１４を介してインターネット２７に接続されている。スプリッタ２３は、公衆電話回線網１４からアナログ信号を受け取り、受け取ったアナログ信号を低周波数帯の信号と高周波数帯の信号とに分離する。ここで、低周波数帯に含まれるアナログ信号は、音声を示している。また、高周波数帯に含まれるアナログ信号は、データを示している。スプリッタ２３は音声を示すアナログ信号を電話機１３に、データを示すアナログ信号をＤＳＬモデム２２に、それぞれ出力する。また、スプリッタ２３は、電話機１３から音声を示す低周波数帯のアナログ信号を受け取るとともに、ＤＳＬモデム２２からデータを示す高周波数帯のアナログ信号を受け取り、両者を加算して、公衆電話回線網１４に出力する。 The DSL modem 22 is a device that receives an analog signal from the splitter 23 and converts the received analog signal into digital data. The splitter 23 is connected to the Internet 27 via the public telephone line network 14. The splitter 23 receives an analog signal from the public telephone network 14 and separates the received analog signal into a low frequency band signal and a high frequency band signal. Here, the analog signal included in the low frequency band indicates voice. An analog signal included in the high frequency band indicates data. The splitter 23 outputs an analog signal indicating voice to the telephone set 13 and an analog signal indicating data to the DSL modem 22. The splitter 23 receives a low-frequency band analog signal indicating voice from the telephone set 13, and also receives a high-frequency band analog signal indicating data from the DSL modem 22, adds them together, and sends them to the public telephone line network 14. Output.

インターネット２７は、インターネットプロトコルにより相互に接続された通信網群である。インターネット２７には、スプリッタ２３ｘと通信接続が可能な一般ゲートウェイサーバ２５ｘおよびＶｏＩＰ（ＶｏｉｃｅｏｖｅｒＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ）ゲートウェイサーバ２６ｘと、スプリッタ２３ｙと通信接続が可能な一般ゲートウェイサーバ２５ｙおよびＶｏＩＰゲートウェイサーバ２６ｙが含まれている。 The Internet 27 is a group of communication networks connected to each other by the Internet protocol. The Internet 27 includes a general gateway server 25x and a VoIP (Voice over Internet Protocol) gateway server 26x capable of communication connection with the splitter 23x, and a general gateway server 25y and VoIP gateway server 26y capable of communication connection with the splitter 23y. ing.

一般ゲートウェイサーバ２５およびＶｏＩＰゲートウェイサーバ２６は、端末装置２１がインターネット２７を介して他の通信機器とデータの送受信を行う際の通信プロトコルの変換およびデータの中継を行う装置である。一般ゲートウェイサーバ２５は、本実施形態においては、端末装置２１ｘと端末装置２１ｙとの間で音量の変更指示等の各種制御データが送受信される際に、それらの制御データを中継するゲートウェイサーバである。ＶｏＩＰゲートウェイサーバ２６は、端末装置２１ｘと端末装置２１ｙとの間で音声データを含むパケットデータが送受信される際に、それらの音声データを中継するゲートウェイサーバである。以下の説明において、ＶｏＩＰゲートウェイサーバ２６が中継するパケットデータに含まれる音声データは、例として、サンプリング周波数８ｋＨｚ、量子化ビット数８の非圧縮ＰＣＭ（ＰｕｌｓｅＣｏｄｅＭｏｄｕｌａｔｉｏｎ）データであるものとする。 The general gateway server 25 and the VoIP gateway server 26 are devices that perform communication protocol conversion and data relay when the terminal device 21 transmits / receives data to / from other communication devices via the Internet 27. In this embodiment, the general gateway server 25 is a gateway server that relays control data such as a volume change instruction between the terminal device 21x and the terminal device 21y when it is transmitted and received. . The VoIP gateway server 26 is a gateway server that relays voice data when packet data including voice data is transmitted and received between the terminal device 21x and the terminal device 21y. In the following description, it is assumed that the voice data included in the packet data relayed by the VoIP gateway server 26 is, for example, uncompressed PCM (Pulse Code Modulation) data having a sampling frequency of 8 kHz and a quantization bit number of 8.

なお、以下の説明においては、通信システム８は、上記のように、インターネットを介しＤＳＬ技術を用いてデジタルデータ通信をパケットデータとして送受信することにより、第２実施形態を実現するものとするが、通信システム８は他の形態のデジタルデータ通信を行う構成であってもよい。例えば、通信システム８は、端末装置２１ｘと端末装置２１ｙが、専用線により互いに通信接続されている構成であってもよい。さらに、以下の説明においては、通信システム８の構成要素間は全て有線接続されているものとするが、通信システム８の構成要素間の一部もしくは全てが無線接続されていてもよい。 In the following description, the communication system 8 realizes the second embodiment by transmitting and receiving digital data communication as packet data using the DSL technology via the Internet as described above. The communication system 8 may be configured to perform other forms of digital data communication. For example, the communication system 8 may have a configuration in which the terminal device 21x and the terminal device 21y are connected to each other via a dedicated line. Furthermore, in the following description, the components of the communication system 8 are all connected by wire, but some or all of the components of the communication system 8 may be wirelessly connected.

［８．２．通信システムの動作］
従業員Ａおよび上司Ｂは、通信システム８を利用するにあたり、端末装置２１ｘもしくは端末装置２１ｙを操作して、ＶｏＩＰゲートウェイサーバ２６ｘとＶｏＩＰゲートウェイサーバ２６ｙを介した音声通信接続を確立する。端末装置２１ｘと端末装置２１ｙの間にパケットデータの送受信による音声通信接続が確立される動作は、通常のＶｏＩＰ技術によるものであるので、その説明を省略する。 [8.2. Operation of communication system]
When the employee A and the supervisor B use the communication system 8, they operate the terminal device 21x or the terminal device 21y to establish a voice communication connection via the VoIP gateway server 26x and the VoIP gateway server 26y. The operation of establishing a voice communication connection by transmitting and receiving packet data between the terminal device 21x and the terminal device 21y is based on a normal VoIP technology, and thus description thereof is omitted.

続いて、従業員Ａは端末装置２１ｘの操作部を操作して、端末装置２１ｘからパケットデータ化されて送信される音声データ（以下、「送信音声データ」と呼ぶ）の加工処理の開始を端末装置２１ｘに指示する。端末装置２１ｘは、従業員Ａによる加工処理の開始の指示に従い、マイクおよびＡ／Ｄコンバータを介して得られる音声データに対し、例えば図１に示した音声加工部１２１が備えるフィルタ等の処理と同様の処理を施すことにより、曖昧音声データを生成する。ただし、端末装置２１ｘは、アナログ回路によるフィルタ等を用いる代わりに、ＤＳＰに所定のデータ処理を行わせることにより、ＩＩＲ（ＩｎｆｉｎｉｔｅＩｍｐｕｌｓｅＲｅｓｐｏｎｓｅ）型フィルタやＦＩＲ（ＦｉｎｉｔｅＩｍｐｕｌｓｅＲｅｓｐｏｎｓｅ）型フィルタを実現したり、ＰＣＭデータの値を増減することにより増幅部を実現したりする。 Subsequently, the employee A operates the operation unit of the terminal device 21x to start processing of voice data (hereinafter referred to as “transmission voice data”) that is transmitted from the terminal device 21x as packet data. Instruct the device 21x. The terminal device 21x performs processing such as filtering provided in the voice processing unit 121 illustrated in FIG. 1 on voice data obtained via the microphone and the A / D converter in accordance with an instruction to start the processing by the employee A. By performing the same processing, ambiguous voice data is generated. However, the terminal device 21x realizes an IIR (Infinite Impulse Response) type filter or an FIR (Finite Impulse Response) type filter by causing the DSP to perform predetermined data processing instead of using an analog circuit filter or the like. An amplification unit is realized by increasing or decreasing the value of PCM data.

上記のような端末装置２１ｘの処理の結果、端末装置２１ｘから端末装置２１ｙに対し、曖昧音声データが送信され、端末装置２１ｙのスピーカから曖昧音声が発音される。その結果、第１実施形態と同様に、自宅Ｘを送信側、会社Ｙを受信側とする曖昧通信が実現される。第８実施形態によっても、一般の家庭や会社等が通常有している汎用コンピュータ等を用いて、遠隔地における様子を、遠隔地にいる者のプライバシーを侵害することなくモニタすることが可能となる。 As a result of the processing of the terminal device 21x as described above, the ambiguous voice data is transmitted from the terminal device 21x to the terminal device 21y, and the ambiguous voice is pronounced from the speaker of the terminal device 21y. As a result, similarly to the first embodiment, ambiguous communication is realized with home X as the transmission side and company Y as the reception side. According to the eighth embodiment, it is possible to monitor the situation in a remote place without infringing on the privacy of a person in the remote place by using a general-purpose computer or the like normally possessed by a general household or company. Become.

ところで、第８実施形態においては、上司Ｂが会社Ｙにおいて端末装置２１ｙを操作することにより、自宅Ｘに配置された端末装置２１ｘに対し、音声データの加工の開始および終了、もしくは音声データの加工に用いられる各種パラメータの変更を指示することができる。上司Ｂは、曖昧通信が行われている状態で、端末装置２１ｙの操作部を用いて、例えば音声データの加工の終了の指示を行う。端末装置２１ｙは、上司Ｂによる操作に応じて、音声データの加工の終了を指示するデータ（以下、「終了データ」と呼ぶ）を生成し、終了データを端末装置２１ｘに送信する。終了データは一般ゲートウェイサーバ２５を介して端末装置２１ｘに送信される。端末装置２１ｘは、終了データを受信すると、それまで行っていたフィルタ処理等による音声データの加工を中止し、その後は未加工の音声データを端末装置２１ｙに送信する。その結果、自宅Ｘを送信側、会社Ｙを受信側とする曖昧通信は中止され、自宅Ｘと会社Ｙとの間で通常の音声による通信が開始される。同様に、上司Ｂは端末装置２１ｘに対し、音声データの加工の開始や、音声データの加工に用いられるパラメータ、例えばローパスフィルタのカットオフ周波数の変更等の指示を行うことができる。 By the way, in the eighth embodiment, when the boss B operates the terminal device 21y in the company Y, the start and end of the processing of the voice data with respect to the terminal device 21x arranged at the home X, or the processing of the voice data It is possible to instruct the change of various parameters used for. The boss B gives an instruction to end the processing of the audio data, for example, using the operation unit of the terminal device 21y while the ambiguous communication is being performed. The terminal device 21y generates data (hereinafter referred to as “end data”) instructing the end of processing of the voice data in response to an operation by the superior B, and transmits the end data to the terminal device 21x. The end data is transmitted to the terminal device 21x via the general gateway server 25. When the terminal device 21x receives the end data, the terminal device 21x stops the processing of the sound data by the filter processing or the like performed so far, and thereafter transmits the unprocessed sound data to the terminal device 21y. As a result, the ambiguous communication in which the home X is the transmitting side and the company Y is the receiving side is stopped, and normal voice communication between the home X and the company Y is started. Similarly, the supervisor B can instruct the terminal device 21x to start processing audio data and change parameters used for processing the audio data, for example, a cut-off frequency of the low-pass filter.

［８．３．変形例］
上記の第８実施形態は、本発明の技術的思想の範囲内において、以下に例示するように、様々に変形することができる。まず、第８実施形態の通信システム８に必要な変形を加えることにより、上述した第２実施形態〜第７実施形態のそれぞれにおける通信システムと同様の機能を有する通信システムを実現可能である。 [8.3. Modified example]
The eighth embodiment described above can be variously modified as exemplified below within the scope of the technical idea of the present invention. First, by adding necessary modifications to the communication system 8 of the eighth embodiment, it is possible to realize a communication system having the same functions as the communication systems of the second to seventh embodiments described above.

例えば、端末装置２１ｘにはフィルタ等による音声データの加工を行わせず、端末装置２１ｙに、受信する音声データ（以下、「受信音声データ」と呼ぶ）に対する加工を行わせることにより、第２実施形態の通信システム２と同様の機能を有する通信システムを実現することができる。また、端末装置２１ｘと端末装置２１ｙの両方に音声データの加工を行わせることにより、第３実施形態の通信システム３と同様の機能を有する通信システムを実現することができる。 For example, the terminal device 21x does not process the voice data by a filter or the like, but causes the terminal device 21y to process the received voice data (hereinafter referred to as “received voice data”). A communication system having the same function as that of the communication system 2 can be realized. Further, by causing both the terminal device 21x and the terminal device 21y to process audio data, a communication system having the same function as the communication system 3 of the third embodiment can be realized.

また、端末装置２１ｘにセンサ１７と同様のセンサを接続し、センサから入力される信号に応じて端末装置２１ｘに音声データの加工を行うか行わないかの切り替えや、会社Ｙの電話番号へのダイヤル処理を行わせることにより、第４実施形態もしくは第５実施形態の通信システム４もしくは通信システム５と同様の通信システムを実現することができる。また、端末装置２１ｘには自宅Ｘにおける生活音を示す音声データの加工を行わせ、端末装置２１ｙには会社Ｙにおける生活音を示す音声データの加工を行わせ、加工後の曖昧音声データをそれぞれパケットデータ化して送信させることにより、第６実施形態の通信システム６と同様の通信システムを実現することができる。 In addition, a sensor similar to the sensor 17 is connected to the terminal device 21x, and whether or not the voice data is processed in the terminal device 21x according to a signal input from the sensor, or the telephone number of the company Y is changed. By performing the dial process, a communication system similar to the communication system 4 or the communication system 5 of the fourth embodiment or the fifth embodiment can be realized. Further, the terminal device 21x is made to process the sound data indicating the life sound at home X, and the terminal device 21y is processed to the sound data indicating the life sound in the company Y, and the ambiguous sound data after the processing is respectively processed. A communication system similar to the communication system 6 of the sixth embodiment can be realized by transmitting the packet data.

また、端末装置２１ｘおよび端末装置２１ｙに、他の端末装置との間においてもＶｏＩＰゲートウェイサーバ２６を介した音声通信接続を確立させることにより、第７実施形態の通信システム７と同様の通信システムを実現することができる。 In addition, by establishing a voice communication connection via the VoIP gateway server 26 with other terminal devices in the terminal device 21x and the terminal device 21y, a communication system similar to the communication system 7 of the seventh embodiment is provided. Can be realized.

［９．第９実施形態］
第９実施形態は、上述した第８実施形態と多くの点で共通しているため、以下、第９実施形態が第８実施形態と異なる点のみを説明する。第９実施形態においては、音声データを加工する端末装置２１は、未加工の音声データのサンプリング周波数を下げることにより、曖昧音声データの生成を行う。例えば、通信の送信側に配置された端末装置２１ｘが音声データを加工する端末装置であるとすると、端末装置２１ｘは音声データに含まれる各サンプリングに対応するデータ（以下、「サンプルデータ」と呼ぶ）を数個毎に取り出し、取り出したデータを音声データとして、フィルタ処理等により加工することなく、端末装置２１ｙに送信する。 [9. Ninth Embodiment]
Since the ninth embodiment is common in many respects to the above-described eighth embodiment, only the differences of the ninth embodiment from the eighth embodiment will be described below. In the ninth embodiment, the terminal device 21 that processes audio data generates ambiguous audio data by lowering the sampling frequency of the unprocessed audio data. For example, assuming that the terminal device 21x arranged on the communication transmission side is a terminal device that processes audio data, the terminal device 21x is data corresponding to each sampling included in the audio data (hereinafter referred to as “sample data”). ) Are extracted every several pieces, and the extracted data is transmitted as audio data to the terminal device 21y without being processed by filter processing or the like.

例えば、未加工の音声データのサンプリング周波数が８ｋＨｚであり、端末装置２１ｘが１０個毎にサンプルデータを取り出すとすると、取り出されたサンプルデータの列である音声データは、サンプリング周波数８００Ｈｚの音声データとなる。この８ｋＨｚから８００Ｈｚへのダウンサンプリングにより、元の音声データに含まれていたスペクトラムのうち、サンプリング周波数８００Ｈｚの半分の周波数、すなわち４００Ｈｚより高い周波数帯のスペクトラムに起因した折り返しノイズが発生し、元の音声からかけ離れた曖昧音声が得られ、端末装置２１ｙのスピーカから発音される。第９実施形態によれば、端末装置２１ｘはフィルタ処理等を行うよりも容易に、曖昧音声データを生成することができる。さらに、第９実施形態によれば、曖昧通信において送受信される音声データの量が削減される。 For example, if the sampling frequency of raw audio data is 8 kHz and the terminal device 21x extracts sample data for every 10 pieces, the audio data that is a sequence of extracted sample data is audio data with a sampling frequency of 800 Hz. Become. Due to the downsampling from 8 kHz to 800 Hz, aliasing noise is generated due to the half of the sampling frequency of 800 Hz, that is, the spectrum in the frequency band higher than 400 Hz, among the spectrum included in the original audio data. An ambiguous voice far from the voice is obtained and pronounced from the speaker of the terminal device 21y. According to the ninth embodiment, the terminal device 21x can generate ambiguous voice data more easily than performing filter processing or the like. Furthermore, according to the ninth embodiment, the amount of audio data transmitted and received in ambiguous communication is reduced.

［１０．第１０実施形態］
第１０実施形態は、上述した第８実施形態と多くの点で共通しているため、以下、第１０実施形態が第８実施形態と異なる点のみを説明する。第１０実施形態においては、音声データを加工する端末装置２１は、未加工の音声データの量子化ビット数を下げることにより、曖昧音声データの生成を行う。例えば、通信の送信側に配置された端末装置２１ｘが音声データを加工する端末装置であるとすると、端末装置２１ｘは音声データに含まれる各サンプルデータのＭＳＢ（ＭｏｓｔＳｉｇｎｉｆｉｃａｎｔＢｉｔ）側から４ビットを取り出し、取り出したデータを音声データとして、フィルタ等により加工することなく、端末装置２１ｙに送信する。 [10. Tenth Embodiment]
Since the tenth embodiment is common in many respects to the above-described eighth embodiment, only the differences between the tenth embodiment and the eighth embodiment will be described below. In the tenth embodiment, the terminal device 21 that processes audio data generates ambiguous audio data by lowering the number of quantization bits of the unprocessed audio data. For example, if the terminal device 21x arranged on the communication transmission side is a terminal device that processes audio data, the terminal device 21x obtains 4 bits from the MSB (Most Significant Bit) side of each sample data included in the audio data. The extracted data is transmitted as audio data to the terminal device 21y without being processed by a filter or the like.

上記のように量子化ビット数が４となった音声データは、ダイナミックレンジが極めて狭いため、音声波形を大まかにしか再現できない。従って、端末装置２１ｙのスピーカから発音される音声は曖昧音声となる。第１０実施形態によれば、第９実施形態と同様に、端末装置２１ｘはフィルタ処理等を行うよりも容易に、曖昧音声データを生成することができる。さらに、第１０実施形態によれば、曖昧通信において送受信される音声データの量が削減される。 As described above, since the audio data having the quantization bit number of 4 has a very narrow dynamic range, the audio waveform can only be roughly reproduced. Therefore, the sound produced from the speaker of the terminal device 21y is an ambiguous sound. According to the tenth embodiment, similarly to the ninth embodiment, the terminal device 21x can generate ambiguous voice data more easily than performing a filtering process or the like. Furthermore, according to the tenth embodiment, the amount of audio data transmitted and received in ambiguous communication is reduced.

［１１．第１１実施形態］
第１１実施形態は、上述した第８実施形態と多くの点で共通しているため、以下、第１１実施形態が第８実施形態と異なる点のみを説明する。第１１実施形態においては、音声データを加工する端末装置２１は、音声データに対し人間の声のみを取り除く加工を行うことにより、曖昧音声データを生成する。 [11. Eleventh Embodiment]
Since the eleventh embodiment is common in many respects to the above-described eighth embodiment, only the differences between the eleventh embodiment and the eighth embodiment will be described below. In the eleventh embodiment, the terminal device 21 that processes voice data generates ambiguous voice data by performing processing that removes only human voice from the voice data.

例えば、通信の送信側に配置された端末装置２１ｘが音声データを加工する端末装置であるとすると、端末装置２１ｘのＨＤには、予め複数の人間により発音された各音素のスペクトル成分の各々の平均値を示すデータ（以下、「基準スペクトルデータ」と呼ぶ）が記憶されている。端末装置２１ｘは、マイクおよびＡ／Ｄコンバータを介して得られる未加工の音声データを、例えば１０ミリ秒の時間単位で順次選択し、選択した音声データのスペクトル成分を示すデータ（以下、「対象スペクトルデータ」と呼ぶ）を生成する。 For example, assuming that the terminal device 21x arranged on the communication transmission side is a terminal device that processes audio data, the HD of the terminal device 21x has each of the spectral components of each phoneme pronounced in advance by a plurality of people. Data indicating an average value (hereinafter referred to as “reference spectrum data”) is stored. The terminal device 21x sequentially selects raw audio data obtained via the microphone and the A / D converter in units of time of, for example, 10 milliseconds, and data indicating the spectrum components of the selected audio data (hereinafter, “target” Spectral data ”).

続いて、端末装置２１ｘは、対象スペクトルデータと各音素に対応する基準スペクトルデータとの類似度を示す指標として、例えば相関係数を算出し、算出した相関係数が所定値を超えるか否かを判定する。この所定値は、０〜１の範囲内の数値であり、人間の声を含む音声データから生成される対象スペクトルデータを用いて算出される相関係数よりも小さく、人間の声を含まない音声データから生成される対象スペクトルデータを用いて算出される相関係数よりも大きな値となるように調整されている。 Subsequently, the terminal device 21x calculates, for example, a correlation coefficient as an index indicating the degree of similarity between the target spectrum data and the reference spectrum data corresponding to each phoneme, and whether or not the calculated correlation coefficient exceeds a predetermined value. Determine. The predetermined value is a numerical value in the range of 0 to 1, and is smaller than a correlation coefficient calculated using target spectrum data generated from audio data including a human voice, and does not include a human voice. The value is adjusted to be larger than the correlation coefficient calculated using the target spectrum data generated from the data.

上記の判定において、対象スペクトルデータといずれかの音素に対応する基準スペクトルデータとの相関係数が所定値を超えた場合、端末装置２１ｘは対象スペクトルデータから基準スペクトルデータを減算し、その結果得られるスペクトル成分を重畳することにより音声データを生成し、生成した音声データを端末装置２１ｙに送信する。一方、対象スペクトルデータと基準スペクトルデータとの相関係数が、いずれの音素に対応する基準スペクトルデータに関しても所定値を超えなかった場合、端末装置２１ｘは先に選択した音声データをそのまま、端末装置２１ｙに送信する。 In the above determination, when the correlation coefficient between the target spectrum data and the reference spectrum data corresponding to any phoneme exceeds a predetermined value, the terminal device 21x subtracts the reference spectrum data from the target spectrum data and obtains the result. Audio data is generated by superimposing the spectral components to be generated, and the generated audio data is transmitted to the terminal device 21y. On the other hand, if the correlation coefficient between the target spectrum data and the reference spectrum data does not exceed a predetermined value with respect to the reference spectrum data corresponding to any phoneme, the terminal device 21x keeps the previously selected voice data as it is. To 21y.

上記のような処理により端末装置２１ｘから端末装置２１ｙに送信される音声データは、未加工の音声のうち人間の声を含む部分から、標準的な人間により発音される声の成分を取り除いた音声を示す曖昧音声データである。なお、端末装置２１ｘが人間の声を含む音声部分の音声データを特定および除去する方法は、音素単位のスペクトル成分を対象となる音声データのスペクトル成分と比較する方法に限られず、例えば独立成分分析技術によるブラインド音源分離法など、他の技術が用いられてもよい。 The voice data transmitted from the terminal device 21x to the terminal device 21y by the above processing is a voice obtained by removing a voice component pronounced by a standard human being from a part including a human voice in the raw voice. Is ambiguous voice data. Note that the method of specifying and removing the voice data of the voice part including the human voice by the terminal device 21x is not limited to the method of comparing the spectral component of the phoneme unit with the spectral component of the target voice data, for example, independent component analysis Other techniques may be used, such as a blind source separation technique.

また、基準スペクトルデータを、複数の人間により発音された音声から生成する代わりに、特定の人間により発音された音声から生成するようにしてもよい。その場合、通信の受信側のスピーカからは、通信の送信側の生活音から特定の人間により発音される声の成分のみを取り除いた曖昧音声が発音される。 Further, the reference spectrum data may be generated from a sound pronounced by a specific person instead of being generated from a sound pronounced by a plurality of persons. In this case, the speaker on the communication receiving side produces an ambiguous sound obtained by removing only the voice component pronounced by a specific person from the life sound on the communication transmitting side.

第１１実施形態によれば、曖昧通信の送信側で発せられる全ての人間の声もしくは特定の人間の声以外の生活音が鮮明に受信側に伝達されるため、受信側にいる者はより詳細に送信側の様子を知ることができる。 According to the eleventh embodiment, all human voices emitted on the transmission side of the ambiguous communication or life sounds other than the specific human voice are clearly transmitted to the reception side. It is possible to know the state of the transmitting side.

［１２．第１２実施形態］
第１２実施形態は、上述した第８実施形態と多くの点で共通しているため、以下、第１２実施形態が第８実施形態と異なる点のみを説明する。第１２実施形態においては、音声データを加工する端末装置２１は、特定の音声を含む音声データに関しては、その特定の音声を曖昧音声に変換しないような処理を行う。 [12. 12th Embodiment]
Since the twelfth embodiment is common in many respects to the eighth embodiment described above, only the differences of the twelfth embodiment from the eighth embodiment will be described below. In the twelfth embodiment, the terminal device 21 that processes audio data performs a process that does not convert the specific sound into ambiguous sound for the sound data including the specific sound.

例えば、通信の送信側に配置された端末装置２１ｘが音声データを加工する端末装置であるとすると、端末装置２１ｘのＨＤには、予め曖昧音声に加工したくない音声を、例えば１０ミリ秒毎に分割したそれぞれのスペクトル成分を示すデータが、基準スペクトルデータとして記憶されている。曖昧音声にしたくない音声としては、「おーい」という呼びかけの声、幼児の鳴き声、ドアフォンや警告ブザーの音、ドアの開閉音など、様々なものが考えられる。 For example, if the terminal device 21x arranged on the communication transmission side is a terminal device that processes audio data, the HD of the terminal device 21x stores, for example, every 10 milliseconds of audio that is not desired to be processed into ambiguous audio in advance. Data indicating the respective spectral components divided into two is stored as reference spectral data. There are various types of voices that you do not want to make ambiguous voices, such as a voice calling "Ooi", an infant cry, a door phone or warning buzzer, or a door opening / closing sound.

端末装置２１ｘは、第１１実施形態における場合と同様に、未加工の音声データを、例えば１０ミリ秒の時間単位で順次選択し、選択した音声データのスペクトル成分を示すデータ、すなわち対象スペクトルデータを生成する。続いて、端末装置２１ｘは、基準スペクトルデータと対象スペクトルデータとの間で、第１１実施形態における場合と同様の相関係数による判定を行う。 As in the case of the eleventh embodiment, the terminal device 21x sequentially selects raw audio data in units of time of, for example, 10 milliseconds, and selects data indicating spectral components of the selected audio data, that is, target spectrum data. Generate. Subsequently, the terminal device 21x performs determination using the same correlation coefficient as in the eleventh embodiment between the reference spectrum data and the target spectrum data.

上記の判定において、対象スペクトルデータといずれかの基準スペクトルデータとの相関係数が所定値を超えた場合、端末装置２１ｘは先に選択した音声データをそのまま、端末装置２１ｙに送信する。一方、対象スペクトルデータと基準スペクトルデータとの相関係数が、いずれの基準スペクトルデータに関しても所定値を超えなかった場合、端末装置２１ｘは先に選択した音声データをフィルタ処理等により曖昧音声データに加工し、端末装置２１ｙに送信する。 In the above determination, when the correlation coefficient between the target spectrum data and any of the reference spectrum data exceeds a predetermined value, the terminal device 21x transmits the previously selected voice data to the terminal device 21y as it is. On the other hand, if the correlation coefficient between the target spectrum data and the reference spectrum data does not exceed a predetermined value for any reference spectrum data, the terminal device 21x converts the previously selected audio data into ambiguous audio data by filtering or the like. It is processed and transmitted to the terminal device 21y.

上記のような処理により端末装置２１ｘから端末装置２１ｙに送信される音声データは、特定の音声を含む部分に関してのみ未加工の音声を示す曖昧音声データである。従って、通信の受信側にいる者は、曖昧通信を行っている間に通信の送信側で特定の音声が発せられた場合、例えば幼児の鳴き声がした場合や誰もいないはずの部屋でドアの開閉音がした場合などに、その音声を明瞭に聞くことにより、送信側における異常を容易に知ることができる。 The audio data transmitted from the terminal device 21x to the terminal device 21y by the process as described above is ambiguous audio data indicating unprocessed audio only for a part including a specific audio. Therefore, a person on the receiving side of a communication may receive a specific voice on the transmitting side of the communication while performing ambiguous communication, for example, if a child makes a cry or if there is nobody in the room When an opening / closing sound is heard, the abnormality on the transmission side can be easily known by clearly listening to the sound.

なお、基準スペクトルデータと対象スペクトルデータとの間の相関係数が所定値を超えた場合、その後の一定時間もしくはユーザからの指示があるまでの時間、端末装置２１ｘはすべての音声データを未加工のままで端末装置２１ｙに送信するようにしてもよい。その場合、例えば、通信の送信側にいる者は「おーい」等の呼びかけを行うことにより、他の操作を行うことなく曖昧通信を解除し、通常の音声による通信を開始することができる。また、誰もいないはずの通信の送信側においてドアの開閉音がした場合等においても、曖昧通信がしばらくの間解除されるため、通信の受信側にいる者は、送信側の異常等をより容易に確認することができる。 When the correlation coefficient between the reference spectrum data and the target spectrum data exceeds a predetermined value, the terminal device 21x does not process all the audio data for a certain period of time or until the user gives an instruction. You may make it transmit to the terminal device 21y as it is. In this case, for example, a person on the transmission side of the communication can cancel the ambiguous communication and start normal voice communication without performing other operations by calling “Ooi” or the like. Also, even when there is a door opening / closing sound on the sending side of a communication that nobody should have, the ambiguous communication is canceled for a while, so the person on the receiving side of the communication will be more aware of abnormalities on the sending side, etc. It can be easily confirmed.

なお、端末装置２１ｘが曖昧音声にしたくない音声を含む音声部分の音声データを特定する方法は、単位時間分の基準となる音声データのスペクトル成分を対象となる音声データのスペクトル成分と比較する方法に限られず、他の音声認識技術が用いられてもよい。また、端末装置２１ｘは、特定の音声を曖昧音声にしないために、特定の音声を含む音声部分の音声データに関しては加工処理は行うが、加工処理の方法を異ならせる構成としても良い。例えば、端末装置２１ｘは、独立成分分析技術によるブラインド音源分離法などにより、音声データから曖昧音声にしたくない音声の成分を分離し、曖昧音声にしたくない音声の成分については加工処理を行わず、他の音声の成分については加工処理を行い、それらの音声データを加算する方法などが考えられる。 In addition, the method of specifying the audio | voice data of the audio | voice part containing the audio | voice which the terminal device 21x does not want to make ambiguous audio | voice compares the spectrum component of the audio | voice data used as the reference | standard for unit time with the spectrum component of the audio | voice data used as object. However, other voice recognition techniques may be used. Further, the terminal device 21x performs the processing on the audio data of the audio portion including the specific sound in order not to make the specific sound ambiguous, but the processing method may be different. For example, the terminal device 21x separates a voice component that is not desired to be ambiguous voice from the voice data by a blind sound source separation method using an independent component analysis technique, and does not perform processing on the voice component that is not desired to be ambiguous voice. A method of processing other audio components and adding the audio data can be considered.

［１３．第１３実施形態］
第１３実施形態は、上述した第８実施形態と多くの点で共通しているため、以下、第１３実施形態が第８実施形態と異なる点のみを説明する。第１３実施形態においては、音声データを加工する端末装置２１は、通信の送信側の生活音を快適な音を示す音声データに変換する。 [13. 13th Embodiment]
Since the thirteenth embodiment is common in many respects to the eighth embodiment described above, only the differences of the thirteenth embodiment from the eighth embodiment will be described below. In the thirteenth embodiment, the terminal device 21 that processes audio data converts the life sound on the transmission side of communication into audio data indicating a comfortable sound.

例えば、通信の送信側に配置された端末装置２１ｘが音声データを加工する端末装置であるとすると、端末装置２１ｘのＨＤには、予め風鈴、水のせせらぎ、鳥のさえずり等の人間にとって快く感じられる音を示す音声データ（以下、「背景音声データ」と呼ぶ）が、例えば１分間分記憶されている。背景音声データのそれぞれは、最初の部分と最後の部分とをつないで再生した際にクリック音を発生しないよう、クロスフェード等の処理によりレベル調整等がなされている。 For example, assuming that the terminal device 21x arranged on the communication transmission side is a terminal device that processes audio data, the HD of the terminal device 21x feels pleasant for humans such as wind chimes, water murmurs, and bird chirps in advance. The sound data indicating the sound to be played (hereinafter referred to as “background sound data”) is stored for one minute, for example. Each of the background audio data is subjected to level adjustment or the like by a process such as cross fading so that a click sound is not generated when the first part and the last part are connected and reproduced.

端末装置２１ｘは、まず、マイクおよびＡ／Ｄコンバータを介して得られる一定時間分、例えば１秒間分の未加工の音声データをＲＡＭに記憶し、記憶した音声データの音量を示すデータ、例えば音声データに含まれるサンプルデータの絶対値の平均値を算出する。続いて、端末装置２１ｘは背景音声データに含まれるサンプルデータを順次選択し、選択したサンプルデータに対し、先に算出した音量を示すデータ（以下、「音量データ」と呼ぶ）に応じた値調整の処理を加える。音量データが０〜１２７の範囲であるとすると、端末装置２１ｘは、例えば（サンプルデータ）×（音量データ）×５０／１２７の計算を行い、その結果を新たなサンプルデータとする。 The terminal device 21x first stores raw audio data for a certain time, for example, 1 second, obtained via the microphone and the A / D converter in the RAM, and data indicating the volume of the stored audio data, for example, audio The average value of the absolute values of the sample data included in the data is calculated. Subsequently, the terminal device 21x sequentially selects sample data included in the background audio data, and adjusts a value corresponding to data indicating the previously calculated volume (hereinafter referred to as “volume data”) for the selected sample data. Add the process. If the volume data is in the range of 0 to 127, the terminal device 21x calculates, for example, (sample data) × (volume data) × 50/127, and sets the result as new sample data.

このようにして算出されるサンプルデータの列は、音量データに応じて１秒間単位で０％〜５０％の範囲で音量調整のなされた風鈴等の音を示す音声データである。端末装置２１ｘは、上記のように生成した音声データを端末装置２１ｙに送信する。その結果、端末装置２１ｙのスピーカからは、通信の送信側における生活音の音量に応じた音量の風鈴等の音が発音される。そのように発音される風鈴等の音は、通信の送信側の生活音に含まれる情報のうち、音量に関する情報のみを通信の受信側に伝達する曖昧音声である。なお、端末装置２１ｘが背景音声データを加工する方法は、通信の送信側における生活音の音量に応じて音量を調整する方法に限られない。例えば、端末装置２１ｘは、生活音の音高に応じて背景音声データの音高を調整してもよい。 The column of sample data calculated in this way is audio data indicating a sound such as a wind chime whose volume is adjusted in a range of 0% to 50% in units of one second according to the volume data. The terminal device 21x transmits the voice data generated as described above to the terminal device 21y. As a result, from the speaker of the terminal device 21y, a sound such as a wind chime having a volume corresponding to the volume of the living sound on the communication transmission side is generated. The sound such as a wind chime so pronounced is an ambiguous voice that transmits only the information related to the volume to the reception side of the communication among the information included in the life sound on the transmission side of the communication. Note that the method of processing the background audio data by the terminal device 21x is not limited to the method of adjusting the volume according to the volume of the living sound on the communication transmission side. For example, the terminal device 21x may adjust the pitch of the background audio data according to the pitch of the living sound.

なお、端末装置２１ｘは、予めＨＤに記憶している背景音声データを利用する代わりに、例えば、インターネット２７を介してサーバから背景音声データを取得して利用するようにしてもよい。また、端末装置２１ｘは、例えば無人島で常時録音される波の音等のように、全く異なる地点で録音される音の情報をリアルタイムに受信し、その情報を背景音声データとして利用してもよい。上記のように端末装置２１ｘが背景音声データを外部の装置から取得するように構成すると、長時間の背景音声データもしくは時間制限のない背景音声データを用いた曖昧通信が容易に実現される。 Note that the terminal device 21x may acquire and use background audio data from a server via the Internet 27, for example, instead of using background audio data stored in advance in HD. Further, the terminal device 21x may receive in real time information of sound recorded at completely different points such as a wave sound that is always recorded on an uninhabited island, and use the information as background audio data. Good. If the terminal device 21x is configured to acquire background audio data from an external device as described above, ambiguous communication using long-time background audio data or background audio data with no time limit is easily realized.

第１３実施形態によれば、通信の送信側における生活音が不快な雑音を含む場合であっても、通信の受信側にいる者は快適に曖昧通信を利用することができる。 According to the thirteenth embodiment, even if the living sound on the communication transmission side includes unpleasant noise, the person on the communication reception side can comfortably use the ambiguous communication.

［１４．第１４実施形態］
第１４実施形態は、上述した第８実施形態と多くの点で共通しているため、以下、第１４実施形態が第８実施形態と異なる点のみを説明する。第１４実施形態においては、音声データを加工する端末装置２１は、第１３実施形態と同様に、通信の送信側の生活音を快適な音を示す音声データに変換する。ただし、第１４実施形態においては、音声データを曖昧音声データに変換する際に、楽音の発音を指示する演奏データが用いられる。以下の説明においては、ＭＩＤＩ（ＭｕｓｉｃａｌＩｎｓｔｒｕｍｅｎｔＤｉｇｉｔａｌＩｎｔｅｒｆａｃｅ）規格に従った演奏データ（以下、「ＭＩＤＩデータ」と呼ぶ）を用いて第１４実施形態を実施する場合を例として説明するが、演奏データの形式はＭＩＤＩ規格に従ったものに限られない。 [14. 14th Embodiment]
Since the fourteenth embodiment is common in many respects to the eighth embodiment described above, only the differences of the fourteenth embodiment from the eighth embodiment will be described below. In the fourteenth embodiment, the terminal device 21 that processes voice data converts the life sound on the transmission side of communication into voice data indicating a comfortable sound, as in the thirteenth embodiment. However, in the fourteenth embodiment, performance data for instructing the pronunciation of a musical sound is used when converting voice data into ambiguous voice data. In the following description, a case in which the fourteenth embodiment is implemented using performance data (hereinafter referred to as “MIDI data”) in accordance with the MIDI (Musical Instrument Digital Interface) standard will be described as an example. The format is not limited to that according to the MIDI standard.

例えば、通信の送信側に配置された端末装置２１ｘが音声データを加工する端末装置であるとすると、端末装置２１ｘのＨＤには、予めハープやハンドベル等の楽音を示す音声データ（以下、「楽音データ」と呼ぶ）が、各音高に関し記憶されている。楽音データのそれぞれには、音色を０〜１２７の数値で指定するプログラムナンバーおよび音高を０〜１２７の数値で指定するノートナンバーが対応付けられている。また、端末装置２１ｘのＨＤには、ハープやハンドベル等の音色のそれぞれに関し、例えば中音部の「Ｃ」に対応する楽音データのスペクトル成分を示すデータが、基準スペクトルデータとして記憶されている。 For example, assuming that the terminal device 21x arranged on the communication transmission side is a terminal device that processes voice data, the HD of the terminal device 21x has voice data (hereinafter referred to as "musical sound") indicating a musical sound such as a harp or a handbell in advance. Data ") is stored for each pitch. Each of the musical tone data is associated with a program number that designates a tone color with a numerical value of 0 to 127 and a note number that designates a pitch with a numerical value of 0 to 127. The HD of the terminal device 21x stores, for example, data indicating the spectrum component of the musical tone data corresponding to “C” in the middle tone portion for each tone color such as a harp or a hand bell as reference spectrum data.

端末装置２１ｘは、まず、マイクおよびＡ／Ｄコンバータを介して得られる一定時間分、例えば１０ミリ秒間分の未加工の音声データをＲＡＭに記憶し、記憶した音声データの音量を示すデータ、例えば音声データに含まれるサンプルデータの絶対値の平均値を順次算出する。続いて、端末装置２１ｘは先に算出した音量を示すデータ（以下、「音量データ」と呼ぶ）の列に対し微分処理等を行い、音量データの値が急激に増加するタイミング（以下、「発音タイミング」と呼ぶ）および音量データの値が所定値以下となるタイミング（以下、「消音タイミング」と呼ぶ）を特定する。端末装置２１ｘは、発音タイミングに対応する音量データに基づき、ＭＩＤＩデータにおける対応するベロシティを特定する。ベロシティは、音量を０〜１２７の数値で指定するデータである。例えば、未加工の音声データの量子化ビット数が８である場合、サンプルデータの値の絶対値の平均値である音量データは０〜１２７の値をとるので、端末装置２１ｘは音量データの値をそのままベロシティとする。 The terminal device 21x first stores raw audio data for a fixed time, for example, 10 milliseconds, obtained through the microphone and the A / D converter in the RAM, and data indicating the volume of the stored audio data, for example, The average value of the absolute values of the sample data included in the audio data is sequentially calculated. Subsequently, the terminal device 21x performs a differentiation process or the like on a column of data indicating the previously calculated volume (hereinafter referred to as “volume data”), and the timing at which the value of the volume data rapidly increases (hereinafter referred to as “sound generation”). (Referred to as “timing”) and the timing at which the value of the volume data becomes equal to or less than a predetermined value (hereinafter referred to as “silence timing”). The terminal device 21x specifies the corresponding velocity in the MIDI data based on the volume data corresponding to the sound generation timing. The velocity is data that designates the volume with a numerical value of 0 to 127. For example, when the number of quantization bits of the raw audio data is 8, the volume data that is the average of the absolute values of the sample data takes a value of 0 to 127. Is the velocity as it is.

続いて、端末装置２１ｘは未加工の音声データから、発音タイミングと消音タイミングの間に対応する部分を選択し、選択した音声データのスペクトル成分を算出する。端末装置２１ｘは、算出したスペクトル成分の振幅値が最大となる周波数に対応するノートナンバーを特定する。例えば、スペクトル成分の振幅値が最大となる周波数が４４０ｋＨｚ近傍であれば、端末装置２１ｘは中音部の「Ａ」を示す６９をノートナンバーとする。 Subsequently, the terminal device 21x selects a corresponding portion between the sound generation timing and the mute timing from the raw audio data, and calculates the spectrum component of the selected audio data. The terminal device 21x specifies the note number corresponding to the frequency at which the calculated amplitude value of the spectral component is maximum. For example, if the frequency at which the amplitude value of the spectral component is maximum is around 440 kHz, the terminal device 21x sets 69 indicating “A” in the middle sound portion as the note number.

続いて、端末装置２１ｘはＨＤに記憶されている基準スペクトルデータのそれぞれに関し、先に算出した発音タイミングと消音タイミングの間の音声データに関するスペクトル成分と、基準スペクトルデータとの間の類似度を示す指標として、例えば相関係数を算出する。端末装置２１ｘは、算出した相関係数が最大となる基準スペクトルデータに対応する楽音データのプログラムナンバーを特定する。 Subsequently, the terminal device 21x indicates, for each of the reference spectrum data stored in the HD, the similarity between the spectrum component related to the sound data between the previously calculated sounding timing and the mute timing and the reference spectrum data. As an index, for example, a correlation coefficient is calculated. The terminal device 21x specifies the program number of the musical sound data corresponding to the reference spectrum data that maximizes the calculated correlation coefficient.

端末装置２１ｘは、上記のように特定したベロシティおよびノートナンバーを用いて、楽音の発音を指示するＭＩＤＩデータであるノートオンメッセージを生成する。また、端末装置２１ｘは、上記のように特定したプログラムナンバーを用いて、音色を指定するＭＩＤＩデータであるプログラムチェンジメッセージを生成する。 The terminal device 21x uses the velocity and note number specified as described above to generate a note-on message that is MIDI data instructing the pronunciation of a musical sound. Further, the terminal device 21x generates a program change message that is MIDI data for designating a timbre, using the program number specified as described above.

端末装置２１ｘは、ＨＤから、上記のように生成したプログラムチェンジメッセージに含まれるプログラムナンバーに対応付けられ、かつ上記のように生成したノートオンメッセージに含まれるノートナンバーに対応付けられた楽音データを選択する。続いて、端末装置２１ｘは、ノートオンメッセージに含まれるベロシティに応じて、音量調整の処理を行う。すなわち、端末装置２１ｘは、選択した楽音データに含まれる各サンプルデータに関し、例えば、（サンプルデータ）×（ベロシティ）×５０／１２７の計算を行い、その結果を新たなサンプルデータとする。 The terminal device 21x receives from the HD the musical tone data associated with the program number included in the program change message generated as described above and associated with the note number included in the note-on message generated as described above. select. Subsequently, the terminal device 21x performs a volume adjustment process according to the velocity included in the note-on message. That is, the terminal device 21x calculates (sample data) × (velocity) × 50/127, for example, for each sample data included in the selected musical sound data, and sets the result as new sample data.

端末装置２１ｘは、以上の処理により得られるサンプルデータの列を音声データとして端末装置２１ｙに送信する。その結果、端末装置２１ｙのスピーカからは、通信の送信側における生活音の変化に応じて、様々なタイミングで、異なる音色の楽音が、異なる音量および音高で発音される。そのように発音される楽音は、通信の送信側の生活音に含まれる情報のうち、音量、音高および音質に関する情報のみを通信の受信側に伝達する曖昧音声である。 The terminal device 21x transmits the sequence of sample data obtained by the above processing to the terminal device 21y as audio data. As a result, different tones of musical tones are generated from the speaker of the terminal device 21y with different volumes and pitches at various timings in accordance with changes in living sounds on the communication transmission side. The musical sound thus generated is an ambiguous voice that transmits only information related to volume, pitch, and sound quality to the receiving side of the communication among the information included in the life sound on the transmitting side of the communication.

なお、端末装置２１ｘが端末装置２１ｙに送信する楽音データの選択方法、楽音データの送信タイミングの決定方法および楽音データの音量調整の方法は、上記の方法に限られない。例えば、端末装置２１ｘは、生活音の音高が急激に変化したタイミングで、楽音データを端末装置２１ｙに送信したり、音量に応じて異なる音色の楽音データを端末装置２１ｙに送信したりしてもよい。 Note that the selection method of the musical sound data transmitted from the terminal device 21x to the terminal device 21y, the determination method of the transmission timing of the musical sound data, and the method of adjusting the volume of the musical sound data are not limited to the above methods. For example, the terminal device 21x transmits musical tone data to the terminal device 21y at a timing when the pitch of the living sound suddenly changes, or transmits musical tone data of a different tone depending on the volume to the terminal device 21y. Also good.

第１４実施形態によれば、第１３実施形態による場合と同様に、通信の送信側における生活音が不快な雑音を含む場合であっても、通信の受信側にいる者は快適に曖昧通信を利用することができる。 According to the fourteenth embodiment, as in the case of the thirteenth embodiment, even if the living sound on the communication transmission side contains unpleasant noise, the person on the communication reception side can comfortably carry out ambiguous communication. Can be used.

なお、第１４実施形態の他の実施態様として、通信の受信側である端末装置２１ｙがＨＤに楽音データを記憶しておき、通信の送信側である端末装置２１ｘにより生成された演奏データに基づいて、端末装置２１ｙが楽音データの再生を行うようにしてもよい。その場合、端末装置２１ｘから端末装置２１ｙに送信されるデータは演奏データのみとなり、端末装置２１ｘと端末装置２１ｙとの間で送受信されるデータ量が削減される。 As another embodiment of the fourteenth embodiment, the terminal device 21y on the communication receiving side stores musical tone data in the HD, and based on performance data generated by the terminal device 21x on the communication transmitting side. Thus, the terminal device 21y may reproduce the musical sound data. In that case, the data transmitted from the terminal device 21x to the terminal device 21y is only performance data, and the amount of data transmitted and received between the terminal device 21x and the terminal device 21y is reduced.

［１５．第１５実施形態］
第１５実施形態は、上述した第８実施形態と多くの点で共通しているため、以下、第１５実施形態が第８実施形態と異なる点のみを説明する。第１５実施形態においては、通信の送信側および受信側に配置された端末装置２１においては音声データの加工が行われず、インターネット２７内に配置された音声加工サーバによって、音声データの加工が行われる。 [15. Fifteenth embodiment]
Since the fifteenth embodiment is common in many respects to the eighth embodiment described above, only the differences of the fifteenth embodiment from the eighth embodiment will be described below. In the fifteenth embodiment, the voice data is not processed in the terminal device 21 arranged on the transmission side and the reception side of the communication, and the voice data is processed by the voice processing server arranged in the Internet 27. .

図９は、第１５実施形態における通信システム９を示した図である。通信システム９においては、第８実施形態における通信システム８の構成要素に加え、一般ゲートウェイサーバ２５ｘおよびＶｏＩＰゲートウェイサーバ２６ｘと、一般ゲートウェイサーバ２５ｙおよびＶｏＩＰゲートウェイサーバ２６ｙとの間に、音声加工サーバ３１が設けられている。 FIG. 9 is a diagram showing a communication system 9 in the fifteenth embodiment. In the communication system 9, in addition to the components of the communication system 8 in the eighth embodiment, a voice processing server 31 is provided between the general gateway server 25x and the VoIP gateway server 26x and the general gateway server 25y and the VoIP gateway server 26y. Is provided.

音声加工サーバ３１は、音声データの加工処理を行うとともに、一般ゲートウェイサーバ２５およびＶｏＩＰゲートウェイサーバ２６との間でパケットデータの送受信を行うことが可能な装置である。音声加工サーバ３１は、汎用コンピュータに特定のプログラムに従った処理を行わせることによっても実現可能である。以下の説明においては、音声加工サーバ３１は、ＣＰＵ、ＤＳＰ、ＲＯＭ、ＲＡＭ、ＨＤ、表示部、操作部およびＮＷ入出力部を有する汎用コンピュータに通信システム９の音声加工サーバ用のプログラムを実行させることにより実現するものとする。 The voice processing server 31 is a device capable of processing voice data and transmitting and receiving packet data between the general gateway server 25 and the VoIP gateway server 26. The voice processing server 31 can also be realized by causing a general-purpose computer to perform processing according to a specific program. In the following description, the voice processing server 31 causes a general-purpose computer having a CPU, DSP, ROM, RAM, HD, display unit, operation unit, and NW input / output unit to execute a program for the voice processing server of the communication system 9. To be realized.

端末装置２１ｘはマイクおよびＡ／Ｄコンバータを介して得られる音声データを、加工することなく端末装置２１ｙに対し送信する。端末装置２１ｘから送信された音声データは、ＶｏＩＰゲートウェイサーバ２６ｘに受信され、音声加工サーバ３１に転送される。音声加工サーバ３１は、ＶｏＩＰゲートウェイサーバ２６ｘから転送された音声データを受信すると、上述した第８実施形態において端末装置２１ｘが行う音声データの加工処理と同様の処理を行い、曖昧音声データを生成する。音声加工サーバ３１は、生成した曖昧音声データを端末装置２１ｙに送信する。音声加工サーバ３１から送信された曖昧音声データは、ＶｏＩＰゲートウェイサーバ２６ｙを介して端末装置２１ｙに送信される。その結果、端末装置２１ｙのスピーカからは、曖昧音声が発音される。 The terminal device 21x transmits the audio data obtained through the microphone and the A / D converter to the terminal device 21y without being processed. The voice data transmitted from the terminal device 21 x is received by the VoIP gateway server 26 x and transferred to the voice processing server 31. When the voice processing server 31 receives the voice data transferred from the VoIP gateway server 26x, the voice processing server 31 performs processing similar to the voice data processing performed by the terminal device 21x in the eighth embodiment described above, and generates ambiguous voice data. . The voice processing server 31 transmits the generated ambiguous voice data to the terminal device 21y. The ambiguous voice data transmitted from the voice processing server 31 is transmitted to the terminal device 21y via the VoIP gateway server 26y. As a result, an ambiguous voice is pronounced from the speaker of the terminal device 21y.

第１５実施形態によれば、ユーザはＶｏＩＰ技術を用いた音声通信に通常用いられる端末装置を、特段の変更を加えることなく用いることにより、曖昧通信を利用することができる。 According to the fifteenth embodiment, the user can use ambiguous communication by using a terminal device normally used for voice communication using the VoIP technology without any particular change.

本発明の第１実施形態における通信システムの構成を示した図である。It is the figure which showed the structure of the communication system in 1st Embodiment of this invention. 本発明の第２実施形態における通信システムの構成を示した図である。It is the figure which showed the structure of the communication system in 2nd Embodiment of this invention. 本発明の第３実施形態における通信システムの構成を示した図である。It is the figure which showed the structure of the communication system in 3rd Embodiment of this invention. 本発明の第４実施形態における通信システムの構成を示した図である。It is the figure which showed the structure of the communication system in 4th Embodiment of this invention. 本発明の第５実施形態における通信システムの構成を示した図である。It is the figure which showed the structure of the communication system in 5th Embodiment of this invention. 本発明の第６実施形態における通信システムの構成を示した図である。It is the figure which showed the structure of the communication system in 6th Embodiment of this invention. 本発明の第７実施形態における通信システムの構成を示した図である。It is the figure which showed the structure of the communication system in 7th Embodiment of this invention. 本発明の第８実施形態における通信システムの構成を示した図である。It is the figure which showed the structure of the communication system in 8th Embodiment of this invention. 本発明の第１５実施形態における通信システムの構成を示した図である。It is the figure which showed the structure of the communication system in 15th Embodiment of this invention.

Explanation of symbols

１１…マイク、１２・２１…端末装置、１３…電話機、１４…公衆電話回線網、１５…アンプ、１６…スピーカ、１７…センサ、１８…ミキサ、２２…ＤＳＬモデム、２３…スプリッタ、２５…一般ゲートウェイサーバ、２６…ＶｏＩＰゲートウェイサーバ、２７…インターネット、３１…音声加工サーバ、１２１…音声加工部、１２２・１５２…操作部、１２３…計時部、１５１・１２１６…増幅部、１２１１…ローパスフィルタ、１２１２・１２１４…ピッチシフタ、１２１５…ノイズ低減フィルタ、１２１３…ハイパスフィルタ。 DESCRIPTION OF SYMBOLS 11 ... Microphone, 12.21 ... Terminal device, 13 ... Telephone, 14 ... Public telephone network, 15 ... Amplifier, 16 ... Speaker, 17 ... Sensor, 18 ... Mixer, 22 ... DSL modem, 23 ... Splitter, 25 ... General Gateway server, 26 ... VoIP gateway server, 27 ... Internet, 31 ... Audio processing server, 121 ... Audio processing unit, 122/152 ... Operation unit, 123 ... Timekeeping unit, 151/1216 ... Amplification unit, 1211 ... Low pass filter, 1212 1214: Pitch shifter, 1215: Noise reduction filter, 1213: High pass filter.

Claims

Sound information generating means for generating sound information based on the first sound collected at the first place and first information for transmitting the sound information generated by the sound information generating means. A first communication device comprising a transmission means;
A first receiving means disposed at a second location different from the first location and receiving sound information from the first communication device via a communication line; and a second receiving means based on the received sound information. A second communication device comprising at least output means for generating and outputting sound;
The first communication device, said second communication device, the information processing device provided in said communication line, provided on either of, the sound information generated by the sound information generation unit, the sound information a 曖昧音converting means that converts the ambiguous sound information representing another sound does not transmit the contents of the conversation contained in,
Said second sound, a communication system, wherein the ambiguous sound converting means and wherein the Ru sound der representing the ambiguous sound information converted to the output means outputs.

The ambiguous sound conversion means has acquisition means for acquiring sound information representing another sound that does not transmit the content of the conversation contained in the sound information, and the acquired sound information is converted into a volume of the generated sound information. by the Turkey adjust the volume or pitch corresponding to the pitch corresponding to, according to the sound information which the sound information generation unit is generated to claim 1, characterized in that converting the ambiguous sound information Communications system.

The ambiguous sound converting means includes a feature specifying means for analyzing a feature of the sound information and specifying at least one of a volume, a pitch, and a timbre based on the analyzed feature. communication system according to claim 1, characterized in that converting the sound information to the ambiguous sound information by combining the sound according to the result identified by the means.

The first communication device analyzes the feature of the sound information generated by the sound information generation unit, and specifies at least one of volume, pitch, and timbre based on the analyzed feature Means, and a specified result transmitting means for transmitting the result specified by the feature specifying means,
The ambiguous sound converting means, a communication system according to claim 1, characterized in that converting the sound information to the ambiguous sound information by combining the sound in accordance with the result sent by the specifying result transmitting means .

The ambiguous sound conversion means takes a difference between the first sound signal represented by the sound information and a sound signal obtained by delaying the first sound, and based on the difference, the sound information is converted into the sound information. The communication system according to claim 1, wherein the communication system is converted into ambiguous sound information.

The ambiguous sound conversion means reads a signal indicating a sine wave in accordance with the signal of the first sound represented by the sound information, and converts the sound information to the ambiguous sound information based on the signal indicating the sine wave. The communication system according to claim 1.

An instruction means for receiving a user instruction;
The ambiguous sound conversion means starts and ends a processing process for converting the sound information into the ambiguous sound information or changes a parameter used for the process in accordance with an instruction from the instruction means.
The communication system according to any one of claims 1 to 6, characterized in that:

An input means for inputting sound information;
Sound information input by the input means, conversion means that converts the ambiguous sound information representing another sound does not transmit the contents of the conversation included in the sound information,
Output means for outputting ambiguous sound information converted by the conversion means,
Furthermore, at least a receiving unit that receives sound information from another communication device and outputs the sound information to the input unit, and a transmission unit that transmits the ambiguous sound information output by the output unit to the other communication device. A communication device comprising one side.

Said conversion means includes acquisition means for acquiring sound information representing another sound does not transmit the contents of the conversation included in the sound information input by the input means, the acquired the sound information by the input means the communication apparatus according to claim 8, characterized in that converting the ambiguous sound information by the Turkey adjust the tone pitch corresponding to the volume or pitch corresponding to the volume of the input sound information.

The converting means analyzes the characteristics of the sound information input by the input means, specifies at least one of volume, pitch, and timbre based on the analyzed characteristics, and produces the specified result. depending communication apparatus according to claim 8, characterized by converting the sound information to the ambiguous sound information by combining the sound.

The conversion means receives at least one of a volume, a pitch, and a timbre specified based on characteristics obtained by analyzing the sound information together with the sound information input by the input means as a specification result. the communication apparatus according to claim 8, characterized by converting the sound information to the ambiguous sound information by combining the sound in accordance with the identification result.

Input processing to input sound information;
Sound information inputted in the input process, a conversion process that converts the ambiguous sound information representing another sound does not transmit the contents of the conversation included in the sound information,
Causing the computer to execute output processing for outputting the ambiguous sound information converted by the conversion processing,
Further, at least one of reception processing for receiving sound information used in the input processing from another communication device and transmission processing for transmitting ambiguous sound information output in the output processing to another communication device is transmitted to the computer. A program characterized by being executed.