JP3165585U

JP3165585U - Speech synthesizer

Info

Publication number: JP3165585U
Application number: JP2010007440U
Authority: JP
Inventors: 宜盟高橋; 秀之渡辺; 千鶴子高橋
Original assignee: 有限会社オフィス結アジア
Priority date: 2010-11-11
Filing date: 2010-11-11
Publication date: 2011-01-27
Anticipated expiration: 2020-11-11

Abstract

【課題】携帯電話などにおける通話時に、声をださずに受け答えを音声で伝えることを可能とする、音声合成装置を提供する。【解決手段】音声発生アプリケーションを格納し、送信元より発信される音声を受信する端末装置と音声を再生する音声再生装置とに、接続される音声合成装置であって、端末装置で受信した第１音声を受け取る音声受信手段と、第１音声を第１音声信号に変換し、かつ利用者により音声発生アプリケーションで選択された音声を第２音声信号に変換する音声変換手段と、第１音声信号および第２音声信号を合成する音声信号合成手段と、合成された音声信号を端末装置に送信する第１の送信手段と、合成された音声信号を音声再生装置に送信する第２の送信手段とを備える。【選択図】図２Provided is a voice synthesizing device that can transmit and receive a response without voice during a call on a mobile phone or the like. A voice synthesizing device connected to a terminal device that stores a voice generating application and receives a voice transmitted from a transmission source and a voice reproducing device that reproduces the voice, wherein Voice receiving means for receiving one voice, voice converting means for converting the first voice into a first voice signal, and converting voice selected by a user in a voice generation application into a second voice signal, and first voice signal Signal synthesizing means for synthesizing the audio signal and the second audio signal, first transmitting means for transmitting the synthesized audio signal to the terminal device, and second transmitting means for transmitting the synthesized audio signal to the audio reproducing apparatus. Is provided. [Selection] Figure 2

Description

本考案は、音声合成装置、より具体的には、通信中に、伝えたい音声を、備え付けのアプリケーションから文字群を選択し、これを音声信号に変換し、受信した音声信号と合成する装置に関する。 The present invention relates to a speech synthesizer, and more specifically, to a device that selects a character group from a provided application for speech to be transmitted during communication, converts the character group into a speech signal, and synthesizes the received speech signal. .

現在、発話困難者、例えば視覚・言語障害者は３４万６０００人（平成１３年度、厚生労働省による身体障害児・者実態調査）存在し、声帯関連手術などによる発話困難者を含め、数万人の発話困難者が存在するといわれている。発話困難な場合、携帯電話などで相手からの音声を聞くことができても、発話困難者から音声を発信することができない。 Currently, there are 346,000 people who have difficulty speaking, for example, visually impaired and language disabled (in FY 2001, a survey of physically disabled children / persons by the Ministry of Health, Labor and Welfare), and tens of thousands including those who have difficulty speaking due to vocal cord related surgery. It is said that there are people who have difficulty speaking. When it is difficult to speak, even if the voice from the other party can be heard with a mobile phone or the like, the voice cannot be sent from the person with difficulty speaking.

また、電車または病院などの公共的な場所にいる際に電話がかかってくることがある。この場合、緊急の連絡などに応えることができず、適当な時間が経過した後、改めて電話をかけ直す必要がある。 In addition, there is a case where a phone call comes in when in a public place such as a train or a hospital. In this case, it is not possible to respond to an urgent communication or the like, and it is necessary to call again after an appropriate time has passed.

現存の携帯電話などにおいては、通話を行うことが好ましくない環境に存在する場合に、自身による発声で情報を伝達するかわりに、携帯電話などの端末装置から音声ではない情報を送信することによって、相手の端末装置に音声を出力させることができる、音声変換機能を有するサービス提供装置が存在する。
例えば特許文献１では、ネットワーク上に解説したサイトや移動通信端末機を通じて転送した文字メッセージを音声に変換または変調して、端末機のベル音を作成したり、特定の国の言語に翻訳して音声として伝達するシステムが開示されている。 In an existing mobile phone or the like, when there is an environment in which it is not preferable to make a call, instead of transmitting information by utterance by itself, by transmitting non-voice information from a terminal device such as a mobile phone, There is a service providing apparatus having a voice conversion function that can output voice to a partner terminal apparatus.
For example, in Patent Document 1, text messages transferred through a site explained on a network or a mobile communication terminal are converted or modulated into speech, and a bell sound of the terminal is created or translated into a language of a specific country. A system for transmitting as speech is disclosed.

また、対話型通信方法を実施したものとして、例えば特許文献２では、音声・文字相互変換装置と、これを送受信するシステムを開示している。これは、電話機などの通話機器より送信される音声信号を通信により受信し、文字に変換するための文字認識装置により文字として認知し、表示画面に文字として表示するようにしたものである。また、送信元で入力した文字を音声合成により音声信号に変換し、この音声信号を通信システムを介して通話相手に送信する。 For example, Patent Document 2 discloses a voice / character mutual conversion device and a system for transmitting and receiving the same as an implementation of the interactive communication method. In this method, a voice signal transmitted from a telephone or other telephone equipment is received by communication, recognized as a character by a character recognition device for converting it into a character, and displayed on the display screen as a character. Moreover, the character input at the transmission source is converted into a voice signal by voice synthesis, and this voice signal is transmitted to the other party through the communication system.

特開２００３−１１０７５４号公報JP 2003-110754 A 特開２００３−８６９１号公報JP 2003-8691 A

しかしながら、現存の電話機（多機能電話機を含む）は、文字から変換された音声を聞くか、または音声から変換された文字を目視するかのどちらかを行うことしかできない。さらに、携帯電話に備え付け可能な音声−文字変換のためのアプリケーションが存在するものの、電話機能と、搭載しているアプリケーションの機能との、それぞれの音声入力を合成することができず、従って片方のみしか使用することができない。
本考案は、上述の課題を解決し、通話ができない環境においても、音声発生アプリケーションを用いて、相手の音声を聞きながらリアルタイムに対応することができる装置の提供を目的とする。 However, existing telephones (including multi-function telephones) can only either listen to the speech converted from characters or view the characters converted from speech. Furthermore, although there is an application for voice-to-character conversion that can be installed in a mobile phone, it is not possible to synthesize the voice input of the telephone function and the function of the installed application, so only one of them can be synthesized. Can only be used.
SUMMARY OF THE INVENTION An object of the present invention is to provide an apparatus that can solve the above-described problems and can respond in real time while listening to the other party's voice using a voice generation application even in an environment where a telephone call cannot be made.

本考案の音声合成装置によれば、音声発生アプリケーションを格納し、送信元より発信される音声を受信する端末装置と音声を再生する音声再生装置とに、有線または無線装置を介して操作可能に接続される音声合成装置であって、前記端末装置で受信した第１音声を受け取る音声受信手段と、前記第１音声を第１音声信号に変換し、かつ利用者により前記音声発生アプリケーションで選択された音声を第２音声信号に変換する音声変換手段と、前記第１音声信号および前記第２音声信号を合成する音声信号合成手段と、合成された音声信号を前記端末装置に送信する第１の送信手段と、前記合成された音声信号を前記音声合成装置に送信する第２の送信手段とを備える。 According to the voice synthesizer of the present invention, a voice generation application is stored, and a terminal device that receives voice transmitted from a transmission source and a voice playback device that plays voice can be operated via a wired or wireless device. A voice synthesizer to be connected; voice receiving means for receiving the first voice received by the terminal device; and the first voice is converted into a first voice signal and selected by the user in the voice generation application. Voice converting means for converting the obtained voice into a second voice signal, voice signal synthesizing means for synthesizing the first voice signal and the second voice signal, and a first for transmitting the synthesized voice signal to the terminal device A transmission unit; and a second transmission unit configured to transmit the synthesized voice signal to the voice synthesizer.

上述の手段により、通話が好ましくない環境、または発話困難者であっても、音声を聞きながらリアルタイムに音声で応答することができる。 By the above-described means, even in an environment where a telephone call is not preferable or a person who has difficulty in speaking, it is possible to respond in real time while listening to the voice.

本考案の音声合成装置はさらに、前記音声受信手段が前記音声再生装置から送信された第３音声を受け取り、前記音声変換手段が前記第３音声を第３音声信号に変換し、前記音声信号合成手段が前記第２音声信号及び前記第３音声信号を合成し、前記送信手段が、合成された音声信号を前記端末装置に送信することを特徴とする。
上述の手段により、発声することなく音声で意思を伝えるとともに周囲の環境音、例えば音楽を伝達することができる。 In the speech synthesizer of the present invention, the speech receiving unit receives the third speech transmitted from the speech reproduction device, the speech converting unit converts the third speech into a third speech signal, and the speech signal synthesis. The means synthesizes the second audio signal and the third audio signal, and the transmission means transmits the synthesized audio signal to the terminal device.
By the above-mentioned means, it is possible to convey an intention by voice without speaking and to transmit surrounding environmental sounds, for example, music.

本考案による音声合成装置によれば、声が出せなくても電話を利用することができ、発話困難者の活動範囲が広がり、自身で行えることが増えるという社会参加への推進が促進される。 According to the speech synthesizer according to the present invention, it is possible to use the telephone even if the voice cannot be spoken, and the promotion of social participation is promoted so that the range of activities of people with difficulty speaking can be expanded and the number of activities that can be performed by themselves increases.

また本考案の音声合成装置によれば、電車または病院などの公共的な場所にいる際に電話がかかってきたとき、第三者の目を気にせず、別の場所に移動せず、また改めて一定時間後にかけ直すことなく相手からの電話に対応することができ、結果携帯電話の利便性を向上させることができる。 In addition, according to the speech synthesizer of the present invention, when a call is received while in a public place such as a train or a hospital, the third party does not mind, does not move to another place, and It is possible to respond to a call from the other party without calling again after a certain time, and as a result, the convenience of the mobile phone can be improved.

また本考案の音声合成装置によれば、講演会や会議などで、スタッフや参加者が電話で指示を受けながら周囲に迷惑をかけることなく対応することができ、結果作業の効率性を向上させることができる。 In addition, according to the speech synthesizer of the present invention, staff and participants can respond without inconvenience to the surroundings while receiving instructions on the telephone at lectures and conferences, improving the efficiency of the result work be able to.

さらに本考案の音声合成装置によれば、不審者などに尾行されている際に、声に出さず家族や警察に現状を説明することができ、防犯を未然に防ぐことができる。 Furthermore, according to the speech synthesizer of the present invention, when being followed by a suspicious person or the like, the current situation can be explained to the family and the police without being uttered, and crime prevention can be prevented.

本考案の音声合成装置のシステム構成を示す図である。It is a figure which shows the system configuration | structure of the speech synthesizer of this invention. 本考案の音声合成装置の詳細を示す図である。It is a figure which shows the detail of the speech synthesizer of this invention. 本考案による音声合成のフローチャートである。3 is a flowchart of speech synthesis according to the present invention. 本考案による音声合成の別のフローチャートである。It is another flowchart of the speech synthesis by this invention. 本考案の音声合成装置の別のシステム構成を示す図である。It is a figure which shows another system configuration | structure of the speech synthesizer of this invention.

以下、本考案の実施形態について図面を参照しながら詳細に説明する。尚、以下に示す実施例は本考案の音声合成装置における好適な具体例であり、本考案の技術範囲は、特に本考案を限定する記載がない限り、これらの態様に限定されるものではない。また、以下に示す実施形態における構成要素は適宜、既存の構成要素等との置き換えが可能であり、かつ、他の既存の構成要素との組合せを含む様々なバリエーションが可能である。したがって、以下に示す実施形態の記載をもって、特許請求の範囲に記載された考案の内容を限定するものではない。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. The following embodiments are preferred specific examples of the speech synthesizer according to the present invention, and the technical scope of the present invention is not limited to these embodiments unless specifically described to limit the present invention. . In addition, the constituent elements in the embodiments shown below can be appropriately replaced with existing constituent elements and the like, and various variations including combinations with other existing constituent elements are possible. Therefore, the description of the embodiment described below does not limit the contents of the device described in the claims.

図１において、本考案による音声合成装置のシステム１は、有線または無線装置１０、端末装置１１、音声発生アプリケーション１２および音声変換モジュール１３を備える音声合成装置１４、有線または無線装置１５、および音声再生装置１６を含む。
有線または無線装置１０および１５は、例えば限定しないがＢｌｕｅＴｏｏｔｈ（登録商標）が好ましく、これは例えば２．４ＧＨｚのＩＳＭバンドと呼ばれる周波数を使用し、毎秒１６００回の周波数ホッピングを行うスペクトラム拡散方式を採用し、１Ｍｂｐｓの伝送速度の無線通信方式を有する。
端末装置は、携帯電話、ＰＨＳなどの通話用通信端末である。音声合成装置は、有線または無線装置を介して端末装置に接続される。使用時には、端末装置からの音声信号を受け取り、有線または無線装置を介して端末装置の表示画面に音声発声アプリケーションを表示する。音声発生アプリケーションは、検索結果から伝えたいフレーズを選択することで、その文字群を音声で発声する、既存のソフトウェアでよい。このような音声作成ソフトウェアにおいては、予め所定のテキスト群（例えば、「ありがとう」、「どういたしまして」）が記録されており、伝えたいフレーズ全てを文字入力することなく、５０音表の頭のかな１文字を入力することで、入力したかなに一致する会話文や単語が一覧表示される。利用者は、一覧表示中から任意のフレーズを選択するだけでよいため、時間を短縮することができる。また、韻律変形を行う音声修正機能などが搭載され、抑揚がつき、かつ感情のこもった音声として、選択したフレーズ音声を作成することができる。
音声再生装置は、例えばイヤホンまたはヘッドホンなどのヘッドセットが想定され、有線または無線装置を介して音声合成装置に接続される。ヘッドセットは随意にマイクを備える。 1, a speech synthesis apparatus system 1 according to the present invention includes a wired or wireless device 10, a terminal device 11, a speech synthesis device 14 including a speech generation application 12 and a speech conversion module 13, a wired or wireless device 15, and speech reproduction. Device 16 is included.
The wired or wireless devices 10 and 15 are preferably, for example, but not limited to, BlueTooth (registered trademark), which employs a spread spectrum method that uses a frequency called 2.4 GHz ISM band and performs frequency hopping 1600 times per second. And a wireless communication system having a transmission rate of 1 Mbps.
The terminal device is a communication terminal for calls such as a mobile phone or PHS. The speech synthesizer is connected to the terminal device via a wired or wireless device. When in use, a voice signal is received from the terminal device, and a voice utterance application is displayed on the display screen of the terminal device via a wired or wireless device. The voice generation application may be existing software that utters the character group by voice by selecting a phrase to be conveyed from the search result. In such a voice creation software, a predetermined text group (for example, “Thank you”, “Thank you”) is recorded in advance, and the first part of the 50-sound table is used without inputting all the phrases to be transmitted. By inputting characters, a list of conversation sentences and words that match the input kana is displayed. Since the user only has to select an arbitrary phrase from the list display, the time can be reduced. In addition, a voice correction function for performing prosodic transformation is installed, and a selected phrase voice can be created as a voice with inflection and emotion.
For example, a headset such as an earphone or a headphone is assumed as the voice reproduction device, and is connected to the voice synthesizer via a wired or wireless device. The headset optionally includes a microphone.

次に図２を参照すると、本考案による音声合成装置の内部構成が示される。音声合成装置１４は、音声発生アプリケーション１２、音声受信手段、音声変換手段、音声信号合成手段、第１の送信手段、および第２の送信手段を有する。 Next, referring to FIG. 2, the internal structure of the speech synthesizer according to the present invention is shown. The voice synthesizer 14 includes a voice generation application 12, a voice reception unit, a voice conversion unit, a voice signal synthesis unit, a first transmission unit, and a second transmission unit.

（実施例１）
以下、図３を参照して、本考案による音声合成装置を使用した、通話者Ａと通話者Ｂの対話のやり取りの過程の一例を示す。本実施例では、通話者Ａが固定電話を使用し、通話者Ｂが本発明の音声合成装置を備えた携帯電話を使用することを想定する。
Ｓ１０１では、通信者Ａは電話機を用いて通話者Ｂに対して音声「ありがとう」を発信し、携帯電話（端末装置）で受け取った音声「ありがとう」が、有線または無線装置１０を介して音声合成装置の受信手段により音声合成装置に受信される。通話者Ｂは、音声「ありがとう」を、音声合成装置と有線または無線装置１５とを介し、ヘッドセットから聞くことができる。
Ｓ１０２では、音声「ありがとう」が、音声合成装置の音声変換手段により音声信号に変換される（第１音声信号）。
Ｓ１０３では、通話者Ｂの選択により、有線または無線装置を介して音声発生アプリケーションが端末装置の表示画面に表示され、通話者Ｂは文字群「どういたしまして」を選択する。
Ｓ１０４では、文字群「どういたしまして」が、音声合成装置の音声変換手段により音声信号に変換される（第２音声信号）。
Ｓ１０５では、音声信号「ありがとう」と音声信号「どういたしまして」とが、音声装置の音声信号合成手段により合成される。
Ｓ１０６では、合成された音声信号が、音声合成装置の送信手段によりヘッドセットと端末装置とに送信される（第１の送信手段、第２の送信手段）。これにより端末装置に送信された音声信号が通話者Ａの電話機に送信され、通話者Ａは、音声「どういたしまして」を聞く。設定を変えることにより、第１音声のみ、第２音声のみをそれぞれ選択的に送信することもできる。 Example 1
Hereinafter, with reference to FIG. 3, an example of a conversation exchange process between the caller A and the caller B using the speech synthesizer according to the present invention will be described. In the present embodiment, it is assumed that the caller A uses a fixed phone and the caller B uses a mobile phone equipped with the speech synthesizer of the present invention.
In S <b> 101, the communicator A transmits a voice “thank you” to the caller B using the telephone, and the voice “thank you” received by the mobile phone (terminal device) is voice-synthesized via the wired or wireless device 10. Received by the speech synthesizer by the receiving means of the apparatus. The caller B can hear the voice “thank you” from the headset via the voice synthesizer and the wired or wireless device 15.
In S102, the voice “thank you” is converted into a voice signal by the voice conversion means of the voice synthesizer (first voice signal).
In S103, the voice generation application is displayed on the display screen of the terminal device via the wired or wireless device by the selection of the caller B, and the caller B selects the character group “you are welcome”.
In S104, the character group “you are welcome” is converted into a speech signal by the speech conversion means of the speech synthesizer (second speech signal).
In S105, the voice signal “thank you” and the voice signal “you are welcome” are synthesized by the voice signal synthesis means of the voice device.
In S106, the synthesized speech signal is transmitted to the headset and the terminal device by the transmission unit of the speech synthesizer (first transmission unit, second transmission unit). As a result, the audio signal transmitted to the terminal device is transmitted to the telephone of the caller A, and the caller A listens to the voice “you are welcome”. By changing the setting, it is possible to selectively transmit only the first sound and only the second sound.

（実施例２）
以下の実施例Ｓ２０１〜Ｓ２０６においては、通話者Ｂは騒音の激しい場所など声を出して話ができない環境にいる者、もしくは発話困難者を想定してもよい。例えば、周囲環境の音などを伝えるとともに音声メッセージも伝達したい場合、図４で示されるように、Ｓ２０１において通話者Ｂは、ヘッドセットに備え付けられたマイクから、この例では音楽を吹き込み、これが無線または有線装置を介して送信され、音声合成装置の受信手段により、音声合成装置に受信される。Ｓ２０２において、音楽の音声が、音声合成装置の音声変換手段により音声信号に変換される（第３音声信号）。
Ｓ２０３において、通話者Ｂの選択により、有線または無線装置を介して音声発生アプリケーションが端末装置の表示画面に表示され、通話者Ｂは文字群「この音楽いいよね」を選択する。
Ｓ２０４では、文字群「この音楽いいよね」が、音声合成装置の音声変換手段により音声信号に変換される（第２音声信号）。
Ｓ２０５では、音楽の音声信号（第３音声信号）と第２音声信号「この音楽いいよね」とが、音声合成装置の音声信号合成手段により合成される。
Ｓ２０６では、合成された音声信号が、音声合成装置の送信手段により、ヘッドセットと端末装置とに送信される（第１の送信手段、第２の送信手段）。これにより端末装置に送信された音声信号が通話者Ａの電話機に送信され、通話者Ａは、音楽と音声「この音楽いいよね」とを聞く。
通話者Ｂは、本考案の音声合成装置を使用することにより、特定の音楽を、伝えたい時に通話者Ａに伝達しつつ、さらに自身の意思や感情を音声により伝達することができる。従って、発話困難者であっても、コミュニケーションのやり取りといった機会が増し、社会参加への推進が促進される。 (Example 2)
In the following embodiments S201 to S206, the caller B may be assumed to be a person who is in an environment where he / she cannot speak aloud, such as a place with a lot of noise, or a person who cannot speak. For example, when it is desired to convey a sound of the surrounding environment and also to transmit a voice message, as shown in FIG. 4, as shown in FIG. 4, the caller B blows music in this example from a microphone provided in the headset, and this is wireless. Alternatively, it is transmitted via a wired device and received by the speech synthesizer by the reception unit of the speech synthesizer. In S202, the audio of music is converted into an audio signal by the audio conversion means of the audio synthesizer (third audio signal).
In S203, the voice generation application is displayed on the display screen of the terminal device via the wired or wireless device by the selection of the caller B, and the caller B selects the character group “This music is good”.
In S204, the character group “This music is good” is converted into a voice signal by the voice conversion means of the voice synthesizer (second voice signal).
In S205, the audio signal of the music (third audio signal) and the second audio signal “This music is good” are synthesized by the audio signal synthesizing means of the audio synthesizer.
In S206, the synthesized speech signal is transmitted to the headset and the terminal device by the transmission unit of the speech synthesizer (first transmission unit, second transmission unit). As a result, the audio signal transmitted to the terminal device is transmitted to the telephone of the caller A, and the caller A listens to the music and the sound “This music is good”.
By using the speech synthesizer of the present invention, the caller B can further transmit his / her intention and emotion by voice while transmitting specific music to the caller A when he / she wants to transmit it. Therefore, even for those who have difficulty in speaking, opportunities for communication exchange increase, and promotion of social participation is promoted.

次に図５を参照すると、本考案による音声合成装置のシステムにおける別のシステム構成が示される。本考案による音声合成装置のシステム１は、有線または無線装置１０、音声発生アプリケーション１２および音声変換モジュール１３を備える端末装置１１、音声合成装置１４、有線または無線装置１５、および音声再生装置１６を含む。 Next, referring to FIG. 5, another system configuration in the system of the speech synthesizer according to the present invention is shown. A speech synthesis apparatus system 1 according to the present invention includes a wired or wireless device 10, a terminal device 11 including a speech generation application 12 and a speech conversion module 13, a speech synthesis device 14, a wired or wireless device 15, and a speech playback device 16. .

図５の実施形態におけるシステム構成は、図１と異なり、音声発生アプリケーション１２および音声変換モジュール１３が端末装置１１内に備えられている。
端末装置１１は、一例としてｉＰｈｏｎｅ（登録商標）、ｉＰａｄ（登録商標）、ｉＰｏｄＴｏｕｃｈ（登録商標）、Ａｎｄｏｒｏｉｄ（登録商標）搭載端末などが想定される。その他のシステム構成は、図１で説明されるシステム構成要素と同様である。
図５の実施形態では、送信元より送信される音声、および音声発生アプリケーション１２で選択した音声が、それぞれ端末装置１１内で音声変換モジュール１３により音声信号に変換される（第１音声信号、第２音声信号）。変換された第１音声信号、第２音声信号が、有線または無線装置１０を介して音声合成装置で受信され（音声信号受信手段）、合成される。そして合成された音声信号が有線または無線装置１０、１５を介して、音声再生装置および端末装置に送信される。設定を変えることにより、第１音声のみ、第２音声のみをそれぞれ選択的に送信することもできる。
例えば、周囲環境の音などを伝えるとともに音声メッセージも伝達したい場合、図４で示される過程により伝達することがきる。この場合、図４で説明する過程と同様であり、簡略のため割愛する。 The system configuration in the embodiment in FIG. 5 is different from that in FIG. 1, and the voice generation application 12 and the voice conversion module 13 are provided in the terminal device 11.
As an example, the terminal device 11 may be an iPhone (registered trademark), iPad (registered trademark), iPodTouch (registered trademark), an Android (registered trademark) equipped terminal, or the like. Other system configurations are the same as the system components described in FIG.
In the embodiment of FIG. 5, the voice transmitted from the transmission source and the voice selected by the voice generation application 12 are each converted into a voice signal by the voice conversion module 13 in the terminal device 11 (first voice signal, first voice signal). 2 audio signals). The converted first voice signal and second voice signal are received by the voice synthesizer via the wired or wireless device 10 (voice signal receiving means) and synthesized. The synthesized audio signal is transmitted to the audio reproduction device and the terminal device via the wired or wireless devices 10 and 15. By changing the setting, it is possible to selectively transmit only the first sound and only the second sound.
For example, when it is desired to convey the sound of the surrounding environment and also to transmit a voice message, it can be transmitted by the process shown in FIG. In this case, the process is the same as that described with reference to FIG.

本考案の合成装置を、音声通話を中心に説明してきたが、当業者であれば、音声合成装置により、音声から文字に変換し、文字として通信相手に送信することも可能であることが理解されるであろう。 The synthesizer of the present invention has been described with a focus on voice calls, but those skilled in the art understand that a voice synthesizer can convert voice to text and send it as text to a communication partner. Will be done.

１システム
１０，１５有線又は無線装置
１１端末装置
１２音声発生アプリケーション
１３音声変換モジュール
１４音声合成装置
１６音声再生装置 DESCRIPTION OF SYMBOLS 1 System 10,15 Wired or radio | wireless apparatus 11 Terminal apparatus 12 Voice generation application 13 Voice conversion module 14 Voice synthesizer 16 Voice playback apparatus

Claims

A voice synthesizer that stores a voice generation application and is operably connected to a terminal device that receives voice transmitted from a transmission source and a voice playback device that plays back voice via a wired or wireless device,
Voice receiving means for receiving the first voice received by the terminal device;
Audio converting means for converting the first audio into a first audio signal and converting audio selected by the user in the audio generating application into a second audio signal;
Voice signal synthesis means for synthesizing the first voice signal and the second voice signal;
First transmission means for transmitting the synthesized voice signal to the terminal device;
A speech synthesizer comprising: a second transmission unit configured to transmit the synthesized speech signal to the speech synthesizer.

The sound receiving means receives the third sound transmitted from the sound reproducing device;
The voice receiving means converts the third voice into a third voice signal;
The voice signal synthesis means synthesizes the second voice signal and the third voice signal;
The speech synthesizer according to claim 1, wherein the transmission unit transmits the synthesized speech signal to the terminal device.

The speech synthesizer according to claim 1, wherein, of the synthesized speech, only the second speech signal is transmitted to the speech reproduction device or the terminal device.

The speech synthesizer according to claim 1, wherein the first transmission unit and the second transmission unit are executed simultaneously.

The speech synthesizer according to claim 2, wherein one of the second speech signal and the third signal is transmitted to the speech reproduction device or the terminal device.

The speech synthesizer according to any one of claims 1 to 5, wherein the terminal device is a mobile phone, a PHS, or a fixed telephone that is wired or wirelessly mediated.

The speech synthesizer according to claim 1, wherein the speech reproduction device is a headset including headphones or earphones.

A voice synthesizer connected to a terminal device that stores voice generated from a transmission source and stores a voice generation application and a voice playback device that plays back the voice, operably connected via a wired or wireless device,
Audio signal receiving means for receiving a first audio signal received by the terminal device and a second audio signal selected by the user in the audio generation application;
Synthesis means for synthesizing the first audio signal and the second audio signal;
First transmission means for transmitting the synthesized voice signal to the terminal device;
A speech synthesizer comprising: a second transmission unit configured to transmit the synthesized speech signal to the speech reproduction device.

Receiving the third sound transmitted from the sound reproduction device by the sound means;
Voice conversion means for converting the third voice into a third voice signal;
The synthesis means synthesizes the second audio signal and the third audio signal;
9. The speech synthesizer according to claim 8, wherein the transmission unit transmits the synthesized speech signal to the terminal device.

The speech synthesizer according to claim 8, wherein only the second audio signal among the synthesized speech is transmitted to the sound reproduction device.

The speech synthesizer according to claim 8, wherein the first transmission unit and the second transmission unit are executed simultaneously.

The speech synthesis device according to claim 9, wherein one of the second speech signal and the third speech signal is transmitted to the speech reproduction device or the terminal device.

The speech synthesizer according to any one of claims 8 to 10, wherein the terminal device is a terminal equipped with iPhone, iPad, iPodTouch, and Android.

The voice synthesizer according to any one of claims 8 to 11, wherein the voice reproduction device is a headset including headphones or earphones.