JP2020064151A

JP2020064151A - Reproduction system and program

Info

Publication number: JP2020064151A
Application number: JP2018195184A
Authority: JP
Inventors: 遥香松本; Haruka Matsumoto; 智治町田; Tomoharu Machida; 宮本　登; Noboru Miyamoto; 登宮本
Original assignee: Tokyo Gas Co Ltd
Current assignee: Tokyo Gas Co Ltd
Priority date: 2018-10-16
Filing date: 2018-10-16
Publication date: 2020-04-23
Anticipated expiration: 2038-10-16
Also published as: JP7218143B2

Abstract

To provide a reproduction system and the like capable of changing transmission information in accordance with a person for performing reproduction when transmission information is reproduced.SOLUTION: A reproduction system comprises: a reproduction section 34 for reproducing transmitted transmission information; a grasping section 36 for grasping feature information on a person performing reproduction operation for performing reproduction by the reproduction section 34; a setting section 37 for determining setting of reproduction of transmission information in the reproduction section 34 on the basis of the feature information; and a changing section 39 for changing the transmission information to setting which is set in the setting section 37.SELECTED DRAWING: Figure 4

Description

本発明は、再生システム、プログラムに関する。 The present invention relates to a reproduction system and a program.

音声を録音し、録音した音声を別の人物が再生することでコミュニケーションを図る装置が存在する。 There is a device that communicates by recording voice and reproducing the recorded voice by another person.

特許文献１に記載の音声処理装置では、記憶装置は、発声者の発声音を示す素片データを音声素片毎に記憶する。声質変換部は、発声者の音声の特徴量情報と発声者の音声の特徴量情報との確率分布を示す混合分布モデルから生成されて発声者の音声を発声者の音声に変換する変換関数を、合成対象の発音文字に対応した素片データに適用することで、発声者の音声に対応する素片データを順次生成する。音声合成部は、声質変換部が生成した各素片データから音声信号を生成する。 In the voice processing device described in Patent Document 1, the storage device stores, for each voice unit, voice unit data indicating a vocal sound of a speaker. The voice quality conversion unit generates a conversion function for converting the voice of the speaker into the voice of the speaker by being generated from a mixture distribution model showing the probability distribution between the feature amount information of the voice of the speaker and the feature amount information of the voice of the speaker. , Is applied to the segment data corresponding to the phonetic character to be synthesized to sequentially generate segment data corresponding to the voice of the speaker. The voice synthesis unit generates a voice signal from each voice segment data generated by the voice quality conversion unit.

また、特許文献２に記載の音声変換装置は、セレクタで選択されたいずれかの音声信号が音声信号分析部にディジタル信号で入力する。音声信号分析部で音声認識され、同音声認識されたデータをテキストデータ変換部でテキストデータに変換する。同テキストデータを、構文解析部で節に分割し、節単位で標準語から特定地域の方言への変換、又はこの逆の変換の必要性につき判定する。同判定に従い、制御部の制御のもとに変換部で標準語から特定地域の方言への変換、又はこの逆の変換を行う。標準語及び方言のデータは第１のメモリ部に格納されている。また、変換の形態については入力部で設定する。変換されたテキストデータを第２のメモリ部の声質データに従い音声合成部で音声信号に変換する。 Further, in the audio conversion device described in Patent Document 2, any audio signal selected by the selector is input to the audio signal analysis unit as a digital signal. The voice signal analysis unit performs voice recognition, and the voice-recognized data is converted into text data by the text data conversion unit. The text data is divided into sections by the parsing unit, and it is determined whether or not the standard words need to be converted into dialects of a specific area or vice versa for each section. According to the determination, the conversion unit performs conversion from the standard language to the dialect of the specific area or vice versa under the control of the control unit. Data of standard words and dialects are stored in the first memory unit. The conversion mode is set by the input unit. The converted text data is converted into a voice signal by the voice synthesizing unit according to the voice quality data in the second memory unit.

特開２０１２−６３５０１号公報JP2012-63501A 特開２０００−１１２４８８号公報JP 2000-112488 A

録音された音声は、通常は、メッセージがそのまま再生される。ところが、再生を行う人物に合わせ、声質等の変更を行った方が、メッセージの内容が伝わりやすい場合がある。
本発明の目的は、送信情報の再生を行う際に、再生を行う人物に合わせ送信情報を変更することができる再生システム等を提供することを目的とする。 As for the recorded voice, the message is usually reproduced as it is. However, it may be easier to convey the content of the message if the voice quality or the like is changed according to the person who plays back.
It is an object of the present invention to provide a reproduction system or the like that can change transmission information according to a person who reproduces it when reproducing the transmission information.

かくして本発明によれば、送信された送信情報の再生を行う再生手段と、再生手段で再生を行う再生操作を実行する人物の特徴情報を把握する把握手段と、再生手段における送信情報の再生の設定を、特徴情報に基づき決定する設定手段と、送信情報を設定手段で設定された設定に変更する変更手段と、を有する再生システムが提供される。 Thus, according to the present invention, the reproducing means for reproducing the transmitted transmission information, the grasping means for grasping the characteristic information of the person who performs the reproducing operation for reproducing by the reproducing means, and the reproducing means for reproducing the transmitted information by the reproducing means. There is provided a reproducing system having a setting unit that determines a setting based on the characteristic information and a changing unit that changes the transmission information to the setting set by the setting unit.

ここで、人物の音声を取得する取得手段をさらに有し、把握手段は、取得手段が取得した音声を基に特徴情報を把握するようにすることができる。この場合、特徴情報をより把握しやすくなる。
また、変更手段は、送信情報として送られた音声の声質を人物に合わせ変更するようにすることができる。この場合、再生操作を実行する人物に適した声質で音声の再生を行うことができる。
さらに、変更手段は、送信情報として送られた音声の文言を人物に合わせ変更するようにすることができる。この場合、再生操作を実行する人物に適した文言で音声の再生を行うことができる。
またさらに、変更手段は、設定により、テキスト、人の音声および機械的な合成音声について相互に変更するようにすることができる。この場合、送信情報を、再生操作を実行する人物に適した形式とすることができる。
また、変更手段は、設定により、音声の周波数変換を行うようにすることができる。この場合、聞き取りにくい周波数の音声が含まれる場合に、聞き取りやすくなる。
さらに、設定手段は、自装置の周辺の状況を把握し、把握した状況に基づき、設定を行うようにすることができる。この場合、自装置の周辺の状況に合わせて音声の再生を行うことができる。 Here, it is possible to further have an acquisition means for acquiring the voice of the person, and the grasping means can grasp the feature information based on the voice acquired by the acquiring means. In this case, it becomes easier to understand the feature information.
Further, the changing unit can change the voice quality of the voice sent as the transmission information according to the person. In this case, it is possible to reproduce the voice with a voice quality suitable for the person who performs the reproduction operation.
Further, the changing unit can change the wording of the voice sent as the transmission information according to the person. In this case, it is possible to reproduce the voice with a wording suitable for the person who performs the reproduction operation.
Furthermore, the changing means can change the text, the human voice, and the mechanical synthetic voice to each other by setting. In this case, the transmission information can be in a format suitable for the person who performs the reproduction operation.
Further, the changing means can perform frequency conversion of voice by setting. In this case, it becomes easy to hear when a voice of a frequency that is difficult to hear is included.
Further, the setting means can grasp the situation around the device itself and make the setting based on the grasped situation. In this case, the sound can be reproduced according to the situation around the device.

さらに、本発明によれば、コンピュータに、送信された送信情報の再生を行う再生機能と、再生機能で再生を行う再生操作を実行する人物の特徴情報を把握する把握機能と、再生機能における送信情報の再生の設定を、特徴情報に基づき決定する設定機能と、送信情報を設定機能で設定された設定に変更する変更機能と、を実現させるためのプログラムが提供される。 Further, according to the present invention, the computer has a reproducing function for reproducing the transmitted transmission information, a grasping function for grasping characteristic information of a person who performs a reproducing operation for performing reproduction by the reproducing function, and a transmission in the reproducing function. A program is provided for realizing a setting function for determining the setting of information reproduction based on the characteristic information and a changing function for changing the transmission information to the setting set by the setting function.

本発明によれば、送信情報の再生を行う際に、再生を行う人物に合わせ送信情報を変更することができる再生システム等を提供することができる。 According to the present invention, when reproducing transmission information, it is possible to provide a reproduction system or the like that can change the transmission information according to the person who reproduces it.

本実施の形態における再生システムの構成例を示す図である。It is a figure which shows the structural example of the reproduction | regeneration system in this Embodiment. 端末装置をロボットとした場合について説明した図である。It is a figure explaining the case where a terminal device was made into a robot. 再生システムの概略動作の例について示した図である。It is the figure which showed the example of schematic operation | movement of a reproduction | regeneration system. 再生システムの機能構成例を示したブロック図である。It is a block diagram showing an example of functional composition of a reproduction system. 本実施形態の再生システムの動作の例について説明したフローチャートである。It is a flow chart explaining an example of operation of the reproducing system of this embodiment. ユーザの年齢を推定する方法の一例を示した図である。It is the figure which showed an example of the method of estimating a user's age. （ａ）〜（ｃ）は、ユーザの性別を推定する方法の一例を示した図である。(A)-(c) is a figure showing an example of the method of presuming a user's sex. 特徴情報と設定を変更する方法とについて示した図である。It is the figure which showed about the characteristic information and the method of changing a setting. スペクトル包絡の例について示した図である。It is the figure shown about the example of a spectrum envelope. （ａ）〜（ｂ）は、音声の周波数変換について示した図である。(A)-(b) is a figure shown about frequency conversion of voice.

以下、添付図面を参照して、本発明の実施の形態について詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

＜再生システム１全体の説明＞
図１は、本実施の形態における再生システム１の構成例を示す図である。
図示するように本実施の形態の再生システム１は、携帯端末２０と、端末装置３０とが、ネットワーク７０およびアクセスポイント９０を介して接続されることにより構成されている。図１では、携帯端末２０は、１つのみ示したが、個数はいくつでもよい。 <Explanation of the entire reproduction system 1>
FIG. 1 is a diagram showing a configuration example of a reproduction system 1 in the present embodiment.
As shown in the figure, the reproduction system 1 of the present embodiment is configured by connecting a mobile terminal 20 and a terminal device 30 via a network 70 and an access point 90. Although only one mobile terminal 20 is shown in FIG. 1, any number may be used.

携帯端末２０は、例えば、モバイルコンピュータ、携帯電話、スマートフォン、タブレット等のモバイル端末である。携帯端末２０は、無線通信を行うためにアクセスポイント９０に接続する。そして、携帯端末２０は、アクセスポイント９０を介して、有線で通信を行うネットワーク７０に接続する。 The mobile terminal 20 is, for example, a mobile terminal such as a mobile computer, a mobile phone, a smartphone, or a tablet. The mobile terminal 20 connects to the access point 90 for wireless communication. Then, the mobile terminal 20 connects to the network 70 that performs wired communication via the access point 90.

携帯端末２０は、演算手段であるＣＰＵ（Central Processing Unit）と、記憶手段であるメインメモリを備える。ここで、ＣＰＵは、ＯＳ（基本ソフトウェア）やアプリ（応用ソフトウェア）等の各種ソフトウェアを実行する。また、メインメモリは、各種ソフトウェアやその実行に用いるデータ等を記憶する記憶領域である。さらに、携帯端末２０は、外部との通信を行うための通信インタフェース（以下、「通信Ｉ／Ｆ」と表記する）と、ビデオメモリやディスプレイ等からなる表示機構と、入力ボタン、タッチパネル、キーボード等の入力機構とを備える。そして、携帯端末２０は、音声の出力を行うスピーカと、音声の入力を行うマイクロフォンとを備える。 The mobile terminal 20 includes a CPU (Central Processing Unit) that is a calculation unit and a main memory that is a storage unit. Here, the CPU executes various software such as an OS (basic software) and an application (applied software). The main memory is a storage area for storing various software and data used for its execution. Further, the mobile terminal 20 has a communication interface (hereinafter, referred to as “communication I / F”) for performing communication with the outside, a display mechanism including a video memory and a display, an input button, a touch panel, a keyboard, and the like. And an input mechanism of. The mobile terminal 20 includes a speaker that outputs a voice and a microphone that inputs a voice.

端末装置３０は、例えば、ロボットとすることができる。このロボットは、ロボットを所有するユーザの住居に置かれる。
図２は、端末装置３０をロボットとした場合について説明した図である。
図２に示した、ロボットとしての端末装置３０は、歩行等を行うことで移動する機能を有する移動式としてもよいが、移動しない非移動式としてもよい。
端末装置３０は、送信情報の送信および受信を行う通信アンテナ３０１と、音声を取得するマイクロフォン３０２と、音声等の音を出力するスピーカ３０３と、画像を表示するディスプレイ３０４と、ユーザが操作を行う操作ボタン３０５と、端末装置３０の全体の制御を行う制御部３０６とを備える。また、操作ボタン３０５は、録音を行う録音ボタン３０５ａと、送られた送信情報を再生する再生ボタン３０５ｂと、端末装置３０の設定などを行うためのメニューボタン３０５ｃとを備える。 The terminal device 30 can be, for example, a robot. This robot is placed in the residence of the user who owns the robot.
FIG. 2 is a diagram illustrating a case where the terminal device 30 is a robot.
The terminal device 30 as a robot shown in FIG. 2 may be a mobile type having a function of moving by walking or the like, or may be a non-mobile type that does not move.
The terminal device 30 includes a communication antenna 301 that transmits and receives transmission information, a microphone 302 that acquires voice, a speaker 303 that outputs sound such as voice, a display 304 that displays an image, and a user performs operation. An operation button 305 and a control unit 306 that controls the entire terminal device 30 are provided. The operation buttons 305 include a recording button 305a for recording, a reproduction button 305b for reproducing the transmitted transmission information, and a menu button 305c for setting the terminal device 30 and the like.

ネットワーク７０は、携帯端末２０および端末装置３０の情報通信に用いられる通信手段であり、例えば、インターネットである。 The network 70 is a communication means used for information communication between the mobile terminal 20 and the terminal device 30, and is, for example, the Internet.

アクセスポイント９０は、有線で通信を行うネットワーク７０に対して、無線通信回線を利用して無線通信を行う機器である。アクセスポイント９０は、携帯端末２０や端末装置３０とネットワーク７０との間の情報の送受信を媒介する。
無線通信回線の種類としては、携帯電話回線、ＰＨＳ（Personal Handy-phone System）回線、Ｗｉ−Ｆｉ（Wireless Fidelity）、Bluetooth（登録商標）、ZigBee、ＵＷＢ（Ultra Wideband）等の各回線が使用可能である。 The access point 90 is a device that performs wireless communication with a network 70 that performs wired communication using a wireless communication line. The access point 90 mediates transmission / reception of information between the mobile terminal 20 or the terminal device 30 and the network 70.
As the types of wireless communication lines, mobile phone lines, PHS (Personal Handy-phone System) lines, Wi-Fi (Wireless Fidelity), Bluetooth (registered trademark), ZigBee, UWB (Ultra Wideband) lines can be used. Is.

＜再生システム１の動作の概略説明＞
図３は、再生システム１の概略動作の例について示した図である。
まず、端末装置３０を所有するユーザＡが、送信情報を作成する（１Ａ）。送信情報は、端末装置３０と携帯端末２０との間で、やりとりを行う際に用いられる電子情報である。送信情報は、詳しくは後述するが、例えば、音声やテキストの情報である。ユーザＡは、携帯端末２０を所有するユーザＢへのメッセージを、音声やテキストにより作成する。なお、ユーザＡとユーザＢとは、予め定められた人物であり、所定の交友関係がある。例えば、親子の関係であったり、友人同士の関係である。 <Outline of the operation of the reproduction system 1>
FIG. 3 is a diagram showing an example of a schematic operation of the reproduction system 1.
First, the user A who owns the terminal device 30 creates transmission information (1A). The transmission information is electronic information used when exchanging information between the terminal device 30 and the mobile terminal 20. The transmission information is, for example, voice or text information, which will be described in detail later. The user A creates a message to the user B who owns the mobile terminal 20 by voice or text. The user A and the user B are predetermined persons and have a predetermined friendship relationship. For example, a parent-child relationship or a friend relationship.

音声の情報は、ユーザＡが、端末装置３０に向かって話しかけ、この際に、マイクロフォン３０２により音声を取得し、録音を行うことで、作成することができる。具体的には、ユーザＡは、例えば、端末装置３０に対し相対する位置に自らの顔を向ける。そして、ユーザＡが、操作ボタン３０５の録音ボタン３０５ａを押下すると、押下している間だけマイクロフォン３０２により、録音が行われる。録音を停止したい場合は、録音ボタン３０５ａから手を離せばよい。そして、ユーザＡは、録音ボタン３０５ａを押下している間に、自らの音声によりユーザＢに対し伝えたい内容を話す。録音した音声の情報は、制御部３０６のメモリに保存される。
また、テキストの情報は、端末装置３０に接続するキーボード等から入力してもよいが、例えば、ディスプレイ３０４をタッチパネルとし、タッチパネルにより入力を行ってもよい。さらに、上述したように音声を入力し、これを音声認識することで、テキストに変換する方法でもよい。 The voice information can be created by the user A speaking to the terminal device 30, and at this time, the voice is acquired by the microphone 302 and recorded. Specifically, the user A turns his / her face to a position facing the terminal device 30, for example. Then, when the user A presses the record button 305a of the operation buttons 305, the microphone 302 records only while the user presses the record button 305a. If you want to stop recording, you can release your hand from the record button 305a. Then, the user A speaks the content he wants to convey to the user B with his own voice while pressing the record button 305a. Information on the recorded voice is stored in the memory of the control unit 306.
Further, the text information may be input from a keyboard or the like connected to the terminal device 30, but, for example, the display 304 may be a touch panel and the touch panel may be used for input. Furthermore, as described above, a method of inputting voice and recognizing the voice to convert it into text may be used.

そして、制御部３０６は、この音声やテキストの情報を送信情報として、携帯端末２０に対し送信する。送信情報は、通信アンテナ３０１、アクセスポイント９０、ネットワーク７０を介し、携帯端末２０に送られる（１Ｂ）。
携帯端末２０では、再生システム１を実現するための専用のアプリが動作しており、この送信情報を、通信Ｉ／Ｆが取得する。ＣＰＵは、この送信情報をメモリに保存する（１Ｃ）。またこのとき、携帯端末２０にＬＥＤなどからなる発光源を別途設け、この発光源を点滅等させることで、ユーザＡから送信情報が到着した旨を、ユーザＢに対し知らせてもよい。また、ユーザＡから送信情報が到着した旨の案内を、着信音や音声等で出力してもよい。 Then, the control unit 306 transmits the voice and text information as transmission information to the mobile terminal 20. The transmission information is sent to the mobile terminal 20 via the communication antenna 301, the access point 90, and the network 70 (1B).
In the mobile terminal 20, a dedicated application for realizing the reproduction system 1 is operating, and the communication I / F acquires this transmission information. The CPU stores this transmission information in the memory (1C). At this time, the mobile terminal 20 may be provided with a light emitting source such as an LED, and the light emitting source may be blinked or the like to notify the user B that the transmission information has arrived from the user A. Further, the guidance that the transmission information has arrived from the user A may be output as a ring tone or a voice.

ユーザＢは、送信情報の再生を行うことができる。具体的には、ユーザＢが、携帯端末２０のタッチパネル等の入力機構において、再生ボタン等を押下する。これにより、ユーザＡから送信された音声が、メモリから読み出され、スピーカから出力される（１Ｄ）。これにより、ユーザＢは、ユーザＡから送信されたメッセージを聞くことができる。また、送信情報が、テキストの情報であったときは、タッチパネル等の表示機構にテキストを表示することができる。 User B can reproduce the transmission information. Specifically, the user B presses a play button or the like on the input mechanism such as the touch panel of the mobile terminal 20. As a result, the voice transmitted from the user A is read from the memory and output from the speaker (1D). Thereby, the user B can hear the message transmitted from the user A. When the transmission information is text information, the text can be displayed on a display mechanism such as a touch panel.

そして、ユーザＢは、ユーザＡへ返信を行うための送信情報を作成する（１Ｅ）。この送信情報の作成方法は、上述したユーザＡの場合で説明した方法と同様である。 Then, the user B creates transmission information for replying to the user A (1E). The method of creating the transmission information is the same as the method described in the case of the user A described above.

そして、携帯端末２０のＣＰＵは、この音声の情報を送信情報として、端末装置３０に対し送信する（１Ｆ）。送信情報は、通信Ｉ／Ｆ、アクセスポイント９０、ネットワーク７０を介し、端末装置３０に送られる。
端末装置３０では、この送信情報を、通信アンテナ３０１で受け、制御部３０６が取得して、メモリに記憶する（１Ｇ）。ユーザＡの操作により、ユーザＢから送られた送信情報をメモリから読み出し、再生を行う（１Ｈ）。
そして以下、同様の動作が繰り返される。即ち、ユーザＡとユーザＢとの間で、送信情報のやりとりが行われる。 Then, the CPU of the mobile terminal 20 transmits the voice information as transmission information to the terminal device 30 (1F). The transmission information is sent to the terminal device 30 via the communication I / F, the access point 90, and the network 70.
In the terminal device 30, the communication antenna 301 receives this transmission information, the control unit 306 acquires it, and stores it in the memory (1G). By the operation of the user A, the transmission information sent from the user B is read from the memory and reproduced (1H).
Then, the same operation is repeated thereafter. That is, the transmission information is exchanged between the user A and the user B.

次に、本実施の形態の再生システム１の詳細な機能構成および動作について説明する。 Next, a detailed functional configuration and operation of the reproduction system 1 according to the present embodiment will be described.

＜再生システム１の機能構成の説明＞
図４は、再生システム１の機能構成例を示したブロック図である。
なおここでは、再生システム１が有する種々の機能のうち本実施の形態に関係するものを選択して図示している。
再生システム１において、携帯端末２０は、情報の送受信を行う送受信部２１と、画像の表示を行う表示部２２と、情報を入力する入力部２３と、音声を出力する音声出力部２４とを備える。 <Description of functional configuration of playback system 1>
FIG. 4 is a block diagram showing a functional configuration example of the reproduction system 1.
Here, among various functions of the reproduction system 1, those relevant to the present embodiment are selected and shown.
In the reproduction system 1, the mobile terminal 20 includes a transmission / reception unit 21 that transmits / receives information, a display unit 22 that displays images, an input unit 23 that inputs information, and an audio output unit 24 that outputs audio. .

送受信部２１は、例えば、通信Ｉ／Ｆであり、アクセスポイント９０およびネットワーク７０を介し、端末装置３０と情報の送受信を行う。 The transmission / reception unit 21 is, for example, a communication I / F, and transmits / receives information to / from the terminal device 30 via the access point 90 and the network 70.

表示部２２は、各種情報が表示される表示機構であり、例えば、タッチパネル等のディスプレイである。
入力部２３は、テキストや音声等の入力を行う入力機構であり、例えば、上述したタッチパネルや、入力ボタン・キーボード等である。また、入力部２３は、ユーザＢの音声を入力する入力機構であり、例えば、マイクロフォンである。
音声出力部２４は、音声の出力を行うスピーカである。 The display unit 22 is a display mechanism that displays various types of information, and is, for example, a display such as a touch panel.
The input unit 23 is an input mechanism for inputting text, voice, and the like, and is, for example, the above-described touch panel, input button / keyboard, or the like. The input unit 23 is an input mechanism that inputs the voice of the user B, and is, for example, a microphone.
The audio output unit 24 is a speaker that outputs audio.

端末装置３０は、送信情報の送受信を行う送受信部３１と、送信情報を記憶する記憶部３２と、音声の取得を行う取得部３３と、音声を再生する再生部３４と、画像の表示を行う表示部３５と、再生操作を行う人物の特徴情報を把握する把握部３６と、音声の声質を決定する設定部３７と、ユーザＡの操作を受け付ける操作部３８と、音声の声質を変更する変更部３９と、音声の再生の制御を行う再生制御部４０とを備える。 The terminal device 30 transmits / receives transmission information, a transmission / reception unit 31, a storage unit 32 that stores transmission information, an acquisition unit 33 that acquires voice, a reproduction unit 34 that reproduces voice, and displays an image. Display unit 35, grasping unit 36 for grasping the characteristic information of the person performing the reproduction operation, setting unit 37 for determining the voice quality of voice, operation unit 38 for accepting the operation of user A, and change for changing the voice quality of voice. It includes a unit 39 and a reproduction control unit 40 that controls reproduction of sound.

送受信部３１は、携帯端末２０を所有するユーザＢからの送信情報を受信する。また、送受信部３１は、ユーザＡからユーザＢへの送信情報を送信する。送受信部３１は、例えば、通信Ｉ／Ｆであり、制御部３０６に含まれる。また、通信アンテナ３０１もこれに含まれる。送受信部３１は、アクセスポイント９０およびネットワーク７０を介し、端末装置３０および携帯端末２０の間で送信情報の送受信を行う。 The transmission / reception unit 31 receives the transmission information from the user B who owns the mobile terminal 20. Further, the transmission / reception unit 31 transmits the transmission information from the user A to the user B. The transmission / reception unit 31 is, for example, a communication I / F and is included in the control unit 306. The communication antenna 301 is also included in this. The transmission / reception unit 31 transmits / receives transmission information between the terminal device 30 and the mobile terminal 20 via the access point 90 and the network 70.

記憶部３２は、受信された送信情報を記憶する。また、記憶部３２は、必要な場合にこれを出力する。記憶部３２は、例えば、メモリ、ＨＤＤ（Hard Disk Drive）、ＳＳＤ（Solid State Drive）などであり、制御部３０６に含まれる。 The storage unit 32 stores the received transmission information. The storage unit 32 also outputs this when necessary. The storage unit 32 is, for example, a memory, a HDD (Hard Disk Drive), an SSD (Solid State Drive), or the like, and is included in the control unit 306.

取得部３３は、取得手段の一例であり、ユーザＡの音声等の音を取得する。取得部３３は、マイクロフォン３０２に対応する。マイクロフォンの種類としては、ダイナミック型、コンデンサ型等、既存の種々のものを用いてよい。また、マイクロフォンとして、無指向性のＭＥＭＳ（Micro Electro Mechanical Systems）型マイクロフォンであることが好ましい。
再生部３４は、再生手段の一例であり、ユーザＢから送信された送信情報として、音声の再生を行う。再生部３４は、音声の再生を行うスピーカ３０３に対応する。
表示部３５も、再生手段の一例であり、ユーザＢから送信された送信情報として、テキストの再生を行う。表示部３５は、例えば、上述したように、タッチパネルである。この場合、表示部３５は、各種情報が表示されるディスプレイと、指やスタイラスペン等で接触された位置を検出する位置検出シートとを備える。接触された位置を検出する手段としては、接触による圧力をもとに検出する抵抗膜方式や、接触した物の静電気をもとに検出する静電容量方式など、どのようなものが用いられてもよい。 The acquisition unit 33 is an example of an acquisition unit, and acquires a sound such as the voice of the user A. The acquisition unit 33 corresponds to the microphone 302. As the type of microphone, various existing types such as a dynamic type and a condenser type may be used. Further, the microphone is preferably a non-directional MEMS (Micro Electro Mechanical Systems) type microphone.
The reproducing unit 34 is an example of a reproducing unit, and reproduces voice as the transmission information transmitted from the user B. The reproduction unit 34 corresponds to the speaker 303 that reproduces sound.
The display unit 35 is also an example of a reproducing unit, and reproduces the text as the transmission information transmitted from the user B. The display unit 35 is, for example, a touch panel as described above. In this case, the display unit 35 includes a display that displays various kinds of information and a position detection sheet that detects a position touched by a finger, a stylus pen, or the like. As a means for detecting the contacted position, any method such as a resistance film method that detects based on pressure due to contact or a capacitance method that detects based on static electricity of the contacted object is used. Good.

把握部３６は、把握手段の一例であり、再生部３４や表示部３５で再生を行う再生操作を実行する人物の特徴情報を把握する。この場合、再生操作を実行する人物は、ユーザＡであり、ユーザＢの音声の聞き手である。ここで、「特徴情報」とは、聞き手であるユーザＡの話し方を特徴付ける情報である。特徴情報は、例えば、ユーザＡの年齢である。年齢により使用する言葉に違いが生ずるため、年齢は、ユーザＡの話し方を特徴付ける情報となる。また、特徴情報は、例えば、ユーザＡの性別である。男女の違いにより、使用する言葉に違いが生ずるため、性別は、ユーザＡの話し方を特徴付ける情報となる。さらに、特徴情報は、例えば、ユーザＡの居住地域である。居住地域の違いにより、使用する言葉が標準語であるか、その居住地域で用いられる方言であるかなどの違いが生ずるため、居住地域は、ユーザＡの話し方を特徴付ける情報となる。
設定部３７は、設定手段の一例であり、再生部３４や表示部３５における送信情報の再生の設定を、特徴情報に基づき決定する。例えば、設定部３７は、再生部３４で再生する音声の声質を聞き手であるユーザＡに合わせ設定する。この設定は、把握部３６により把握したユーザＡの特徴情報により決めることができる。また、ユーザＡが自ら設定を決定し、端末装置３０に入力してもよい。これは、例えば、次に説明する操作部３８を操作することで行うことができる。
把握部３６および設定部３７は、例えば、ＣＰＵであり、制御部３０６に含まれる。 The grasping unit 36 is an example of grasping means, and grasps characteristic information of a person who performs a reproducing operation for reproducing on the reproducing unit 34 or the display unit 35. In this case, the person who performs the reproduction operation is the user A and the listener of the voice of the user B. Here, the "feature information" is information that characterizes the speaking style of the user A who is the listener. The characteristic information is, for example, the age of the user A. Since the words used differ depending on the age, the age is information that characterizes the way the user A speaks. The characteristic information is, for example, the gender of the user A. Gender is information that characterizes the way the user A speaks, because the words used differ depending on the gender. Further, the characteristic information is, for example, the residence area of the user A. Depending on the residential area, the language used is a standard language or a dialect used in the residential area. Therefore, the residential area is information that characterizes the way the user A speaks.
The setting unit 37 is an example of a setting unit, and determines the reproduction setting of the transmission information in the reproduction unit 34 and the display unit 35 based on the characteristic information. For example, the setting unit 37 sets the voice quality of the voice reproduced by the reproducing unit 34 according to the user A who is the listener. This setting can be determined based on the characteristic information of the user A grasped by the grasping unit 36. Alternatively, the user A may determine the settings by himself / herself and input them to the terminal device 30. This can be performed, for example, by operating the operation unit 38 described below.
The grasping unit 36 and the setting unit 37 are CPUs, for example, and are included in the control unit 306.

操作部３８は、録音や再生を行うためのユーザＡによる操作を受け付ける。操作部３８は、操作ボタン３０５に対応する。また、操作部３８は、キーボードやマウス等で構成されていてもよい。
変更部３９は、変更手段の一例であり、送信情報を設定部３７で設定された設定に変更する。変更部３９は、送信情報として送られた音声の声質をユーザＡに合わせ変更する。また、音声とテキストとの変換を行う。
再生制御部４０は、音声やテキストの再生の制御を行う。再生制御部４０は、例えば、ＣＰＵであり、制御部３０６に含まれる。 The operation unit 38 receives an operation by the user A for recording or playing. The operation unit 38 corresponds to the operation button 305. In addition, the operation unit 38 may be configured by a keyboard, a mouse, or the like.
The changing unit 39 is an example of a changing unit, and changes the transmission information to the setting set by the setting unit 37. The changing unit 39 changes the voice quality of the voice sent as the transmission information to match the user A. It also converts voice and text.
The reproduction control unit 40 controls reproduction of voice and text. The reproduction control unit 40 is, for example, a CPU and is included in the control unit 306.

＜再生システム１の動作の説明＞
次に、本実施の形態の再生システム１の動作について、より詳細に説明を行う。
図５は、本実施形態の再生システム１の動作の例について説明したフローチャートである。
まず、ユーザＡが、端末装置３０の操作部３８を操作し、取得部３３を使用して、音声の録音を行う（ステップ１０１）。音声の情報は、送信情報として記憶部３２に記憶される（ステップ１０２）。さらに、送受信部３１が、送信情報を携帯端末２０に向け、送信する（ステップ１０３）。なお、送信情報には、送信情報を作成した際の日時の情報等を含めてもよい。 <Explanation of the operation of the reproduction system 1>
Next, the operation of the reproduction system 1 of the present embodiment will be described in more detail.
FIG. 5 is a flowchart illustrating an example of the operation of the reproduction system 1 of this embodiment.
First, the user A operates the operation unit 38 of the terminal device 30 and uses the acquisition unit 33 to record a voice (step 101). The voice information is stored in the storage unit 32 as transmission information (step 102). Further, the transmission / reception unit 31 directs the transmission information to the mobile terminal 20 and transmits it (step 103). The transmission information may include information on the date and time when the transmission information was created.

一方、端末装置３０では、把握部３６が、取得部３３が取得した音声を基にユーザＡの特徴情報を把握する（ステップ１０４）。
以下、特徴情報として、ユーザＡの年齢、ユーザＡの性別、およびユーザＡの居住地域を推定する方法について説明する。 On the other hand, in the terminal device 30, the grasping unit 36 grasps the characteristic information of the user A based on the voice acquired by the acquiring unit 33 (step 104).
Hereinafter, a method for estimating the age of the user A, the gender of the user A, and the residential area of the user A as the characteristic information will be described.

（ユーザＡの年齢の推定）
図６は、ユーザＡの年齢を推定する方法の一例を示した図である。
図６は、音声の周波数スペクトルについて示している。ここで、横軸は、周波数を表し、縦軸は、スペクトル強度を表す。即ち、周波数スペクトルは、音声に含まれる周波数成分について、周波数とその強度との関係を示している。
ここでは、音声について、４０歳、５０歳、６０歳、７０歳の人物の周波数スペクトルの一例を示している。図示するように、年齢が上昇するに従い、４ｋＨｚ以上のスペクトル強度が増加することがわかる。実際には、４ｋＨｚ以上のスペクトル強度が増加することで、音声が、よりかれた状態となる嗄声（させい）となる。
よって、把握部３６は、周波数スペクトルのうち、４ｋＨｚ以上のスペクトル強度を見ることで、ユーザＡの年齢を推定することができる。 (Estimation of age of user A)
FIG. 6 is a diagram showing an example of a method of estimating the age of the user A.
FIG. 6 shows a frequency spectrum of voice. Here, the horizontal axis represents frequency and the vertical axis represents spectrum intensity. That is, the frequency spectrum shows the relationship between the frequency and the intensity of the frequency component contained in the voice.
Here, regarding voice, an example of frequency spectra of persons aged 40, 50, 60, and 70 is shown. As shown in the figure, it can be seen that the spectrum intensity at 4 kHz or higher increases as the age increases. Actually, the increase in the spectrum intensity of 4 kHz or more causes the voice to become a hoarse voice that makes the voice more audible.
Therefore, the grasping unit 36 can estimate the age of the user A by looking at the spectrum intensity of 4 kHz or higher in the frequency spectrum.

（ユーザＡの性別の推定）
図７（ａ）〜（ｃ）は、ユーザＡの性別を推定する方法の一例を示した図である。
図７（ａ）で示す音声の信号は、図７（ｂ）で示す基本周波数と、図７（ｃ）で示す非周期成分の２つに分けることができる。基本周波数は、声の高さを表す。例えば、男声の基本周波数は、１００Ｈｚ〜２００Ｈｚであり、女声の基本周波数は、２５０Ｈｚ〜５００Ｈｚである。なお、非周期成分は、声色を表す。よって、基本周波数により、ユーザＡの性別を推定することができる。 (Estimation of Gender of User A)
7A to 7C are diagrams showing an example of a method of estimating the gender of the user A.
The audio signal shown in FIG. 7A can be divided into two, the fundamental frequency shown in FIG. 7B and the aperiodic component shown in FIG. 7C. The fundamental frequency represents the pitch of a voice. For example, the fundamental frequency of male voice is 100 Hz to 200 Hz, and the fundamental frequency of female voice is 250 Hz to 500 Hz. The aperiodic component represents a voice color. Therefore, the gender of the user A can be estimated from the fundamental frequency.

（ユーザＡの居住地域の推定）
この場合、把握部３６に、ＧＰＳ（Global Positioning System）機能を付与することで、端末装置３０の位置を求め、これによりユーザＡの居住地域を推定することができる。また、ＧＰＳ機能の代わりに、またはＧＰＳ機能と併用して、Ｗｉ−Ｆｉアクセスポイントの位置情報を利用して端末装置３０の位置を求めてもよい。 (Estimation of the residential area of user A)
In this case, the position of the terminal device 30 can be obtained by adding a GPS (Global Positioning System) function to the grasping unit 36, and thus the residence area of the user A can be estimated. Further, instead of the GPS function or in combination with the GPS function, the position of the terminal device 30 may be obtained using the position information of the Wi-Fi access point.

図５に戻り、端末装置３０では、設定部３７は、再生部３４で再生する送信情報の設定を特徴情報に基づき決定する。（ステップ１０５）。
図８は、特徴情報と設定を変更する方法とについて示した図である。
図８は、把握部３６により、特徴情報として、ユーザＡの年齢、ユーザＡの性別、およびユーザＡの居住地域が推定されたときに、設定部３７が設定する内容についてまとめた表である。
まず、特徴情報としてユーザＡの年齢により、送信情報の設定をする場合、音声の声質をユーザＡに合わせ変更することができる。ユーザＡの年齢が、未成年や幼児などの若年者である場合、例えば、音声の声質を親の声、機械的な合成音声等に変更する。幼児の場合、親の音声にした場合、安心感を得ることができる。また、子供の場合、親の声より機械音声の方が、言いつけに従いやすいなどの研究結果もあることから、機械音声に声質を変更するようにしてもよい。また、音声の文言をユーザＡに合わせ変更するようにしてもよい。例えば、音声の文言を、通常の「〜してください。」から、「〜しなさい。」などの命令調に変更してもよい。さらに、若年者の場合、音声よりもテキストの方が、送信情報の内容を速く理解しやすいことがあるため、送信情報を、音声からテキストに変換してもよい。 Returning to FIG. 5, in the terminal device 30, the setting unit 37 determines the setting of the transmission information to be reproduced by the reproducing unit 34 based on the characteristic information. (Step 105).
FIG. 8 is a diagram showing the characteristic information and the method of changing the setting.
FIG. 8 is a table summarizing the contents set by the setting unit 37 when the grasping unit 36 estimates the age of the user A, the gender of the user A, and the residential area of the user A as the characteristic information.
First, when the transmission information is set according to the age of the user A as the characteristic information, the voice quality of the voice can be changed according to the user A. When the age of the user A is a young person such as a minor or an infant, for example, the voice quality of the voice is changed to a parent voice, a mechanical synthetic voice, or the like. In the case of an infant, when using the voice of the parent, a sense of security can be obtained. Further, in the case of children, there is a research result that the mechanical voice is easier to follow the saying than the parent's voice. Therefore, the voice quality may be changed to the mechanical voice. Also, the wording of the voice may be changed according to the user A. For example, the audio wording may be changed from the usual “-please.” To a command tone such as “-please.”. Further, in the case of a young person, the text may be faster and easier to understand than the voice, so that the transmission information may be converted from the voice to the text.

一方、ユーザＡの年齢が、例えば、６０歳以上など高齢者である場合、音声の文言を丁寧語に変換するようにしてもよい。例えば、「おかえり。」を「おかえりなさい。」に変更する。また、若者言葉など高齢者では理解が容易ではない言葉を、高齢者でも理解しやすい文言に変換するようにしてもよい。さらに、高齢者の場合、テキストよりも音声の方が、送信情報の内容を理解しやすいことがあるため、送信情報を、テキストから音声に変換してもよい。 On the other hand, when the age of the user A is an elderly person, such as 60 years or older, the wording of voice may be converted into a polite language. For example, change "Welcome back." To "Welcome back." Also, words that are difficult for the elderly to understand, such as young people's words, may be converted into words that are easy for the elderly to understand. Furthermore, in the case of an elderly person, it may be easier to understand the content of the transmission information with voice than with text, so the transmission information may be converted from text to voice.

また、特徴情報としてユーザＡの性別により、送信情報の設定をする場合、音声の文言をユーザＡに合わせ変更することができる。例えば、ユーザＡの性別に合わせ、ユーザＡが男性であった場合は、女性語を男性語に変換し、ユーザＡが女性であった場合は、男性語を女性語に変換することが考えられる。 When the transmission information is set according to the gender of the user A as the characteristic information, the wording of the voice can be changed according to the user A. For example, according to the gender of the user A, if the user A is male, the female word may be converted into a male word, and if the user A is female, the male word may be converted into a female word. .

さらに、特徴情報としてユーザＡの居住地域により、送信情報の設定をする場合、音声の文言をユーザＡが居住する地域で使用される方言に変換することができる。例えば、ユーザＢからユーザＡに送られる音声を標準語から方言に変換したり、方言から標準語に変換することができる。 Furthermore, when the transmission information is set as the characteristic information according to the residence area of the user A, the voice language can be converted into the dialect used in the area where the user A resides. For example, the voice sent from the user B to the user A can be converted from a standard language into a dialect or from a dialect into a standard language.

以上述べた特徴情報は、１回の音声の取得だけで設定をすることもできるが、これに限られるものではない。即ち、複数回の音声の取得を行い、これにより設定を順次変更する方法でもよい。これにより、設定の精度をより向上させることができる。例えば、親の音声の設定を行うには、対象となる人物の音声を複数回取得することで、声のライブラリを作成し、これにより対象となる人物の声質により近い音声にしていくことができる。 The feature information described above can be set only by acquiring voice once, but the setting is not limited to this. That is, a method may be used in which the voice is acquired a plurality of times and the setting is sequentially changed by this. As a result, the setting accuracy can be further improved. For example, in order to set the parent's voice, a voice library can be created by acquiring the voice of the target person multiple times, thereby making it possible to make the voice closer to the voice quality of the target person. .

また、特徴情報は、取得した音声により設定されるため、例えば、送信情報をテキストだけで作成するようなときは、この設定は行われない。この場合、例えば、端末装置３０から、設定がされていない旨を音声案内等で通知し、設定を行うための音声を入力するように、促してもよい。 Further, since the characteristic information is set by the acquired voice, this setting is not performed, for example, when transmitting information is created only by text. In this case, for example, the terminal device 30 may notify that the setting is not made by voice guidance or the like, and may prompt the user to input a voice for performing the setting.

再び図５に戻り、端末装置３０から送信された送信情報は、アクセスポイント９０およびネットワーク７０を介して、携帯端末２０に送られる。携帯端末２０では、送信情報を、送受信部２１が取得する（ステップ１０６）。そして、携帯端末２０のＣＰＵは、この送信情報をメモリに保存する（ステップ１０７）。 Returning to FIG. 5 again, the transmission information transmitted from the terminal device 30 is transmitted to the mobile terminal 20 via the access point 90 and the network 70. In the mobile terminal 20, the transmission / reception unit 21 acquires the transmission information (step 106). Then, the CPU of the mobile terminal 20 stores this transmission information in the memory (step 107).

一方、ユーザＢは、携帯端末２０の表示部２２および入力部２３に対応する等の入力機構において、専用アプリから再生ボタン等を押下する。その結果、携帯端末２０の音声出力部２４であるスピーカにより、ユーザＡから送信された音声が、再生される（ステップ１０８）。 On the other hand, the user B presses the play button or the like from the dedicated application in the input mechanism such as the display unit 22 and the input unit 23 of the mobile terminal 20. As a result, the voice transmitted from the user A is reproduced by the speaker, which is the voice output unit 24 of the mobile terminal 20 (step 108).

そして、ユーザＢは、ユーザＡへ返信を行うための送信情報を作成する（ステップ１０９）。この送信情報の作成方法は、ユーザＡの場合で上述した方法と同様であり、入力部２３に対応するマイクロフォンを使用して、ユーザＢの音声を録音することで行う。録音した音声は、メモリに保存される（ステップ１１０）。なおこのとき、入力部２３を利用して送信情報をテキストで作成することもできる。 Then, the user B creates transmission information for replying to the user A (step 109). The method of creating the transmission information is the same as the method described above for the user A, and is performed by recording the voice of the user B using the microphone corresponding to the input unit 23. The recorded voice is stored in the memory (step 110). At this time, the transmission information can be created in text using the input unit 23.

そして、携帯端末２０の送受信部２１は、この送信情報を、端末装置３０に対し送信する（ステップ１１１）。送信情報は、携帯端末２０の送受信部２１、アクセスポイント９０、ネットワーク７０を介し、端末装置３０に送られる。
端末装置３０では、送受信部３１が、送信情報を受信する（ステップ１１２）。そして、送られた送信情報は、記憶部３２が記憶する（ステップ１１３）。 Then, the transmission / reception unit 21 of the mobile terminal 20 transmits this transmission information to the terminal device 30 (step 111). The transmission information is sent to the terminal device 30 via the transmission / reception unit 21 of the mobile terminal 20, the access point 90, and the network 70.
In the terminal device 30, the transmission / reception unit 31 receives the transmission information (step 112). Then, the transmitted transmission information is stored in the storage unit 32 (step 113).

さらに、端末装置３０では、ユーザＡが、操作部３８を操作し、再生部３４によりユーザＢから返信された送信情報の再生を行う。このとき、送信情報の再生の制御は、再生制御部４０が行う。またこのとき、送信情報は、設定部３７が決定した設定に従い、変更部３９が変換を行い、変換後の送信情報が再生される（ステップ１１４）。つまり、声質や文言の変更が行われる。またこのとき、変更部３９は、設定により、テキスト、人の音声および機械的な合成音声について相互に変更することがある。 Further, in the terminal device 30, the user A operates the operation unit 38, and the reproduction unit 34 reproduces the transmission information returned from the user B. At this time, the reproduction control unit 40 controls the reproduction of the transmission information. At this time, the transmission information is converted by the changing unit 39 according to the setting determined by the setting unit 37, and the converted transmission information is reproduced (step 114). That is, the voice quality and wording are changed. Further, at this time, the changing unit 39 may change the text, the human voice, and the mechanical synthetic voice to each other depending on the setting.

変更部３９が、文言の変更を行うには、例えば、変換を行う文言として、予め変換前の文言と変換後の文言とを登録しておき、音声認識により、変換前の文言が登場したときに、この部分を変換後の文言に置き換える。
また、変更部３９が、音声をテキストに変換するには、音声を音声認識し、テキストに変換する。さらに、テキストを音声に変換するには、テキストを基に音声合成を行う方法が使用できる。
そして、変更部３９が、声質の変換を行うには、例えば、音声を、まず、図７で説明したような基本周波数と非周期成分とに分ける。また、音声の信号をフーリエ変換し、周波数スペクトルを求め、これからスペクトル包絡を抽出する。スペクトル包絡は、周波数スペクトルの対数をさらにフーリエ変換したものであり、いわば、スペクトルのスペクトルである。 The changing unit 39 changes the wording, for example, when the wording before conversion and the wording after conversion are registered in advance as the wording to be converted, and the wording before conversion appears by voice recognition. Then, replace this part with the translated wording.
Further, in order to convert voice into text, the changing unit 39 performs voice recognition of voice and converts it into text. Furthermore, in order to convert text into speech, a method of performing speech synthesis based on text can be used.
Then, in order for the changing unit 39 to convert the voice quality, for example, the voice is first divided into the fundamental frequency and the aperiodic component as described in FIG. 7. In addition, the voice signal is Fourier transformed to obtain the frequency spectrum, and the spectrum envelope is extracted from this. The spectral envelope is the logarithm of the frequency spectrum further Fourier-transformed, so to speak, a spectrum of the spectrum.

図９は、スペクトル包絡の例について示した図である。
図９は、横軸は、周波数を表し、縦軸は、スペクトル強度を表す。図中、Ｓｓで表わす線は、周波数スペクトルである。一方、Ｓｈで表わす線は、スペクトル包絡である。このスペクトル包絡Ｓｈは、周波数スペクトルＳｓのなだらかな変動を表したものであり、周波数スペクトルＳｓから、周波数スペクトルＳｓの細かな変動（スペクトル微細構造）を分離したものである。そして、このスペクトル包絡Ｓｈは、人間の声道の特性を表している。よって、このスペクトル包絡Ｓｈを変換することで、異なる声道のスペクトル包絡Ｓｈを再現することができる。つまり、元とは異なる声質とすることができる。また、基本周波数を異なる周波数とすることで、声の高さを変更することができる。さらに、非周期成分の大きさを変化させることでも声質が変化する。例えば、非周期成分が小さいほど、声のかすれが小さくなり、大きいほど声のかすれが大きくなる。そして、変換後の波形を再合成すると、声質を変更できる。 FIG. 9 is a diagram showing an example of the spectrum envelope.
In FIG. 9, the horizontal axis represents frequency and the vertical axis represents spectrum intensity. In the figure, the line represented by Ss is the frequency spectrum. On the other hand, the line represented by Sh is the spectrum envelope. The spectrum envelope Sh represents a gentle variation of the frequency spectrum Ss, and is a fine variation (spectral fine structure) of the frequency spectrum Ss separated from the frequency spectrum Ss. The spectral envelope Sh represents the characteristic of the human vocal tract. Therefore, the spectral envelope Sh of a different vocal tract can be reproduced by converting this spectral envelope Sh. That is, the voice quality can be different from the original one. In addition, the pitch of the voice can be changed by changing the fundamental frequency to a different frequency. Furthermore, changing the magnitude of the aperiodic component also changes the voice quality. For example, the smaller the aperiodic component, the smaller the blurring of the voice, and the larger the non-periodic component, the larger the blurring of the voice. Then, by resynthesizing the converted waveform, the voice quality can be changed.

また、変更部３９は、音声の周波数変換を行ってもよい。つまり、高齢者の場合は、低音域および中音域は聞こえるが、高音域が聞こえにくくなることが多い。そのため高音域の音について、中音域への周波数変換を行い、音声に高音域の音が含まれていても、聞こえるようにする。 The changing unit 39 may also perform frequency conversion of voice. In other words, elderly people often hear the low-pitched sound and the mid-pitched sound, but are hard to hear the high-pitched sound. Therefore, the high-frequency range sound is frequency-converted into the mid-range frequency range so that the sound can be heard even if the sound includes the high-frequency range sound.

図１０（ａ）〜（ｂ）は、音声の周波数変換について示した図である。
ここで、横軸は、周波数を示し、縦軸は、音圧を示す。
このうち、図１０（ａ）は、音声の周波数変換として、周波数の圧縮を行った場合を示している。この場合、実線で示した音声の波形について、高音域として、４０００Ｈｚ以上の周波数領域について、圧縮し、点線で示す波形にしている。
また、図１０（ｂ）は、音声の周波数変換として、周波数の移行を行った場合を示している。この場合、実線で示した音声の波形について、高音域として、４０００Ｈｚ以上の周波数領域について、中音域にスライド（移行）させ、点線で示す波形にしている。
このような音声の周波数変換を行うことで、本来聞こえない領域の音も聞こえるようになり、音声をより聞きやすくなる。 10A and 10B are diagrams showing frequency conversion of voice.
Here, the horizontal axis represents frequency and the vertical axis represents sound pressure.
Of these, FIG. 10A shows the case where frequency compression is performed as frequency conversion of voice. In this case, the waveform of the voice shown by the solid line is compressed in the high frequency range of 4000 Hz or higher to obtain the waveform shown by the dotted line.
Further, FIG. 10B shows a case where frequency shift is performed as frequency conversion of voice. In this case, with respect to the waveform of the voice shown by the solid line, the frequency range of 4000 Hz or higher as the high tone range is slid (shifted) to the middle tone range to obtain the waveform shown by the dotted line.
By performing such frequency conversion of the voice, it becomes possible to hear the sound in a region that cannot be heard, and it becomes easier to hear the voice.

＜変形例＞
本実施の形態では、設定部３７は、取得部３３が取得した音声に基づき、自装置の周辺の状況を把握し、把握した状況に基づき、設定を行う。
例えば、設定部３７は、時間帯に合わせ、音声を再生する際の音量を設定する。例えば、夜間には、音量を小さくする。
また、設定部３７は、取得部３３が取得した音声に基づき、自装置の周辺の状況を把握し、把握した状況に基づき、設定を行ってもよい。例えば、自装置の周辺が騒がしいときは、音量を大きくする。 <Modification>
In the present embodiment, the setting unit 37 grasps the situation around the own device based on the voice acquired by the acquisition unit 33, and performs the setting based on the grasped condition.
For example, the setting unit 37 sets the volume for reproducing sound in accordance with the time zone. For example, the volume is reduced at night.
Further, the setting unit 37 may grasp the situation around the own device based on the voice acquired by the acquisition unit 33, and may perform the setting based on the grasped condition. For example, when the surroundings of the own device are noisy, the volume is increased.

以上詳述した再生システム１によれば、送信情報の再生を行う際に、再生を行う人物に合わせ送信情報を変更することができる再生システム１を提供することができる。 According to the reproduction system 1 described in detail above, when reproducing the transmission information, it is possible to provide the reproduction system 1 capable of changing the transmission information according to the person who reproduces the transmission information.

また、以上詳述した形態では、設定部３７は、いわば自動的に設定を行ったが、手動で設定を変更できるようにしてもよい。この場合、図２で示したメニューボタン３０５ｃを押下し、表示部３５に表示されるメニューから設定の変更を行う。
また、以上詳述した形態では、再生システム１は、携帯端末２０および端末装置３０が、ネットワーク７０、アクセスポイント９０を介して接続されることにより構成されていたが、端末装置３０だけでも再生システムであるとして捉えることができる。また、端末装置３０で行う処理は、携帯端末２０でも同様のことができる。よって、携帯端末２０を再生システムとして捉えることもできる。 Further, in the embodiment described in detail above, the setting unit 37, so to speak, performs the setting automatically, but the setting may be manually changed. In this case, the menu button 305c shown in FIG. 2 is pressed to change the setting from the menu displayed on the display unit 35.
Further, in the embodiment described in detail above, the reproduction system 1 is configured by connecting the mobile terminal 20 and the terminal device 30 via the network 70 and the access point 90, but the reproduction system includes only the terminal device 30. Can be regarded as The processing performed by the terminal device 30 can also be performed by the mobile terminal 20. Therefore, the mobile terminal 20 can be regarded as a reproduction system.

さらに、上述した例では、端末装置３０は、ロボットである例を示したが、これに限られるものではない。例えば、モバイルコンピュータ、携帯電話、スマートフォン、タブレット等のモバイル端末であってもよく、デスクトップコンピュータであってもよい。
さらに、上述した例では、端末装置３０と携帯端末２０とは、ネットワーク７０、アクセスポイント９０を介してピアツーピア接続していたが、これに限られるものではなく、サーバを介して接続していてもよい。なおこの場合、端末装置３０で行う処理は、サーバでも同様のことができる。よって、このサーバを再生システムとして捉えることもできる。 Furthermore, in the above-described example, the terminal device 30 is an example that is a robot, but the present invention is not limited to this. For example, it may be a mobile computer, a mobile phone, a smartphone, a mobile terminal such as a tablet, or a desktop computer.
Furthermore, in the above-described example, the terminal device 30 and the mobile terminal 20 are peer-to-peer connected via the network 70 and the access point 90, but the connection is not limited to this, and they may be connected via a server. Good. In this case, the processing performed by the terminal device 30 can also be performed by the server. Therefore, this server can be regarded as a reproduction system.

＜プログラムの説明＞
ここで、以上説明を行った本実施の形態における端末装置３０が行う処理は、例えば、アプリケーションソフトウェア等のプログラムとして用意される。そして、この処理は、ソフトウェアとハードウェア資源とが協働することにより実現される。即ち、端末装置３０に設けられたコンピュータ内部の図示しないＣＰＵが、上述した各機能を実現するプログラムを実行し、これらの各機能を実現させる。 <Explanation of program>
Here, the processing performed by the terminal device 30 according to the present embodiment described above is prepared as a program such as application software. Then, this processing is realized by the cooperation of software and hardware resources. That is, a CPU (not shown) inside the computer provided in the terminal device 30 executes a program that realizes each function described above, and realizes each function.

よって、本実施の形態で、端末装置３０が行う処理は、コンピュータに、送信された送信情報の再生を行う再生機能と、再生機能で再生を行う再生操作を実行する人物の特徴情報を把握する把握機能と、再生機能における送信情報の再生の設定を、特徴情報に基づき決定する設定機能と、送信情報を設定機能で設定された設定に変更する変更機能と、を実現させるためのプログラムとして捉えることもできる。 Therefore, in the present embodiment, the processing performed by the terminal device 30 causes the computer to grasp the reproduction function of reproducing the transmitted transmission information and the characteristic information of the person who executes the reproduction operation of performing the reproduction by the reproduction function. It is understood as a program for realizing the grasping function, the setting function for determining the reproduction of the transmission information in the reproduction function based on the characteristic information, and the changing function for changing the transmission information to the setting set by the setting function. You can also

なお、本実施の形態を実現するプログラムは、通信手段により提供することはもちろんＣＤ−ＲＯＭ等の記録媒体に格納して提供することも可能である。 It should be noted that the program for realizing the present embodiment can be provided not only by communication means but also by being stored in a recording medium such as a CD-ROM and provided.

以上、本実施の形態について説明したが、本発明の技術的範囲は上記実施の形態に記載の範囲には限定されない。上記実施の形態に、種々の変更または改良を加えたものも、本発明の技術的範囲に含まれることは、特許請求の範囲の記載から明らかである。 Although the present embodiment has been described above, the technical scope of the present invention is not limited to the scope described in the above embodiment. It is apparent from the scope of the claims that the various modifications and improvements made to the above embodiment are also included in the technical scope of the present invention.

１…再生システム、２０…携帯端末、３０…端末装置、３１…送受信部、３２…記憶部、３３…取得部、３４…再生部、３５…表示部、３６…把握部、３７…設定部、３８…操作部、３９…変更部、４０…再生制御部 DESCRIPTION OF SYMBOLS 1 ... Reproduction system, 20 ... Portable terminal, 30 ... Terminal device, 31 ... Transmission / reception part, 32 ... Storage part, 33 ... Acquisition part, 34 ... Reproduction part, 35 ... Display part, 36 ... Grasping part, 37 ... Setting part, 38 ... Operation part, 39 ... Change part, 40 ... Reproduction control part

Claims

Playback means for playing back the transmitted transmission information,
Grasping means for grasping characteristic information of a person who executes a reproducing operation for reproducing by the reproducing means,
Setting means for determining the reproduction setting of the transmission information in the reproduction means based on the characteristic information;
Changing means for changing the transmission information to the setting set by the setting means,
Playback system having.

Further comprising an acquisition unit for acquiring the voice of the person,
The reproduction system according to claim 1, wherein the grasping unit grasps the characteristic information based on the voice acquired by the acquiring unit.

The reproduction system according to claim 1, wherein the changing unit changes the voice quality of the voice transmitted as the transmission information according to the person.

The reproduction system according to claim 1, wherein the changing unit changes the wording of the voice transmitted as the transmission information according to the person.

The reproduction system according to claim 1, wherein the changing unit mutually changes the text, the human voice, and the mechanical synthetic voice according to the setting.

The reproduction system according to claim 1, wherein the changing unit performs frequency conversion of audio according to the setting.

The reproduction system according to claim 1, wherein the setting unit grasps a situation around the apparatus itself and makes a setting based on the grasped situation.

On the computer,
A playback function that plays back the transmitted transmission information,
A grasping function for grasping characteristic information of a person who performs a reproducing operation for reproducing by the reproducing function,
A setting function for determining the reproduction setting of the transmission information in the reproduction function based on the characteristic information,
A change function for changing the transmission information to the setting set by the setting function,
A program for realizing.