JP2010166324A

JP2010166324A - Portable terminal, voice synthesizing method, and program for voice synthesis

Info

Publication number: JP2010166324A
Application number: JP2009006810A
Authority: JP
Inventors: Yuichi Kameshige; 祐一亀重
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2009-01-15
Filing date: 2009-01-15
Publication date: 2010-07-29

Abstract

<P>PROBLEM TO BE SOLVED: To switch to voice synthesis based on a specific keyword, to perform filtering to a raw voice and to automatically register the specific keyword in a portable terminal capable of performing a voice communication. <P>SOLUTION: The portable terminal has a voice analyzing means and a voice converting means at the previous stage of a voice synthesizing means. The voice synthesizing means converts a received voice into a synthesized voice that imitates a particular person, such as a talent. It is determined whether a transmission source of the received voice is a switching partner of voice synthesis. The voice feeling of a reception voice is analyzed. When the converting means recognizes a voice of a specific keyword preregistered in a keyword database, the converting means stops the voice synthesis and outputs a voice with the reception voice as it is. When the keyword is registered as an NG word, a determined NG word is notified to the partner. When a result that the voice feeling of the received voice is a specific feeling such as anger is obtained, the voice synthesis is stopped, processing that outputs a voice obtained by filtering the original voice is performed. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、携帯端末に関し、特に音声を出力する携帯端末に関する。 The present invention relates to a mobile terminal, and more particularly to a mobile terminal that outputs sound.

現在、携帯電話事業者（キャリア）の収益においては、データパケットの比率が増加し、音声の比率が減少している。 Currently, in the profits of mobile phone carriers (carriers), the ratio of data packets is increasing and the ratio of voice is decreasing.

そこで、携帯電話オペレータ各社は、ユニークなアプリケーション（アプリケーションソフトウェア）を持った携帯端末や、サービスを提供する事により、シェアの伸張と共に音声の収益率アップを図っている。 Accordingly, mobile phone operator companies are aiming to increase the profitability of voice along with the expansion of market share by providing mobile terminals and services with unique applications (application software).

しかし、従来の携帯電話は、音声合成機能は備えていても、特定のキーワードに基づいた音声合成の切換、生の音声へのフィルタリング、特定キーワードの自動登録は不可能であった。 However, even if a conventional mobile phone has a speech synthesis function, it is impossible to switch speech synthesis based on a specific keyword, filter to raw speech, and automatically register a specific keyword.

なお、関連する技術として、特開平１０−０９７２６７号公報（特許文献１）に声質変換方法および装置が開示されている。この関連技術では、利用者（ユーザ）の好みに合致する方向に音質が変換されるように、分析合成方式等を用いて、入力された音声の声質変換を行い、声質の異なる音声を合成し、良好な音声、所望の音質の合成音声、種々の声質の合成音声を出力することを目的としている。 As a related technique, a voice quality conversion method and apparatus are disclosed in Japanese Patent Laid-Open No. 10-097267 (Patent Document 1). In this related technology, the voice quality of the input voice is converted using an analysis / synthesis method so that the voice quality is converted in a direction that matches the user's (user) preference, and voices with different voice quality are synthesized. It is intended to output good speech, synthesized speech of desired sound quality, and synthesized speech of various voice qualities.

特開平１０−０９７２６７号公報Japanese Patent Laid-Open No. 10-097267

本発明の目的は、音声を分析し合成や変換を行う携帯端末を提供することである。 An object of the present invention is to provide a portable terminal that analyzes voice and performs synthesis and conversion.

本発明の携帯端末は、音声分析手段と、変換手段と、音声合成手段とを具備する。音声分析手段は、音声通信の相手の識別情報が音声合成対象として登録されているか判断し、識別情報が音声合成対象として登録されている場合、音声通信に関する音声信号のスペクトラムを分析し、音声信号に基づく音声が特定の感情を示す音声パターンに該当する音声であるかどうか判断する。変換手段は、音声が音声パターンに該当する音声である場合、音声にフィルタをかけるフィルタリング処理を行い、フィルタをかけられた音声を出力する。また、変換手段は、音声が音声パターンに該当する音声でない場合、音声信号に含まれるキーワードがＮＧワードに該当するか判断し、キーワードがＮＧワードに該当する場合、音声信号の発信元に対し、キーワードがＮＧワードである旨を通知する。音声合成手段は、音声信号に基づく音声に対して、所定の音声の合成を行い、合成音を出力する。なお、音声通信の例として、電話による通話、テレビ放送やラジオ放送の受信、ストリーミングデータの受信、記憶装置等からの音声データの取得及び再生等が考えられる。また、音声信号の発信元は、音声信号が外部からの受信信号である場合は送信元の相手を示し、音声信号が外部への送信信号である場合は携帯端末のユーザを示す。但し、実際には、これらの例に限定されない。 The portable terminal of the present invention includes a voice analysis unit, a conversion unit, and a voice synthesis unit. The voice analysis means determines whether the identification information of the other party of voice communication is registered as a voice synthesis target. If the identification information is registered as a voice synthesis target, the voice analysis unit analyzes a spectrum of the voice signal related to voice communication, It is determined whether or not the voice based on the voice corresponds to a voice pattern indicating a specific emotion. When the voice is a voice corresponding to the voice pattern, the conversion means performs a filtering process for filtering the voice and outputs the filtered voice. In addition, when the voice is not a voice corresponding to the voice pattern, the conversion unit determines whether the keyword included in the voice signal corresponds to the NG word, and when the keyword corresponds to the NG word, Notify that the keyword is an NG word. The voice synthesizer synthesizes a predetermined voice with the voice based on the voice signal and outputs a synthesized voice. Note that examples of voice communication include telephone calls, reception of television broadcasts and radio broadcasts, reception of streaming data, acquisition and reproduction of voice data from a storage device, and the like. The source of the audio signal indicates the other party of the transmission source when the audio signal is an externally received signal, and indicates the user of the portable terminal when the audio signal is an externally transmitted signal. However, actually, it is not limited to these examples.

本発明の音声合成方法では、音声通信の相手の識別情報が音声合成対象として登録されているか判断する。また、識別情報が音声合成対象として登録されている場合、音声通信に関する音声信号のスペクトラムを分析し、音声信号に基づく音声が特定の感情を示す音声パターンに該当する音声であるかどうか判断する。また、音声が音声パターンに該当する音声である場合、音声にフィルタをかけるフィルタリング処理を行い、フィルタをかけられた音声を出力する。また、音声が音声パターンに該当する音声でない場合、音声信号に含まれるキーワードがＮＧワードに該当するか判断し、キーワードがＮＧワードに該当する場合、音声信号の発信元に対し、キーワードがＮＧワードである旨を通知する。また、音声信号に基づく音声に対して、所定の音声の合成を行い、合成音を出力する。 In the speech synthesis method of the present invention, it is determined whether or not the identification information of the speech communication partner is registered as a speech synthesis target. When the identification information is registered as a speech synthesis target, the spectrum of the speech signal related to speech communication is analyzed to determine whether the speech based on the speech signal is a speech that corresponds to a speech pattern indicating a specific emotion. Further, when the voice is a voice corresponding to the voice pattern, a filtering process for filtering the voice is performed, and the filtered voice is output. If the voice is not a voice corresponding to the voice pattern, it is determined whether the keyword included in the voice signal corresponds to the NG word. If the keyword corresponds to the NG word, the keyword is determined to be NG word for the source of the voice signal. Notify that. Further, a predetermined voice is synthesized with the voice based on the voice signal, and a synthesized sound is output.

本発明の音声合成用プログラムは、音声通信の相手の識別情報が音声合成対象として登録されているか判断するステップと、識別情報が音声合成対象として登録されている場合、音声通信に関する音声信号のスペクトラムを分析し、音声信号に基づく音声が特定の感情を示す音声パターンに該当する音声であるかどうか判断するステップと、音声が音声パターンに該当する音声である場合、音声にフィルタをかけるフィルタリング処理を行い、フィルタをかけられた音声を出力するステップと、音声が音声パターンに該当する音声でない場合、音声信号に含まれるキーワードがＮＧワードに該当するか判断し、キーワードがＮＧワードに該当する場合、音声信号の発信元に対し、キーワードがＮＧワードである旨を通知するステップと、音声信号に基づく音声に対して、所定の音声の合成を行い、合成音を出力するステップとをコンピュータに実行させるためのプログラムである。 The speech synthesis program according to the present invention includes a step of determining whether identification information of a speech communication partner is registered as a speech synthesis target, and a spectrum of a speech signal related to speech communication when the identification information is registered as a speech synthesis target. And determining whether the voice based on the voice signal is a voice corresponding to a voice pattern indicating a specific emotion, and if the voice is a voice corresponding to the voice pattern, a filtering process for filtering the voice is performed. And outputting a filtered voice and if the voice is not a voice corresponding to a voice pattern, determine whether the keyword included in the voice signal corresponds to a NG word, and if the keyword corresponds to a NG word, Informing the source of the audio signal that the keyword is an NG word; For voice-based No. performs synthesis of a given voice, a program for executing the steps on a computer for outputting the synthesized sound.

電話機能を始めとする音声通信機能の利用価値を向上する。 Improve the utility value of voice communication functions such as telephone functions.

本発明の第１及び第２実施形態における携帯端末の構成例を示すブロック図である。It is a block diagram which shows the structural example of the portable terminal in 1st and 2nd embodiment of this invention. 本発明の第１実施形態における携帯端末の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the portable terminal in 1st Embodiment of this invention. 本発明の第２実施形態における携帯端末の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the portable terminal in 2nd Embodiment of this invention. 本発明の第３及び第４実施形態における携帯端末の構成例を示すブロック図である。It is a block diagram which shows the structural example of the portable terminal in the 3rd and 4th embodiment of this invention. 本発明の第３実施形態における携帯端末の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the portable terminal in 3rd Embodiment of this invention. 本発明の第４実施形態における携帯端末の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the portable terminal in 4th Embodiment of this invention.

＜第１実施形態＞
以下に、本発明の第１実施形態について添付図面を参照して説明する。
図１を参照すると、本発明の携帯端末は、アンテナ部１１と、受信部１２と、音声分析機能部１３と、変換機能部１４と、音声合成機能部１５と、音声出力部１６と、音声入力部１７と、送信部１８と、電話番号帳２１と、キーワードデータベース２２と、音声データベース２３を備える。 <First Embodiment>
Hereinafter, a first embodiment of the present invention will be described with reference to the accompanying drawings.
Referring to FIG. 1, the portable terminal of the present invention includes an antenna unit 11, a receiving unit 12, a voice analysis function unit 13, a conversion function unit 14, a voice synthesis function unit 15, a voice output unit 16, and a voice. An input unit 17, a transmission unit 18, a telephone number book 21, a keyword database 22, and a voice database 23 are provided.

ここでは、携帯端末の例として、携帯電話機を想定している。但し、実際には、携帯端末は、ＰＣ（パソコン）、モバイルノートＰＣ、シンクライアント端末、カーナビ（カーナビゲーションシステム）、携帯音楽プレーヤー、携帯ゲーム機、家庭用ゲーム機、双方向テレビ、デジタルチューナー、デジタルレコーダー、情報家電（ｉｎｆｏｒｍａｔｉｏｎｈｏｍｅａｐｐｌｉａｎｃｅ）、ＯＡ（ＯｆｆｉｃｅＡｕｔｏｍａｔｉｏｎ）機器等でも良い。本質的には、本発明の携帯端末は、音声を出力可能な電子機器であれば良い。 Here, a mobile phone is assumed as an example of the mobile terminal. However, in practice, the portable terminal is a PC (personal computer), a mobile notebook PC, a thin client terminal, a car navigation system (car navigation system), a portable music player, a portable game machine, a home game machine, an interactive TV, a digital tuner, Digital recorders, information home appliances, OA (Office Automation) devices, and the like may be used. Essentially, the portable terminal of the present invention may be any electronic device that can output sound.

アンテナ部１１は、信号を電波で送信したり受信したりする。ここでは、携帯端末の通信方式として、無線通信を想定しているが、実際には、有線通信でも良い。 The antenna unit 11 transmits and receives signals by radio waves. Here, wireless communication is assumed as the communication method of the mobile terminal, but actually, wired communication may be used.

受信部１２は、アンテナ部１１で受信した外部からの信号を音声信号に変換する。ここでは、受信部１２は、アンテナ部１１が受信した高周波信号を復調してベースバンド信号に変換する。なお、ベースバンド信号とは、変復調をするシステムにおける変調前の信号及び復調後の情報信号（音声、映像、デジタルデータ等）を示す。ここでは、ベースバンド信号は、音声信号である。 The receiving unit 12 converts an external signal received by the antenna unit 11 into an audio signal. Here, the receiving unit 12 demodulates and converts the high-frequency signal received by the antenna unit 11 into a baseband signal. Note that the baseband signal indicates a signal before modulation and an information signal after demodulation (audio, video, digital data, etc.) in a system that performs modulation and demodulation. Here, the baseband signal is an audio signal.

音声分析機能部１３は、電話を掛けて来た相手の電話番号等の識別情報が電話番号帳２１に音声合成対象番号として登録されているかを判断する。また、音声分析機能部１３は、音声信号のスペクトラムを分析し、音声信号に基づく音声が怒り等の特定の感情を示す音声であるかどうか判断する。例えば、音声分析機能部１３は、携帯端末の外部又は内部に記憶された特定の感情を示す音声パターンを参照して、音声信号に基づく音声パターンが、特定の感情を示す音声パターンと適合するかを判断する。但し、実際には、これらの例に限定されない。その後、音声分析機能部１３は、変換機能部１４に音声信号を送る。このとき、音声分析機能部１３は、音声信号に基づく音声が怒り等の特定の感情を示す音声であれば、特定の感情を示す音声である旨を、或いは、元の音声にフィルタをかける旨の指示を、変換機能部１４に通知する。なお、変換機能部１４は、音声信号に基づく音声が怒り等の特定の感情を示す音声であるかどうか判断しない場合、無条件で変換機能部１４に音声信号を送る。 The voice analysis function unit 13 determines whether identification information such as the telephone number of the other party who made the call is registered in the telephone number book 21 as a voice synthesis target number. In addition, the voice analysis function unit 13 analyzes the spectrum of the voice signal and determines whether or not the voice based on the voice signal is a voice indicating a specific emotion such as anger. For example, the voice analysis function unit 13 refers to a voice pattern indicating a specific emotion stored outside or inside the mobile terminal, and whether the voice pattern based on the voice signal matches the voice pattern indicating the specific emotion. Judging. However, actually, it is not limited to these examples. Thereafter, the voice analysis function unit 13 sends a voice signal to the conversion function unit 14. At this time, if the voice based on the voice signal indicates a specific emotion such as anger, the voice analysis function unit 13 indicates that the voice indicates a specific emotion, or filters the original voice. Is sent to the conversion function unit 14. Note that the conversion function unit 14 unconditionally sends the audio signal to the conversion function unit 14 when it is not determined whether the audio based on the audio signal is an audio signal indicating a specific emotion such as anger.

変換機能部１４は、キーワードデータベース２２を参照して、音声分析機能部１３で分析された音声信号に従い、特定のキーワードを示すキーワードデータを出力する。ここでは、変換機能部１４は、電話を掛けて来た相手からの音声信号に含まれるキーワードがＮＧワードとしてキーワードデータベース２２に登録されていた場合、その相手に対し、送信部１８を介して、予め決められたＮＧワードや、ＮＧワードである旨の通知を送信する。例えば、変換機能部１４は、音声信号に含まれていた「○○」というキーワードがＮＧワードとしてキーワードデータベース２２に登録されていた場合、相手に対し、送信部１８を介して、「○○はＮＧワードです」という旨の通知を送信する。なお、変換機能部１４は、音声信号に含まれるキーワードがＮＧワードとしてキーワードデータベース２２に登録されていた場合、ＮＧワードの部分の音声を、無音又はビープ音に変換するようにしても良い。また、変換機能部１４は、音声信号に基づく音声が怒り等の特定の感情を示す音声であれば、元の音声にフィルタをかけた音声を出力する処理（フィルタリング）を行う。 The conversion function unit 14 refers to the keyword database 22 and outputs keyword data indicating a specific keyword according to the voice signal analyzed by the voice analysis function unit 13. Here, when the keyword included in the voice signal from the other party who made the call is registered in the keyword database 22 as an NG word, the conversion function unit 14 sends the keyword to the other party via the transmission unit 18. A predetermined NG word or a notification that it is an NG word is transmitted. For example, when the keyword “XX” included in the audio signal is registered in the keyword database 22 as an NG word, the conversion function unit 14 sends “ NG word "is sent. Note that the conversion function unit 14 may convert the voice of the NG word portion into silence or beep sound when the keyword included in the voice signal is registered in the keyword database 22 as an NG word. Moreover, the conversion function part 14 will perform the process (filtering) which outputs the audio | voice which filtered the original audio | voice, if the audio | voice based on an audio | voice signal is an audio | voice which shows specific emotions, such as anger.

音声合成機能部１５は、音声データベース２３を参照して、受け取った音声信号に応じて、所定の音声の合成を行う。本発明の携帯端末では、音声合成機能部１５の前段に音声分析機能部１３と変換機能部１４を配置している。 The voice synthesis function unit 15 refers to the voice database 23 and synthesizes a predetermined voice according to the received voice signal. In the portable terminal of the present invention, the speech analysis function unit 13 and the conversion function unit 14 are arranged in the previous stage of the speech synthesis function unit 15.

音声出力部１６は、受け取った音声信号に基づいて、音声出力を行う。 The audio output unit 16 performs audio output based on the received audio signal.

音声入力部１７は、ユーザ又はアプリケーションによる音声入力に基づいて、音声信号を発生する。 The voice input unit 17 generates a voice signal based on voice input by a user or an application.

送信部１８は、音声信号を変換し、アンテナ部１１を介して、変換後の信号を外部に送信する。ここでは、受信部１２は、音声信号であるベースバンド信号を変調して高周波信号に変換し、アンテナ部１１を介して、高周波信号を外部に送信する。 The transmission unit 18 converts the audio signal and transmits the converted signal to the outside via the antenna unit 11. Here, the receiving unit 12 modulates a baseband signal that is an audio signal to convert it into a high-frequency signal, and transmits the high-frequency signal to the outside via the antenna unit 11.

電話番号帳２１は、電話を掛けて来た相手の電話番号等の識別情報を格納する。ここでは、電話番号帳２１は、電話を掛けて来た相手のうち、音声合成の対象とする相手の電話番号等の識別情報を格納する。なお、電話番号は例示に過ぎず、実際には、電話を掛けて来た相手を特定できる識別情報であれば良い。また、電話番号帳２１は、携帯端末自体が有する通常の電話番号帳とは別に設けられていても良い。 The telephone number book 21 stores identification information such as the telephone number of the other party who made the call. Here, the telephone number book 21 stores identification information such as the telephone number of the other party who is the target of speech synthesis among the other parties who have made a call. Note that the telephone number is merely an example, and in practice, it may be any identification information that can identify the other party who made the call. Moreover, the telephone number book 21 may be provided separately from the normal telephone number book which the portable terminal itself has.

キーワードデータベース２２は、特定のキーワードを示すキーワードデータを格納する。ここでは、キーワードデータベース２２には、特定の会話のセンテンスやＮＧワードが登録されている。特定のキーワードの登録については、受話の操作者が変換機能部１４を操作するためのボタンを押下した際、音声分析機能部１３がそのボタンの押下時から所定時間前までの音声を分析して特定のキーワードを検出し、検出されたキーワードを示すキーワードデータをキーワードデータベース２２に自動的に登録する。この場合、音声分析機能部１３は、そのボタンの押下時から所定時間前までの音声を、一時的に記憶している。なお、変換機能部１４は、操作者の操作に応じて、特定のキーワードデータをＷＥＢサイト等からダウンロードして、キーワードデータベース２２に登録するようにしても良い。このとき、変換機能部１４は、キーワードデータベース２２自体をＷＥＢサイト等からダウンロードするようにしても良い。 The keyword database 22 stores keyword data indicating specific keywords. Here, in the keyword database 22, sentences of specific conversations and NG words are registered. For registration of a specific keyword, when the receiving operator presses a button for operating the conversion function unit 14, the voice analysis function unit 13 analyzes the voice from when the button is pressed to a predetermined time before. A specific keyword is detected, and keyword data indicating the detected keyword is automatically registered in the keyword database 22. In this case, the voice analysis function unit 13 temporarily stores voices from when the button is pressed to a predetermined time before. Note that the conversion function unit 14 may download specific keyword data from a WEB site or the like and register it in the keyword database 22 in accordance with the operation of the operator. At this time, the conversion function unit 14 may download the keyword database 22 itself from a WEB site or the like.

音声データベース２３は、所定の音声データを格納する。ここでは、音声データベース２３には、特定の人物の音声に似た音声データが格納されているものとする。特定の人物の例として、タレント（芸能人）や、その他の著名人、ユーザの家族、知人等が考えられる。但し、実際には、これらの例に限定されない。なお、音声合成機能部１５は、操作者の操作に応じて、所定の音声データをＷＥＢサイト等からダウンロードして、音声データベース２３に登録する。このとき、音声合成機能部１５は、音声データベース２３自体をＷＥＢサイト等からダウンロードするようにしても良い。 The voice database 23 stores predetermined voice data. Here, it is assumed that voice data similar to the voice of a specific person is stored in the voice database 23. As an example of a specific person, a talent (celebrity), other celebrities, a user's family, an acquaintance, etc. can be considered. However, actually, it is not limited to these examples. The voice synthesis function unit 15 downloads predetermined voice data from a WEB site or the like and registers it in the voice database 23 in accordance with the operation of the operator. At this time, the speech synthesis function unit 15 may download the speech database 23 itself from a WEB site or the like.

アンテナ部１１が利用する通信回線の例として、携帯電話網、ＷｉＭＡＸ、３Ｇ（第３世代携帯電話）、インターネット、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、無線ＬＡＮ（ＷｉｒｅｌｅｓｓＬＡＮ）、ケーブルテレビ（ＣＡＴＶ）回線、固定電話網、専用線（ｌｅａｓｅｌｉｎｅ）、ＩｒＤＡ（ＩｎｆｒａｒｅｄＤａｔａＡｓｓｏｃｉａｔｉｏｎ）、Ｂｌｕｅｔｏｏｔｈ（登録商標）、シリアル通信回線等が考えられる。但し、実際には、これらの例に限定されない。 Examples of communication lines used by the antenna unit 11 include a mobile phone network, WiMAX, 3G (third generation mobile phone), the Internet, a LAN (Local Area Network), a wireless LAN (Wireless LAN), a cable television (CATV) line, A fixed telephone network, a leased line, IrDA (Infrared Data Association), Bluetooth (registered trademark), a serial communication line, and the like are conceivable. However, actually, it is not limited to these examples.

受信部１２及び送信部１８の例として、ＮＩＣ（ＮｅｔｗｏｒｋＩｎｔｅｒｆａｃｅＣａｒｄ）等のネットワークアダプタや、接続口（コネクタ）等の通信ポート等が考えられる。但し、実際には、これらの例に限定されない。 Examples of the receiving unit 12 and the transmitting unit 18 include a network adapter such as a NIC (Network Interface Card), a communication port such as a connection port (connector), and the like. However, actually, it is not limited to these examples.

音声分析機能部１３、変換機能部１４、及び音声合成機能部１５の例として、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）やマイクロプロセッサ（ｍｉｃｒｏｐｒｏｃｅｓｓｏｒ）等の処理装置、又は同様の機能を有する半導体集積回路（ＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ（ＩＣ））等が考えられる。なお、音声分析機能部１３、変換機能部１４、及び音声合成機能部１５は、各々の機能を携帯端末に実行させるためのプログラム（音声認識ソフトウェア、音声合成ソフトウェア等）でも良い。但し、実際には、これらの例に限定されない。 As an example of the speech analysis function unit 13, the conversion function unit 14, and the speech synthesis function unit 15, a processing device such as a CPU (Central Processing Unit) or a microprocessor (microprocessor), or a semiconductor integrated circuit (Integrated Circuit) having a similar function (IC)) etc. are conceivable. Note that the speech analysis function unit 13, the conversion function unit 14, and the speech synthesis function unit 15 may be programs (speech recognition software, speech synthesis software, etc.) for causing a mobile terminal to execute each function. However, actually, it is not limited to these examples.

また、変換機能部１４は、ユーザ操作等の外部入力を受け付けるための入力装置や、ユーザ通知を行うための出力装置と連携するようにしても良い。入力装置の例として、キーボードやキーパッド、画面上のキーパッド、タッチパネル（ｔｏｕｃｈｐａｎｅｌ）、タブレット（ｔａｂｌｅｔ）等が考えられる。或いは、入力装置は、外部の入力装置や記憶装置から情報を取得するためのインターフェース（Ｉ／Ｆ：ｉｎｔｅｒｆａｃｅ）でも良い。また、出力装置の例として、ＬＣＤ（液晶ディスプレイ）やＰＤＰ（プラズマディスプレイ）、有機ＥＬディスプレイ（ｏｒｇａｎｉｃｅｌｅｃｔｒｏｌｕｍｉｎｅｓｃｅｎｃｅｄｉｓｐｌａｙ）等の表示装置、又は、表示内容を壁やスクリーンに投影するプロジェクタ等の映写装置、表示内容を用紙等に印刷するプリンタ等の印刷装置等が考えられる。或いは、出力装置は、外部の表示装置や記憶装置に情報を出力するためのインターフェースでも良い。但し、実際には、これらの例に限定されない。 Further, the conversion function unit 14 may cooperate with an input device for receiving an external input such as a user operation or an output device for performing a user notification. Examples of the input device include a keyboard, a keypad, a keypad on the screen, a touch panel, a tablet, and the like. Alternatively, the input device may be an interface (I / F) for acquiring information from an external input device or storage device. Examples of the output device include a display device such as an LCD (liquid crystal display), a PDP (plasma display), and an organic EL display (organic electroluminescence display), or a projection device such as a projector that projects display contents on a wall or a screen, A printing apparatus such as a printer that prints display contents on a sheet or the like can be considered. Alternatively, the output device may be an interface for outputting information to an external display device or storage device. However, actually, it is not limited to these examples.

音声出力部１６の例として、スピーカー、イヤホン、又はヘッドホン等が考えられる。音声出力部１６は、ディスプレイ等の表示装置と一体化していても良い。但し、実際には、これらの例に限定されない。 As an example of the audio output unit 16, a speaker, an earphone, a headphone, or the like can be considered. The audio output unit 16 may be integrated with a display device such as a display. However, actually, it is not limited to these examples.

音声入力部１７の例として、マイク等の集音器、文字音声変換ソフトウェア、人工音声ソフトウェア、電子楽器等が考えられる。また、音声入力部１７は、携帯端末の内部や外部の記憶装置に格納された音声データを取得するためのインタフェース（Ｉｎｔｅｒｆａｃｅ（Ｉ／Ｆ））でも良い。但し、実際には、これらの例に限定されない。 As an example of the voice input unit 17, a sound collector such as a microphone, character-speech conversion software, artificial voice software, an electronic musical instrument, and the like can be considered. The voice input unit 17 may be an interface (Interface (I / F)) for acquiring voice data stored in a storage device inside or outside the portable terminal. However, actually, it is not limited to these examples.

電話番号帳２１、キーワードデータベース２２、及び音声データベース２３の例として、メモリ等の半導体記憶装置、ハードディスク等の外部記憶装置（ストレージ）、又は、記憶媒体（メディア）等が考えられる。なお、電話番号帳２１、キーワードデータベース２２、及び音声データベース２３は、携帯端末の本体に内蔵された記憶装置に限らず、周辺機器（外付けＨＤＤ等）や外部のサーバ（ストレージサーバ等）に設置された記憶装置、或いは、ＮＡＳ（ＮｅｔｗｏｒｋＡｔｔａｃｈｅｄＳｔｏｒａｇｅ）でも良い。但し、実際には、これらの例に限定されない。 As an example of the telephone number book 21, the keyword database 22, and the voice database 23, a semiconductor storage device such as a memory, an external storage device (storage) such as a hard disk, or a storage medium (media) can be considered. The telephone number book 21, the keyword database 22, and the voice database 23 are not limited to a storage device built in the main body of the mobile terminal, but are installed in a peripheral device (external HDD or the like) or an external server (storage server or the like). Or a NAS (Network Attached Storage). However, actually, it is not limited to these examples.

図２を参照して、本発明の携帯端末の動作について説明する。 With reference to FIG. 2, the operation of the portable terminal of the present invention will be described.

（１）ステップＳ１０１
アンテナ部１１は、相手が電話を掛けて来た場合、高周波信号を受信する。 (1) Step S101
The antenna unit 11 receives a high-frequency signal when the other party makes a call.

（２）ステップＳ１０２
受信部１２は、受信された高周波信号を復調し、ベースバンド信号に変換して音声分析機能部１３に送る。このとき、受信部１２は、ベースバンド信号と共に、電話を掛けて来た相手の電話番号を示す情報信号を音声分析機能部１３に送る。ここでは、ベースバンド信号は、音声信号である。なお、ベースバンド信号は、電話を掛けて来た相手の電話番号を示すデジタルデータを含んでいても良い。 (2) Step S102
The receiving unit 12 demodulates the received high-frequency signal, converts it to a baseband signal, and sends it to the voice analysis function unit 13. At this time, the receiving unit 12 sends to the voice analysis function unit 13 an information signal indicating the telephone number of the other party who made the call together with the baseband signal. Here, the baseband signal is an audio signal. The baseband signal may include digital data indicating the telephone number of the other party who made the call.

（３）ステップＳ１０３
音声分析機能部１３は、電話を掛けて来た相手の電話番号が電話番号帳２１に音声合成対象番号として登録されているかを判断する。なお、電話番号は例示に過ぎず、実際には、電話を掛けて来た相手を特定できる識別情報であれば良い。 (3) Step S103
The voice analysis function unit 13 determines whether the telephone number of the other party who made the call is registered in the telephone number book 21 as a voice synthesis target number. Note that the telephone number is merely an example, and in practice, it may be any identification information that can identify the other party who made the call.

（４）ステップＳ１０４
音声分析機能部１３は、電話を掛けて来た相手の電話番号が電話番号帳２１に音声合成対象番号として登録された番号でない場合、直接、音声出力部１６に音声信号を送る。音声出力部１６は、当該音声信号に基づいて、音声出力を行う。このとき、音声分析機能部１３は、変換機能部１４及び音声合成機能部１５に対して動作しないように通知した上で、変換機能部１４及び音声合成機能部１５を介して音声出力部１６に音声信号を送るようにしても良い。 (4) Step S104
The voice analysis function unit 13 sends a voice signal directly to the voice output unit 16 when the telephone number of the other party who made the call is not a number registered in the telephone number book 21 as a voice synthesis target number. The audio output unit 16 performs audio output based on the audio signal. At this time, the voice analysis function unit 13 notifies the conversion function unit 14 and the voice synthesis function unit 15 not to operate, and then sends the voice output unit 16 via the conversion function unit 14 and the voice synthesis function unit 15. An audio signal may be sent.

（５）ステップＳ１０５
音声分析機能部１３は、電話を掛けて来た相手の電話番号が電話番号帳２１に音声合成対象番号として登録された番号である場合、音声合成ＯＮと認識し、変換機能部１４を動作させる。例えば、音声分析機能部１３は、音声合成ＯＮと認識した場合、ＯＦＦ状態の変換機能部１４を起動させる。或いは、音声分析機能部１３から変換機能部１４への通知を禁止状態から許可状態に変更する。すなわち、音声分析機能部１３は、音声合成ＯＮと認識した場合、当該相手との電話中、変換機能部１４への音声信号の提供を可能にする。 (5) Step S105
The voice analysis function unit 13 recognizes that the voice synthesis is ON when the telephone number of the other party who made the call is a number registered as a voice synthesis target number in the telephone number book 21 and operates the conversion function unit 14. . For example, when the speech analysis function unit 13 recognizes that speech synthesis is ON, the speech analysis function unit 13 activates the conversion function unit 14 in the OFF state. Alternatively, the notification from the voice analysis function unit 13 to the conversion function unit 14 is changed from the prohibited state to the permitted state. That is, when the voice analysis function unit 13 recognizes that the voice synthesis is ON, the voice analysis function unit 13 can provide a voice signal to the conversion function unit 14 during a call with the other party.

（６）ステップＳ１０６
音声分析機能部１３は、音声信号のスペクトラムを分析し、音声信号に基づく音声が怒り等の特定の感情を示す音声であるかどうか判断する。その後、音声分析機能部１３は、変換機能部１４に音声信号を送る。このとき、音声分析機能部１３は、音声信号に基づく音声が怒り等の特定の感情を示す音声であれば、特定の感情を示す音声である旨を、或いは、元の音声にフィルタをかける旨の指示を、変換機能部１４に通知する。なお、音声分析機能部１３は、音声信号に基づく音声が怒り等の特定の感情を示す音声であるかどうか判断しない場合、無条件で変換機能部１４に音声信号を送る。 (6) Step S106
The voice analysis function unit 13 analyzes the spectrum of the voice signal and determines whether or not the voice based on the voice signal is a voice indicating a specific emotion such as anger. Thereafter, the voice analysis function unit 13 sends a voice signal to the conversion function unit 14. At this time, if the voice based on the voice signal indicates a specific emotion such as anger, the voice analysis function unit 13 indicates that the voice indicates a specific emotion, or filters the original voice. Is sent to the conversion function unit 14. If the voice analysis function unit 13 does not determine whether the voice based on the voice signal is a voice indicating a specific emotion such as anger, the voice analysis function unit 13 sends the voice signal to the conversion function unit 14 unconditionally.

（７）ステップＳ１０７
変換機能部１４は、音声信号に基づく音声が怒り等の特定の感情を示す音声であれば、元の音声にフィルタをかけた音声を出力する処理（フィルタリング）を行う。ここでは、変換機能部１４は、音声出力部１６に対して、元の音声にフィルタをかけた後の音声信号を送る。音声出力部１６は、元の音声にフィルタをかけた音声を出力する。このとき、変換機能部１４は、音声合成機能部１５に対し、音声信号と、元の音声にフィルタをかける旨の通知（フィルタリング依頼）を送り、音声合成機能部１５で元の音声にフィルタをかけ、音声合成機能部１５から音声出力部１６に元の音声にフィルタをかけた後の音声信号を送るようにしても良い。 (7) Step S107
If the voice based on the voice signal indicates a specific emotion such as anger, the conversion function unit 14 performs processing (filtering) to output a voice obtained by filtering the original voice. Here, the conversion function unit 14 sends an audio signal after filtering the original audio to the audio output unit 16. The sound output unit 16 outputs sound obtained by filtering the original sound. At this time, the conversion function unit 14 sends a voice signal and a notification (filtering request) to filter the original voice to the voice synthesis function unit 15, and the voice synthesis function unit 15 filters the original voice. Alternatively, the voice synthesis function unit 15 may send the voice signal after filtering the original voice to the voice output unit 16.

（８）ステップＳ１０８
変換機能部１４は、音声信号に基づく音声が怒り等の特定の感情を示す音声でなければ、キーワードデータベース２２を参照して、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれているかどうか判断する。このとき、変換機能部１４は、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれていない場合、音声合成機能部１５に音声信号を送る。 (8) Step S108
If the voice based on the voice signal is not a voice indicating a specific emotion such as anger, the conversion function unit 14 refers to the keyword database 22 and adds a specific keyword to the voice signal analyzed by the voice analysis function unit 13. It is determined whether or not the indicated keyword data is included. At this time, if the voice signal analyzed by the voice analysis function unit 13 does not include keyword data indicating a specific keyword, the conversion function unit 14 sends the voice signal to the voice synthesis function unit 15.

（９）ステップＳ１０９
変換機能部１４は、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれている場合、音声合成機能部１５に対して、音声合成を中止又は禁止し、元の音声を音声出力部１６に出力する旨の通知を送る。このとき、音声合成機能部１５は、音声信号のうち、特定のキーワードに該当する部分の音声を無音又はビープ音に変換して音声出力部１６に送り、他の部分を元の音声で音声出力部１６に送る。音声出力部１６は、当該音声信号に基づいて、音声出力を行う。 (9) Step S109
When the speech signal analyzed by the speech analysis function unit 13 includes keyword data indicating a specific keyword, the conversion function unit 14 stops or prohibits speech synthesis for the speech synthesis function unit 15; A notification to output the original voice to the voice output unit 16 is sent. At this time, the voice synthesizing function unit 15 converts the voice corresponding to the specific keyword in the voice signal into silence or beep and sends it to the voice output unit 16, and outputs the other part as the original voice. Send to part 16. The audio output unit 16 performs audio output based on the audio signal.

（１０）ステップＳ１１０
変換機能部１４は、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれている場合、当該キーワードデータに基づいて、予め決められたＮＧワードを送信部１８に出力する。送信部１８は、アンテナ部１１を介して、電話を掛けて来た相手に、予め決められたＮＧワードを送信する。例えば、変換機能部１４は、キーワードデータに基づいて、特定のキーワードがＮＧワードである旨を、送信部１８を介して、電話を掛けて来た相手に通知する。このとき、変換機能部１４は、ＮＧワードの部分の音声を、無音又はビープ音に変換するようにしても良い。なお、変換機能部１４は、特定のキーワードがＮＧワードである旨を、電話を掛けて来た相手に通知しない場合、特定のキーワードを送信部１８に出力しなくても良い。 (10) Step S110
When the voice signal analyzed by the voice analysis function unit 13 includes keyword data indicating a specific keyword, the conversion function unit 14 transmits a predetermined NG word based on the keyword data to the transmission unit 18. Output to. The transmitting unit 18 transmits a predetermined NG word to the other party who has made a call via the antenna unit 11. For example, based on the keyword data, the conversion function unit 14 notifies the calling party via the transmission unit 18 that the specific keyword is an NG word. At this time, the conversion function unit 14 may convert the sound of the NG word portion into silence or beep sound. Note that the conversion function unit 14 does not have to output the specific keyword to the transmission unit 18 when not notifying the other party who has made a call that the specific keyword is an NG word.

（１１）ステップＳ１１１
音声合成機能部１５は、音声データベース２３を参照して、受け取った音声信号に応じて、特定の人物に似た音声の合成を行う。ここでは、音声合成機能部１５は、受け取った音声信号に基づく音声の全体に対して、特定の人物に似た音声の合成を行う。 (11) Step S111
The speech synthesis function unit 15 refers to the speech database 23 and synthesizes speech similar to a specific person according to the received speech signal. Here, the speech synthesis function unit 15 synthesizes speech similar to a specific person with respect to the entire speech based on the received speech signal.

（１２）ステップＳ１１２
音声合成機能部１５は、音声合成された音声信号を音声出力部１６に送る。音声出力部１６は、音声合成された音声信号に基づいて、スピーカー等から合成音を出力する。 (12) Step S112
The voice synthesizing function unit 15 sends the synthesized voice signal to the voice output unit 16. The audio output unit 16 outputs a synthesized sound from a speaker or the like based on the synthesized audio signal.

（１３）ステップＳ１１３
音声分析機能部１３は、高周波信号の受信が終了するまで、継続的に受話の音声の感情を分析し、音声信号に基づく音声が怒り等の特定の感情を示す音声であるとの結果を得たら、現在の合成音の出力を中止して元の音声にフィルタをかけた音声を出力する処理に切り替える。音声分析機能部１３は、通話が終了したら、一連の処理を終了する。 (13) Step S113
The voice analysis function unit 13 continuously analyzes the emotion of the received voice until reception of the high-frequency signal is finished, and obtains a result that the voice based on the voice signal is a voice indicating a specific emotion such as anger. Then, the output of the current synthesized sound is stopped, and the process is switched to a process of outputting a sound obtained by filtering the original sound. The voice analysis function unit 13 ends the series of processes when the call is finished.

本発明の携帯端末は、テレビやラジオ、カーナビ、無線通信対応の携帯音楽プレーヤーのように、受信専用の音声再生装置でも良い。すなわち、本発明の携帯端末は、音声入力部１７や、送信部１８を備えていなくても良い。この場合、本発明の携帯端末は、外部に対して、ＮＧワード等の通知を行わない。 The portable terminal of the present invention may be an audio reproduction device dedicated to reception, such as a portable music player compatible with television, radio, car navigation, and wireless communication. That is, the portable terminal of the present invention may not include the voice input unit 17 or the transmission unit 18. In this case, the portable terminal of the present invention does not notify the outside of NG words or the like.

本発明の携帯端末では、音声データベース２３に、タレント等の特定の人物の音声データを登録して置けば、受信音声を、その特定の人物を真似た合成音で聞く事ができる。 In the portable terminal of the present invention, if voice data of a specific person such as a talent is registered and placed in the voice database 23, the received voice can be heard with a synthesized sound imitating the specific person.

音声合成の切り替え相手の選択は、予め電話番号帳２１に相手を登録しておく事により、又は通話途中でもスイッチを押す事で相手を登録する事により、音声合成をＯＮ、ＯＦＦする事ができる。 The voice synthesis switching partner can be selected by registering the partner in the telephone number book 21 in advance or by registering the partner by pressing a switch even during a call. .

キーワードデータベース２２に予め登録された特定のキーワードの音声を認識すると、音声合成を中止して受話音そのままの音声を出力することもできる。 When the voice of a specific keyword registered in advance in the keyword database 22 is recognized, the voice synthesis can be stopped and the voice as the received voice can be output.

また、キーワードがＮＧワードとして登録されていた場合は、相手に対して、決められたＮＧワードや、ＮＧワードである旨の通知を出力する事もできる。 If the keyword is registered as an NG word, a predetermined NG word or a notification that it is an NG word can be output to the other party.

また、受話の音声の感情を分析し、音声信号に基づく音声が怒り等の特定の感情を示す音声であるとの結果を得たら、特定の人物の音声に真似た合成音を中止して元の音声にフィルタをかけた音声を出力する処理を行う事ができる。 Also, after analyzing the emotion of the received voice and finding that the voice based on the voice signal is a voice that expresses a specific emotion such as anger, the synthesized sound that imitates the voice of a specific person is stopped and the original voice is stopped. It is possible to perform a process of outputting a sound obtained by filtering the sound.

特定のキーワードの登録は、受話の操作者がボタンを押すとその所定時間前の音声を分析してキーワードデータベース２２に自動的に登録する事ができる。 The registration of a specific keyword can be automatically registered in the keyword database 22 by analyzing the voice of a predetermined time before the receiving operator presses the button.

この携帯端末を使用すれば、特定の相手から掛かって来た着信音を好みのタレントとそっくりな音声に合成し、あたかもタレントと会話をしている雰囲気を作り出す事ができる。 By using this mobile terminal, you can synthesize ringtones coming from a specific party into voices that look just like your favorite talent, creating an atmosphere as if you were talking to the talent.

また、キーワードデータベース２２と組合せる事により、相手の感情や、直接音声で言って貰いたくないＮＧワードを会話から遠ざける事ができる。 Also, by combining with the keyword database 22, it is possible to keep away the other party's emotions and NG words that you don't want to hear by speaking directly from the conversation.

ここでは、携帯端末による電話（通話）における音声変換を例に説明しているが、実際には、テレビ機能やラジオ機能等を使用した際（放送視聴時）における音声変換、或いは、記憶装置や記憶媒体等に記憶されている音声データを再生した際（音声再生時）における音声変換も可能である。 Here, voice conversion in a telephone (call) using a mobile terminal is described as an example. However, in reality, voice conversion when using a TV function, a radio function, or the like (during broadcast viewing), a storage device, Audio conversion when audio data stored in a storage medium or the like is reproduced (during audio reproduction) is also possible.

＜第２実施形態＞
以下に、本発明の第２実施形態について説明する。
第１実施形態では、変換機能部１４は、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれている場合、音声合成を中止又は禁止し、ＮＧワードを通知しているが、本実施形態では、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれている場合に、音声合成を行うようにする。 <Second Embodiment>
The second embodiment of the present invention will be described below.
In the first embodiment, when the speech signal analyzed by the speech analysis function unit 13 includes keyword data indicating a specific keyword, the conversion function unit 14 suspends or prohibits speech synthesis, and converts an NG word. In this embodiment, the speech synthesis is performed when the speech signal analyzed by the speech analysis function unit 13 includes keyword data indicating a specific keyword.

本実施形態における携帯端末の構成については、基本的に第１実施形態と同じである。すなわち、本実施形態における携帯端末の構成は、図１に示す通りである。 The configuration of the mobile terminal in the present embodiment is basically the same as in the first embodiment. That is, the configuration of the mobile terminal in the present embodiment is as shown in FIG.

図３を参照して、本実施形態における携帯端末の動作について説明する。 With reference to FIG. 3, the operation of the mobile terminal in the present embodiment will be described.

（１）ステップＳ２０１
アンテナ部１１は、相手が電話を掛けて来た場合、高周波信号を受信する。 (1) Step S201
The antenna unit 11 receives a high-frequency signal when the other party makes a call.

（２）ステップＳ２０２
受信部１２は、受信された高周波信号を復調し、ベースバンド信号に変換して音声分析機能部１３に送る。このとき、受信部１２は、ベースバンド信号と共に、電話を掛けて来た相手の電話番号を示す情報信号を音声分析機能部１３に送る。ここでは、ベースバンド信号は、音声信号である。なお、ベースバンド信号は、電話を掛けて来た相手の電話番号を示すデジタルデータを含んでいても良い。 (2) Step S202
The receiving unit 12 demodulates the received high-frequency signal, converts it to a baseband signal, and sends it to the voice analysis function unit 13. At this time, the receiving unit 12 sends to the voice analysis function unit 13 an information signal indicating the telephone number of the other party who made the call together with the baseband signal. Here, the baseband signal is an audio signal. The baseband signal may include digital data indicating the telephone number of the other party who made the call.

（３）ステップＳ２０３
音声分析機能部１３は、電話を掛けて来た相手の電話番号が電話番号帳２１に音声合成対象番号として登録されているかを判断する。なお、電話番号は例示に過ぎず、実際には、電話を掛けて来た相手を特定できる識別情報であれば良い。 (3) Step S203
The voice analysis function unit 13 determines whether or not the telephone number of the other party who made the call is registered in the telephone number book 21 as a voice synthesis target number. Note that the telephone number is merely an example, and in practice, it may be any identification information that can identify the other party who made the call.

（４）ステップＳ２０４
音声分析機能部１３は、電話を掛けて来た相手の電話番号が電話番号帳２１に音声合成対象番号として登録された番号でない場合、直接、音声出力部１６に音声信号を送る。音声出力部１６は、当該音声信号に基づいて、音声出力を行う。このとき、音声分析機能部１３は、変換機能部１４及び音声合成機能部１５に対して動作しないように通知した上で、変換機能部１４及び音声合成機能部１５を介して音声出力部１６に音声信号を送るようにしても良い。 (4) Step S204
The voice analysis function unit 13 sends a voice signal directly to the voice output unit 16 when the telephone number of the other party who made the call is not a number registered in the telephone number book 21 as a voice synthesis target number. The audio output unit 16 performs audio output based on the audio signal. At this time, the voice analysis function unit 13 notifies the conversion function unit 14 and the voice synthesis function unit 15 not to operate, and then sends the voice output unit 16 via the conversion function unit 14 and the voice synthesis function unit 15. An audio signal may be sent.

（５）ステップＳ２０５
音声分析機能部１３は、電話を掛けて来た相手の電話番号が電話番号帳２１に音声合成対象番号として登録された番号である場合、音声合成ＯＮと認識し、変換機能部１４を動作させる。例えば、音声分析機能部１３は、音声合成ＯＮと認識した場合、ＯＦＦ状態の変換機能部１４を起動させる。或いは、音声分析機能部１３から変換機能部１４への通知を禁止状態から許可状態に変更する。すなわち、音声分析機能部１３は、音声合成ＯＮと認識した場合、当該相手との電話中、変換機能部１４への音声信号の提供を可能にする。 (5) Step S205
The voice analysis function unit 13 recognizes that the voice synthesis is ON when the telephone number of the other party who made the call is a number registered as a voice synthesis target number in the telephone number book 21 and operates the conversion function unit 14. . For example, when the speech analysis function unit 13 recognizes that speech synthesis is ON, the speech analysis function unit 13 activates the conversion function unit 14 in the OFF state. Alternatively, the notification from the voice analysis function unit 13 to the conversion function unit 14 is changed from the prohibited state to the permitted state. That is, when the voice analysis function unit 13 recognizes that the voice synthesis is ON, the voice analysis function unit 13 can provide a voice signal to the conversion function unit 14 during a call with the other party.

（６）ステップＳ２０６
音声分析機能部１３は、音声信号のスペクトラムを分析し、音声信号に基づく音声が怒り等の特定の感情を示す音声であるかどうか判断する。その後、音声分析機能部１３は、変換機能部１４に音声信号を送る。このとき、音声分析機能部１３は、音声信号に基づく音声が怒り等の特定の感情を示す音声であれば、特定の感情を示す音声である旨を、或いは、元の音声にフィルタをかける旨の指示を、変換機能部１４に通知する。なお、音声分析機能部１３は、音声信号に基づく音声が怒り等の特定の感情を示す音声であるかどうか判断しない場合、無条件で変換機能部１４に音声信号を送る。 (6) Step S206
The voice analysis function unit 13 analyzes the spectrum of the voice signal and determines whether or not the voice based on the voice signal is a voice indicating a specific emotion such as anger. Thereafter, the voice analysis function unit 13 sends a voice signal to the conversion function unit 14. At this time, if the voice based on the voice signal indicates a specific emotion such as anger, the voice analysis function unit 13 indicates that the voice indicates a specific emotion, or filters the original voice. Is sent to the conversion function unit 14. If the voice analysis function unit 13 does not determine whether the voice based on the voice signal is a voice indicating a specific emotion such as anger, the voice analysis function unit 13 sends the voice signal to the conversion function unit 14 unconditionally.

（７）ステップＳ２０７
変換機能部１４は、音声信号に基づく音声が怒り等の特定の感情を示す音声であれば、元の音声にフィルタをかけた音声を出力する処理（フィルタリング）を行う。ここでは、変換機能部１４は、音声出力部１６に対して、元の音声にフィルタをかけた後の音声信号を送る。音声出力部１６は、元の音声にフィルタをかけた音声を出力する。このとき、変換機能部１４は、音声合成機能部１５に対し、音声信号と、元の音声にフィルタをかける旨の通知（フィルタリング依頼）を送り、音声合成機能部１５で元の音声にフィルタをかけ、音声合成機能部１５から音声出力部１６に元の音声にフィルタをかけた後の音声信号を送るようにしても良い。 (7) Step S207
If the voice based on the voice signal indicates a specific emotion such as anger, the conversion function unit 14 performs processing (filtering) to output a voice obtained by filtering the original voice. Here, the conversion function unit 14 sends an audio signal after filtering the original audio to the audio output unit 16. The sound output unit 16 outputs sound obtained by filtering the original sound. At this time, the conversion function unit 14 sends a voice signal and a notification (filtering request) to filter the original voice to the voice synthesis function unit 15, and the voice synthesis function unit 15 filters the original voice. Alternatively, the voice synthesis function unit 15 may send the voice signal after filtering the original voice to the voice output unit 16.

（８）ステップＳ２０８
変換機能部１４は、音声信号に基づく音声が怒り等の特定の感情を示す音声でなければ、キーワードデータベース２２を参照して、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれているかどうか判断する。このとき、変換機能部１４は、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれている場合、当該音声信号を音声合成機能部１５に出力する。なお、変換機能部１４は、音声信号のうち特定のキーワードに該当する部分のみ音声合成機能部１５に出力し、他の部分を音声出力部１６に出力するようにしても良い。また、変換機能部１４は、当該キーワードデータを音声合成機能部１５に出力するようにしても良い。 (8) Step S208
If the voice based on the voice signal is not a voice indicating a specific emotion such as anger, the conversion function unit 14 refers to the keyword database 22 and adds a specific keyword to the voice signal analyzed by the voice analysis function unit 13. It is determined whether or not the indicated keyword data is included. At this time, if the speech signal analyzed by the speech analysis function unit 13 includes keyword data indicating a specific keyword, the conversion function unit 14 outputs the speech signal to the speech synthesis function unit 15. Note that the conversion function unit 14 may output only a portion corresponding to a specific keyword in the voice signal to the voice synthesis function unit 15 and output the other portion to the voice output unit 16. Further, the conversion function unit 14 may output the keyword data to the speech synthesis function unit 15.

（９）ステップＳ２０９
変換機能部１４は、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれていない場合、音声合成機能部１５に対して、音声合成を中止又は禁止し、元の音声を音声出力部１６に出力する旨の通知を送る。音声出力部１６は、当該音声信号に基づいて、元の音声を出力する。 (9) Step S209
If the speech signal analyzed by the speech analysis function unit 13 does not include keyword data indicating a specific keyword, the conversion function unit 14 stops or prohibits speech synthesis for the speech synthesis function unit 15; A notification to output the original voice to the voice output unit 16 is sent. The audio output unit 16 outputs the original audio based on the audio signal.

（１０）ステップＳ２１０
変換機能部１４は、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれている場合、当該キーワードデータに基づいて、特定のキーワードを送信部１８に出力する。送信部１８は、アンテナ部１１を介して、電話を掛けて来た相手に、特定のキーワードを送信する。例えば、変換機能部１４は、キーワードデータに基づいて、特定のキーワードに対して音声合成を行う旨を、送信部１８を介して、電話を掛けて来た相手に通知する。なお、変換機能部１４は、特定のキーワードに対して音声合成を行う旨を、電話を掛けて来た相手に通知しない場合、特定のキーワードを送信部１８に出力しなくても良い。 (10) Step S210
When the voice signal analyzed by the voice analysis function unit 13 includes keyword data indicating a specific keyword, the conversion function unit 14 outputs the specific keyword to the transmission unit 18 based on the keyword data. . The transmission unit 18 transmits a specific keyword to the other party who has made a call via the antenna unit 11. For example, the conversion function unit 14 notifies the other party who has made a call via the transmission unit 18 that voice synthesis is to be performed for a specific keyword based on the keyword data. Note that the conversion function unit 14 does not need to output the specific keyword to the transmission unit 18 when not notifying the other party who has made a call that speech synthesis is performed for the specific keyword.

（１１）ステップＳ２１１
音声合成機能部１５は、音声データベース２３を参照して、受け取った音声信号に応じて、特定の人物に似た音声の合成を行う。ここでは、音声合成機能部１５は、受け取った音声信号に基づく音声の全体に対して、特定の人物に似た音声の合成を行う。なお、音声合成機能部１５は、変換機能部１４からキーワードデータを受け取り、キーワードデータに基づいて、受け取った音声信号に基づく音声に含まれる特定のキーワードの音声のみ、特定の人物に似た音声の合成を行うようにしても良い。例えば、音声合成機能部１５は、キーワードデータが特定の人物の著名な発言を示している場合、受け取った音声信号に基づく音声にこの著名な発言と同じ内容が含まれていれば、その発言の箇所のみ、特定の人物に似た音声での合成を行う。 (11) Step S211
The speech synthesis function unit 15 refers to the speech database 23 and synthesizes speech similar to a specific person according to the received speech signal. Here, the speech synthesis function unit 15 synthesizes speech similar to a specific person with respect to the entire speech based on the received speech signal. The voice synthesis function unit 15 receives the keyword data from the conversion function unit 14, and based on the keyword data, only the voice of the specific keyword included in the voice based on the received voice signal is used. You may make it perform a synthesis | combination. For example, if the keyword data indicates a prominent utterance of a specific person, the speech synthesis function unit 15, if the speech based on the received audio signal includes the same content as the prominent utterance, Only the part is synthesized with sound similar to a specific person.

（１２）ステップＳ２１２
音声合成機能部１５は、音声合成された音声信号を音声出力部１６に送る。音声出力部１６は、音声合成された音声信号に基づいて、スピーカー等から合成音を出力する。 (12) Step S212
The voice synthesizing function unit 15 sends the synthesized voice signal to the voice output unit 16. The audio output unit 16 outputs a synthesized sound from a speaker or the like based on the synthesized audio signal.

（１３）ステップＳ２１３
音声分析機能部１３は、高周波信号の受信が終了するまで、継続的に受話の音声の感情を分析し、音声信号に基づく音声が怒り等の特定の感情を示す音声であるとの結果を得たら、現在の合成音の出力を中止して元の音声にフィルタをかけた音声を出力する処理に切り替える。音声分析機能部１３は、通話が終了したら、一連の処理を終了する。 (13) Step S213
The voice analysis function unit 13 continuously analyzes the emotion of the received voice until reception of the high-frequency signal is finished, and obtains a result that the voice based on the voice signal is a voice indicating a specific emotion such as anger. Then, the output of the current synthesized sound is stopped, and the process is switched to a process of outputting a sound obtained by filtering the original sound. The voice analysis function unit 13 ends the series of processes when the call is finished.

これにより、本発明の携帯端末は、通話や音声再生の際に、ユーザの知人の口癖や、著名人の有名な台詞（セリフ）が使用された場合、その本人（当人）の声色で再生する事ができる。このとき、声色の音声データ（音源）については、当人の使用許諾を得ているものとする。使用許諾を得る方法として、対価の支払い等が考えられる。声色の音声データ（音源）の入手方法としては、許可を受けた上での当人からの録音や、所定のＷｅｂサイトからのダウンロード等が考えられる。 As a result, the portable terminal of the present invention reproduces the voice of the user (person) when the voice of the user's acquaintance or famous speech of a celebrity is used during a call or voice playback. I can do it. At this time, it is assumed that the voice data (sound source) of the voice color has been licensed by the person concerned. As a method of obtaining a license, payment of consideration can be considered. As a method for obtaining voice-colored voice data (sound source), recording from the person with permission and downloading from a predetermined website can be considered.

＜第３実施形態＞
以下に、本発明の第３実施形態について説明する。
本実施形態では、第１実施形態とは逆に、携帯端末のユーザが音声入力した音声を合成し、合成音を外部に送信するようにする。 <Third Embodiment>
The third embodiment of the present invention will be described below.
In the present embodiment, contrary to the first embodiment, the voice input by the user of the mobile terminal is synthesized and the synthesized sound is transmitted to the outside.

図４を参照すると、本実施形態における携帯端末は、アンテナ部１１と、受信部１２と、音声分析機能部１３と、変換機能部１４と、音声合成機能部１５と、音声出力部１６と、音声入力部１７と、送信部１８と、電話番号帳２１と、キーワードデータベース２２と、音声データベース２３を備える。 Referring to FIG. 4, the mobile terminal in the present embodiment includes an antenna unit 11, a receiving unit 12, a voice analysis function unit 13, a conversion function unit 14, a voice synthesis function unit 15, a voice output unit 16, A voice input unit 17, a transmission unit 18, a telephone number book 21, a keyword database 22, and a voice database 23 are provided.

アンテナ部１１、受信部１２、音声分析機能部１３、変換機能部１４、音声合成機能部１５、音声出力部１６、音声入力部１７、送信部１８、電話番号帳２１、キーワードデータベース２２、及び音声データベース２３については、基本的に第１実施形態と同じである。 Antenna unit 11, receiving unit 12, speech analysis function unit 13, conversion function unit 14, speech synthesis function unit 15, speech output unit 16, speech input unit 17, transmission unit 18, telephone number book 21, keyword database 22, and speech The database 23 is basically the same as in the first embodiment.

本実施形態では、音声分析機能部１３は、電話を掛ける相手の電話番号等の識別情報が電話番号帳２１に音声合成対象番号として登録されているかを判断する。また、音声分析機能部１３は、音声入力部１７に入力された音声に基づく音声信号についても分析を行う。すなわち、音声分析機能部１３は、送話又は受話の少なくとも一方の音声信号のスペクトラムを分析し、送話又は受話の少なくとも一方の音声が怒り等の特定の感情を示す音声であるかどうか判断する。ここでは、受信部１２から受け取った音声信号、及び音声入力部１７に入力された音声に基づく音声信号の両方のスペクトラムを分析し、分析された音声信号のうち少なくとも一方の音声信号が怒り等の特定の感情を示す音声信号であるかどうか判断する。その後、音声分析機能部１３は、変換機能部１４に音声信号を送る。このとき、音声分析機能部１３は、送話又は受話の少なくとも一方の音声が怒り等の特定の感情を示す音声であれば、特定の感情を示す音声である旨を、或いは、元の音声にフィルタをかける旨の指示を、変換機能部１４に通知する。なお、音声分析機能部１３は、送話又は受話の少なくとも一方の音声が怒り等の特定の感情を示す音声であるかどうか判断しない場合、無条件で変換機能部１４に音声信号を送る。 In the present embodiment, the voice analysis function unit 13 determines whether identification information such as the telephone number of the other party to be called is registered as a voice synthesis target number in the telephone number book 21. The voice analysis function unit 13 also analyzes a voice signal based on the voice input to the voice input unit 17. That is, the voice analysis function unit 13 analyzes the spectrum of at least one voice signal of transmission or reception, and determines whether at least one voice of transmission or reception is a voice indicating a specific emotion such as anger. . Here, the spectrum of both the audio signal received from the receiving unit 12 and the audio signal based on the audio input to the audio input unit 17 is analyzed, and at least one of the analyzed audio signals is angry or the like. It is determined whether the audio signal indicates a specific emotion. Thereafter, the voice analysis function unit 13 sends a voice signal to the conversion function unit 14. At this time, if at least one of the voices of the transmission or reception is a voice indicating a specific emotion such as anger, the voice analysis function unit 13 indicates that the voice indicates a specific emotion or the original voice. The conversion function unit 14 is notified of an instruction for filtering. Note that the voice analysis function unit 13 unconditionally sends a voice signal to the conversion function unit 14 when it is not determined whether at least one of the voices of the transmission or reception is a voice indicating a specific emotion such as anger.

変換機能部１４は、キーワードデータベース２２を参照して、音声分析機能部１３で分析された音声信号に従い、特定のキーワードを示すキーワードデータを出力する。ここでは、変換機能部１４は、音声入力部１７から受け取った音声信号に含まれるキーワードがＮＧワードとしてキーワードデータベース２２に登録されていた場合、音声出力部１６に対して、予め決められたＮＧワードや、ＮＧワードである旨の通知を送る。例えば、変換機能部１４は、ユーザにより音声入力された音声信号に含まれていた「○○」というキーワードがＮＧワードとしてキーワードデータベース２２に登録されていた場合、音声出力部１６を介して、ユーザに対し、「○○はＮＧワードです」という旨を通知する。なお、変換機能部１４は、音声信号に含まれるキーワードがＮＧワードとしてキーワードデータベース２２に登録されていた場合、ＮＧワードの部分の音声を、無音又はビープ音に変換するようにしても良い。また、変換機能部１４は、送話又は受話の少なくとも一方の音声が怒り等の特定の感情を示す音声であれば、元の音声にフィルタをかけた音声を出力する処理（フィルタリング）を行う。 The conversion function unit 14 refers to the keyword database 22 and outputs keyword data indicating a specific keyword according to the voice signal analyzed by the voice analysis function unit 13. Here, when the keyword included in the voice signal received from the voice input unit 17 is registered in the keyword database 22 as the NG word, the conversion function unit 14 sends a predetermined NG word to the voice output unit 16. Or a notification that it is an NG word. For example, when the keyword “XXX” included in the voice signal input by the user is registered as an NG word in the keyword database 22, the conversion function unit 14 receives the user via the voice output unit 16. Is notified that “XX is an NG word”. Note that the conversion function unit 14 may convert the voice of the NG word portion into silence or beep sound when the keyword included in the voice signal is registered in the keyword database 22 as an NG word. Moreover, the conversion function part 14 will perform the process (filtering) which outputs the audio | voice which filtered the original audio | voice if at least one audio | voice of transmission or reception is a voice which shows specific emotions, such as anger.

電話番号帳２１は、電話を掛ける相手の電話番号等の識別情報を格納する。ここでは、電話番号帳２１は、電話を掛ける相手のうち、音声合成の対象とする相手の電話番号等の識別情報を格納する。なお、電話番号は例示に過ぎず、実際には、電話を掛ける相手を特定できる識別情報であれば良い。また、電話番号帳２１は、携帯端末自体が有する通常の電話番号帳とは別に設けられていても良い。 The telephone number book 21 stores identification information such as the telephone number of the other party who makes the call. Here, the telephone number book 21 stores identification information such as the telephone number of the other party who is the target of speech synthesis among the other parties who make a call. Note that the telephone number is merely an example, and in practice, it may be any identification information that can identify the other party to call. Moreover, the telephone number book 21 may be provided separately from the normal telephone number book which the portable terminal itself has.

図５を参照して、本実施形態における携帯端末の動作について説明する。 With reference to FIG. 5, the operation of the portable terminal in the present embodiment will be described.

（１）ステップＳ３０１
音声入力部１７は、所定の相手に電話を掛ける際、音声入力に応じて、音声信号を発生する。ここでは、音声入力部１７は、ユーザ又はアプリケーションによる音声入力に応じて、音声信号を生成し、その音声信号を音声分析機能部１３に送る。このとき、音声分析機能部１３は、音声信号と、電話を掛ける相手の電話番号を受け取る。例えば、音声分析機能部１３は、ユーザにより入力された電話番号を受け取った後、音声入力部１７から音声信号を受け取るようにしても良い。 (1) Step S301
The voice input unit 17 generates a voice signal in response to voice input when calling a predetermined partner. Here, the voice input unit 17 generates a voice signal in response to voice input by a user or an application, and sends the voice signal to the voice analysis function unit 13. At this time, the voice analysis function unit 13 receives the voice signal and the telephone number of the other party to call. For example, the voice analysis function unit 13 may receive a voice signal from the voice input unit 17 after receiving a telephone number input by the user.

（２）ステップＳ３０２
音声分析機能部１３は、電話を掛ける相手の電話番号が電話番号帳２１に音声合成対象番号として登録されているかを判断する。なお、電話番号は例示に過ぎず、実際には、電話を掛ける相手を特定できる識別情報であれば良い。 (2) Step S302
The voice analysis function unit 13 determines whether the other party's telephone number to be called is registered in the telephone number book 21 as a voice synthesis target number. Note that the telephone number is merely an example, and in practice, it may be any identification information that can identify the other party to call.

（３）ステップＳ３０３
音声分析機能部１３は、電話を掛ける相手の電話番号が電話番号帳２１に音声合成対象番号として登録された番号でない場合、直接、送信部１８に音声信号を送る。このとき、音声分析機能部１３は、変換機能部１４に対して動作しないように通知した上で、変換機能部１４を介して送信部１８に音声信号を送るようにしても良い。送信部１８は、受け取った音声信号を変調して高周波信号に変換し、アンテナ部１１を介して、電話を掛ける相手に対し、その高周波信号を送信する。 (3) Step S303
The voice analysis function unit 13 directly sends a voice signal to the transmission unit 18 when the telephone number of the other party to be called is not a number registered as a voice synthesis target number in the telephone number book 21. At this time, the voice analysis function unit 13 may notify the conversion function unit 14 not to operate, and may send a voice signal to the transmission unit 18 via the conversion function unit 14. The transmission unit 18 modulates the received audio signal to convert it into a high frequency signal, and transmits the high frequency signal to the other party who makes a call via the antenna unit 11.

（４）ステップＳ３０４
音声分析機能部１３は、電話を掛ける相手の電話番号が電話番号帳２１に音声合成対象番号として登録された番号である場合、音声合成ＯＮと認識し、変換機能部１４を動作させる。例えば、音声分析機能部１３は、音声合成ＯＮと認識した場合、ＯＦＦ状態の変換機能部１４を起動させる。或いは、音声分析機能部１３から変換機能部１４への通知を禁止状態から許可状態に変更する。すなわち、音声分析機能部１３は、音声合成ＯＮと認識した場合、当該相手との電話中、変換機能部１４への音声信号の提供を可能にする。 (4) Step S304
The voice analysis function unit 13 recognizes that the voice synthesis is ON when the telephone number of the other party to be called is a number registered in the telephone number book 21 as a voice synthesis target number, and operates the conversion function unit 14. For example, when the speech analysis function unit 13 recognizes that speech synthesis is ON, the speech analysis function unit 13 activates the conversion function unit 14 in the OFF state. Alternatively, the notification from the voice analysis function unit 13 to the conversion function unit 14 is changed from the prohibited state to the permitted state. That is, when the voice analysis function unit 13 recognizes that the voice synthesis is ON, the voice analysis function unit 13 can provide a voice signal to the conversion function unit 14 during a call with the other party.

（５）ステップＳ３０５
アンテナ部１１は、相手が電話に出た場合、相手からの高周波信号を受信する。 (5) Step S305
The antenna unit 11 receives a high-frequency signal from the other party when the other party answers the call.

（６）ステップＳ３０６
受信部１２は、受信された高周波信号を復調し、ベースバンド信号に変換して音声分析機能部１３に送る。このとき、受信部１２は、ベースバンド信号と共に、電話を掛けて来た相手の電話番号を示す情報信号を音声分析機能部１３に送る。ここでは、ベースバンド信号は、音声信号である。なお、ベースバンド信号は、電話を掛けて来た相手の電話番号を示すデジタルデータを含んでいても良い。 (6) Step S306
The receiving unit 12 demodulates the received high-frequency signal, converts it to a baseband signal, and sends it to the voice analysis function unit 13. At this time, the receiving unit 12 sends to the voice analysis function unit 13 an information signal indicating the telephone number of the other party who made the call together with the baseband signal. Here, the baseband signal is an audio signal. The baseband signal may include digital data indicating the telephone number of the other party who made the call.

（７）ステップＳ３０７
音声分析機能部１３は、送話又は受話の少なくとも一方の音声信号のスペクトラムを分析し、送話又は受話の少なくとも一方の音声が怒り等の特定の感情を示す音声であるかどうか判断する。すなわち、音声分析機能部１３は、受信部１２から受け取った音声信号（相手側からの音声信号）と、音声入力部１６から受け取った音声信号（ユーザ側からの音声信号）のうち、少なくとも一方の音声信号が怒り等の特定の感情を示す音声信号であるかどうか判断する。その後、音声分析機能部１３は、変換機能部１４に音声信号を送る。このとき、音声分析機能部１３は、送話又は受話の少なくとも一方の音声が怒り等の特定の感情を示す音声であれば、特定の感情を示す音声である旨を、或いは、元の音声にフィルタをかける旨の指示を、変換機能部１４に通知する。なお、音声分析機能部１３は、怒り等の特定の感情を示す音声であるかどうか判断しない場合、無条件で変換機能部１４に音声信号を送る。 (7) Step S307
The voice analysis function unit 13 analyzes the spectrum of at least one voice signal for transmission or reception, and determines whether at least one voice for transmission or reception is a voice indicating a specific emotion such as anger. That is, the voice analysis function unit 13 includes at least one of the voice signal received from the receiving unit 12 (voice signal from the other party) and the voice signal received from the voice input unit 16 (voice signal from the user side). It is determined whether the audio signal is an audio signal indicating a specific emotion such as anger. Thereafter, the voice analysis function unit 13 sends a voice signal to the conversion function unit 14. At this time, if at least one of the voices of the transmission or reception is a voice indicating a specific emotion such as anger, the voice analysis function unit 13 indicates that the voice indicates a specific emotion or the original voice. The conversion function unit 14 is notified of an instruction for filtering. If the voice analysis function unit 13 does not determine whether the voice indicates a specific emotion such as anger, the voice analysis function unit 13 sends a voice signal to the conversion function unit 14 unconditionally.

（８）ステップＳ３０８
変換機能部１４は、送話又は受話の少なくとも一方の音声が怒り等の特定の感情を示す音声であれば、元の音声にフィルタをかけた音声を出力する処理（フィルタリング）を行う。ここでは、変換機能部１４は、送信部１８に対して、元の音声にフィルタをかけた後の音声信号を送る。送信部１８は、元の音声にフィルタをかけた音声を出力する。このとき、変換機能部１４は、音声合成機能部１５に対し、音声信号と、元の音声にフィルタをかける旨の通知（フィルタリング依頼）を送り、音声合成機能部１５で元の音声にフィルタをかけ、音声合成機能部１５から送信部１８に元の音声にフィルタをかけた後の音声信号を送るようにしても良い。 (8) Step S308
The conversion function unit 14 performs processing (filtering) to output a sound obtained by filtering the original voice if at least one of the voices of transmission or reception is a voice indicating a specific emotion such as anger. Here, the conversion function unit 14 sends the audio signal after filtering the original audio to the transmission unit 18. The transmission unit 18 outputs a sound obtained by filtering the original sound. At this time, the conversion function unit 14 sends a voice signal and a notification (filtering request) to filter the original voice to the voice synthesis function unit 15, and the voice synthesis function unit 15 filters the original voice. Alternatively, the voice synthesis function unit 15 may send the voice signal after filtering the original voice to the transmission unit 18.

（９）ステップＳ３０９
変換機能部１４は、送話又は受話の少なくとも一方の音声が怒り等の特定の感情を示す音声でなければ、キーワードデータベース２２を参照して、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれているかどうか判断する。このとき、変換機能部１４は、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれていない場合、音声合成機能部１５に音声信号を送る。 (9) Step S309
If the voice of at least one of the transmission and reception is not a voice indicating a specific emotion such as anger, the conversion function unit 14 refers to the keyword database 22 and converts the voice signal analyzed by the voice analysis function unit 13 into It is determined whether or not keyword data indicating a specific keyword is included. At this time, if the voice signal analyzed by the voice analysis function unit 13 does not include keyword data indicating a specific keyword, the conversion function unit 14 sends the voice signal to the voice synthesis function unit 15.

（１０）ステップＳ３１０
変換機能部１４は、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれている場合、音声合成機能部１５に対して、音声合成を中止又は禁止し、元の音声を送信部１８に出力する旨の通知を送る。このとき、音声合成機能部１５は、音声信号のうち、特定のキーワードに該当する部分の音声を無音又はビープ音に変換して送信部１８に送り、他の部分を元の音声で送信部１８に送る。送信部１８は、受け取った音声信号を変調して高周波信号に変換し、アンテナ部１１を介して、電話を掛ける相手に対し、その高周波信号を送信する。 (10) Step S310
When the speech signal analyzed by the speech analysis function unit 13 includes keyword data indicating a specific keyword, the conversion function unit 14 stops or prohibits speech synthesis for the speech synthesis function unit 15; A notification to output the original voice to the transmission unit 18 is sent. At this time, the voice synthesizing function unit 15 converts the voice of the part corresponding to the specific keyword in the voice signal into silence or beep and sends it to the transmission part 18, and sends the other part to the transmission part 18 with the original voice. Send to. The transmission unit 18 modulates the received audio signal to convert it into a high frequency signal, and transmits the high frequency signal to the other party who makes a call via the antenna unit 11.

（１１）ステップＳ３１１
変換機能部１４は、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれている場合、当該キーワードデータに基づいて、特定のキーワードを音声出力部１６に出力する。音声出力部１６は、特定のキーワードを出力する。例えば、変換機能部１４は、キーワードデータに基づいて、特定のキーワードがＮＧワードである旨を、音声出力部１６を介して、ユーザに通知する。このとき、変換機能部１４は、ＮＧワードの部分の音声を、無音又はビープ音に変換するようにしても良い。なお、変換機能部１４は、特定のキーワードがＮＧワードである旨を、ユーザに通知しない場合、特定のキーワードを音声出力部１６に出力しなくても良い。 (11) Step S311
When the voice signal analyzed by the voice analysis function unit 13 includes keyword data indicating a specific keyword, the conversion function unit 14 outputs the specific keyword to the voice output unit 16 based on the keyword data. To do. The voice output unit 16 outputs a specific keyword. For example, the conversion function unit 14 notifies the user via the voice output unit 16 that the specific keyword is an NG word based on the keyword data. At this time, the conversion function unit 14 may convert the sound of the NG word portion into silence or beep sound. Note that the conversion function unit 14 may not output the specific keyword to the voice output unit 16 when not notifying the user that the specific keyword is an NG word.

（１２）ステップＳ３１２
音声合成機能部１５は、音声データベース２３を参照して、受け取った音声信号に応じて、特定の人物に似た音声の合成を行う。ここでは、音声合成機能部１５は、受け取った音声信号に基づく音声の全体に対して、特定の人物に似た音声の合成を行う。 (12) Step S312
The speech synthesis function unit 15 refers to the speech database 23 and synthesizes speech similar to a specific person according to the received speech signal. Here, the speech synthesis function unit 15 synthesizes speech similar to a specific person with respect to the entire speech based on the received speech signal.

（１３）ステップＳ３１３
音声合成機能部１５は、音声合成された音声信号を送信部１８に送る。送信部１８は、受け取った音声信号を変調して高周波信号に変換し、アンテナ部１１を介して、電話を掛ける相手に対し、その高周波信号を送信する。 (13) Step S313
The voice synthesis function unit 15 sends the voice signal that has been voice synthesized to the transmission unit 18. The transmission unit 18 modulates the received audio signal to convert it into a high frequency signal, and transmits the high frequency signal to the other party who makes a call via the antenna unit 11.

（１４）ステップＳ３１４
音声分析機能部１３は、高周波信号の送信が終了するまで、継続的に、送話又は受話の少なくとも一方の音声の感情を分析し、送話又は受話の少なくとも一方の音声が怒り等の特定の感情を示す音声であるとの結果を得たら、現在の合成音の出力を中止して元の音声にフィルタをかけた音声を出力する処理に切り替える。音声分析機能部１３は、通話が終了したら、一連の処理を終了する。 (14) Step S314
The voice analysis function unit 13 continuously analyzes the emotion of at least one of the voices of the transmission or reception until the transmission of the high-frequency signal is completed, and the voice of at least one of the transmissions or receptions is specified as anger When the result of the voice indicating emotion is obtained, the output of the current synthesized sound is stopped and the process is switched to the process of outputting the voice obtained by filtering the original voice. The voice analysis function unit 13 ends the series of processes when the call is finished.

本実施形態における携帯端末では、デコメ（デコレーションメール）（登録商標）における顔文字や絵文字のように、声色や台詞を「素材」として、音声をデコレーション（装飾）する事ができる。 In the mobile terminal according to the present embodiment, the voice can be decorated (decorated) using the voice color and dialogue as “material” like the emoticons and pictograms in the decoration mail (registered trademark).

＜第４実施形態＞
以下に、本発明の第４実施形態について説明する。
第３実施形態では、変換機能部１４は、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれている場合、音声合成を中止又は禁止し、ユーザにＮＧワードを通知しているが、本実施形態では、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれている場合に、音声合成を行うようにする。 <Fourth embodiment>
The fourth embodiment of the present invention will be described below.
In the third embodiment, when the speech signal analyzed by the speech analysis function unit 13 includes keyword data indicating a specific keyword, the conversion function unit 14 stops or prohibits speech synthesis, and NG In this embodiment, the speech synthesis is performed when the speech signal analyzed by the speech analysis function unit 13 includes keyword data indicating a specific keyword.

本実施形態における携帯端末の構成については、基本的に第３実施形態と同じである。すなわち、本実施形態における携帯端末の構成は、図４に示す通りである。 The configuration of the mobile terminal in the present embodiment is basically the same as that in the third embodiment. That is, the configuration of the mobile terminal in the present embodiment is as shown in FIG.

図６を参照して、本実施形態における携帯端末の動作について説明する。 With reference to FIG. 6, the operation of the mobile terminal in the present embodiment will be described.

（１）ステップＳ４０１
音声入力部１７は、所定の相手に電話を掛ける際、音声入力に応じて、音声信号を発生する。ここでは、音声入力部１７は、ユーザ又はアプリケーションによる音声入力に応じて、音声信号を生成し、その音声信号を音声分析機能部１３に送る。このとき、音声分析機能部１３は、音声信号と、電話を掛ける相手（通話相手）の電話番号を受け取る。例えば、音声分析機能部１３は、ユーザにより入力された電話番号を受け取った後、音声入力部１７から音声信号を受け取るようにしても良い。 (1) Step S401
The voice input unit 17 generates a voice signal in response to voice input when calling a predetermined partner. Here, the voice input unit 17 generates a voice signal in response to voice input by a user or an application, and sends the voice signal to the voice analysis function unit 13. At this time, the voice analysis function unit 13 receives the voice signal and the telephone number of the other party (calling party) to call. For example, the voice analysis function unit 13 may receive a voice signal from the voice input unit 17 after receiving a telephone number input by the user.

（２）ステップＳ４０２
音声分析機能部１３は、電話を掛ける相手の電話番号が電話番号帳２１に音声合成対象番号として登録されているかを判断する。なお、電話番号は例示に過ぎず、実際には、電話を掛ける相手を特定できる識別情報であれば良い。 (2) Step S402
The voice analysis function unit 13 determines whether the other party's telephone number to be called is registered in the telephone number book 21 as a voice synthesis target number. Note that the telephone number is merely an example, and in practice, it may be any identification information that can identify the other party to call.

（３）ステップＳ４０３
音声分析機能部１３は、電話を掛ける相手の電話番号が電話番号帳２１に音声合成対象番号として登録された番号でない場合、直接、送信部１８に音声信号を送る。このとき、音声分析機能部１３は、変換機能部１４に対して動作しないように通知した上で、変換機能部１４を介して送信部１８に音声信号を送るようにしても良い。送信部１８は、受け取った音声信号を変調して高周波信号に変換し、アンテナ部１１を介して、電話を掛ける相手に対し、その高周波信号を送信する。 (3) Step S403
The voice analysis function unit 13 sends a voice signal directly to the transmission unit 18 when the telephone number of the other party to be called is not a number registered as a voice synthesis target number in the telephone number book 21. At this time, the voice analysis function unit 13 may notify the conversion function unit 14 not to operate, and then may send a voice signal to the transmission unit 18 via the conversion function unit 14. The transmission unit 18 modulates the received audio signal to convert it into a high frequency signal, and transmits the high frequency signal to the other party who makes a call via the antenna unit 11.

（４）ステップＳ４０４
音声分析機能部１３は、電話を掛ける相手の電話番号が電話番号帳２１に音声合成対象番号として登録された番号である場合、音声合成ＯＮと認識し、変換機能部１４を動作させる。例えば、音声分析機能部１３は、音声合成ＯＮと認識した場合、ＯＦＦ状態の変換機能部１４を起動させる。或いは、音声分析機能部１３から変換機能部１４への通知を禁止状態から許可状態に変更する。すなわち、音声分析機能部１３は、音声合成ＯＮと認識した場合、当該相手との電話中、変換機能部１４への音声信号の提供を可能にする。 (4) Step S404
The voice analysis function unit 13 recognizes that the voice synthesis is ON when the telephone number of the other party to be called is a number registered in the telephone number book 21 as a voice synthesis target number, and operates the conversion function unit 14. For example, when the speech analysis function unit 13 recognizes that speech synthesis is ON, the speech analysis function unit 13 activates the conversion function unit 14 in the OFF state. Alternatively, the notification from the voice analysis function unit 13 to the conversion function unit 14 is changed from the prohibited state to the permitted state. That is, when the voice analysis function unit 13 recognizes that the voice synthesis is ON, the voice analysis function unit 13 can provide a voice signal to the conversion function unit 14 during a call with the other party.

（５）ステップＳ４０５
アンテナ部１１は、相手が電話に出た場合、相手からの高周波信号を受信する。 (5) Step S405
The antenna unit 11 receives a high-frequency signal from the other party when the other party answers the call.

（６）ステップＳ４０６
受信部１２は、受信された高周波信号を復調し、ベースバンド信号に変換して音声分析機能部１３に送る。このとき、受信部１２は、ベースバンド信号と共に、電話を掛けて来た相手の電話番号を示す情報信号を音声分析機能部１３に送る。ここでは、ベースバンド信号は、音声信号である。なお、ベースバンド信号は、電話を掛けて来た相手の電話番号を示すデジタルデータを含んでいても良い。 (6) Step S406
The receiving unit 12 demodulates the received high-frequency signal, converts it to a baseband signal, and sends it to the voice analysis function unit 13. At this time, the receiving unit 12 sends to the voice analysis function unit 13 an information signal indicating the telephone number of the other party who made the call together with the baseband signal. Here, the baseband signal is an audio signal. The baseband signal may include digital data indicating the telephone number of the other party who made the call.

（７）ステップＳ４０７
音声分析機能部１３は、送話又は受話の少なくとも一方の音声信号のスペクトラムを分析し、送話又は受話の少なくとも一方の音声が怒り等の特定の感情を示す音声であるかどうか判断する。すなわち、音声分析機能部１３は、受信部１２から受け取った音声信号（相手側からの音声信号）と、音声入力部１６から受け取った音声信号（ユーザ側からの音声信号）のうち、少なくとも一方の音声信号が怒り等の特定の感情を示す音声信号であるかどうか判断する。その後、音声分析機能部１３は、変換機能部１４に音声信号を送る。このとき、音声分析機能部１３は、送話又は受話の少なくとも一方の音声が怒り等の特定の感情を示す音声であれば、特定の感情を示す音声である旨を、或いは、元の音声にフィルタをかける旨の指示を、変換機能部１４に通知する。なお、音声分析機能部１３は、怒り等の特定の感情を示す音声であるかどうか判断しない場合、無条件で変換機能部１４に音声信号を送る。 (7) Step S407
The voice analysis function unit 13 analyzes the spectrum of at least one voice signal for transmission or reception, and determines whether at least one voice for transmission or reception is a voice indicating a specific emotion such as anger. That is, the voice analysis function unit 13 includes at least one of the voice signal received from the receiving unit 12 (voice signal from the other party) and the voice signal received from the voice input unit 16 (voice signal from the user side). It is determined whether the audio signal is an audio signal indicating a specific emotion such as anger. Thereafter, the voice analysis function unit 13 sends a voice signal to the conversion function unit 14. At this time, if at least one of the voices of the transmission or reception is a voice indicating a specific emotion such as anger, the voice analysis function unit 13 indicates that the voice indicates a specific emotion or the original voice. The conversion function unit 14 is notified of an instruction for filtering. If the voice analysis function unit 13 does not determine whether the voice indicates a specific emotion such as anger, the voice analysis function unit 13 sends a voice signal to the conversion function unit 14 unconditionally.

（８）ステップＳ４０８
変換機能部１４は、送話又は受話の少なくとも一方の音声が怒り等の特定の感情を示す音声であれば、元の音声にフィルタをかけた音声を出力する処理（フィルタリング）を行う。ここでは、変換機能部１４は、送信部１８に対して、元の音声にフィルタをかけた後の音声信号を送る。送信部１８は、元の音声にフィルタをかけた音声を出力する。このとき、変換機能部１４は、音声合成機能部１５に対し、音声信号と、元の音声にフィルタをかける旨の通知（フィルタリング依頼）を送り、音声合成機能部１５で元の音声にフィルタをかけ、音声合成機能部１５から送信部１８に元の音声にフィルタをかけた後の音声信号を送るようにしても良い。 (8) Step S408
The conversion function unit 14 performs processing (filtering) to output a sound obtained by filtering the original voice if at least one of the voices of transmission or reception is a voice indicating a specific emotion such as anger. Here, the conversion function unit 14 sends the audio signal after filtering the original audio to the transmission unit 18. The transmission unit 18 outputs a sound obtained by filtering the original sound. At this time, the conversion function unit 14 sends a voice signal and a notification (filtering request) to filter the original voice to the voice synthesis function unit 15, and the voice synthesis function unit 15 filters the original voice. Alternatively, the voice synthesis function unit 15 may send the voice signal after filtering the original voice to the transmission unit 18.

（９）ステップＳ４０９
変換機能部１４は、送話又は受話の少なくとも一方の音声が怒り等の特定の感情を示す音声でなければ、キーワードデータベース２２を参照して、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれているかどうか判断する。このとき、変換機能部１４は、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれている場合、当該音声信号を音声合成機能部１５に出力する。なお、変換機能部１４は、音声信号のうち特定のキーワードに該当する部分のみ音声合成機能部１５に出力し、他の部分を送信部１８に出力するようにしても良い。また、変換機能部１４は、当該キーワードデータを音声合成機能部１５に出力するようにしても良い。 (9) Step S409
If the voice of at least one of the transmission and reception is not a voice indicating a specific emotion such as anger, the conversion function unit 14 refers to the keyword database 22 and converts the voice signal analyzed by the voice analysis function unit 13 into It is determined whether or not keyword data indicating a specific keyword is included. At this time, if the speech signal analyzed by the speech analysis function unit 13 includes keyword data indicating a specific keyword, the conversion function unit 14 outputs the speech signal to the speech synthesis function unit 15. Note that the conversion function unit 14 may output only a portion corresponding to a specific keyword in the speech signal to the speech synthesis function unit 15 and output the other portion to the transmission unit 18. Further, the conversion function unit 14 may output the keyword data to the speech synthesis function unit 15.

（１０）ステップＳ４１０
変換機能部１４は、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれていない場合、音声合成機能部１５に対して、音声合成を中止又は禁止し、元の音声を送信部１８に出力する旨の通知を送る。送信部１８は、当該音声信号に基づいて、元の音声を出力する。 (10) Step S410
If the speech signal analyzed by the speech analysis function unit 13 does not include keyword data indicating a specific keyword, the conversion function unit 14 stops or prohibits speech synthesis for the speech synthesis function unit 15; A notification to output the original voice to the transmission unit 18 is sent. The transmission unit 18 outputs the original sound based on the sound signal.

（１１）ステップＳ４１１
変換機能部１４は、音声分析機能部１３で分析された音声信号に、特定のキーワードを示すキーワードデータが含まれている場合、当該キーワードデータに基づいて、特定のキーワードを音声出力部１６に出力する。このとき、変換機能部１４は、音声合成機能部１５を介して、特定のキーワードに該当する部分の合成音を音声出力部１６に出力するようにしても良い。音声出力部１６は、ユーザに、特定のキーワードを通知する。すなわち、変換機能部１４は、キーワードデータに基づいて、特定のキーワードに対して音声合成を行う旨を、音声出力部１６を介して、ユーザに通知する。なお、変換機能部１４は、特定のキーワードに対して音声合成を行う旨を、ユーザに通知しない場合、特定のキーワードを音声出力部１６に出力しなくても良い。 (11) Step S411
When the voice signal analyzed by the voice analysis function unit 13 includes keyword data indicating a specific keyword, the conversion function unit 14 outputs the specific keyword to the voice output unit 16 based on the keyword data. To do. At this time, the conversion function unit 14 may output the synthesized sound corresponding to the specific keyword to the voice output unit 16 via the voice synthesis function unit 15. The audio output unit 16 notifies the user of a specific keyword. That is, the conversion function unit 14 notifies the user via the voice output unit 16 that voice synthesis is performed for a specific keyword based on the keyword data. Note that the conversion function unit 14 may not output the specific keyword to the voice output unit 16 when not notifying the user that speech synthesis is performed on the specific keyword.

（１２）ステップＳ４１２
音声合成機能部１５は、音声データベース２３を参照して、受け取った音声信号に応じて、特定の人物に似た音声の合成を行う。ここでは、音声合成機能部１５は、受け取った音声信号に基づく音声の全体に対して、特定の人物に似た音声の合成を行う。なお、音声合成機能部１５は、変換機能部１４からキーワードデータを受け取り、キーワードデータに基づいて、受け取った音声信号に基づく音声に含まれる特定のキーワードの音声のみ、特定の人物に似た音声の合成を行うようにしても良い。例えば、音声合成機能部１５は、キーワードデータが特定の人物の著名な発言を示している場合、受け取った音声信号に基づく音声にこの著名な発言と同じ内容が含まれていれば、その発言の箇所のみ、特定の人物に似た音声での合成を行う。 (12) Step S412
The speech synthesis function unit 15 refers to the speech database 23 and synthesizes speech similar to a specific person according to the received speech signal. Here, the speech synthesis function unit 15 synthesizes speech similar to a specific person with respect to the entire speech based on the received speech signal. The voice synthesis function unit 15 receives the keyword data from the conversion function unit 14, and based on the keyword data, only the voice of the specific keyword included in the voice based on the received voice signal is used. You may make it perform a synthesis | combination. For example, if the keyword data indicates a prominent utterance of a specific person, the speech synthesis function unit 15, if the speech based on the received audio signal includes the same content as the prominent utterance, Only the part is synthesized with sound similar to a specific person.

（１３）ステップＳ４１３
音声合成機能部１５は、音声合成された音声信号を送信部１８に送る。送信部１８は、受け取った音声信号を変調して高周波信号に変換し、アンテナ部１１を介して、電話を掛ける相手に対し、その高周波信号を送信する。 (13) Step S413
The voice synthesis function unit 15 sends the voice signal that has been voice synthesized to the transmission unit 18. The transmission unit 18 modulates the received audio signal to convert it into a high frequency signal, and transmits the high frequency signal to the other party who makes a call via the antenna unit 11.

（１４）ステップＳ４１４
音声分析機能部１３は、高周波信号の送信が終了するまで、継続的に、送話又は受話の少なくとも一方の音声の感情を分析し、送話又は受話の少なくとも一方の音声が怒り等の特定の感情を示す音声であるとの結果を得たら、現在の合成音の出力を中止して元の音声にフィルタをかけた音声を出力する処理に切り替える。音声分析機能部１３は、通話が終了したら、一連の処理を終了する。 (14) Step S414
The voice analysis function unit 13 continuously analyzes the emotion of at least one of the voices of the transmission or reception until the transmission of the high-frequency signal is completed, and the voice of at least one of the transmissions or receptions is a specific signal such as anger. When the result of the voice indicating emotion is obtained, the output of the current synthesized sound is stopped and the process is switched to the process of outputting the voice obtained by filtering the original voice. The voice analysis function unit 13 ends the series of processes when the call is finished.

なお、本発明における各実施形態は、組み合わせて実施することも可能である。 It should be noted that the embodiments of the present invention can be implemented in combination.

以上のように、本発明の携帯端末は、音声合成機能と、特定のキーワードや特定の感情の音声に基づいた音声合成の切換機能、生の音声へのフィルタリング機能、及び特定キーワードの自動登録の機能を持つ。 As described above, the mobile terminal of the present invention has a voice synthesis function, a voice synthesis switching function based on a voice of a specific keyword or a specific emotion, a filtering function to raw voice, and an automatic registration of a specific keyword. Has function.

本発明の携帯端末では、受話音を特定の人物の音声に真似た合成音で聞く事ができる。 In the portable terminal of the present invention, the received sound can be heard with a synthesized sound imitating the voice of a specific person.

また、本発明の携帯端末では、音声合成の切り替え相手の選択は予め電話番号帳２１に登録した相手を登録でき、また、通話途中でもスイッチを押す事により音声合成をＯＮ、ＯＦＦする事ができる。 In the portable terminal of the present invention, the voice synthesizer switching partner can be selected by registering the partner previously registered in the telephone number book 21, and voice synthesis can be turned on and off by pressing a switch during a call. .

また、本発明の携帯端末では、キーワードデータベース２２に登録された特定のキーワードの音声を認識すると音声合成を中止して受話音そのままの音声を出力できる。 Further, in the mobile terminal of the present invention, when the voice of a specific keyword registered in the keyword database 22 is recognized, the voice synthesis can be stopped and the voice as the received voice can be outputted.

また、本発明の携帯端末では、キーワードがＮＧワードとして登録されていた場合、相手に対しＮＧワードを出力する事もできる。 In the portable terminal of the present invention, when a keyword is registered as an NG word, the NG word can be output to the other party.

また、本発明の携帯端末では、受話の音声の感情を分析し、怒り等の特定の感情を示す音声であるとの結果を得たら、特定の人物の音声に真似た合成音を中止して元の音声にフィルタをかけた音声を出力する処理を行う事ができる。 Further, in the mobile terminal of the present invention, when the emotion of the received voice is analyzed and a result indicating that the voice indicates a specific emotion such as anger is obtained, the synthesized sound imitating the voice of the specific person is stopped. It is possible to perform processing to output a sound obtained by filtering the original sound.

また、本発明の携帯端末では、受話の操作者がボタンを押すとその所定時間前の音声を分析して、特定のキーワードをキーワードデータベース２２に自動的に登録する事ができる。 Further, in the portable terminal of the present invention, when a receiving operator presses a button, a voice before the predetermined time is analyzed and a specific keyword can be automatically registered in the keyword database 22.

以上の機能を持ったこの携帯端末を使用すれば、収益比率が減少中の音声ユーザの需要を喚起する事ができ、引いては携帯端末の販売シェアを伸ばす事ができる。 If this mobile terminal having the above functions is used, the demand of voice users whose profit ratio is decreasing can be stimulated, and the sales share of the mobile terminal can be increased.

本発明は、携帯電話機の開発設計製造会社及びその部門での利用が考えられる。 The present invention is considered to be used in mobile phone development and design companies and their departments.

以上、本発明の実施形態を詳述してきたが、実際には、上記の実施形態に限られるものではなく、本発明の要旨を逸脱しない範囲の変更があっても本発明に含まれる。 As mentioned above, although embodiment of this invention was explained in full detail, actually, it is not restricted to said embodiment, Even if there is a change of the range which does not deviate from the summary of this invention, it is included in this invention.

１１… アンテナ部
１２… 受信部
１３… 音声分析機能部
１４… 変換機能部
１５… 音声合成機能部
１６… 音声出力部
１７… 音声入力部
１８… 送信部
２１… 電話番号帳
２２… キーワードデータベース
２３… 音声データベース DESCRIPTION OF SYMBOLS 11 ... Antenna part 12 ... Reception part 13 ... Speech analysis function part 14 ... Conversion function part 15 ... Speech synthesis function part 16 ... Voice output part 17 ... Voice input part 18 ... Transmission part 21 ... Telephone number book 22 ... Keyword database 23 ... Voice database

Claims

It is determined whether the identification information of the other party of voice communication is registered as a voice synthesis target, and when the identification information is registered as a voice synthesis target, the spectrum of the voice signal related to the voice communication is analyzed and based on the voice signal A voice analysis means for determining whether the voice corresponds to a voice pattern indicating a specific emotion;
When the voice is a voice corresponding to the voice pattern, a filtering process is performed to filter the voice, and the filtered voice is output. When the voice is not a voice corresponding to the voice pattern, the voice Conversion means for determining whether a keyword included in the signal corresponds to an NG word, and notifying that the keyword is an NG word to a source of the audio signal when the keyword corresponds to an NG word;
A mobile terminal comprising speech synthesis means for synthesizing predetermined speech with speech based on the speech signal and outputting synthesized speech.

The mobile terminal according to claim 1,
The conversion means uses, as an NG word, a specific keyword included in an audio signal transmitted between the predetermined operation and a predetermined time before the predetermined time according to a predetermined operation of the user during the voice communication. A portable terminal that registers in a keyword database and refers to the keyword database when determining whether a keyword included in the audio signal corresponds to an NG word.

The mobile terminal according to claim 1 or 2,
The voice analysis unit analyzes a spectrum of a voice signal on both a transmission side and a reception side in the voice communication, and a voice based on at least one voice signal of the transmission side and the reception side indicates a specific emotion To determine whether
The converting means performs a filtering process for filtering the sound indicating the specific emotion when the sound based on at least one sound signal of the transmitting side and the receiving side is a sound corresponding to the sound pattern. Do mobile device.

The mobile terminal according to any one of claims 1 to 3,
The said conversion means judges whether the keyword contained in the said audio | voice signal corresponds to a specific keyword, and when the said keyword corresponds to a specific keyword, the said audio | voice signal is sent to the said voice synthesizing means.

The mobile terminal according to any one of claims 1 to 4,
The converting means determines whether a keyword included in the audio signal corresponds to a specific keyword, and if the keyword corresponds to a specific keyword, the audio signal at a location corresponding to the specific keyword is selected from the audio signals. , Sent to the speech synthesis means,
The portable terminal that synthesizes a predetermined voice with respect to a voice signal corresponding to a specific keyword in the voice signal and outputs a synthesized voice.

Determine whether the identification information of the voice communication partner is registered as the target for speech synthesis,
When the identification information is registered as a voice synthesis target, analyze a spectrum of a voice signal related to the voice communication, determine whether the voice based on the voice signal is a voice indicating a specific emotion,
If the voice is a voice corresponding to the voice pattern, perform a filtering process to filter the voice, and output the filtered voice,
If the voice is not a voice corresponding to the voice pattern, it is determined whether a keyword included in the voice signal corresponds to an NG word. If the keyword corresponds to an NG word, the voice signal source is Notify that the keyword is an NG word,
A speech synthesis method for synthesizing predetermined speech with speech based on the speech signal and outputting synthesized speech.

The speech synthesis method according to claim 6,
During the voice communication, a specific keyword included in a voice signal transmitted between the predetermined operation and a predetermined time before is registered as an NG word in the keyword database according to a predetermined operation of the user. ,
A speech synthesis method for referring to the keyword database when determining whether a keyword included in the speech signal corresponds to an NG word.

The speech synthesis method according to claim 6 or 7,
The voice analysis unit analyzes a spectrum of a voice signal on both a transmission side and a reception side in the voice communication, and a voice based on at least one voice signal of the transmission side and the reception side indicates a specific emotion To determine whether
A speech synthesizing method that performs a filtering process for filtering the speech indicating the specific emotion when speech based on at least one speech signal of the transmitting side and the receiving side is speech that corresponds to the speech pattern.

The speech synthesis method according to any one of claims 6 to 8,
Determining whether a keyword included in the audio signal corresponds to a specific keyword;
A speech synthesis method for synthesizing predetermined speech with respect to the speech signal and outputting a synthesized speech when the keyword corresponds to a specific keyword.

The speech synthesis method according to any one of claims 6 to 9,
Determining whether a keyword included in the audio signal corresponds to a specific keyword;
When the keyword corresponds to a specific keyword, a voice synthesis method for synthesizing a predetermined voice with respect to a voice signal at a location corresponding to the specific keyword in the voice signal and outputting a synthesized voice.

Determining whether the identification information of the voice communication partner is registered as a speech synthesis target;
When the identification information is registered as a voice synthesis target, the spectrum of the voice signal related to the voice communication is analyzed to determine whether the voice based on the voice signal is a voice corresponding to a voice pattern indicating a specific emotion. Steps,
If the voice is a voice corresponding to the voice pattern, performing a filtering process to filter the voice, and outputting the filtered voice;
If the voice is not a voice corresponding to the voice pattern, it is determined whether a keyword included in the voice signal corresponds to an NG word. If the keyword corresponds to an NG word, the voice signal source is Notifying that the keyword is an NG word;
A speech synthesis program for causing a computer to execute a step of synthesizing predetermined speech with speech based on the speech signal and outputting the synthesized speech.

The speech synthesis program according to claim 11,
During the voice communication, a specific keyword included in a voice signal transmitted between the predetermined operation and a predetermined time before is registered in the keyword database as an NG word according to a predetermined operation of the user. Steps,
A speech synthesis program for causing a computer to further execute a step of referring to the keyword database when determining whether or not a keyword included in the speech signal corresponds to an NG word.

The speech synthesis program according to claim 11 or 12,
The voice analysis unit analyzes a spectrum of a voice signal on both a transmission side and a reception side in the voice communication, and a voice based on at least one voice signal of the transmission side and the reception side indicates a specific emotion Determining whether the sound corresponds to the pattern;
A step of performing a filtering process for filtering the voice indicating the specific emotion when the voice based on at least one voice signal of the transmitting side and the receiving side is a voice corresponding to the voice pattern; A speech synthesis program to be executed by a computer.

A speech synthesis program according to any one of claims 11 to 13,
Determining whether a keyword included in the audio signal corresponds to a specific keyword;
A speech synthesis program for causing a computer to further execute a step of synthesizing predetermined speech with respect to the speech signal and outputting a synthesized sound when the keyword corresponds to a specific keyword.

A speech synthesis program according to any one of claims 11 to 14,
Determining whether a keyword included in the audio signal corresponds to a specific keyword;
When the keyword corresponds to a specific keyword, the computer further executes a step of synthesizing a predetermined voice with respect to a voice signal at a position corresponding to the specific keyword in the voice signal and outputting a synthesized voice. A program for speech synthesis.