JP2007259427A

JP2007259427A - Mobile terminal unit

Info

Publication number: JP2007259427A
Application number: JP2007039006A
Authority: JP
Inventors: Mari Iino; まり飯野; Yutaka Yokota; 裕横田; Osamu Yamamoto; 修山本
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2006-02-23
Filing date: 2007-02-20
Publication date: 2007-10-04

Abstract

<P>PROBLEM TO BE SOLVED: To provide a mobile terminal unit for outputting the effect predetermined by a user during a call. <P>SOLUTION: The effect corresponding to each parameter, key word, variation of voice power, and pitch fluctuation or the like, is defined at an effect pattern management part, and saved in an effect pattern data base. The mobile terminal unit analyzes the speech data by an analysis means using voice recognition or emotion estimation, and detects the particular key word, the voice power variation, the pitch fluctuation or the like. When the detected particular key word or the like is collated with the effect pattern and coincided, the previously set effect is given, made into voice by a data voice part, and outputted to a speaker, and the sounding of the voice data with the effect is performed. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、通話中にユーザーの希望する効果を付加して出力する機能を持たせた携帯端末装置、特に通話中にユーザーの希望する効果音、背景音または効果画像などの効果を、自端末に出力する音声または画像の少なくともどちらかに付加して出力する機能を持たせた携帯端末装置、および、通話中にユーザーの希望する効果音、背景音または効果画像などの効果を、相手側端末に送信する音声または画像の少なくともどちらかに付加して送信する機能を持たせた携帯端末装置に関する。 The present invention relates to a portable terminal device having a function of adding and outputting an effect desired by a user during a call, in particular, an effect such as a sound effect, background sound or effect image desired by the user during a call. A mobile terminal device that has a function of adding to and / or outputting to at least one of audio and / or image to be output to the terminal, and the effect of the sound effect, background sound, or effect image desired by the user during the call. The present invention relates to a portable terminal device having a function of adding to and transmitting at least either sound or image to be transmitted to the mobile phone.

携帯端末装置である携帯電話が広範囲に普及し、既に人々の生活に欠かせないものとなっている。そして、携帯端末装置の高性能化、多機能化が着々と進んでいる。とりわけ、着信音メロディ等の、ユーザーが任意に着信音を設定できる機能は多くの支持を得ており、近年では着信音の出力のために高性能な音源チップを実装した携帯端末装置が実用化されている。そればかりでなく、通話の娯楽性をより高めるために任意のサウンドエフェクト機能を発揮することを目的として、通話中にキーボタンを利用した効果音または背景音の出力機能を持つ携帯端末装置も提案されている（例えば、特許文献１参照）。 Mobile phones, which are portable terminal devices, have become widespread and are already indispensable for people's lives. And the performance improvement and multifunctionalization of portable terminal devices are steadily progressing. In particular, the functions that allow users to set ringtones arbitrarily, such as ringtone melodies, have gained a lot of support, and in recent years, mobile terminal devices equipped with high-performance sound source chips for the output of ringtones have been put into practical use. Has been. Not only that, but also a portable terminal device that has a sound effect or background sound output function using key buttons during a call in order to demonstrate any sound effect function to further enhance the entertainment of the call. (For example, refer to Patent Document 1).

図１３に従来の携帯端末装置を示す。図１３には、複数のキーボタンを有するキーボタン部１８０と、複数のキーボタン情報と通話中に相手側端末に送信する効果音または背景音が貯蔵されている貯蔵部１１０と、上記キーボタンに対応する効果音または背景音を出力するキートーン出力部１４２と、信号処理を行う送信側信号処理部１６０と、上記キーボタン部１８０から入力されるキーボタン選択信号に応じて効果音または背景音を上記貯蔵部１１０から読み込んで上記キートーン出力部１４２に出力し、更に上記送信側信号処理部１６０を通して出力される効果音または背景音を所定のチャンネルを通して転送するメインコントローラ１７０を含む構成が示されている。上記従来の携帯端末装置では、通話中にキーボタン部１８０のいずれかのキーボタンを押すと、キーボタンに対応する効果音または背景音を通話音声に付加して無線通信部１３０により、サーバー３００を経由して相手側端末に送信することができる。
特開２００４−３１２６６２号公報（第１ページ、図１） FIG. 13 shows a conventional portable terminal device. 13 shows a key button unit 180 having a plurality of key buttons, a storage unit 110 storing a plurality of key button information and sound effects or background sounds to be transmitted to the other terminal during a call, and the key buttons. Key tone output unit 142 that outputs a sound effect or background sound corresponding to the sound signal, a transmission side signal processing unit 160 that performs signal processing, and a sound effect or background sound according to a key button selection signal input from the key button unit 180 Is read from the storage unit 110 and output to the key tone output unit 142, and further includes a main controller 170 for transferring the sound effect or background sound output through the transmission side signal processing unit 160 through a predetermined channel. ing. In the conventional portable terminal device, when any key button of the key button unit 180 is pressed during a call, a sound effect or background sound corresponding to the key button is added to the call voice, and the wireless communication unit 130 causes the server 300 to Can be sent to the other terminal via.
Japanese Patent Laying-Open No. 2004-312662 (first page, FIG. 1)

ところが、上記従来の携帯端末装置では、効果音を発するためのキーボタンの操作が煩わしい。通話時は操作部が見えないので、タイミングよく適切なキーボタンを選んで希望する効果音を付与することが困難である。操作を間違うこともあり、その場にそぐわない効果音が付与されることもあるという問題点があった。本発明は、通話中にユーザーが希望する効果をキーボタン操作をすることなく自動的に付加して、自端末に出力する、あるいは相手側端末に送信するようにした携帯端末装置を提供することを目的としている。 However, in the conventional portable terminal device described above, it is troublesome to operate a key button for generating a sound effect. Since the operation unit cannot be seen during a call, it is difficult to select an appropriate key button in a timely manner and to provide a desired sound effect. There is a problem in that the operation may be wrong and a sound effect that is not suitable for the occasion may be given. An object of the present invention is to provide a portable terminal device that automatically adds an effect desired by a user during a call without performing a key button operation and outputs the result to the own terminal or to the other terminal. It is an object.

以上の課題を解決すべく、本発明では、通話データを解析する解析手段と、解析した解析結果と照合する効果パターンと効果パターンに関連付けた効果を蓄積した効果パターンデータベースと、効果パターンの出力を管理する効果パターン管理手段と、解析した解析結果と効果パターンとを照合する解析照合手段と、音声または画像の少なくともどちらかに効果を付加する合成手段とを設け、解析手段で解析した解析結果と効果パターン管理手段で出力を管理している効果パターンを解析照合手段で照合し、適合した効果パターンに関連付けた効果を音声または画像の少なくともどちらかに付加して自端末に出力、または相手側端末に送信するよう携帯端末装置を構成している。 In order to solve the above problems, in the present invention, the analysis means for analyzing the call data, the effect pattern database for accumulating the effect pattern and the effect associated with the effect pattern to be compared with the analyzed result, and the output of the effect pattern An effect pattern management means for managing, an analysis collating means for collating the analyzed result with the effect pattern, and a synthesis means for adding an effect to at least one of sound and image, and the analysis result analyzed by the analyzing means; The effect pattern whose output is managed by the effect pattern management means is collated by the analysis and collation means, and the effect associated with the matched effect pattern is added to at least one of sound and image and output to the own terminal, or the partner terminal The mobile terminal device is configured to transmit to.

この構成により、効果パターンデータベースにユーザーの好みの効果を予め割り付けておくことができ、通話中に通話音声から特定のパラメータを検出して、検出したパラメータのパターンに応じた効果を付加して自端末に出力する、あるいは相手側端末に送信することで効果を自分または相手方に聞かせることを可能としている。 With this configuration, the user's favorite effects can be pre-assigned to the effect pattern database, and a specific parameter is detected from the call voice during a call, and an effect corresponding to the detected parameter pattern is added to the effect pattern database. By outputting to the terminal or sending to the other terminal, it is possible to let the person or the other party hear the effect.

また、本発明は、適合した効果パターンに関連付けてある効果、または適合した効果パターンに関連付けてある効果を識別する識別情報を記憶する記憶手段をさらに設け、解析手段が通話データから一定時間の無音区間を検出したとき、前記記憶手段に記憶されている前記効果、または前記識別情報が示す効果を、前記音声または前記画像の少なくともどちらかに付加する合成手段を備えた構成を有する。 The present invention further includes storage means for storing the identification information for identifying the effect associated with the adapted effect pattern or the effect associated with the adapted effect pattern, and the analyzing means is silent for a certain time from the call data. When a section is detected, the image processing apparatus includes a synthesis unit that adds the effect stored in the storage unit or the effect indicated by the identification information to at least one of the sound and the image.

この構成により、無音区間を有効的に活用することができ、会話中に効果が音声または画像に頻繁に付加されることを防止し、ユーザーの会話に対する集中を阻害せずに効果を出力することを可能としている。 With this configuration, it is possible to effectively use the silent section, prevent the effect from being frequently added to the voice or image during the conversation, and output the effect without disturbing the user's concentration on the conversation Is possible.

また、通話中に、携帯端末装置の表示手段に表示する画像に効果画像を合成する画像合成手段を更に設けたことにより、視覚的な効果を出力することを可能としている。 Further, it is possible to output a visual effect by further providing image synthesizing means for synthesizing the effect image with the image displayed on the display means of the portable terminal device during a call.

さらに本発明は、解析手段を音声認識手段を有する解析手段としたことにより、通話音声から特定のキーワードを検出して、検出したキーワードに対応する効果を出力することを可能としている。 Further, according to the present invention, the analysis means is an analysis means having a voice recognition means, so that a specific keyword is detected from the call voice and an effect corresponding to the detected keyword can be output.

さらに本発明は、解析手段を感情推定手段を有する解析手段としたことにより、音声パワー変動やピッチの揺れなどを抽出し、感情パターンを検出して、検出した感情パターンに対応する効果を出力することもできる。 Furthermore, according to the present invention, the analysis means is an analysis means having an emotion estimation means, so that voice power fluctuations, pitch fluctuations, and the like are extracted, an emotion pattern is detected, and an effect corresponding to the detected emotion pattern is output. You can also.

（実施の形態１）
以下、本発明の第１の実施の形態を、図面を参照して説明する。本実施形態に係る携帯端末装置１００は、携帯電話端末やＰＨＳ端末等の通信機能に加えて、通話中に効果音を付加して自端末に出力することができるように構成してある。効果音の例としては、種々のＢＧＭ（ＢａｃｋＧｒｏｕｎｄＭｕｓｉｃ）や駅や道路や公園などの環境音、掛け声や合いの手等の人の声、拍手や歓声、動物の鳴き声、著名人の声やユーザーが予め録音した音声、若しくは声以外の音である電子音、機械音、その他を挙げることができる。それ以外に通話音声に対して、繰り返しや、テンポやトーンなどを変調した音などがあげられる。 (Embodiment 1)
DESCRIPTION OF EXEMPLARY EMBODIMENTS Hereinafter, a first embodiment of the invention will be described with reference to the drawings. The mobile terminal device 100 according to the present embodiment is configured such that in addition to the communication functions of a mobile phone terminal, a PHS terminal, etc., a sound effect can be added during a call and output to the terminal itself. Examples of sound effects include various BGM (Back Ground Music), environmental sounds of stations, roads and parks, voices of people such as shouts and hands, applause and cheers, animal calls, celebrity voices and users Examples thereof include pre-recorded voices, electronic sounds that are sounds other than voices, mechanical sounds, and others. In addition to this, there can be repeated sounds, modulated sounds such as tempo and tone.

図１は、本発明における携帯端末装置１００の構成を表すブロック図である。携帯端末装置１００は、相手側端末から音声データを受信する無線部４と、受信した音声データのパターンを解析する解析手段としての音声解析部８と、あらかじめ効果パターンと効果パターンに関連付けた効果を貯蔵しておいた効果パターンデータベース１０と、効果パターンの出力を管理する効果パターン管理部９と、解析結果と効果パターンを照合する解析照合部７と、効果パターンに対応する効果を音声化するデータ音声化部６と、相手側端末から受信した通話音声と効果パターンに対応する効果を音声化したデータを合成する音声合成部５と、合成した音声データをスピーカに送出する制御部１、ユーザーからの操作を受け付ける操作部２、音声データの鳴動を行うスピーカ３等からなる。 FIG. 1 is a block diagram showing a configuration of a mobile terminal device 100 according to the present invention. The mobile terminal device 100 includes a wireless unit 4 that receives voice data from the counterpart terminal, a voice analysis unit 8 as an analysis unit that analyzes a pattern of the received voice data, and an effect pattern and an effect associated with the effect pattern in advance. The stored effect pattern database 10, the effect pattern management unit 9 that manages the output of the effect pattern, the analysis collation unit 7 that collates the analysis result with the effect pattern, and the data that utters the effect corresponding to the effect pattern From the voice generating unit 6, the voice synthesizing unit 5 that synthesizes the voice of the call received from the partner terminal and the data obtained by converting the effect corresponding to the effect pattern into voice, the control unit 1 that sends the synthesized voice data to the speaker, and the user The operation unit 2 for receiving the operation of the above, the speaker 3 for sounding the sound data, and the like.

ここで、通話中とは、ユーザーが使用する携帯端末装置１００と相手方の携帯端末装置（以下、相手側端末という）とが電気通信回線を介して通話可能に接続している状態を言う。効果音とは、通話中にマイクを介してリアルタイムに入力される音声信号以外の音声全般を意味する。よって、肉声を録音したものであることもある。携帯端末装置１００とは、複数の人の間で会話を可能とするべく電気通信回線を介して通信を行い得る機器をおしなべて包含する概念である。 Here, “calling” means a state in which the mobile terminal device 100 used by the user and the mobile terminal device of the other party (hereinafter referred to as a counterpart terminal) are connected to each other via a telecommunication line. The sound effect means all sounds other than a sound signal input in real time via a microphone during a call. Therefore, it may be a recording of a real voice. The portable terminal device 100 is a concept that includes all devices capable of communicating via an electric communication line so as to enable conversation between a plurality of people.

携帯端末装置１００では、あらかじめ、音声認識したキーワード、音声パワーの変動やピッチの揺れなどの各パラメータのパターンに対応した効果を定義し、定義した対応関係を効果パターンデータベース１０に保存してある。図２に効果パターンデータベース１０の記憶領域に記憶されているデータ構成の一例を示す。図２に示すように、効果パターンデータベース１０の記憶領域には、効果パターン９０と、効果パターン９０に関連付けた効果９１が一組のデータとして複数組記憶されている。効果パターンデータベース１０の効果パターン９０と効果９１は、効果パターン管理部９によって出力を管理されており、携帯端末装置１００の動作に応じて解析照合部７にそれぞれ出力される。すなわち、効果パターン９０は、解析結果と効果パターンを照合する際に解析照合部７に順次出力される。そして効果９１は、解析結果と効果パターンが適合したときに、適合した効果パターンに関連付けてある効果が特定され、解析照合部７に出力される。 In the mobile terminal device 100, an effect corresponding to each parameter pattern such as a voice-recognized keyword, voice power fluctuation or pitch fluctuation is defined in advance, and the defined correspondence relationship is stored in the effect pattern database 10. FIG. 2 shows an example of the data configuration stored in the storage area of the effect pattern database 10. As shown in FIG. 2, in the storage area of the effect pattern database 10, a plurality of sets of effect patterns 90 and effects 91 associated with the effect patterns 90 are stored as a set of data. The output of the effect pattern 90 and the effect 91 in the effect pattern database 10 is managed by the effect pattern management unit 9, and is output to the analysis collation unit 7 according to the operation of the mobile terminal device 100. That is, the effect pattern 90 is sequentially output to the analysis collation unit 7 when the analysis result and the effect pattern are collated. As for the effect 91, when the analysis result and the effect pattern are matched, the effect associated with the matched effect pattern is specified and output to the analysis collation unit 7.

なお、効果パターンデータベース１０には多数の効果パターンが保存され、定義された一つの効果パターンが、複数の組み合わせであることも考えられる。効果パターンは、例えば、電気通信回線を介してダウンロードしたり、図示しないマイクを介して予め録音しておいたりすることで、効果パターンデータベース１０に格納される。 In addition, many effect patterns are preserve | saved in the effect pattern database 10, and it is also considered that one defined effect pattern is a plurality of combinations. The effect pattern is stored in the effect pattern database 10 by, for example, downloading via an electric communication line or recording in advance via a microphone (not shown).

図３に本発明の携帯端末装置１００の通話音声を解析し効果を付加する時の動作を説明するフローチャートを示す。図３において、携帯端末装置１００が音声通話の発信側である相手側端末と音声通話を開始すると（ステップ２０１）、相手の声とまわりの音の音声データを受信する（ステップ２０２）。そして、音声解析部８が音声データに含まれる各種のパラメータを解析する（ステップ２０３）。解析結果がでると、効果パターン管理部９が効果パターン９０を順次出力し、解析照合部７が解析結果であるパラメータのパターンを効果パターンと照合する（ステップ２０４）。解析照合部７において、効果パターンと照合して一致した場合、一致した効果パターンの効果９１が出力されるべき効果として特定される（ステップ２０６）。効果パターン管理部９は、特定された効果９１を解析照合部７を経由してデータ音声化部６へ出力し、データ音声化部６は効果９１を音声化する（ステップ２０７）。その後、音声合成部５において、相手側端末から受信した音声データと、効果９１を音声化したデータを合成する（ステップ２０８）。そして、制御部１により、スピーカ３へ送出し、効果を付加した音声データの鳴動を行う（ステップ２０９）。もしステップ２０４で、効果パターンデータベースにある効果パターンと音声データのパラメータのパターンを照合しても一致しない場合、特別な処理なしで（ステップ２０５）、通常の音声データをそのまま発する（ステップ２０９）。 FIG. 3 shows a flowchart for explaining the operation of the mobile terminal device 100 of the present invention when analyzing the voice of a call and adding an effect. In FIG. 3, when the mobile terminal device 100 starts a voice call with a partner terminal that is a voice call originator (step 201), it receives voice data of the partner's voice and surrounding sounds (step 202). Then, the voice analysis unit 8 analyzes various parameters included in the voice data (step 203). When the analysis result is obtained, the effect pattern management unit 9 sequentially outputs the effect pattern 90, and the analysis collation unit 7 collates the pattern of the parameter as the analysis result with the effect pattern (step 204). When the analysis collation unit 7 collates with the effect pattern and matches, the effect 91 of the matched effect pattern is specified as an effect to be output (step 206). The effect pattern management unit 9 outputs the identified effect 91 to the data voice converting unit 6 via the analysis matching unit 7, and the data voice generating unit 6 voices the effect 91 (step 207). Thereafter, the voice synthesizer 5 synthesizes the voice data received from the counterpart terminal and the data obtained by converting the effect 91 into voice (step 208). Then, the control unit 1 sends out the sound data to the speaker 3 and sounds the sound data to which the effect is added (step 209). If the effect pattern in the effect pattern database does not coincide with the parameter pattern of the audio data in step 204, the normal audio data is generated without any special processing (step 205) without any special processing (step 209).

（実施の形態２）
次に、本発明の第２の実施の形態にかかる携帯端末装置について説明する。本発明の第２の実施の形態にかかる携帯端末装置２００は、既に実施の形態１で説明した図１の音声解析部８に音声認識部８２を追加して、解析手段である音声解析部８１を音声認識手段を有する解析手段として構成している。図４に本発明の第２の実施の形態にかかる携帯端末装置２００のブロック図を示す。 (Embodiment 2)
Next, the portable terminal device concerning the 2nd Embodiment of this invention is demonstrated. The mobile terminal device 200 according to the second embodiment of the present invention adds a speech recognition unit 82 to the speech analysis unit 8 of FIG. Are configured as analysis means having voice recognition means. FIG. 4 shows a block diagram of a portable terminal device 200 according to the second embodiment of the present invention.

以下、音声認識手段を有する解析手段を用いた本発明の第２の実施の形態にかかる携帯端末装置の動作について説明する。図５はすでに説明した図３のフローチャート中のステップ２０３とステップ２０４に関して、音声データ解析の際に音声認識手段による解析動作を加えた場合のフローチャートである。以下、図面を参照して、本実施の形態における通話音声解析時の音声認識手段による解析動作を説明する。 Hereinafter, the operation of the portable terminal device according to the second embodiment of the present invention using the analysis unit having the voice recognition unit will be described. FIG. 5 is a flowchart in the case where an analysis operation by the speech recognition means is added at the time of speech data analysis with respect to steps 203 and 204 in the flowchart of FIG. 3 already described. Hereinafter, with reference to the drawings, the analysis operation by the voice recognition means at the time of call voice analysis in the present embodiment will be described.

図５において、相手側端末より受信した音声データから、音声認識部８２を有する音声解析部８１により、キーワードといったようなパラメータのパターンを抽出する（ステップ３０３）。例えば、「へー」というキーワードについて、効果パターンデータベース１０にキーワード「へー」と対応して効果「へーへーへーへーへー」があらかじめ保存されているときには（ステップ２０４）、音声認識部８２を有する音声解析部８１により、「へー」というキーワードが検出されたら、解析照合部７で「照合あり」とされ、キーワード「へー」に対応した効果「へーへーへーへーへー」が特定される（ステップ２０６）。同じように、効果パターン管理部９が出力を管理している効果パターンデータベース１０において、キーワード「あっ」と対応して効果「あっあっあっ…」が保存されていれば、音声データにキーワード「あっ」があって、解析照合部７で効果パターンデータベース１０と照合できれば、キーワード「あっ」に対応する効果が特定され（ステップ２０６）、効果パターンに対応する効果「あっあっあっ…」を音声化する（ステップ２０７）。その後、効果「あっあっあっ…」を音声化したデータを合成する（ステップ２０８）。そして、音声データの鳴動を行う（ステップ２０９）。もしステップ２０４で、効果パターンデータベース１０にある効果パターン９０と音声データのキーワードが一致しない場合、特別な処理なしで（ステップ２０５）、通常の音声データをそのまま発する（ステップ２０９）。 In FIG. 5, a parameter pattern such as a keyword is extracted from the voice data received from the counterpart terminal by the voice analysis unit 81 having the voice recognition unit 82 (step 303). For example, for the keyword “he”, when the effect “hehehehehehe” is stored in advance in the effect pattern database 10 in correspondence with the keyword “hehe” (step 204), the voice analysis having the voice recognition unit 82 is performed. When the keyword “heo” is detected by the unit 81, “analysis is present” is made by the analysis collation unit 7 and the effect “hehehehehehe” corresponding to the keyword “hee” is specified (step 206). Similarly, in the effect pattern database 10 for which the effect pattern management unit 9 manages the output, if the effect “Ah Ah…” is stored corresponding to the keyword “Ah”, the keyword “Ah” is stored in the audio data. ”And the analysis matching unit 7 can collate with the effect pattern database 10, the effect corresponding to the keyword“ A ”is specified (Step 206), and the effect“ Ah ”… corresponding to the effect pattern is voiced. (Step 207). After that, the data in which the effect “Ah! ...” is voiced is synthesized (step 208). Then, the sound data is sounded (step 209). If the effect pattern 90 in the effect pattern database 10 does not match the keyword of the voice data in step 204, normal voice data is emitted as it is without any special processing (step 205) (step 209).

（実施の形態３）
次に、本発明の第３の実施の形態にかかる携帯端末装置について説明する。本発明の第３の実施の形態にかかる携帯端末装置は、すでに実施の形態２にて説明した図４の音声認識部８２を感情推定手段と置き換えたものであり、音声解析部を感情推定手段を有する解析手段として構成している。なお、他の構成は図４と同じなので、ブロック図は省略する。 (Embodiment 3)
Next, the portable terminal device concerning the 3rd Embodiment of this invention is demonstrated. The portable terminal device according to the third embodiment of the present invention is obtained by replacing the voice recognition unit 82 of FIG. 4 already described in the second embodiment with an emotion estimation unit, and the voice analysis unit is an emotion estimation unit. It is constituted as an analysis means having Since the other configurations are the same as those in FIG. 4, the block diagram is omitted.

以下、感情推定手段を有する解析手段を用いた本発明の第３の実施の形態にかかる携帯端末装置の動作について説明する。図６は、すでに説明した図３のフローチャートの中のステップ２０３とステップ２０４に関して、音声データ解析の際に感情推定手段による解析動作を加えた場合のフローチャートである。以下、図面を参照して、本実施の形態にお
ける通話音声解析時の感情推定手段による解析動作を説明する。 Hereinafter, the operation of the portable terminal device according to the third embodiment of the present invention using the analysis unit having the emotion estimation unit will be described. FIG. 6 is a flowchart in the case where an analysis operation by the emotion estimation means is added at the time of voice data analysis with respect to steps 203 and 204 in the flowchart of FIG. 3 already described. Hereinafter, with reference to the drawings, the analysis operation by the emotion estimation means at the time of call voice analysis in the present embodiment will be described.

図６において、感情推定手段が相手側端末より受信した音声データから声の大きさと長さを表す音声パワー変動というパラメータや、周波数の変動を表すピッチのゆれというパラメータなどを抽出して感情パターンを検出する（ステップ４０３）。例えば、音声データには「わははは」という感情パターンがある場合、しかも効果パターンデータベース１０に「わははは」と対応して会場がどっとわくような効果があらかじめ保存されているときには、音声解析部の感情推定手段により、「わははは」という感情パターンが検出されたら、処理を続行し、それに応じた会場がどっとわくような効果が発せられる。もし感情パターンがない場合、または感情パターンに対応する効果パターンがない場合、特別な処理がなく、音声データをそのまま発せられる。 In FIG. 6, the emotion estimation means extracts parameters such as voice power fluctuations representing the loudness and length of voice, parameters such as pitch fluctuations representing frequency fluctuations, and the like from the voice data received from the partner terminal. Detect (step 403). For example, if the voice data has an emotional pattern “Wawahaha”, and the effect pattern database 10 stores a pre-stored effect that makes the meeting place “wahahaha”, When the emotion estimation means of the voice analysis unit detects the emotion pattern “wahhaha”, the process is continued, and an effect is produced where the venue corresponding to the emotion pattern is excited. If there is no emotion pattern, or if there is no effect pattern corresponding to the emotion pattern, there is no special processing and the voice data can be emitted as it is.

同じように、効果パターンデータベース１０において、一定時間以上の無音区間と対応して効果「ざわざわざわ」が保存されれば、効果「ざわざわざわ」が音声化される。一定時間以上の無音区間がないときは、通常の処理が行われる。また、「そうそう」「だよね」あるいは「Ｙｅａｈ」「Ｈｅｙ」その他の合いの手、ラップの掛け合い等の人声など、ユーザーにより創作されたデータも効果として採用し、より多くのパラメータ及び大量の効果パターンを有することもできる。そして、複数の効果パターン管理部９及びその中にある効果パターンデータベース１０を設けてもよい。 Similarly, if the effect “no bother” is stored in the effect pattern database 10 in association with a silent period of a certain time or longer, the effect “no bother” is voiced. When there is no silent section longer than a certain time, normal processing is performed. In addition, the data created by the user such as “Yes”, “Ya h”, “Y a h”, “H e y”, and other human voices such as laps are also used as an effect, and more parameters and It can also have a large amount of effect patterns. And you may provide the some effect pattern management part 9 and the effect pattern database 10 in it.

（実施の形態４）
次に、本発明の第４の実施の形態にかかる携帯端末装置について説明する。本発明の第４の実施の形態は、携帯端末装置をテレビ電話として、音声データの効果を出力するのみならず、通話中に画像の効果が見られるようにしている。図７に通話中に音声と画像が同時に送られる携帯端末装置３００のブロック図を示す。携帯端末装置３００では、音声合成部５とは別に画像合成部１１が設置されており、効果パターンデータベース２０には画像に対応する効果パターンも貯蔵されている。画像合成部１１には、無線部４で受信した画像データが画像処理部９３で処理されて出力される。また、カメラ９２で撮影した画像も画像処理部９３で処理されて画像合成部１１に出力される。 (Embodiment 4)
Next, the portable terminal device concerning the 4th Embodiment of this invention is demonstrated. In the fourth embodiment of the present invention, a mobile terminal device is used as a videophone so that not only the effect of audio data is output but also the effect of an image can be seen during a call. FIG. 7 shows a block diagram of a portable terminal device 300 in which voice and images are sent simultaneously during a call. In the mobile terminal device 300, the image synthesis unit 11 is installed separately from the voice synthesis unit 5, and the effect pattern corresponding to the image is also stored in the effect pattern database 20. Image data received by the wireless unit 4 is processed by the image processing unit 93 and output to the image composition unit 11. Further, an image captured by the camera 92 is also processed by the image processing unit 93 and output to the image composition unit 11.

携帯端末装置３００の解析照合部７で音声データのパターンと効果パターンが一致すると、効果パターンに対応する画像が効果パターンデータベース２０から読み出され、効果パターン管理部９、解析照合部７を経由して、画像合成部１１に出力される。画像合成部１１は受信した画像データあるいはカメラ９２で撮影した画像と、一致した効果パターンに対応する画像とを合成する。制御部１には表示部１２を接続しているので、合成した画像は表示部１２に表示される。 When the analysis data collation unit 7 of the portable terminal device 300 matches the pattern of the audio data with the effect pattern, an image corresponding to the effect pattern is read from the effect pattern database 20 and passes through the effect pattern management unit 9 and the analysis collation unit 7. And output to the image composition unit 11. The image synthesis unit 11 synthesizes the received image data or the image taken by the camera 92 and the image corresponding to the matched effect pattern. Since the display unit 12 is connected to the control unit 1, the synthesized image is displayed on the display unit 12.

例えば、テレビ電話の表示部に受信した通話相手の画像を表示している際、受信した音声データを解析し、「だめじゃん」というキーワードが検出され、しかも効果パターンデータベース２０において、「だめじゃん」と対応して「すいません」という文字が表示画面の上から降りてくるという効果が効果パターンとして保存されていたとする。照合により音声データの解析結果と効果パターンが一致した場合は、画像合成部１１において、この効果パターンに対応する効果を画像に合成する。このことにより、表示部１２において、通話している相手本人の映像が表示されると同時に、画面の上端から「すいません」という文字が降りてくる。「だめじゃん」と怒っている通話相手の顔の上方から、返事すべき台詞として「すいません」の文字が降りてくれば、その文字を読むことで「すいません。」と素直に答えることができる。なお、一致しない場合は、効果が出力されない通常の処理が行われる。 For example, when the received image of the other party is displayed on the display unit of the videophone, the received voice data is analyzed, and the keyword “damejan” is detected. In the effect pattern database 20, “damejan” And the effect that the letter “sorry” comes down from the top of the display screen is stored as an effect pattern. If the analysis result of the audio data matches the effect pattern by the collation, the image synthesis unit 11 synthesizes the effect corresponding to this effect pattern into the image. As a result, on the display unit 12, the video of the person who is calling is displayed, and at the same time, the text “I'm sorry” comes down from the upper end of the screen. If the word “I'm sorry” comes down from above the face of the caller who ’s angry, “No. If they do not match, normal processing in which no effect is output is performed.

なお、図７のテレビ電話ではテレビ電話で話している自分の顔をカメラ９２で撮影して、表示部１２の画面を分割して表示することにより、通話相手の顔を表示すると同時に自分の顔も表示することができるようにしている。そのため、受信した通話相手の音声データを解析して付加する効果画像を通話相手の画像に合成して表示し、送信する自分の音声データを解析して付加する効果画像を自分の画像に合成して表示するようにすると、それぞれが話した音声内容をそれぞれの画像の上に文字画像として強調して示すことができる。例えば「わははは」と相手が笑えば、笑った通話相手の画像に「わははは」という文字が表示され、「ホホホホ」と自分が笑えば「ホホホホ」という文字が自分の画像に表示される。 In the videophone shown in FIG. 7, the user's face talking on the videophone is photographed by the camera 92, and the screen of the display unit 12 is divided and displayed, so that the other party's face is displayed at the same time. It can also be displayed. Therefore, the received voice data of the other party is analyzed and added to the other party's image and displayed, and the transmitted voice data is analyzed and added to the own picture. When displayed, the speech content spoken by each can be highlighted and displayed as a character image on each image. For example, if the other party laughs, “Wahaha”, the word “Wahaha” will be displayed on the image of the other party who laughed. Is displayed.

（実施の形態５）
以上、受信した相手の音声に効果を付加して、自端末に出力する実施の形態を説明したが、本発明の第５の実施の形態として、自分の音声に効果を付加して相手側端末に送信するようにした携帯端末装置について説明する。図８に自端末から発した音声に対して効果を出す場合の本発明の第５の実施の形態にかかる携帯端末装置４００のブロック図を示す。 (Embodiment 5)
The embodiment of adding the effect to the received voice of the other party and outputting it to the own terminal has been described above. However, as the fifth embodiment of the present invention, the effect is added to the own voice and the other party's terminal is added. A mobile terminal device that transmits data to the mobile terminal will be described. FIG. 8 shows a block diagram of a portable terminal device 400 according to the fifth embodiment of the present invention in the case where an effect is exerted on the voice emitted from the terminal.

本実施の形態によれば、自端末のマイク１５０に発せられた通話音声である音声データ
を解析手段である音声解析部８で解析してパラメータを抽出し、抽出したパラメータのパターンを解析し、解析したパターンと効果パターンを解析照合部７で照合して、一致した場合、効果パターンに対応する効果をデータ音声化部６で音声化し、音声合成部５で通話音声と合成することにより、自分の音声に効果を付加して、制御部１の制御のもと、無線部４より出力する。このことにより、自分の音声に自動的に効果を付加して送信し、相手側端末に出力することができる。 According to the present embodiment, voice data, which is a call voice emitted from the microphone 150 of the terminal itself, is analyzed by the voice analysis unit 8 which is an analysis means, parameters are extracted, a pattern of the extracted parameters is analyzed, The analyzed pattern and the effect pattern are collated by the analysis collating unit 7, and if they match, the effect corresponding to the effect pattern is voiced by the data voice converting unit 6 and synthesized by the voice synthesizing unit 5 by the voice synthesizing unit 5. The effect is added to the voice of, and output from the wireless unit 4 under the control of the control unit 1. As a result, it is possible to automatically add an effect to one's own voice and transmit it to the other terminal.

（実施の形態６）
次に、本発明の第６の実施の形態における携帯端末装置について説明する。本発明の第６の実施の形態では、上記第１〜第５の実施の形態で述べた解析照合部７で照合して一致する度に効果を付加するものではなく、会話が途切れたときに効果を付加する動作を特徴とする携帯端末装置について説明する。 (Embodiment 6)
Next, a portable terminal device according to a sixth embodiment of the present invention will be described. In the sixth embodiment of the present invention, an effect is not added every time matching is performed by the analysis matching unit 7 described in the first to fifth embodiments, and the conversation is interrupted. A portable terminal device characterized by an operation of adding an effect will be described.

図９は、本発明の第６の実施の形態における携帯端末装置５００のブロック図を示す。 FIG. 9 shows a block diagram of a portable terminal device 500 according to the sixth embodiment of the present invention.

図９はすでに第２の実施の形態にて説明した図４の構成に加えて、記憶部１３を備えた構成を有する。記憶部１３は、携帯端末装置が標準的に備えている汎用的なメモリであって、解析照合部７で解析結果と効果パターンとが一致したときに、効果パターン管理部９から出力される効果を識別する識別情報（例として、識別番号とする）を記憶する。 FIG. 9 has a configuration provided with a storage unit 13 in addition to the configuration of FIG. 4 already described in the second embodiment. The storage unit 13 is a general-purpose memory that is normally provided in the mobile terminal device, and the effect output from the effect pattern management unit 9 when the analysis result matches the effect pattern in the analysis matching unit 7. Is stored as identification information (for example, an identification number).

図１０は、本発明の第６の実施の形態における携帯端末装置５００を構成する効果パターンデータベース３０の記憶領域に記憶されているデータ構成の一例を示す図である。 FIG. 10 is a diagram illustrating an example of a data configuration stored in the storage area of the effect pattern database 30 included in the mobile terminal device 500 according to the sixth embodiment of the present invention.

図示したように、効果パターンデータベース３０は、識別番号３１、効果パターン９０および効果９１が一組のデータとして複数組記憶されている。ここで、効果パターン９０が一定時間以上の無音区間の場合、効果９１として、記憶部１３に記憶されている効果を読み出すよう示している。 As illustrated, the effect pattern database 30 stores a plurality of sets of identification numbers 31, effect patterns 90, and effects 91 as a set of data. Here, when the effect pattern 90 is a silent section of a certain time or longer, the effect stored in the storage unit 13 is read as the effect 91.

図１１は、本発明の第６の実施の形態における携帯端末装置５００の処理フローを示すフローチャートである。なお、本処理フローは、すでに第２の実施の形態にて説明した図５の処理フローに、適合した効果パターンに関連付けてある効果を識別する識別番号３１を保存するステップ（ステップ５０１）と、会話が途切れたことを判定するステップ（ステップ５０２）とを加えたものである。 FIG. 11 is a flowchart showing a process flow of the mobile terminal device 500 according to the sixth embodiment of the present invention. Note that this processing flow stores an identification number 31 for identifying an effect associated with an effect pattern adapted to the processing flow of FIG. 5 already described in the second embodiment (step 501); And a step (step 502) for determining that the conversation is interrupted.

図１１のステップ２０２において、相手側端末より受信した音声データが「へー、そうなんだ、わははは、（一定時間の無音区間）」である場合を例として、以下に携帯端末装置５００の処理フローを説明する。 In step 202 of FIG. 11, the case where the voice data received from the counterpart terminal is “Hey, yes, my name is (silent period of a certain period of time)” is described below as an example. The flow will be described.

まず、携帯端末装置５００は、音声通話の発信側である相手側端末と音声通話を開始し（ステップ２０１）、通話中に相手側端末より発信された音声データ「へー、そうなんだ、わははは、（一定時間の無音区間）」を受信する（ステップ２０２）。 First, the mobile terminal device 500 starts a voice call with the counterpart terminal that is the originator of the voice call (step 201), and the voice data sent from the counterpart terminal during the call is “Hey, that ’s right. Receives (silent period of a certain time) "(step 202).

次に、音声認識部８２を有する音声解析部８１により、音声データからキーワードといったようなパラメータのパターンを抽出する（ステップ２０３）。ここでは、音声解析部８１は、「へー」、「そうなんだ」、「わははは」および「（一定時間の無音区間）」の４つのパターンを抽出し（ステップ３０３）、順次、解析照合部７に出力する。 Next, the speech analysis unit 81 having the speech recognition unit 82 extracts a parameter pattern such as a keyword from the speech data (step 203). Here, the speech analysis unit 81 extracts four patterns of “he”, “sore”, “wahaha” and “(silent period of a certain time)” (step 303), and sequentially analyzes and collates them. Output to unit 7.

次に、解析照合部７は、解析結果であるパターンと、効果パターンデータベース３０の効果パターン９０とを照合する（ステップ２０４）。 Next, the analysis collation unit 7 collates the pattern that is the analysis result with the effect pattern 90 of the effect pattern database 30 (step 204).

解析照合部７において、解析結果であるパターンと効果パターン９０とを照合して一致した場合、一致した効果パターンが一定時間の無音区間であるか否かを判定する（ステップ５０２）。なお、無音区間であるか否かの判定は、ステップ２０３における音声データの解析時に、無音部分の継続時間を計測するタイマー（図示せず）を設定し、その経過時間に基づいて判定させてもよいし、第３の実施の形態で説明した、効果パターンデータベース１０による無音区間選択の処理を用いてもよい。 If the analysis collation unit 7 collates and matches the analysis result pattern with the effect pattern 90, the analysis collation unit 7 determines whether or not the matched effect pattern is a silent section of a fixed time (step 502). Whether or not it is a silent section may be determined based on the elapsed time by setting a timer (not shown) for measuring the duration of the silent part when analyzing the audio data in step 203. Alternatively, the silent section selection processing by the effect pattern database 10 described in the third embodiment may be used.

一方、解析照合部７において、解析結果であるパターンと効果パターン９０とが一致しない場合、特別な処理なしで（ステップ２０５）、通常の音声データをそのまま発する（ステップ２０９）。 On the other hand, if the analysis result pattern and the effect pattern 90 do not match, the analysis collating unit 7 emits normal voice data as it is without any special processing (step 205) (step 209).

次に、解析照合部７は、一致した効果パターン９０が一定時間の無音区間でないと判定した場合、一致した効果９１を出力すべき効果として特定し（ステップ２０６）、効果パターンデータベース３０から、一致した効果パターンに対応する識別番号３１と順番を対応させて順次、記憶部１３に保存する（ステップ５０１）。そして、一致した効果パターンが一定時間の無音区間であるまで、ステップ２０２、ステップ２０３、ステップ３０３、ステップ２０４、ステップ５０２、ステップ２０６およびステップ５０１の処理を繰り返し行う。 Next, when the analysis matching unit 7 determines that the matched effect pattern 90 is not a silent section of a certain time, the analysis matching unit 7 identifies the matched effect 91 as an effect to be output (step 206), and matches from the effect pattern database 30. The identification numbers 31 corresponding to the effect patterns are sequentially stored in the storage unit 13 in correspondence with the order (step 501). Then, the processes of Step 202, Step 203, Step 303, Step 204, Step 502, Step 206, and Step 501 are repeated until the matched effect pattern is a silent section of a fixed time.

ここで、図１２に記憶部１３に記憶されるデータ構成の一例を示す。記憶部１３は、図１２に示すように、左の列に記憶する順番１２１が、順に「１」、「２」というように並び、右の列に順番１２１に対応する識別番号が順次記憶される。図１２では、相手側端末より受信した音声データが「へー、そうなんだ、わははは、（一定時間の無音区間）」である場合のデータ構成の一例を示している。 Here, FIG. 12 shows an example of the data configuration stored in the storage unit 13. As shown in FIG. 12, in the storage unit 13, the order 121 stored in the left column is sequentially arranged as “1” and “2”, and the identification numbers corresponding to the order 121 are sequentially stored in the right column. The FIG. 12 shows an example of the data configuration in the case where the voice data received from the partner terminal is “Hey, yes, that is (silent period of a certain time)”.

記憶部１３は、解析照合部７で一致すると判定された効果パターン「へー」に対応する識別番号「０００３」、および効果パターン「わははは」に対応する識別番号「０００５」を解析照合部７で一致するとの判定がなされた順に記憶している。 The storage unit 13 receives the identification number “0003” corresponding to the effect pattern “he” determined to be matched by the analysis matching unit 7 and the identification number “0005” corresponding to the effect pattern “wahahaha”. 7 are stored in the order in which they are determined to match.

次に、解析照合部７は、一致した効果パターンが一定時間の無音区間であると判定した場合、記憶部１３に記憶されている識別番号９２「０００３」および「０００５」を順番１２１に従って読み出し、効果パターン管理部９から効果９１「ヘーヘーヘーヘーヘー」および効果９１「会場がどっとわくような効果」を音声化する（ステップ２０７）。そして、音声合成部５は、相手側端末から受信した音声データ（無音部分）と、効果９１とを音声化したデータを合成し（ステップ２０８）、スピーカ３は合成した音声データの鳴動を行う（ステップ２０９）。 Next, when the analysis matching unit 7 determines that the matched effect pattern is a silent section of a certain time, the identification numbers 92 “0003” and “0005” stored in the storage unit 13 are read according to the order 121, The effect pattern management unit 9 utters the effects 91 “hehehehehehe” and the effect 91 “an effect that makes the venue feel exciting” (step 207). Then, the voice synthesizer 5 synthesizes the voice data (silent part) received from the counterpart terminal and the data obtained by converting the effect 91 into voice (step 208), and the speaker 3 rings the synthesized voice data (step 208). Step 209).

なお、本実施の形態では、相手側端末より受信した音声データに基づいて無音区間に効果９１を出力する場合について説明を行ったが、図８の構成に記憶部１３を設けることにより、自端末から送信する音声データに基づいて無音区間に効果９１を送信する場合についても同様に行うことができる。 In the present embodiment, the case where the effect 91 is output in the silent section based on the voice data received from the counterpart terminal has been described. However, by providing the storage unit 13 in the configuration of FIG. The same can be applied to the case where the effect 91 is transmitted in the silent period based on the audio data transmitted from.

なお、また、本実施の形態では、音声認識部８２により、相手側端末より受信した音声データからキーワードを抽出し（ステップ３０３）、効果を付加する処理を説明したが、ステップ３０３の処理を、図６のステップ４０３に置き換えることにより、感情パターンにより効果を付加することもできる。 In the present embodiment, the voice recognition unit 82 extracts the keyword from the voice data received from the counterpart terminal (step 303), and the process of adding the effect has been described. By replacing with step 403 in FIG. 6, an effect can be added by an emotion pattern.

なお、また、本実施の形態では、音声データの効果を出力するのみであったが、通話中に音声と画像が同時に送受信できるテレビ電話の構成（図７）に記憶部１３を設けることにより、無音区間に効果９１を付加した画像を出力または送信することができる。 In addition, in the present embodiment, only the effect of the voice data is output, but by providing the storage unit 13 in the configuration of the videophone (FIG. 7) that can simultaneously transmit and receive voice and images during a call, An image with the effect 91 added to the silent section can be output or transmitted.

なお、また、本実施の形態では、記憶部１３に、解析照合部７で一致した効果９１を示す識別番号３１を記憶するよう説明したが、効果パターンデータベース３０を記憶部１３として用い、効果パターンデータベース３０に、一致した効果９１を識別するフラグや順番１２１等を設けるようにしてもよい。 In the present embodiment, it has been described that the storage unit 13 stores the identification number 31 indicating the effect 91 matched by the analysis collation unit 7. However, the effect pattern database 30 is used as the storage unit 13, and the effect pattern is used. The database 30 may be provided with a flag for identifying the matched effect 91, the order 121, and the like.

なお、また、本実施の形態では、記憶部１３に、解析照合部７で一致した効果９１を示す識別番号３１を順次記憶するよう説明したが、解析照合部７で一致するとの判定がなされた最新の効果を示す識別番号のみを記憶させるようにしてもよい。これにより、効果９１の出力または送信の頻度を抑えることができ、通話中に効果パターンデータベース３０のキーワードが連呼された場合であっても、過度な効果の付加を防止することができる。 In the present embodiment, it has been described that the identification number 31 indicating the effect 91 matched by the analysis collation unit 7 is sequentially stored in the storage unit 13, but it is determined that the analysis collation unit 7 matches. Only an identification number indicating the latest effect may be stored. Thereby, the output or transmission frequency of the effect 91 can be suppressed, and even when the keyword of the effect pattern database 30 is continuously called during a call, it is possible to prevent an excessive effect from being added.

なお、また、記憶部１３に、１つの文章中に使用されたキーワードと効果パターン９０との一致する回数を、識別番号３１と対応付けて記憶しておき、最も使用頻度の高かった効果パターン９０に対応する効果９１を出力または送信するようにしてもよい。 In addition, the number of times that the keyword used in one sentence matches the effect pattern 90 is stored in the storage unit 13 in association with the identification number 31, and the effect pattern 90 having the highest use frequency is stored. You may make it output or transmit the effect 91 corresponding to.

以上のように、本発明の第６の実施の形態の携帯端末装置によれば、無音区間を有効的に活用することができ、会話中に効果が音声または画像に頻繁に付加されることを防止し、ユーザーの会話に対する集中を阻害せずに効果を出力することができる。 As described above, according to the mobile terminal device of the sixth exemplary embodiment of the present invention, it is possible to effectively use the silent section, and the effect is frequently added to the voice or the image during the conversation. It is possible to output the effect without disturbing the user's concentration on the conversation.

そのほか、各部の具体的構成に関しては、上記実施の形態に限られるものではなく、本発明の趣旨を逸脱しない範囲で種々変形が可能である。 In addition, the specific configuration of each part is not limited to the above embodiment, and various modifications can be made without departing from the spirit of the present invention.

以上に詳述した本発明によれば、音声データを解析して、ユーザーの思うように予め定めた任意の効果を自端末に出力し、あるいは相手側端末に送信することができるので、通話中にエンターテインメント性を高め、高度なサウンドエフェクト機能を発揮する携帯端末装置に適用することができる。 According to the present invention described in detail above, it is possible to analyze voice data and output a predetermined effect as desired by the user to the own terminal or to transmit it to the partner terminal. In addition, it can be applied to a portable terminal device that enhances entertainment and exhibits advanced sound effect functions.

本発明の第１の実施の形態における携帯端末装置のブロック図The block diagram of the portable terminal device in the 1st Embodiment of this invention 本発明の第１の実施の形態における携帯端末装置の効果パターンデータベースのデータ構成を示す図The figure which shows the data structure of the effect pattern database of the portable terminal device in the 1st Embodiment of this invention. 本発明の第１の実施の形態における解析時の動作を示すフローチャートThe flowchart which shows the operation | movement at the time of the analysis in the 1st Embodiment of this invention 本発明の第２の実施の形態における携帯端末装置のブロック図The block diagram of the portable terminal device in the 2nd Embodiment of this invention 本発明の第２の実施の形態における解析時の動作を示すフローチャートThe flowchart which shows the operation | movement at the time of the analysis in the 2nd Embodiment of this invention. 本発明の第３の実施の形態における解析時の動作を示すフローチャートThe flowchart which shows the operation | movement at the time of the analysis in the 3rd Embodiment of this invention 本発明の第４の実施の形態における携帯端末装置のブロック図The block diagram of the portable terminal device in the 4th Embodiment of this invention 本発明の第５の実施の形態における携帯端末装置のブロック図The block diagram of the portable terminal device in the 5th Embodiment of this invention 本発明の第６の実施の形態における携帯端末装置のブロック図The block diagram of the portable terminal device in the 6th Embodiment of this invention 本発明の第６の実施の形態における携帯端末装置の効果パターンデータベースのデータ構成を示す図The figure which shows the data structure of the effect pattern database of the portable terminal device in the 6th Embodiment of this invention. 本発明の第６の実施の形態における解析時の動作を示すフローチャートThe flowchart which shows the operation | movement at the time of the analysis in the 6th Embodiment of this invention 本発明の第６の実施の形態における記憶部のデータ構成を示す図The figure which shows the data structure of the memory | storage part in the 6th Embodiment of this invention. 従来の携帯端末装置のブロック図Block diagram of a conventional portable terminal device

Explanation of symbols

１制御部
２操作部
３スピーカ
４無線部
５音声合成部
６データ音声化部
７解析照合部
８音声解析部
９効果パターン管理部
１０効果パターンデータベース
１１画像合成部
１２表示部
１３記憶部
１００携帯端末装置 DESCRIPTION OF SYMBOLS 1 Control part 2 Operation part 3 Speaker 4 Radio | wireless part 5 Speech synthesis part 6 Data voice conversion part 7 Analysis collation part 8 Voice analysis part 9 Effect pattern management part 10 Effect pattern database 11 Image composition part 12 Display part 13 Storage part 100 Portable terminal apparatus

Claims

An analysis means for analyzing the received call data;
An effect pattern database for accumulating the effect pattern for matching with the analysis result of the call data and the effect associated with the effect pattern;
Effect pattern management means for managing the output of the effect pattern;
An analysis collation unit that collates the analysis result with an effect pattern and a synthesis unit that adds and outputs the effect to at least one of voice and image output to the terminal during a call,
The analysis result analyzed by the analysis means and the effect pattern whose output is managed by the effect pattern management means are collated by the analysis collation means, and the effect associated with the matched effect pattern is output to the own terminal A mobile terminal device configured to output by adding to at least one of images.

An analysis means for analyzing the call data to be transmitted;
An effect pattern database for accumulating the effect pattern for matching with the analysis result of the call data and the effect associated with the effect pattern;
Effect pattern management means for managing the output of the effect pattern;
An analysis collation unit that collates the analysis result with an effect pattern, and a synthesis unit that adds the effect to at least one of voice and an image transmitted to a partner terminal during a call,
The analysis result analyzed by the analysis unit and the effect pattern whose output is managed by the effect pattern management unit are collated by the analysis collation unit, and the effect associated with the matched effect pattern is transmitted to the counterpart terminal Alternatively, a portable terminal device configured to be added to at least one of images and transmitted.

The synthesizing unit collates the analysis result with the effect pattern by the analysis collating unit, and each time it is determined that they match each other, the synthesizing unit displays an effect associated with the adapted effect pattern at least on the sound or the image. The mobile terminal device according to claim 1, wherein the mobile terminal device is added to either one.

Storage means for storing identification information for identifying an effect associated with the adapted effect pattern or an effect associated with the adapted effect pattern;
The synthesizing means, when the analyzing means detects a silent section of a predetermined time from the call data, shows the effect stored in the storage means or the effect indicated by the identification information at least for the voice or the image. The mobile terminal device according to claim 1, wherein the mobile terminal device is added to either one.

The portable terminal device according to any one of claims 1 to 4, further comprising image combining means for combining an effect with an image to be displayed during a call.

The portable terminal device according to claim 1, wherein the analysis unit is an analysis unit having a voice recognition unit.

The portable terminal device according to claim 1, wherein the analysis unit is an analysis unit including an emotion estimation unit.