JP2005274791A

JP2005274791A - Mobile communication terminal

Info

Publication number: JP2005274791A
Application number: JP2004085751A
Authority: JP
Inventors: Takeshi Ikagawa; 武五百川
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2004-03-23
Filing date: 2004-03-23
Publication date: 2005-10-06

Abstract

<P>PROBLEM TO BE SOLVED: To enable a reader to better understand a document resulting from converting voice information to text information, without using images. <P>SOLUTION: A voice information measurement part 104 measures sound volumes and voice tones from a voice signal transmitted from a mobile phone 102 and a voice signal inputted by a user of a mobile phone 101. A voice/text conversion part 105 uses a voice recognition technique to convert voice information outputted from a transmission/reception part 103 to text information being a string of digital character codes. A rich text generation part 106 adds a measurement result corresponding to text information to the text information to generate rich characters on the basis of the text information outputted from the voice/text conversion part 105 and the measurement result outputted from the voice information measurement part 104. A display part 108 displays series of generated rich characters as a rich text while representing sound volumes by sizes of characters and representing heights of tones by line thickness of characters. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、音声情報をテキスト情報に変換する音声テキスト変換装置を有する移動通信端末装置に関する。 The present invention relates to a mobile communication terminal apparatus having a speech text conversion device that converts speech information into text information.

従来の音声テキスト変換装置としては、特許文献１に記載されている技術が開示されている。特許文献１に記載の技術は、入力された音声に基づいて話者を特定する属性情報を生成し、音声情報からテキスト情報に変換した文書と属性情報とを視覚的に分かるように表示するものである。これにより、文書の読み手は、音声入力者の確認を行いながら文書をより明確に理解することができる。 As a conventional speech-to-text converter, the technique described in Patent Document 1 is disclosed. The technique described in Patent Document 1 generates attribute information for identifying a speaker based on input speech, and displays the document converted from speech information to text information and the attribute information so as to be visually understood. It is. Thereby, the reader of a document can understand a document more clearly, confirming a voice input person.

また、特許文献２には、少ないキー配列の携帯電話等で文字入力を行う煩わしさを解消するため、音声情報をテキスト情報に変換し、電子メール等の文字入力を行うことが開示されている。 Patent Document 2 discloses that voice information is converted into text information and character input such as an e-mail is performed in order to eliminate the troublesomeness of inputting characters with a mobile phone or the like having a small key arrangement. .

また、特許文献３には、メッセージ発信者の音声と、メッセージ発信者本人に代わって伝言メッセージを伝達する代理画像データとから伝言データが編成され、メッセージ発信者の入力操作により設定された感情が、定量化したパラメータをもとにして選択されるので、音声と画像による伝言メッセージを送ることが開示されており、これにより、意思伝達を確実に行うことができる。
特開平１１−２４２６６９号公報特開２００２−３４４５７４号公報特開２００２−０４１２７９号公報 Also, in Patent Document 3, message data is organized from the voice of a message sender and proxy image data that transmits a message message on behalf of the message sender, and the emotion set by the input operation of the message sender Since it is selected on the basis of the quantified parameters, it is disclosed that a message message by voice and image is sent, thereby making it possible to communicate intentionally.
Japanese Patent Laid-Open No. 11-242669 JP 2002-344574 A Japanese Patent Laid-Open No. 2002-041279

しかしながら、上記特許文献１及び２に記載の技術では、音声情報からテキスト情報に変換された文書には、話者の感情を読み取る要素があまり含まれておらず、高度で円滑なコミュニケーションを行うことには適していない。また、上記特許文献３に記載の技術では、予め画像データを大量に用意しておく必要があるため、メモリ容量が大きくなるという問題がある。 However, in the techniques described in Patent Documents 1 and 2, the document converted from the voice information to the text information does not include many elements for reading the emotion of the speaker, and performs high-level and smooth communication. Not suitable for. Further, the technique described in Patent Document 3 requires a large amount of image data to be prepared in advance, resulting in a problem that the memory capacity increases.

本発明はかかる点に鑑みてなされたものであり、音声情報からテキスト情報に変換された文書を、画像を用いずに読み手がより理解できる移動通信端末装置を提供することを目的とする。 The present invention has been made in view of this point, and an object of the present invention is to provide a mobile communication terminal device that allows a reader to better understand a document converted from voice information to text information without using an image.

本発明の移動通信端末装置は、入力された音声情報に基づいて音量及び音調を測定する音声測定手段と、前記入力音声情報をテキスト情報に変換する音声テキスト変換手段と、前記音声テキスト変換手段で変換されたテキスト情報と前記音声測定手段で測定された音量と音調との組み合わせを文字を修飾するパラメータとしてリッチテキストを生成する生成手段と、前記生成手段によって生成されたリッチテキストを表示する表示手段と、を具備する構成を採る。 The mobile communication terminal apparatus according to the present invention includes a voice measurement unit that measures a volume and a tone based on input voice information, a voice text conversion unit that converts the input voice information into text information, and the voice text conversion unit. Generating means for generating rich text using a combination of the converted text information and the volume and tone measured by the voice measuring means as parameters for modifying characters, and display means for displaying the rich text generated by the generating means The structure which comprises these is taken.

この構成によれば、音声情報から測定された音量及び音調を文字の修飾で表示することにより、音声情報を視覚的に表示することができるので、読み手が会話の内容をより理解することができる。 According to this configuration, since the sound information can be visually displayed by displaying the volume and tone measured from the sound information with the modification of the characters, the reader can understand the content of the conversation more. .

本発明の移動通信端末装置は、上記構成において、通信相手から送信された音声情報及び身体情報を受信する受信手段と、自装置のユーザの身体情報を検知する検知手段と、通信相手から送信された身体情報及び前記検知手段が検知した身体情報を解析する身体情報解析手段と、をさらに具備し、前記生成手段は、前記音量と音調との組合せに加えて前記身体情報解析手段で解析された結果を文字を修飾するパラメータとしてリッチテキストを生成する構成を採る。 The mobile communication terminal device according to the present invention is transmitted from the communication partner, receiving means for receiving voice information and physical information transmitted from the communication partner, detection means for detecting the physical information of the user of the user device, in the above configuration. Physical information analyzing means for analyzing the physical information detected by the detecting means and the physical information detected by the detecting means, and the generating means is analyzed by the physical information analyzing means in addition to the combination of the volume and the tone. A configuration is adopted in which rich text is generated using the result as a parameter for modifying characters.

この構成によれば、身体情報を文字の修飾で表示することにより、話者の心理的状況を視覚的に表示することができ、読み手が会話の内容をより一層理解することができる。 According to this configuration, by displaying the body information by modification of characters, the psychological situation of the speaker can be visually displayed, and the reader can further understand the content of the conversation.

本発明の移動通信端末装置は、上記構成において、前記身体情報が、体温、血圧、心拍数のうち少なくとも１つ以上である構成を採る。 The mobile communication terminal device of the present invention employs a configuration in which, in the above configuration, the physical information is at least one of body temperature, blood pressure, and heart rate.

以上説明したように、本発明によれば、話者の声の大小及び声の高低といった音声情報を文字の修飾で表示することにより、音声情報を視覚的に表示することができるので、画像を用いずに読み手が会話の内容をより理解することができる。 As described above, according to the present invention, the voice information can be visually displayed by displaying the voice information such as the magnitude of the voice of the speaker and the pitch of the voice by character modification. The reader can understand the content of the conversation without using it.

以下、本発明の実施の形態について、図面を参照して詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

（実施の形態１）
図１は、本発明の実施の形態１における携帯電話１０１の構成を示すブロック図である。この図において、携帯電話１０１は携帯電話１０２と音声通話を行うものである。ここでは、携帯電話１０１の構成について説明するが、携帯電話１０２は携帯電話１０１と同様の構成を有していてもよいし、有していなくてもよい。 (Embodiment 1)
FIG. 1 is a block diagram showing a configuration of mobile phone 101 according to Embodiment 1 of the present invention. In this figure, a mobile phone 101 performs a voice call with a mobile phone 102. Here, the configuration of the mobile phone 101 will be described, but the mobile phone 102 may or may not have the same configuration as the mobile phone 101.

送受信部１０３は、携帯電話１０２から送信された音声情報を受信し、受信した音声情報に所定の受信処理を行い、受信処理後の音声情報を音声情報測定部１０４及び音声テキスト変換部１０５に出力する。また、携帯電話１０１のユーザが入力した音声を音声情報測定部１０４及び音声テキスト変換部１０５に出力する。 The transmission / reception unit 103 receives the voice information transmitted from the mobile phone 102, performs a predetermined reception process on the received voice information, and outputs the voice information after the reception process to the voice information measurement unit 104 and the voice text conversion unit 105. To do. In addition, the voice input by the user of the mobile phone 101 is output to the voice information measurement unit 104 and the voice text conversion unit 105.

音声情報測定部１０４は、送受信部１０３から出力された音声信号（音声情報）を測定する。具体的には、音量（声の大小）及び音調（声の高低）を測定し、測定結果をディジタル多値化する。音声情報測定部１０４の詳細については後述する。測定結果はリッチテキスト生成部１０６に出力される。 The audio information measurement unit 104 measures the audio signal (audio information) output from the transmission / reception unit 103. Specifically, the sound volume (voice level) and tone (voice level) are measured, and the measurement result is converted into a digital multi-value. Details of the voice information measuring unit 104 will be described later. The measurement result is output to the rich text generation unit 106.

音声テキスト変換部１０５は、音声認識技術を用いて、送受信部１０３から出力された音声情報をディジタルなキャラクタコードの列（以下、「テキスト情報」という）に変換する。テキスト情報はリッチテキスト生成部１０６に出力される。また、文節や音声の区切りを判断し、行末コードをリッチテキスト生成部１０６に出力する。 The voice text conversion unit 105 converts the voice information output from the transmission / reception unit 103 into a digital character code string (hereinafter referred to as “text information”) using a voice recognition technique. The text information is output to the rich text generation unit 106. In addition, it determines the paragraph or voice break and outputs the end-of-line code to the rich text generator 106.

リッチテキスト生成部１０６は、音声テキスト変換部１０５から出力されたテキスト情報と、音声情報測定部１０４から出力された測定結果とに基づいて、テキスト情報に対応する測定結果を当該テキスト情報に付加してリッチキャラクタを生成する。このようにして、リッチテキスト生成部１０６に順次入力されるテキスト情報（キャラクタコード）を入力された順にリッチキャラクタにする。このように生成された一連のリッチキャラクタをリッチテキストという。リッチテキスト生成部１０６は、音声テキスト変換部１０５から行末コードを取得すると、リッチテキストをログ格納部１０７に出力する。 Based on the text information output from the speech text conversion unit 105 and the measurement result output from the speech information measurement unit 104, the rich text generation unit 106 adds a measurement result corresponding to the text information to the text information. To generate a rich character. In this way, the text information (character code) sequentially input to the rich text generation unit 106 is changed to a rich character in the input order. A series of rich characters generated in this way is called rich text. When the rich text generation unit 106 acquires the end-of-line code from the voice text conversion unit 105, the rich text generation unit 106 outputs the rich text to the log storage unit 107.

ログ格納部１０７は、リッチテキスト生成部１０６から出力されたリッチテキストにタイムスタンプ（時刻情報）を付加して格納する。 The log storage unit 107 adds a time stamp (time information) to the rich text output from the rich text generation unit 106 and stores the rich text.

表示部１０８は、ログ格納部１０７からリッチテキストとタイムスタンプを読み出し、音量の大小を文字の大小に、音調の高低を文字の細太にそれぞれ変換して表示する。 The display unit 108 reads the rich text and the time stamp from the log storage unit 107, converts the volume level to the size of the character, and converts the tone level to the thin text.

図２は、本発明の実施の形態１における音声情報測定部１０４の内部構成を示すブロック図である。この図において、音量測定部２０１は、送受信部１０３から出力された音声信号（音声情報）から音量（声の大小）を測定し、測定値をディジタル多値（例えば、０〜１５の１６段階）に変換し、測定情報記憶部２０３に出力する。 FIG. 2 is a block diagram showing an internal configuration of audio information measuring section 104 in Embodiment 1 of the present invention. In this figure, the volume measuring unit 201 measures the volume (volume of voice) from the audio signal (audio information) output from the transmission / reception unit 103, and the measured value is a digital multivalue (for example, 16 levels from 0 to 15). And output to the measurement information storage unit 203.

音調測定部２０２は、送受信部１０３から出力された音声信号（音声情報）から音調（声の高低）を測定し、測定値をディジタル多値（例えば、０〜１５の１６段階）に変換し、測定情報記憶部２０３に出力する。 The tone measurement unit 202 measures the tone (voice level) from the voice signal (voice information) output from the transmission / reception unit 103, converts the measurement value into a digital multivalue (for example, 16 levels of 0 to 15), Output to the measurement information storage unit 203.

測定情報記憶部２０３は、デュアルポートＲＡＭ（ＤＰＲＡＭ）などであり、音量測定部２０１及び音調測定部２０２から出力された測定値を記憶し、記憶された測定値がリッチテキスト生成部１０６から読み出される。 The measurement information storage unit 203 is a dual port RAM (DPRAM) or the like, stores the measurement values output from the volume measurement unit 201 and the tone measurement unit 202, and the stored measurement values are read from the rich text generation unit 106. .

次に、音声情報測定部１０４及びリッチテキスト生成部１０６の動作について説明する。ここでは、具体例として、「こんにちは」という音声情報をリッチテキストにする場合を想定する。音声情報測定部１０４は、入力された音声情報「こんにちは」について、一文字ずつ音量測定部２０１と音調測定部２０２でそれぞれ音量と音調を測定する。測定された音量と音調はディジタル値で測定情報記憶部２０３に記憶される。 Next, operations of the voice information measurement unit 104 and the rich text generation unit 106 will be described. Here, as a specific example, it is assumed that the audio information of "Hello" to the rich text. Audio information measuring unit 104, the audio information input "Hello", respectively by character-volume measuring unit 201 and the measure of tonality unit 202 measures the volume and tone. The measured volume and tone are stored in the measurement information storage unit 203 as digital values.

リッチテキスト生成部１０６は、音声テキスト変換部１０５からキャラクタコード（テキスト情報）となった「こんにちは」を取得し、文字毎、すなわち１つのキャラクタコード毎に対応する音量と音調を音声情報測定部１０４の測定情報記憶部２０３から読み出す。ここで、各文字（キャラクタコード）に対応する音量及び音調の具体例を図３に示す。なお。音量及び音調を０〜１５の１６段階に区分したものとする。図３では、「こ」が音量４、音調３、「ん」が音量３、音調２、「に」が音量３、音調２、「ち」が音量３、音調２、「は」が音量２、音調３となった場合を示している。音量と音調がキャラクタコードと組み合わされたリッチキャラクタは、行末コードとタイムスタンプが付加されてログ格納部１０７に格納される。リッチキャラクタ及びリッチテキストの例は図４に示すようなものとなる。図４において、char_codeは、キャラクタコードを、volumeは音量を、toneは音調をそれぞれ示している。 Rich Text generation unit 106 acquires the "Hello" became character codes (text data) from the audio text conversion unit 105, for each character, i.e. the volume and tone audio information measuring unit 104 corresponding to each one of character codes Is read from the measurement information storage unit 203. Here, a specific example of the volume and tone corresponding to each character (character code) is shown in FIG. Note that. The volume and tone are divided into 16 levels from 0 to 15. In FIG. 3, “ko” is volume 4, tone 3, “n” is volume 3, tone 2, “ni” is volume 3, tone 2, “chi” is volume 3, tone 2, and “ha” is volume 2. In this case, the tone 3 is shown. A rich character whose volume and tone are combined with a character code is stored in the log storage unit 107 with an end-of-line code and a time stamp added thereto. Examples of rich characters and rich text are as shown in FIG. In FIG. 4, char_code represents a character code, volume represents a volume, and tone represents a tone.

図５は、本実施の形態１におけるリッチテキストの表示例を示す図である。この表示例では、お客様窓口に電話してきたＡとお客様窓口担当のＢとの会話記録（ログ）を示している。ここでは、音量及び音調を０〜１５の１６段階とし、音量が小さければ対応する数値も小さくして表し、音量が大きければ対応する数値も大きくして表す。また、音調が低ければ対応する数値も小さくして表し、音調が高ければ対応する数値も多くして表す。 FIG. 5 is a diagram illustrating a display example of rich text in the first embodiment. This display example shows a conversation record (log) between A who has called the customer service and B who is in charge of the customer service. Here, the volume and tone are 16 levels from 0 to 15, and the corresponding numerical value is reduced when the volume is low, and the corresponding numerical value is increased when the volume is high. If the tone is low, the corresponding numerical value is also reduced, and if the tone is high, the corresponding numerical value is increased.

この図において、１行目から５行目までは、Ａの音量及び音調が共に中程度（例えば、音量６、音調６）であることを示し、Ｂの音量が中程度（例えば、音量６）で、音調がやや低い（例えば、音調２）ことを示している。 In this figure, the first to fifth lines indicate that the volume and tone of A are both medium (for example, volume 6, tone 6), and the volume of B is medium (for example, volume 6). This indicates that the tone is slightly low (for example, tone 2).

６行目では、Ｂの対応に怒りを覚えたＡが低い大きな声（例えば、音量１５、音調２）を発したことを示している。７行目では、Ａの怒りに萎縮したＢが小さな声（例えば、音量２、音調５）を発したことを示している。８行目では、いくぶん冷静さを取り戻したＡが多少音量を下げた声（例えば、音量１３、音調２）を発したことを示している。９行目では、依然、萎縮したＢが小さな声（例えば、音量２、音調５）で対応していることを示している。 The sixth line shows that A who has been angry with B's response has made a loud voice (eg, volume 15, tone 2). The seventh line shows that B, who has shrunk to A's anger, has produced a small voice (eg, volume 2, tone 5). The eighth line shows that A, who has somewhat recovered calmness, has uttered a voice with a somewhat lower volume (eg, volume 13, tone 2). The ninth line shows that B, which has been atrophied, still corresponds with a small voice (for example, volume 2 and tone 5).

この図が示すように、会話の内容を文章化するだけではなく、話者の声の大きさ及び高さを修飾文字で視覚的に表示することにより、ログを読む第三者が文章のみからでは読み取れない臨場感を感じたり、話者の感情を推測したりすることが容易となる。 As this figure shows, not only the content of the conversation is documented, but also the third person who reads the log from the sentence alone by visually displaying the loudness and height of the speaker's voice with modifier characters This makes it easy to feel a sense of reality that cannot be read, and to guess the emotions of the speaker.

このように本実施の形態によれば、話者の声の大小を文字の大小で表し、声の高低を文字の細太で表すことにより、音声情報を視覚的に表示することができるので、読み手が会話の内容をより理解することができる。 As described above, according to the present embodiment, the voice information can be visually displayed by expressing the magnitude of the voice of the speaker with the magnitude of the letters and expressing the pitch of the voice with the thin letters. The reader can better understand the content of the conversation.

なお、本実施の形態では、音声テキスト変換部が日本語音声情報を日本語リッチテキストに変換する場合について説明したが、音声テキスト変換部が翻訳機能を備えていてもよい。すなわち、例えば、日本語音声情報を英語リッチテキストに変換するようにしてもよい。 In the present embodiment, the case where the speech text conversion unit converts Japanese speech information into Japanese rich text has been described, but the speech text conversion unit may include a translation function. That is, for example, Japanese speech information may be converted into English rich text.

（実施の形態２）
実施の形態１では、音量及び音調をパラメータ化して文字の大小及び細太で表示したが、本発明の実施の形態２では、音声以外に感情を伝えるパラメータを用いた場合について説明する。 (Embodiment 2)
In the first embodiment, the volume and tone are parameterized and displayed with the size of the character and the thickness. However, in the second embodiment of the present invention, a case where a parameter that conveys emotion other than the voice is used will be described.

図６は、本発明の実施の形態２における携帯電話６０１の構成を示すブロック図である。ただし、図６が図１と共通する部分は図１と同一の符号を付し、その詳しい説明は省略する。図６が図１と異なる点は、主に、センサ部６０４及び身体情報解析部６０５を設けた点である。ここでは、携帯電話６０１の構成について説明するが、携帯電話６０２の構成も携帯電話６０１と同様の構成とする。 FIG. 6 is a block diagram showing a configuration of mobile phone 601 according to Embodiment 2 of the present invention. However, the parts in FIG. 6 that are the same as those in FIG. 1 are given the same reference numerals as those in FIG. 6 differs from FIG. 1 mainly in that a sensor unit 604 and a body information analysis unit 605 are provided. Here, the configuration of the mobile phone 601 will be described, but the configuration of the mobile phone 602 is the same as that of the mobile phone 601.

図６では、携帯電話６０１と携帯電話６０２が音声通話と同時にデータ通信を行うことを示している。 FIG. 6 shows that the mobile phone 601 and the mobile phone 602 perform data communication simultaneously with the voice call.

送受信部６０３は、携帯電話６０２から送信された音声情報を受信し、受信した音声情報に所定の受信処理を行い、受信処理後の音声情報を音声情報測定部１０４及び音声テキスト変換部１０５に出力する。また、携帯電話６０１のユーザが入力した音声を音声情報測定部１０４及び音声テキスト変換部１０５に出力する。さらに、携帯電話６０２からデータ通信で送信されたデータを受信し、受信したデータに所定の受信処理を行い、受信処理後のデータを身体情報解析部６０５に出力する。 The transmission / reception unit 603 receives the voice information transmitted from the mobile phone 602, performs a predetermined reception process on the received voice information, and outputs the voice information after the reception process to the voice information measurement unit 104 and the voice text conversion unit 105. To do. In addition, the voice input by the user of the mobile phone 601 is output to the voice information measuring unit 104 and the voice text converting unit 105. Furthermore, it receives data transmitted from the mobile phone 602 by data communication, performs a predetermined reception process on the received data, and outputs the data after the reception process to the physical information analysis unit 605.

センサ部６０４は、携帯電話６０１のユーザの体温、血圧、心拍数等（以下、これらを総称して「身体情報」という）を測定し、測定結果を身体情報解析部６０５に出力する。 The sensor unit 604 measures the body temperature, blood pressure, heart rate, and the like (hereinafter collectively referred to as “body information”) of the user of the mobile phone 601, and outputs the measurement result to the body information analysis unit 605.

身体情報解析部６０５は、送受信部６０３から出力されたデータに含まれる携帯電話６０２のユーザの身体情報と、センサ部６０４から出力された身体情報とを取得し、測定値をディジタル多値（例えば、０〜１５の１６段階）にそれぞれ変換し、解析結果をリッチテキスト生成部６０６に出力する。 The physical information analysis unit 605 acquires the physical information of the user of the mobile phone 602 included in the data output from the transmission / reception unit 603 and the physical information output from the sensor unit 604, and the measured value is digital multivalued (for example, , 0 to 15 steps), and outputs the analysis result to the rich text generation unit 606.

リッチテキスト生成部６０６は、音声テキスト変換部１０５から出力されたテキスト情報と、音声情報測定部１０４から出力された測定結果と、身体情報解析部６０５から出力された解析結果とに基づいて、テキスト情報に対応する結果を当該テキスト情報に付加してリッチキャラクタを生成する。リッチテキスト生成部６０６は、音声テキスト変換部１０５から行末コードを取得すると、リッチテキストをログ格納部１０７に出力する。 The rich text generation unit 606 generates text based on the text information output from the speech text conversion unit 105, the measurement result output from the speech information measurement unit 104, and the analysis result output from the body information analysis unit 605. A rich character is generated by adding a result corresponding to the information to the text information. When the rich text generation unit 606 acquires the end-of-line code from the speech text conversion unit 105, the rich text generation unit 606 outputs the rich text to the log storage unit 107.

表示部６０７は、ログ格納部１０７からリッチテキストとタイムスタンプを読み出し、音量の大小を文字の大小に、音調の高低を文字の細太に、身体情報を文字の色にそれぞれ変換して表示する。例えば、体温、血圧、心拍数が共に高い場合は赤色に、逆に、体温、血圧、心拍数が共に低い場合は青色にする。これにより、話者が興奮しているか冷静であるかなどを読み手が推測することができる。 The display unit 607 reads the rich text and the time stamp from the log storage unit 107, converts the volume level to the size of the character, converts the tone level to a thin character, and converts the physical information to the color of the character and displays it. . For example, when the body temperature, blood pressure, and heart rate are both high, the color is red. Conversely, when the body temperature, blood pressure, and heart rate are both low, the color is blue. As a result, the reader can infer whether the speaker is excited or calm.

このように本実施の形態によれば、話者の発した声の音量及び音調に加え、話者の体温、血圧、心拍数といった身体情報をディジタル多値化し、身体情報を文字の色で表示することにより、話者の心理的状況を視覚的に表示することができ、読み手が会話の内容をより一層理解することができる。 As described above, according to the present embodiment, in addition to the volume and tone of the voice uttered by the speaker, the body information such as the body temperature, blood pressure, and heart rate of the speaker is digitally multi-valued, and the body information is displayed in the character color. By doing so, the psychological situation of the speaker can be visually displayed, and the reader can further understand the content of the conversation.

なお、本実施の形態では、身体情報である体温、血圧、心拍数のうち少なくとも１つに基づいて、文字の色を表示してもよい。また、本実施の形態では、身体情報として、体温、血圧、心拍数を挙げたが、本発明はこれに限らず、ユーザが携帯電話を握る力や汗の量などを加えてもよい。要は、話者の心理的状況を反映するパラメータであればよい。 In the present embodiment, the character color may be displayed based on at least one of body temperature, blood pressure, and heart rate, which are body information. In the present embodiment, body temperature, blood pressure, and heart rate are listed as the body information. However, the present invention is not limited to this, and the user may add a force for holding the mobile phone, the amount of sweat, and the like. In short, any parameter that reflects the psychological situation of the speaker may be used.

また、本実施の形態では、音量を文字の大小に、音調を文字の細太に、身体情報を文字の色にそれぞれ対応させて表示することとして説明したが、本発明はこれに限らず、それぞれ任意に組み合わせてよい。さらに、音量、音調、身体情報の他に声質を加えてもよいし、文字の大小、細太、色の他に字体を加えてもよい。 Further, in the present embodiment, it has been described that the volume is displayed in correspondence with the size of the character, the tone is displayed in a thin character, and the body information is displayed in correspondence with the color of the character, but the present invention is not limited thereto, Each may be arbitrarily combined. Furthermore, in addition to volume, tone, and body information, voice quality may be added, and fonts may be added in addition to the size of characters, thickness, and color.

なお、上述した各実施の形態では、会話記録や議事録を作成する場合を想定して説明したが、電子メールの作成等にも応用できる。 In each of the above-described embodiments, description has been made assuming that conversation records and minutes are created, but the present invention can also be applied to creation of e-mails.

本願発明にかかる移動通信端末装置は、話者の声の大小及び声の高低といった音声情報を文字の修飾に反映することにより、音声情報を視覚的に表示することができるので、読み手が会話の内容をより理解することができるという効果を有し、会話記録や議事録等の作成に有用である。 The mobile communication terminal device according to the present invention can visually display the voice information by reflecting the voice information such as the loudness of the speaker and the pitch of the voice in the modification of the character, so that the reader can It has the effect of being able to understand the content more and is useful for creating conversation records and minutes.

本発明の実施の形態１における携帯電話の構成を示すブロック図1 is a block diagram showing a configuration of a mobile phone according to Embodiment 1 of the present invention. 本発明の実施の形態１における音声情報測定部の内部構成を示すブロック図The block diagram which shows the internal structure of the audio | voice information measurement part in Embodiment 1 of this invention. 各文字（キャラクタコード）に対応する音量及び音調の具体例を示す図The figure which shows the specific example of the volume and tone corresponding to each character (character code) リッチキャラクタ及びリッチテキストの例を示す図The figure which shows the example of a rich character and rich text 本実施の形態１におけるリッチテキストの表示例を示す図The figure which shows the example of a display of the rich text in this Embodiment 1. 本発明の実施の形態２における携帯電話の構成を示すブロック図Block diagram showing a configuration of a mobile phone according to a second embodiment of the present invention

Explanation of symbols

１０１、１０２携帯電話
１０３、６０３送受信部
１０４音声情報測定部
１０５音声テキスト変換部
１０６、６０６リッチテキスト生成部
１０７ログ格納部
１０８、６０７表示部
２０１音量測定部
２０２音調測定部
２０３測定情報記憶部
６０４センサ部
６０５身体情報解析部 101, 102 Cellular phone 103, 603 Transmission / reception unit 104 Audio information measurement unit 105 Audio text conversion unit 106, 606 Rich text generation unit 107 Log storage unit 108, 607 Display unit 201 Volume measurement unit 202 Tone measurement unit 203 Measurement information storage unit 604 Sensor unit 605 Body information analysis unit

Claims

Voice measuring means for measuring volume and tone based on the input voice information;
Voice text conversion means for converting the input voice information into text information;
Generating means for generating rich text using the text information converted by the voice text converting means and the combination of the volume and tone measured by the voice measuring means as parameters for modifying characters;
Display means for displaying the rich text generated by the generating means;
A mobile communication terminal device comprising:

Receiving means for receiving voice information and physical information transmitted from a communication partner;
Detecting means for detecting physical information of the user of the device;
Physical information analyzing means for analyzing physical information transmitted from a communication partner and physical information detected by the detecting means;
Further comprising
2. The mobile communication according to claim 1, wherein the generation unit generates a rich text by using a result analyzed by the body information analysis unit in addition to a combination of the volume and a tone as a parameter for modifying a character. Terminal device.

The mobile communication terminal apparatus according to claim 2, wherein the physical information is at least one of body temperature, blood pressure, and heart rate.