JP2001318915A

JP2001318915A - Font conversion device

Info

Publication number: JP2001318915A
Application number: JP2000138282A
Authority: JP
Inventors: Takashi Tsuzuki; 貴史続木; Toshio Niwa; 寿男丹羽; Satoru Inagaki; 悟稲垣; Yoshihiro Kojima; 良宏小島; Kazuhiro Koyama; 和宏小山
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2000-05-11
Filing date: 2000-05-11
Publication date: 2001-11-16

Abstract

PROBLEM TO BE SOLVED: To recognize a sentence spoken by a speaker as voice and to convert the font of a text based on voice information indicating the feature of an inputted voice. SOLUTION: When a speaker speaks a message, voice data are inputted from a voice input part 101 to a voice recognition part 105. The voice recognition part 105 recognizes the voice and converts the message into a text. A power information extraction part 106 extracts voice power from the voice data and applies the extracted power to a font conversion control part 102. On the other hand, font information corresponding to each voice power is stored in a font table 1001. A font conversion control part 102 extracts required font information from the font table 1001 through a font table reference part 104 on the basis of the obtained voice power, converts the font of the input text and outputs the converted font on a display part 103.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、入力された音声や
テキストに対して、入力音声の特徴を示す音声情報又は
音声入力されたテキストの確かさ情報に基づいてフォン
ト変換を行うフォント変換装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a font conversion apparatus for performing a font conversion on an input voice or text based on voice information indicating characteristics of the input voice or reliability information of the voice input text. .

【０００２】[0002]

【従来の技術】日本語ワードプロセッサ等の文書処理装
置として、音声入力ができるものがある。このような文
書処理装置では、入力された音声を認識し、テキスト変
換することによりディスプレイ等に文章を表示する。2. Description of the Related Art Some document processing apparatuses such as a Japanese word processor can input voice. Such a document processing device recognizes the input voice and converts the text to display the text on a display or the like.

【０００３】例えば、特開昭６０−１２９７９５号公報
に開示されている技術では、音声を認識して、その音声
が男性のものか女性のものかを判断する話者認識技術が
記載されている。このような技術では、男女別標準パタ
ーンを用い、不特定話者の入力音声を認識し、この認識
結果から話者の性別判定を行っている。[0003] For example, the technology disclosed in Japanese Patent Application Laid-Open No. Sho 60-129795 discloses a speaker recognition technology for recognizing voice and determining whether the voice is male or female. . In such a technique, a gender-specific standard pattern is used to recognize input speech of an unspecified speaker, and the gender of the speaker is determined from the recognition result.

【０００４】また、特開平０５−３２３９９０号公報に
開示されている技術では、話者識別技術が記載されてい
る。このような技術では、話者毎にモデルを作成・登録
し、このモデルを用いて音声を認識することにより、音
声を発声した話者の認識を行っている。[0004] In the technique disclosed in Japanese Patent Application Laid-Open No. 05-323990, a speaker identification technique is described. In such a technique, a model is created and registered for each speaker, and the speaker that uttered the voice is recognized by recognizing the voice using the model.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、従来の
文書処理装置では、文書のフォントを変更する際に、マ
ウス等の入力手段で所定のメニューからフォント変更の
指示を行う必要があった。また、音声入力が可能な従来
の文書処理装置は、音声認識された文書を表示させるだ
けで、認識された文書の認識の確かさはテキストに反映
されていなかった。However, in the conventional document processing apparatus, when changing the font of a document, it is necessary to instruct a font change from a predetermined menu using an input means such as a mouse. Further, the conventional document processing apparatus capable of voice input only displays a document whose speech has been recognized, and the certainty of recognition of the recognized document has not been reflected in the text.

【０００６】本発明は、このような従来の問題点に鑑み
てなされたものであって、請求項１〜１０記載の発明は
音声入力によりテキストのフォント変更を可能にし、請
求項１１記載の発明は音声入力された文字の認識の確か
さをテキストのフォントに反映することを可能にするフ
ォント変換装置を実現することを目的とする。The present invention has been made in view of such a conventional problem. The inventions according to claims 1 to 10 enable a text font to be changed by voice input, and the invention according to claim 11 SUMMARY OF THE INVENTION It is an object of the present invention to realize a font conversion device which can reflect the certainty of recognition of a character input by voice in a font of a text.

【０００７】[0007]

【課題を解決するための手段】本願の請求項１の発明
は、話者によって発話された音声を認識してテキストデ
ータに変換し、話者の入力音声に含まれる音声情報に基
づいて前記テキストデータのフォントを変換するフォン
ト変換装置であって、話者によって発話された音声を入
力し、音声データを出力する音声入力部と、前記音声入
力部から入力された音声データを音声認識してテキスト
データに変換する音声認識部と、前記音声入力部によっ
て入力された音声データから、話者の音声特性を示す音
声情報を抽出する音声情報抽出部と、前記音声情報を複
数の階級又は属性に弁別し、各階級又は属性に対応して
テキストのフォント情報を記憶するフォントテーブル
と、前記フォントテーブルを参照して、入力音声情報に
対応するフォント情報を取得するフォントテーブル参照
部と、前記音声情報抽出部から話者の音声情報が与えら
れ、前記音声認識部から話者のテキストデータが与えら
れたとき、前記フォントテーブル参照部に前記音声情報
を与え、前記フォントテーブルから取得されたフォント
情報を用い、前記テキストデータをフォント変換するフ
ォント変換制御部と、前記フォント変換制御部で変換さ
れたテキストを表示する表示部と、を具備することを特
徴とするものである。According to a first aspect of the present invention, a speech uttered by a speaker is recognized and converted into text data, and the text is converted to text data based on speech information included in the input speech of the speaker. A font conversion device for converting a font of data, wherein a voice uttered by a speaker is input, and a voice input unit for outputting voice data; A voice recognition unit that converts data into data, a voice information extraction unit that extracts voice information indicating voice characteristics of a speaker from voice data input by the voice input unit, and discriminates the voice information into a plurality of classes or attributes. A font table for storing text font information corresponding to each class or attribute; and font information corresponding to input voice information with reference to the font table. When the voice information of the speaker is provided from the font table reference unit to be obtained and the voice information extraction unit and the text data of the speaker is provided from the voice recognition unit, the voice information is provided to the font table reference unit. A font conversion control unit that performs font conversion on the text data using font information obtained from the font table, and a display unit that displays text converted by the font conversion control unit. Is what you do.

【０００８】本願の請求項２の発明は、請求項１のフォ
ント変換装置において、前記音声情報抽出部は、前記音
声入力部から入力された音声データから、話者の音声パ
ワーを前記音声情報として抽出するパワー情報抽出部で
あることを特徴とするものである。According to a second aspect of the present invention, in the font conversion apparatus according to the first aspect, the voice information extracting unit uses the voice power of a speaker as the voice information from the voice data input from the voice input unit. It is a power information extracting unit to be extracted.

【０００９】本願の請求項３の発明は、請求項１のフォ
ント変換装置において、前記音声情報抽出部は、前記音
声入力部により入力された音声データから、話者の識別
に用いる音声特徴量を前記音声情報として抽出する話者
識別用特徴抽出部と、各話者と対応した話者識別ラベル
を入力する話者識別用ラベル取得部と、前記話者識別用
特徴抽出部から抽出された音声特徴量と前記話者識別用
ラベル取得部から入力された話者識別ラベルとを用い
て、話者識別に用いるテンプレートを作成する話者特徴
学習部と、前記話者識別用特徴抽出部から抽出される音
声特徴量と前記話者特徴学習部で作成されたテンプレー
トとを用いて話者を識別し、前記音声を発声した話者の
話者識別ラベルを出力する話者識別部と、前記話者識別
用ラベル取得部から得られた話者識別ラベルと各話者別
に指定されたフォント情報とを用いて、フォントテーブ
ルを作成するフォントテーブル作成部と、を有すること
を特徴とするものである。According to a third aspect of the present invention, in the font conversion apparatus according to the first aspect, the voice information extracting unit extracts a voice feature value used for speaker identification from the voice data input by the voice input unit. A speaker identification feature extraction unit that extracts as the voice information, a speaker identification label acquisition unit that inputs a speaker identification label corresponding to each speaker, and a voice extracted from the speaker identification feature extraction unit A speaker feature learning unit that creates a template used for speaker identification using the feature amount and the speaker identification label input from the speaker identification label acquisition unit, and an extraction from the speaker identification feature extraction unit A speaker identification unit that identifies a speaker using the speech feature amount to be output and the template created by the speaker feature learning unit, and outputs a speaker identification label of the speaker who uttered the speech; From the user identification label acquisition unit Was by using the speaker identification label and font information specified by each speaker, it is characterized in that it has a font table creating unit for creating a font table, a.

【００１０】本願の請求項４の発明は、請求項１のフォ
ント変換装置において、前記音声情報抽出部は、前記音
声入力部により入力された音声データから、話者の感情
識別に用いる音声特徴量を前記音声情報として抽出する
感情識別用特徴抽出部と、前記感情識別用特徴抽出部か
ら抽出された音声特徴量を用いて、話者の感情を識別す
る感情識別部と、を有することを特徴とするものであ
る。According to a fourth aspect of the present invention, in the font conversion apparatus according to the first aspect, the voice information extracting unit is configured to convert a voice feature value used for identifying a speaker's emotion from voice data input by the voice input unit. And an emotion identification unit that identifies a speaker's emotion by using the audio feature amount extracted from the emotion identification feature extraction unit. It is assumed that.

【００１１】本願の請求項５の発明は、請求項１のフォ
ント変換装置において、前記音声情報抽出部は、前記音
声入力部により入力された音声データから、話者の性別
識別に用いる音声特徴量を前記音声情報として抽出する
性別識別用特徴抽出部と、前記性別識別用特徴抽出部か
ら抽出された音声特徴量を用いて、話者の性別を識別す
る性別識別部と、を有することを特徴とするものであ
る。According to a fifth aspect of the present invention, in the font conversion apparatus according to the first aspect, the voice information extracting unit uses a voice feature value used for gender identification of a speaker based on voice data input by the voice input unit. And a gender identification unit that identifies the gender of the speaker by using the audio feature extracted from the gender identification feature extraction unit. It is assumed that.

【００１２】本願の請求項６の発明は、請求項１のフォ
ント変換装置において、前記音声情報抽出部は、前記音
声入力部により入力された音声データから、話者の年齢
識別に用いる音声特徴量を前記音声情報として抽出する
年齢識別用特徴抽出部と、前記年齢識別用特徴抽出部か
ら抽出される音声特徴量を用いて、話者の年齢を識別す
る年齢識別部と、を有することを特徴とするものであ
る。According to a sixth aspect of the present invention, in the font conversion apparatus according to the first aspect, the voice information extracting unit is configured to determine, based on the voice data input by the voice input unit, a voice feature value used for identifying a speaker's age. As an audio information, and an age identification unit that identifies the age of the speaker by using an audio feature extracted from the age identification feature extraction unit. It is assumed that.

【００１３】本願の請求項７の発明は、音声特徴量の各
階級に対応したフォント情報を格納するフォントテーブ
ルを新規に作成し、話者によって発話された音声を認識
してテキストデータに変換し、話者の入力音声に含まれ
る音声情報に基づいて前記テキストデータのフォントを
前記フォントテーブルを用いて変換するフォント変換装
置であって、話者によって発話された音声を入力し、音
声データを出力する音声入力部と、前記音声入力部から
入力された音声データを音声認識してテキストデータに
変換する音声認識部と、前記音声入力部によって入力さ
れた音声データから、話者の音声特性を示す音声情報を
抽出する音声情報抽出部と、フォント情報を入力するフ
ォント指定入力部と、前記音声情報を複数の階級に弁別
し、各階級に対応したテキストのフォント情報を記憶す
るフォントテーブルと、前記フォント指定入力部によっ
て入力されたフォント情報と前記音声情報抽出部で抽出
される音声情報とを用いて、各音声情報に対応するフォ
ント情報を前記フォントテーブルに対して新規に格納す
るフォントテーブル作成部と、前記フォントテーブルを
参照して、入力音声情報に対応するフォント情報を取得
するフォントテーブル参照部と、前記音声情報抽出部か
ら話者の音声情報が与えられ、前記音声認識部から話者
のテキストデータが与えられたとき、前記フォントテー
ブル参照部に前記音声情報を与え、前記フォントテーブ
ルから取得されたフォント情報を用い、前記テキストデ
ータをフォント変換するフォント変換制御部と、前記フ
ォント変換制御部で変換されたテキストを表示する表示
部と、を具備することを特徴とするものである。According to the invention of claim 7 of the present application, a font table for storing font information corresponding to each class of voice feature is newly created, and a voice uttered by a speaker is recognized and converted into text data. A font conversion device for converting a font of the text data based on voice information included in an input voice of a speaker using the font table, inputting a voice uttered by the speaker and outputting voice data A voice input unit, a voice recognition unit that recognizes voice data input from the voice input unit and converts the voice data into text data, and indicates a voice characteristic of a speaker from the voice data input by the voice input unit. A voice information extraction unit for extracting voice information, a font designation input unit for inputting font information, and discriminating the voice information into a plurality of classes, corresponding to each class Using a font table that stores font information of the extracted text and font information input by the font designation input unit and audio information extracted by the audio information extraction unit, A font table creation unit for newly storing a font table; a font table reference unit for referring to the font table to obtain font information corresponding to input audio information; and a speaker voice from the audio information extraction unit. Information, and when the speaker's text data is provided from the voice recognition unit, the voice information is provided to the font table reference unit, and the text data is converted to a font using the font information obtained from the font table. A font conversion control unit to be converted, and a text converted by the font conversion control unit. Is characterized in that it comprises a display unit for displaying the door, the.

【００１４】本願の請求項８の発明は、話者によって発
話された音声を認識してテキストデータに変換し、話者
の入力音声に含まれる音声情報に基づいて前記テキスト
データのフォントを変換すると共に、前記テキストのフ
ォント変更が実行されたとき、その後のフォント変換に
前記変更フォントが用いられるフォント変換装置であっ
て、話者によって発話された音声を入力し、音声データ
を出力する音声入力部と、前記音声入力部から入力され
た音声データを音声認識してテキストデータに変換する
音声認識部と、前記音声入力部によって入力された音声
データから、話者の音声特性を示す音声情報を抽出する
音声情報抽出部と、前記音声情報を複数の階級に弁別
し、各階級に対応してテキストのフォント情報を記憶す
るフォントテーブルと、前記フォントテーブルを参照し
て、入力音声情報に対応するフォント情報を取得するフ
ォントテーブル参照部と、話者がフォント変更するテキ
スト及び前記テキストの表示位置を含むテキスト情報、
及びフォント変更後のフォント情報を入力するフォント
変更入力部と、前記音声情報抽出部から話者の音声情報
が与えられ、前記音声認識部から話者のテキストデータ
が与えられたとき、前記フォントテーブル参照部に音声
情報を与え、前記フォントテーブルから取得されたフォ
ント情報を用い、前記テキストをフォント変換すると共
に、前記フォント変更入力部から入力されたテキストを
フォント変換した際に用いた音声情報を出力し、前記フ
ォント変更入力部から入力された変更対象のテキスト
を、前記フォント変更入力部から入力されたフォント情
報でフォント変更するフォント変換制御部と、前記フォ
ント変換制御部から出力される音声情報と前記フォント
変更入力部から入力されるフォント情報とに基づいて前
記フォントテーブルのデータを変更するフォント学習部
と、前記フォント変換制御部で変換されたテキストを表
示する表示部と、を具備することを特徴とするものであ
る。According to an eighth aspect of the present invention, the voice uttered by the speaker is recognized and converted into text data, and the font of the text data is converted based on voice information included in the input voice of the speaker. A font input device for inputting a voice uttered by a speaker and outputting voice data when the font change of the text is executed, and the changed font is used for subsequent font conversion. A voice recognition unit that recognizes voice data input from the voice input unit and converts the voice data into text data; and extracts voice information indicating a voice characteristic of a speaker from the voice data input by the voice input unit. A speech information extracting unit, and a font table for discriminating the speech information into a plurality of classes and storing text font information corresponding to each class , Above with reference to the font table, a font table reference unit for acquiring font information corresponding to the input voice information, text information including a display position of the text and the text which speaker to change fonts,
And a font change input unit for inputting font information after the font change, and the voice information extracting unit, when the speaker's voice information is provided, and when the speaker's text data is provided from the voice recognizing unit, the font table. The audio information is provided to the reference unit, and the text is converted into a font using the font information obtained from the font table, and the audio information used when the text input from the font change input unit is subjected to the font conversion is output. A font conversion control unit that changes the font of the text to be changed input from the font change input unit with the font information input from the font change input unit; and audio information output from the font conversion control unit. The font table based on font information input from the font change input unit; And font learning unit for changing the data, is characterized in that it comprises a display unit for displaying the converted text with the font conversion control unit.

【００１５】本願の請求項９の発明は、既に入力された
テキストデータを話者が発話することにより、入力済の
各テキストデータに対する話者の音声情報を抽出し、前
記入力済のテキストデータのフォントを変換するフォン
ト変換装置であって、話者によって発話された音声を入
力し、音声データを出力する音声入力部と、前記音声入
力部によって入力された音声データから、話者の音声特
性を示す音声情報を抽出する音声情報抽出部と、前記音
声情報を複数の階級に弁別し、各階級に対応してテキス
トのフォント情報を記憶するフォントテーブルと、前記
フォントテーブルを参照して、入力音声情報に対応する
フォント情報を取得するフォントテーブル参照部と、テ
キスト及びテキスト表示位置を含むテキスト情報を入力
するテキスト入力部と、前記テキスト入力部により入力
されたテキストの一部を指定し、指定したテキストを発
話するように話者に指示すると共に、前記指定テキスト
のテキスト情報を出力する読み上げ指定表示部と、前記
音声情報抽出部から話者の音声情報が与えられ、前記読
み上げ指定表示部からテキスト情報が与えられたとき、
前記フォントテーブル参照部に前記音声情報を与え、前
記フォントテーブルから取得されたフォント情報を用
い、前記読み上げ指定表示部で指定したテキストをフォ
ント変換するフォント変換制御部と、前記テキスト入力
部により入力されたテキストを表示すると共に、前記フ
ォント変換制御部で変換されたテキストを表示する表示
部と、を具備することを特徴とするものである。According to a ninth aspect of the present invention, a speaker speaks text data that has already been input, thereby extracting voice information of the speaker with respect to each input text data, and extracting the voice information of the input text data. A font conversion device for converting a font, comprising: a voice input unit for inputting voice uttered by a speaker and outputting voice data; and a voice characteristic of the speaker based on voice data input by the voice input unit. A voice information extraction unit for extracting voice information to be indicated, a font table for discriminating the voice information into a plurality of classes, and storing text font information corresponding to each class, and an input voice with reference to the font table. Font table reference section for obtaining font information corresponding to information, and text input for inputting text information including text and text display position And a text-to-speech designation display unit for designating a part of the text input by the text input unit, instructing a speaker to speak the designated text, and outputting text information of the designated text, When the voice information of the speaker is given from the information extracting unit and the text information is given from the reading designation display unit,
The voice information is given to the font table reference unit, and the font information obtained from the font table is used. And a display unit for displaying the text converted by the font conversion control unit while displaying the converted text.

【００１６】本願の請求項１０の発明は、既に入力され
たテキストデータを話者が発話することにより、入力済
の各テキストデータに対する話者の音声情報を抽出し、
前記入力済のテキストデータのフォントを変換するフォ
ント変換装置であって、話者によって発話された音声を
入力し、音声データを出力する音声入力部と、前記音声
入力部から入力された音声データを音声認識してテキス
トデータに変換する音声認識部と、前記音声入力部によ
って入力された音声データから、話者の音声特性を示す
音声情報を抽出する音声情報抽出部と、前記音声情報を
複数の階級に弁別し、各階級に対応してテキストのフォ
ント情報を記憶するフォントテーブルと、前記フォント
テーブルを参照して、入力音声情報に対応するフォント
情報を取得するフォントテーブル参照部と、テキスト及
びテキスト表示位置を含むテキスト情報を入力するテキ
スト入力部と、前記音声認識部で認識されたテキストと
前記テキスト入力部から入力されたテキストを比較し、
比較結果が一致する前記テキスト情報を出力する位置情
報検出部と、前記音声情報抽出部から話者の音声情報が
与えられ、前記位置情報検出部から話者の発話したテキ
スト情報が与えられたとき、前記フォントテーブル参照
部に前記音声情報を与え、前記フォントテーブルから取
得されたフォント情報を用い、前記位置情報検出部から
出力されたテキストをフォント変換するフォント変換制
御部と、前記テキスト入力部により入力されたテキスト
を表示すると共に、前記フォント変換制御部で変換され
たテキストを表示する表示部と、を具備することを特徴
とするものである。According to a tenth aspect of the present invention, a speaker speaks text data that has already been input, thereby extracting speaker voice information for each input text data.
A font conversion device for converting a font of the input text data, wherein a voice uttered by a speaker is input, and a voice input unit that outputs voice data, and a voice data input from the voice input unit. A voice recognition unit that performs voice recognition and converts the voice information into text data; a voice information extraction unit that extracts voice information indicating voice characteristics of a speaker from voice data input by the voice input unit; A font table for storing font information of text corresponding to each class, a font table reference unit for referring to the font table and obtaining font information corresponding to input voice information, and a text and a text. A text input unit for inputting text information including a display position, a text recognized by the voice recognition unit, and the text input It compares the input text from,
When a position information detecting unit that outputs the text information having the same comparison result is provided with the speaker's voice information from the voice information extracting unit, and when the speaker's uttered text information is provided from the position information detecting unit. A font conversion control unit that provides the audio information to the font table reference unit, converts the text output from the position information detection unit into a font using the font information acquired from the font table, and the text input unit. A display unit for displaying the input text and displaying the text converted by the font conversion control unit.

【００１７】本願の請求項１１の発明は、話者によって
発話された音声を認識してテキストデータに変換し、音
声認識の確からしさに基づいて前記テキストデータのフ
ォントを変換するフォント変換装置であって、話者によ
って発話された音声を入力し、音声データを出力する音
声入力部と、前記音声入力部から入力された音声データ
を音声認識してテキストデータに変換すると共に、音声
認識されたテキストの確からしさを尤度情報として出力
する音声認識部と、前記尤度情報を複数の階級に弁別
し、各階級に対応してフォント情報を記憶するフォント
テーブルと、前記フォントテーブルを参照して、入力尤
度情報に対応するテキストのフォント情報を取得するフ
ォントテーブル参照部と、前記音声認識部から話者のテ
キストデータとテキストの尤度情報とが与えられたと
き、前記フォントテーブル参照部に前記尤度情報を与
え、前記フォントテーブルから取得されたフォント情報
を用い、前記テキストをフォント変換するフォント変換
制御部と、前記フォント変換制御部で変換されたテキス
トを表示する表示部と、を具備することを特徴とするも
のである。An invention according to claim 11 of the present application is a font conversion apparatus for recognizing voice uttered by a speaker and converting the voice into text data, and converting the font of the text data based on the certainty of voice recognition. A voice input unit for inputting voice uttered by a speaker and outputting voice data; and converting voice data input from the voice input unit into text data by voice recognition. A speech recognition unit that outputs the likelihood of the likelihood information as likelihood information, the likelihood information is discriminated into a plurality of classes, and a font table that stores font information corresponding to each class, and with reference to the font table, A font table reference unit for obtaining font information of text corresponding to the input likelihood information; and a speaker text data and text from the speech recognition unit. When the likelihood information of the font table is given, the likelihood information is given to the font table reference unit, and a font conversion control unit that performs font conversion of the text using font information acquired from the font table; A display unit for displaying the text converted by the font conversion control unit.

【００１８】[0018]

【発明の実施の形態】（実施の形態１）以下、本発明の
実施の形態１によるフォント変換装置について、図面を
参照しながら説明する。図１は本発明の実施の形態１に
よるフォント変換装置の構成図である。このフォント変
換装置は音声入力部１０１、フォント変換制御部１０
２、表示部１０３、フォントテーブル参照部１０４、音
声認識部１０５、パワー情報抽出部１０６、フォントテ
ーブル１００１を含んで構成される。(Embodiment 1) Hereinafter, a font conversion apparatus according to Embodiment 1 of the present invention will be described with reference to the drawings. FIG. 1 is a configuration diagram of a font conversion apparatus according to Embodiment 1 of the present invention. This font conversion device includes a voice input unit 101, a font conversion control unit 10
2, a display unit 103, a font table reference unit 104, a speech recognition unit 105, a power information extraction unit 106, and a font table 1001.

【００１９】音声入力部１０１は話者によって発話され
たメッセージ（音声）を取り込み、音声データを出力す
るものである。パワー情報抽出部１０６は音声入力部１
０１から音声データが入力されると、入力音声データの
音声情報として、パワー情報を抽出する音声情報抽出部
である。音声認識部１０５は音声データが入力される
と、音声データを音声認識し、認識結果としてテキスト
データを単語単位、文節単位、文章単位のいずれかで出
力するものである。これの変換単位は話者の発声状態
（単語間の切れ目、ポーズの有り無し）によって決定さ
れる。The voice input unit 101 takes in a message (voice) uttered by a speaker and outputs voice data. The power information extraction unit 106 is a voice input unit 1
When audio data is input from 01, the audio information extraction unit extracts power information as audio information of the input audio data. When voice data is input, the voice recognition unit 105 performs voice recognition on the voice data, and outputs text data as a recognition result in units of words, phrases, or sentences. The conversion unit of this is determined by the speaker's utterance state (break between words, presence or absence of pause).

【００２０】フォントテーブル１００１は、音声のパワ
ー情報を複数の階級又は属性に弁別し、各階級又は属性
に対応してフォント情報を格納したテーブルである。フ
ォントテーブル１００１の一例を図２に示す。図２にお
いて、音声のパワーとして、０ｄＢ未満、０〜１０ｄ
Ｂ、・・４０〜５０ｄＢのように１０ｄＢ幅で複数の階
級が設定されている。フォント情報であるフォント設定
値は、フォント名、スタイル、サイズ、文字飾り、色等
の組み合わせが各音声パワー毎に設定されている。The font table 1001 discriminates audio power information into a plurality of classes or attributes and stores font information corresponding to each class or attribute. FIG. 2 shows an example of the font table 1001. In FIG. 2, the audio power is less than 0 dB, 0 to 10 dB.
B,... A plurality of classes are set with a 10 dB width such as 40 to 50 dB. As the font setting value as font information, a combination of a font name, style, size, character decoration, color, and the like is set for each audio power.

【００２１】図１のフォントテーブル参照部１０４は、
フォント変換制御部１０２から入力音声情報としてパワ
ー情報が入力されると、フォントテーブル１００１を参
照し、入力されたパワー情報に対応したフォント情報を
取得し、フォント変換制御部１０２に対してフォント情
報を与えるものである。The font table reference unit 104 shown in FIG.
When power information is input as input audio information from the font conversion control unit 102, font information corresponding to the input power information is acquired by referring to the font table 1001, and the font information is transmitted to the font conversion control unit 102. Is to give.

【００２２】フォント変換制御部１０２は、パワー情報
抽出部１０６からパワー情報が入力され、音声認識部１
０５から話者のテキストデータが入力されると、フォン
トテーブル参照部１０４にパワー情報を出力し、フォン
トテーブル参照部１０４から出力されたフォント情報に
基づいて、音声認識部１０５から入力されたテキストデ
ータをフォント変換するものである。表示部１０３はフ
ォント変換前の標準フォントのテキスト（文章）を表示
したり、フォント変換されたテキストを表示するもので
ある。尚、フォントテーブル１００１の情報を一旦メモ
リ上に読み込んで、フォント変換制御部１０２で参照を
行えるようにしてもよい。The font conversion control unit 102 receives the power information from the power information extraction unit 106 and
When the speaker's text data is input from the input unit 05, the power information is output to the font table reference unit 104, and based on the font information output from the font table reference unit 104, the text data input from the speech recognition unit 105 is output. Is used for font conversion. The display unit 103 displays text (sentence) of a standard font before font conversion, and displays font-converted text. Note that the information of the font table 1001 may be temporarily read into the memory, and the font conversion control unit 102 may refer to the information.

【００２３】このように構成された本実施の形態による
フォント変換装置の動作例について説明する。図１にお
いて、音声入力部１０１から、パワー情報抽出部１０６
と音声認識部１０５とに対し、話者の発話により一例と
して「明日は、晴れでしょう。」という音声データが入
力されたとする。この場合、話者は「明日は、」を４５
ｄＢの大きさで発声したとする。この「明日は、」の音
声データが、音声入力部１０１からパワー情報抽出部１
０６と音声認識部１０５とに与えられると、パワー情報
抽出部１０６は音声データのパワー情報を抽出し、「４
５ｄＢ」を出力する。このパワー情報がフォント変換制
御部１０２を介してフォントテーブル参照部１０４に与
えられる。An operation example of the thus configured font conversion apparatus according to the present embodiment will be described. In FIG. 1, a speech information input unit 101 outputs a power information extraction unit 106
As an example, it is assumed that voice data “Tomorrow will be fine.” Is input to the voice recognition unit 105 by the speaker. In this case, the speaker gives 45 tomorrow
Suppose that the user uttered the speech at the size of dB. The voice data of “Tomorrow is” is transmitted from the voice input unit 101 to the power information extraction unit 1.
06 and the speech recognition unit 105, the power information extraction unit 106 extracts the power information of the speech data and outputs “4”.
5 dB ". This power information is provided to the font table reference unit 104 via the font conversion control unit 102.

【００２４】一方、音声認識部１０５は、入力された音
声データを音声認識し、認識結果のテキストデータをフ
ォント変換制御部１０２に与える。上記の例では音声認
識部１０５は音声データ「明日は、」を音声認識し、テ
キスト「明日は、」に変換し、このテキストデータをフ
ォント変換制御部１０２に出力する。On the other hand, the voice recognition unit 105 performs voice recognition on the input voice data, and provides the text data of the recognition result to the font conversion control unit 102. In the above example, the voice recognition unit 105 performs voice recognition on the voice data “Tomorrow is,” converts the voice data into the text “Tomorrow is,” and outputs the text data to the font conversion control unit 102.

【００２５】フォントテーブル参照部１０４は図２のフ
ォントテーブル１００１を参照し、パワー情報「４５ｄ
Ｂ」に対応するフォント情報「（フォント）ゴシック、
（スタイル）標準、（サイズ）１２、（文字飾り）無
し、（色）黒」を取得し、このフォント情報をフォント
変換制御部１０２に与える。The font table reference unit 104 refers to the font table 1001 in FIG.
Font information "(Font) Gothic,
(Style) standard, (size) 12, (no character decoration), (color) black ”are obtained, and this font information is provided to the font conversion control unit 102.

【００２６】フォント変換制御部１０２は、フォントテ
ーブル参照部１０４から与えられた上記のフォント情報
を基に、音声認識部１０５から入力されたテキスト「明
日は、」をフォント変換し、図３（ａ）に示すようなフ
ォントを有する「明日は、」が「（フォント）ゴシッ
ク、（スタイル）標準、（サイズ）１２、（文字飾り）
無し、（色）黒」で表示部１０３で表示される。The font conversion control unit 102 converts the text "tomorrow is" input from the speech recognition unit 105 based on the above-mentioned font information provided from the font table reference unit 104, and performs the font conversion shown in FIG. ) Has a font such as “(Font) Gothic, (Style) Standard, (Size) 12, (Character decoration)
None, (color) black ”is displayed on the display unit 103.

【００２７】次に、話者が「晴れ」を４ｄＢの大きさで
発声し、引き続き「でしょう。」を４１ｄＢの大きさで
発声したとする。フォントテーブル参照部１０４はフォ
ントテーブル１００１を再び参照し、パワー情報「４ｄ
Ｂ」に対応するフォント情報「（フォント）ゴシック、
（スタイル）太字、（サイズ）２４、（文字飾り）無
し、（色）黒」を取得し、このフォント情報をフォント
変換制御部１０２に与える。次にフォントテーブル参照
部１０４は、「でしょう。」のパワー情報「４１ｄＢ」
に対応するフォント情報「（フォント）ゴシック、（ス
タイル）標準、（サイズ）１２、（文字飾り）無し、
（色）黒」を取得し、このフォント情報をフォント変換
制御部１０２に与える。この結果、図３（ｂ）に示すよ
うなフォントの「明日は、晴れでしょう。」が表示部１
０３で表示される。Next, it is assumed that the speaker utters "sunny" with a magnitude of 4 dB, and subsequently utters "will" with a magnitude of 41 dB. The font table reference unit 104 refers to the font table 1001 again, and outputs the power information “4d
Font information "(Font) Gothic,
(Style) bold, (size) 24, (no character decoration), (color) black ”, and provides this font information to the font conversion control unit 102. Next, the font table reference unit 104 outputs the power information “41 dB” of “Would.”
Font information corresponding to "(Font) Gothic, (Style) Standard, (Size) 12, (No character decoration),
(Color) black ”and gives the font information to the font conversion control unit 102. As a result, the display unit 1 displays the font "Tomorrow will be fine" as shown in FIG.
Displayed as 03.

【００２８】このようにすると、話者の発声の強弱によ
り、音声入力された文字のフォントを変換して画面に表
示することができる。例えば、一定パワー以上であれ
ば、強調フォントを使用することにより、声の強弱で通
常フォントと強調フォントを使い分けることができる。In this way, the font of the character input by voice can be converted and displayed on the screen according to the strength of the utterance of the speaker. For example, if the power is equal to or higher than a certain power, the normal font and the emphasized font can be selectively used depending on the strength of the voice by using the emphasized font.

【００２９】（実施の形態２）次に本発明の実施の形態
２によるフォント変換装置について、図面を参照しなが
ら説明する。前述した実施の形態１では、固定のフォン
トテーブル１００１を参照したが、本実施の形態のフォ
ント変換装置は、音声入力を行う前にフォントテーブル
１００１を新規作成し、この後に話者が音声入力を行う
と、入力された音声データのパワー情報を抽出し、音声
認識結果のテキストデータを新規作成のフォントで出力
することを特徴とする。(Embodiment 2) Next, a font conversion apparatus according to Embodiment 2 of the present invention will be described with reference to the drawings. In the above-described first embodiment, the fixed font table 1001 is referred to. However, the font conversion apparatus according to the present embodiment creates a new font table 1001 before performing voice input, and thereafter, the speaker performs voice input. Then, the power information of the input voice data is extracted, and the text data of the voice recognition result is output in a newly created font.

【００３０】図４は本実施の形態によるフォント変換装
置の構成図である。ここで、実施の形態１と同一ブロッ
クは同じ動作を行うものとし、同一の符号を付けてそれ
らの詳細な説明は省略する。FIG. 4 is a block diagram of the font converter according to the present embodiment. Here, it is assumed that the same blocks as those in the first embodiment perform the same operations, and the same reference numerals are given, and detailed descriptions thereof are omitted.

【００３１】本実施の形態のフォント変換装置は、音声
入力部１０１、フォント変換制御部１０２、表示部１０
３、フォントテーブル参照部１０４、音声認識部１０
５、パワー情報抽出部１０６、フォントテーブル１００
１に加えて、フォント指定入力部１０７、フォントテー
ブル作成部１０８を含んで構成される。The font conversion apparatus according to the present embodiment includes a voice input unit 101, a font conversion control unit 102, a display unit 10
3. Font table reference unit 104, voice recognition unit 10
5. Power information extraction unit 106, font table 100
In addition to the above, a font designation input unit 107 and a font table creation unit 108 are included.

【００３２】フォント指定入力部１０７は、キーボード
等の入力装置で構成され、ユーザがフォント情報を入力
するものである。フォントテーブル作成部１０８は、パ
ワー情報抽出部１０６から音声情報であるパワー情報が
入力され、フォント指定入力部１０７からフォント情報
が入力されると、そのパワー情報とフォント情報に基づ
いて、フォントテーブル１００１のデータを新規に作成
するものである。The font designation input unit 107 is composed of an input device such as a keyboard, and is used by a user to input font information. When power information, which is audio information, is input from the power information extraction unit 106 and font information is input from the font designation input unit 107, the font table creation unit 108 generates a font table 1001 based on the power information and the font information. Is newly created.

【００３３】このように構成された本実施の形態による
フォント変換装置の動作例について説明する。図４にお
いて、実際のテキストを音声入力する前に話者が試験用
のテキストの発声を行い、音声入力部１０１から試験用
の音声データをパワー情報抽出部１０６に与える。具体
的な一例として、話者が試験用の文章を強弱をつけて発
声する。このとき、強い発声が１０ｄＢの大きさとし、
弱い発声が３０ｄＢの大きさとする。これらの音声デー
タは音声入力部１０１を介してパワー情報抽出部１０６
に入力される。An example of the operation of the font converting apparatus according to the present embodiment having the above-described configuration will be described. In FIG. 4, a speaker utters a test text before voice input of an actual text, and gives test voice data from a voice input unit 101 to a power information extraction unit 106. As a specific example, a speaker utters a test sentence with strong and weak. At this time, the strong utterance has a size of 10 dB,
Weak utterances are 30 dB in magnitude. These voice data are transmitted to the power information extraction unit 106 via the voice input unit 101.
Is input to

【００３４】パワー情報抽出部１０６では、音声入力部
１０１から入力された音声データのパワー情報を抽出す
る。ここで抽出されたパワー情報はフォントテーブル作
成部１０８に出力される。上記の例では、パワー情報抽
出部１０６で、音声入力部１０１から入力された音声の
パワー情報は１０ｄＢと３０ｄＢとして認識され、フォ
ントテーブル作成部１０８に出力される。即ち、現在の
話者の音声パワーは、１０ｄＢ〜３０ｄＢの範囲を中心
として変動すると判定され、音声パワーの階級として、
１０ｄＢ未満、１０ｄＢ以上３０ｄＢ未満、３０ｄＢ以
上の３階級が設定される。この時点でのフォントテーブ
ル１００１の状態は図５（ａ）のようになる。尚、階級
の幅は任意に設定できるものとする。Power information extracting section 106 extracts power information of the audio data input from audio input section 101. The power information extracted here is output to the font table creation unit 108. In the above example, the power information extraction unit 106 recognizes the power information of the voice input from the voice input unit 101 as 10 dB and 30 dB, and outputs the power information to the font table creation unit 108. That is, it is determined that the voice power of the current speaker fluctuates around the range of 10 dB to 30 dB.
Three classes of less than 10 dB, 10 dB or more and less than 30 dB, and 30 dB or more are set. The state of the font table 1001 at this point is as shown in FIG. The width of the class can be set arbitrarily.

【００３５】次に話者は、フォント指定入力部１０７か
らフォントテーブル作成部１０８に対してフォント情報
を入力する。上記の例では、１０ｄＢ未満の階級に対し
て、「（フォント）ゴシック、（スタイル）太字、（サ
イズ）３２、（文字飾り）浮き出し、（色）赤」という
フォント情報を設定する。また１０ｄＢ以上３０ｄＢ未
満の階級に対して、「（フォント）ゴシック、（スタイ
ル）太字、（サイズ）２４、（文字飾り）無し、（色）
黒」というフォント情報を設定する。更に３０ｄＢ以上
の階級に対して、「（フォント）ゴシック、（スタイ
ル）標準、（サイズ）１２、（文字飾り）無し、（色）
黒」というフォント情報を設定する。こうして、図５
（ｂ）のようなフォントテーブル１００１を作成する。Next, the speaker inputs font information from the font designation input unit 107 to the font table creation unit 108. In the above example, font information such as “(Font) Gothic, (Style) bold, (Size) 32, (Character decoration) embossed, (Color) red” is set for a class of less than 10 dB. For classes of 10 dB or more and less than 30 dB, “(Font) Gothic, (Style) bold, (Size) 24, (No character decoration), (Color)
Set black font information. Furthermore, for the class of 30 dB or more, "(Font) Gothic, (Style) Standard, (Size) 12, (No character decoration), (Color)
Set black font information. Thus, FIG.
A font table 1001 as shown in FIG.

【００３６】フォントテーブル１００１を作成した後に
は、話者が実際のテキストの音声入力をする。この場合
はフォントテーブル１００１が作成済みなので、実施の
形態１と同様の動作が行われる。即ち、フォントテーブ
ル参照部１０４は、新規作成されたフォントテーブル１
００１を参照してパワー情報に基づいてフォント情報を
取得し、フォント変換制御部１０２に与える。After the font table 1001 is created, the speaker inputs the actual text by voice. In this case, since the font table 1001 has been created, the same operation as in the first embodiment is performed. That is, the font table reference unit 104 stores the newly created font table 1
Referring to 001, font information is obtained based on the power information, and is provided to the font conversion control unit 102.

【００３７】このようにすると、話者の発声の強弱によ
り、音声入力された文字のフォントを、予めユーザが設
定した値に変換して画面に表示することができる。即ち
個人の音声パワー情報を学習することにより、声の大小
の個人差を吸収することができる。In this way, the font of the character input by voice can be converted into a value set in advance by the user and displayed on the screen according to the strength of the speaker's utterance. That is, by learning the voice power information of the individual, it is possible to absorb individual differences in voice volume.

【００３８】（実施の形態３）次に本発明の実施の形態
３によるフォント変換装置について、図面を参照しなが
ら説明する。前述した実施の形態２のフォント変換装置
は、フォントテーブル１００１を作成した後、音声入力
中にはフォントテーブル１００１のフォント情報を変更
することはできなかった。しかし、本実施の形態のフォ
ント変換装置は、話者が表示部１０３で表示されたテキ
ストを見て、そのテキストのフォントを変更できるよう
にすることを特徴とする。さらに、本実施の形態のフォ
ント変換装置は、フォント変換後のテキストのフォント
変更をフォントテーブル参照部１０４に反映し、フォン
トテーブル参照機能も更新可能にすることも特徴とす
る。(Embodiment 3) Next, a font conversion apparatus according to Embodiment 3 of the present invention will be described with reference to the drawings. After the font conversion apparatus of the second embodiment described above has created the font table 1001, it was not possible to change the font information of the font table 1001 during voice input. However, the font conversion apparatus according to the present embodiment is characterized in that a speaker can view a text displayed on display unit 103 and change the font of the text. Further, the font conversion apparatus according to the present embodiment is characterized in that the font change of the text after the font conversion is reflected in the font table reference unit 104, and the font table reference function can be updated.

【００３９】図６は本実施の形態によるフォント変換装
置の構成図である。実施の形態２と同一のブロックにつ
いては同一の符号をつけ、それらの詳細な説明は省略す
る。本実施の形態のフォント変換装置は、音声入力部１
０１、フォント変換制御部１０９、表示部１０３、フォ
ントテーブル参照部１０４、音声認識部１０５、パワー
情報抽出部１０６、フォントテーブル１００１に加え
て、フォント学習部１１０、フォント変更入力部１１１
を含んで構成される。FIG. 6 is a block diagram of the font converter according to the present embodiment. The same reference numerals are given to the same blocks as in the second embodiment, and detailed description thereof will be omitted. The font conversion apparatus according to the present embodiment includes a voice input unit 1
01, font conversion control unit 109, display unit 103, font table reference unit 104, voice recognition unit 105, power information extraction unit 106, font table 1001, font learning unit 110, font change input unit 111
It is comprised including.

【００４０】フォント変更入力部１１１は、マウスやキ
ーボード等の入力装置で構成され、表示部１０３で表示
されているテキストにおいて、話者が変更したいテキス
トやこのテキストの表示位置情報を含むテキスト情報を
入力したり、変更後のフォント情報を入力するものであ
る。ここで入力されたテキスト情報とフォント情報は、
フォント変換制御部１０９に出力され、フォント情報は
フォント学習部１１０にも出力される。The font change input unit 111 is composed of an input device such as a mouse and a keyboard. The text displayed on the display unit 103 is used to convert text desired by the speaker and text information including display position information of the text. This is for inputting or inputting font information after the change. The text information and font information entered here are
The font information is output to the font conversion control unit 109, and the font information is also output to the font learning unit 110.

【００４１】フォント学習部１１０は、フォント変更入
力部１１１から変更後のフォント情報が入力され、更に
フォント変換制御部１０９からパワー情報が入力される
と、フォントテーブル１００１を参照し、フォント変換
制御部１０９から入力されたパワー情報とフォント変更
入力部１１１から入力されたフォント情報とを用いて、
フォントテーブル１００１の一部のデータを更新するも
のである。When the font information after the change is input from the font change input unit 111 and the power information is further input from the font conversion control unit 109, the font learning unit 110 refers to the font table 1001 and executes the font conversion control unit. Using the power information input from 109 and the font information input from the font change input unit 111,
A part of the data of the font table 1001 is updated.

【００４２】フォント変換制御部１０９は、実施の形態
１及び２のフォント変換制御部１０２の機能に加えて、
フォント変更入力部１１１からテキスト情報とフォント
情報とが入力されると、このフォント情報を用いて、入
力されたテキストのフォントを変換すると共に、フォン
ト変更したテキストを表示部１０３に出力するものであ
る。また、フォント変換制御部１０９は、フォント変更
入力部１１１からテキストデータが入力されると、この
テキストデータをフォント変更する際に用いたパワー情
報をフォント学習部１１０に与える機能も有する。The font conversion control unit 109 has the functions of the font conversion control unit 102 of the first and second embodiments,
When text information and font information are input from the font change input unit 111, the font of the input text is converted using the font information, and the text whose font has been changed is output to the display unit 103. . In addition, when text data is input from the font change input unit 111, the font conversion control unit 109 has a function of providing power information used for changing the font of the text data to the font learning unit 110.

【００４３】このように構成された本実施の形態による
フォント変換装置の動作例について説明する。図６にお
いて、話者の発話によって生じた音声データが音声認識
部１０５に与えられると、音声認識されて認識結果のテ
キストデータが出力される。またパワー情報抽出部１０
６で抽出された音声のパワー情報に基づいて、フォント
変換制御部１０９がテキストをフォント変換し、フォン
ト変換されたテキストを表示部１０３に表示させる。以
上の動作は実施の形態２と同じである。An example of the operation of the thus configured font conversion apparatus according to the present embodiment will be described. In FIG. 6, when speech data generated by a speaker's utterance is given to the speech recognition unit 105, the speech is recognized and text data as a recognition result is output. Power information extraction unit 10
The font conversion control unit 109 converts the font of the text based on the power information of the audio extracted in step 6, and causes the display unit 103 to display the font-converted text. The above operation is the same as in the second embodiment.

【００４４】表示部１０３でフォント変換されたテキス
トが表示された後、話者が表示内容を見ながら、変更後
のフォント情報と変更するテキスト情報とを入力する場
合を考える。この場合変更するテキスト情報と変更後の
フォント情報は、フォント変更入力部１１１を介してフ
ォント変換制御部１０９に入力され、変更後のフォント
情報はフォント学習部１１０にも入力される。It is assumed that after the font-converted text is displayed on the display unit 103, the speaker inputs the changed font information and the changed text information while looking at the display contents. In this case, the text information to be changed and the font information after the change are input to the font conversion control unit 109 via the font change input unit 111, and the font information after the change is also input to the font learning unit 110.

【００４５】一例として、図７（ａ）のように「明日
は、晴れでしょう。」というテキストが既に表示部１０
３で表示されているとする。そして、図７（ａ）のテキ
ストをフォント変換したときに参照したフォントテーブ
ル１００１は図８（ａ）に示す内容のものとする。ま
た、図７（ａ）のテキスト「晴れ」は１２ｄＢの大きさ
で発声されたため、図８（ａ）のフォントテーブル１０
０１で、音声パワーの大きさが１０ｄＢ以上３０ｄＢ未
満の階級におけるフォント情報が参照される。次に話者
が、図７（ａ）のテキスト「晴れ」を「太字」から「太
字斜体」に変更するため、フォント変更入力部１１１か
ら「晴れ」のテキスト情報と変更後のフォント情報「太
字斜体」とを入力する。この「晴れ」のテキスト情報と
変更後のフォント情報「太字斜体」はフォント変換制御
部１０９に出力される。また変更後のフォント情報「太
字斜体」はフォント学習部１１０にも出力される。As an example, as shown in FIG. 7A, the text "Tomorrow will be fine."
It is assumed that it is displayed as 3. The font table 1001 referred to when the text of FIG. 7A is converted into a font has the contents shown in FIG. 8A. Since the text “sunny” in FIG. 7A was uttered at a size of 12 dB, the font table 10 in FIG.
At 01, font information in a class in which the magnitude of audio power is 10 dB or more and less than 30 dB is referred to. Next, in order for the speaker to change the text “sunny” in FIG. 7A from “bold” to “bold italic”, the text information of “sunny” and the changed font information “bold” are input from the font change input unit 111. "Italic". The text information “sunny” and the font information “bold italic” after the change are output to the font conversion control unit 109. The changed font information “bold italic” is also output to the font learning unit 110.

【００４６】フォント変換制御部１０９では、フォント
変更入力部１１１から入力されたテキストデータをフォ
ント変換した際に用いたパワー情報を、フォント学習部
１１０に出力する。上記の例において、フォント変換制
御部１０９で、フォント変更入力部１１１から「晴れ」
のテキスト情報が入力される。そして、この「晴れ」の
テキストデータをフォント変換する際に用いたパワー情
報１２ｄＢがフォント学習部１１０に出力される。The font conversion control unit 109 outputs to the font learning unit 110 power information used when the text data input from the font change input unit 111 is subjected to font conversion. In the above example, the font conversion control unit 109 sends “fine”
Is entered. Then, the power information 12 dB used when performing font conversion on the text data of “sunny” is output to the font learning unit 110.

【００４７】フォント変換制御部１０９は、フォント変
更入力部１１１から入力されたテキストである太字の
「晴れ」に対して、太字斜体の「晴れ」にフォント変更
する。その結果、図７（ｂ）のような太字斜体の「晴
れ」が表示部１０３に表示される。フォント学習部１１
０は、フォント変換制御部１０９からパワー情報１２ｄ
Ｂが入力されると、音声パワー１０ｄＢ以上３０ｄＢの
階級のフォント情報のスタイルの項を、フォント変更入
力部１１１から入力された変更後のフォント情報「太字
斜体」に書き換える。この結果、図８（ｂ）に示すフォ
ントテーブル１００１のように更新される。The font conversion control unit 109 changes the font of bold “clear”, which is the text input from the font change input unit 111, to bold italic “clear”. As a result, “clear” in bold italic as shown in FIG. Font learning unit 11
0 is the power information 12d from the font conversion control unit 109.
When B is input, the style item of the font information of the class of audio power of 10 dB or more and 30 dB is rewritten to the changed font information “bold italic” input from the font change input unit 111. As a result, the font table 1001 is updated as shown in FIG. 8B.

【００４８】次に、例えば話者が「明後日は、雨でしょ
う。」と発声したとする。このとき発声の一部である
「雨」の音声パワーが２０ｄＢと抽出されれば、テキス
ト「雨」は図８（ｂ）のフォントテーブル１００１を用
いてフォント変換され、図７（ｃ）に示すように、テキ
スト「雨」のスタイルは太字斜体になる。このように本
実施の形態では、「晴れ」のフォント変更が、次に音声
入力された「雨」のフォント変換に反映される。Next, suppose that the speaker utters, "The day after tomorrow will be rain." At this time, if the voice power of “rain”, which is a part of the utterance, is extracted as 20 dB, the text “rain” is font-converted using the font table 1001 of FIG. 8B, and is shown in FIG. 7C. Thus, the style of the text "Rain" is bold italic. As described above, in the present embodiment, the font change of “sunny” is reflected on the font conversion of “rain” that is input next by voice.

【００４９】尚、本実施の形態のフォント変換装置に更
に実施の形態２のフォントテーブル作成部１０８を設
け、実施の形態２におけるフォント指定入力部１０７の
代わりに、実施の形態３のフォント変更入力部１１１を
用いて、本実施の形態のフォントテーブル１００１を新
規に作成するようにしてもよい。The font conversion apparatus according to the present embodiment is further provided with a font table creation section 108 according to the second embodiment, and the font designation input section 107 according to the second embodiment is replaced with the font change input section according to the third embodiment. The font table 1001 of the present embodiment may be newly created using the unit 111.

【００５０】本実施の形態のフォント変換装置によれ
ば、一旦変換されたフォントを随時に修正する機能が実
現され、その変更内容を学習することにより、話者の好
みに応じたフォントにテキストを変換することができ
る。According to the font conversion apparatus of the present embodiment, a function of correcting the converted font at any time is realized, and by learning the contents of the change, the text can be converted to a font according to the speaker's preference. Can be converted.

【００５１】（実施の形態４）次に本発明の実施の形態
４によるフォント変換装置について、図面を参照しなが
ら説明する。前述した実施の形態１〜３では、音声入力
されたテキストをフォント変換した後、フォント変換さ
れたテキストを表示部１０３で表示する装置として述べ
た。本実施の形態によるフォント変換装置は、既に入力
されているテキストのフォント変更を可能にすることを
特徴とする。(Embodiment 4) Next, a font conversion apparatus according to Embodiment 4 of the present invention will be described with reference to the drawings. In the above-described first to third embodiments, the apparatus has been described in which the text input by voice is font-converted and then the font-converted text is displayed on the display unit 103. The font conversion apparatus according to the present embodiment is characterized in that the font of already input text can be changed.

【００５２】図９は本発明の実施の形態４におけるフォ
ント変換装置の構成図である。尚、実施の形態１と同一
ブロックは同一符号をつけ、詳細な説明は省略する。本
実施の形態のフォント変換装置は、音声入力部１０１、
フォント変換制御部１０２、表示部１０３、フォントテ
ーブル参照部１０４、パワー情報抽出部１０６、フォン
トテーブル１００１に加えて、テキスト入力部１１２、
読み上げ指定表示部１１３を含んで構成される。FIG. 9 is a configuration diagram of a font conversion apparatus according to the fourth embodiment of the present invention. The same blocks as those in the first embodiment are denoted by the same reference numerals, and detailed description thereof will be omitted. The font conversion apparatus according to the present embodiment includes a voice input unit 101,
In addition to the font conversion control unit 102, the display unit 103, the font table reference unit 104, the power information extraction unit 106, and the font table 1001, a text input unit 112,
It is configured to include a reading designation display unit 113.

【００５３】テキスト入力部１１２は、キーボード等の
入力装置で構成され、話者がテキストデータを入力し、
テキストデータ及びテキスト表示位置を含むテキスト情
報を出力するものである。ここで入力されたテキストデ
ータは表示部１０３に出力されて表示され、テキスト情
報は読み上げ指定表示部１１３に出力されるものとす
る。The text input unit 112 is composed of an input device such as a keyboard, and allows a speaker to input text data.
It outputs text information including text data and a text display position. The text data input here is output to the display unit 103 and displayed, and the text information is output to the reading designation display unit 113.

【００５４】読み上げ指定表示部１１３は、テキスト入
力部１１２から入力されたテキスト情報に基づいて、文
字毎、単語毎、文章毎等のように話者が読み上げるテキ
ストを指示するため、下線表示等によって自動的に一定
の速さでフォント変更の順番を指定し、表示部１０３に
読み上げテキストデータ（指定表示テキスト）を与える
ものである。また、この指定表示テキストのデータはフ
ォント変換制御部１０２にも出力される。Based on the text information input from the text input unit 112, the reading designation display unit 113 indicates the text to be read by the speaker such as for each character, for each word, for each sentence, etc. The font change order is automatically designated at a constant speed, and read-out text data (designated display text) is given to the display unit 103. The data of the designated display text is also output to the font conversion control unit 102.

【００５５】このように構成された本実施の形態による
フォント変換装置の動作例について説明する。一例とし
て、テキスト入力部１１２から、「明日は、晴れでしょ
う。」というテキストが入力され、このテキストが図１
０（ａ）のように表示部１０３で標準フォントで表示さ
れている場合を考える。テキスト入力部１１２から上記
のテキスト情報が与えられると、読み上げ指定表示部１
１３は、文字毎、単語毎、文章毎等のように、読み上げ
指定テキストを生成し、表示部１０３に出力する。また
この読み上げ指定テキストのテキスト情報はフォント変
換制御部１０２にも出力される。上記の例において、読
み上げ指定表示部１１３では、先ず読み上げを指定する
テキストとして「明日は、」を指定し、表示部１０３で
図１０（ｂ）のように下線表示する。An example of the operation of the thus configured font conversion apparatus according to the present embodiment will be described. As an example, the text "Tomorrow will be fine." Is input from the text input unit 112, and this text is shown in FIG.
Consider a case where the display unit 103 is displaying a standard font like 0 (a). When the above text information is given from the text input unit 112, the reading designation display unit 1
Reference numeral 13 generates a reading designation text, such as for each character, each word, or each sentence, and outputs it to the display unit 103. In addition, the text information of the designated reading-out text is also output to the font conversion control unit 102. In the above example, the text-to-speech designation display unit 113 first designates “Tomorrow is” as text to designate text-to-speech, and the display unit 103 displays the text underlined as shown in FIG.

【００５６】話者は、表示部１０３で指定表示されてい
るテキスト「明日は、」を見ながら「明日は、」を２０
ｄＢの大きさで発声したとする。音声入力部１０１はこ
の音声データをパワー情報抽出部１０６に与える。パワ
ー情報抽出部１０６は、音声入力部１０１から入力され
た音声データ「明日は、」のパワーが２０ｄＢであるこ
とを検出する。フォント変換制御部１０２はこのパワー
情報が入力されると、「２０ｄＢ」をフォントテーブル
参照部１０４に与える。The speaker looks at the text "Tomorrow is," which is designated and displayed on the display unit 103, and writes "Tomorrow is," 20 characters.
Suppose that the user uttered the speech at the size of dB. The audio input unit 101 supplies the audio data to the power information extraction unit 106. The power information extraction unit 106 detects that the power of the audio data “Tomorrow is” input from the audio input unit 101 is 20 dB. When this power information is input, the font conversion control unit 102 gives “20 dB” to the font table reference unit 104.

【００５７】フォントテーブル参照部１０４では、フォ
ントテーブル１００１を参照し、フォント変換制御部１
０２から入力されたパワー情報に対応したフォント情報
を、フォント変換制御部１０２に出力する。フォントテ
ーブル参照部１０４は、例えば図８（ａ）に示すフォン
トテーブル１００１を参照する。そして上記の例におい
て、音声パワー２０ｄＢが含まれる階級を検索し、この
音声パワーに対応したフォント情報を読み出す。こうし
て図８（ａ）から、「（フォント）ゴシック、（スタイ
ル）太字、（サイズ）２４、（文字飾り）無し、（色）
黒」のフォント情報がフォント変換制御部１０２に出力
される。The font table reference unit 104 refers to the font table 1001 and uses the font conversion control unit 1
The font information corresponding to the power information input from 02 is output to the font conversion control unit 102. The font table reference unit 104 refers to, for example, a font table 1001 shown in FIG. Then, in the above example, a class including the audio power of 20 dB is searched, and font information corresponding to the audio power is read. 8A, “(Font) Gothic, (Style) bold, (Size) 24, (No character decoration), (Color)
The “black” font information is output to the font conversion control unit 102.

【００５８】フォント変換制御部１０２は、フォントテ
ーブル参照部１０４から入力された上記のフォント情報
に基づいて、読み上げ指定表示部１１３から入力された
テキストをフォント変換する。フォント変換後のテキス
トが、読み上げ指定表示部１１３から入力された表示位
置情報に基づいて、表示部１０３で表示される。上記の
例において、読み上げ指定表示部１１３から入力された
テキスト「明日は、」がフォント変換され、図１０
（ｃ）に示すようにフォント変換された「明日は、」が
表示部１０３で表示される。The font conversion control section 102 converts the text input from the reading designation display section 113 into a font based on the font information input from the font table reference section 104. The text after the font conversion is displayed on the display unit 103 based on the display position information input from the reading designation display unit 113. In the above example, the text “Tomorrow is” input from the reading designation display unit 113 is font-converted, and FIG.
“Tomorrow is” font-converted as shown in FIG.

【００５９】次に、読み上げ指定表示部１１３は、一定
時間後に現在指定しているテキストの次のテキストを表
示部１０３で指定表示する。上記の例において、図１０
（ｄ）に示すように、前回指定表示されていたテキスト
は「明日は、」であるので、次のテキスト「晴れ」を表
示部１０３に指定表示する。Next, the reading-out designation display unit 113 causes the display unit 103 to designate and display the text next to the currently designated text after a predetermined time. In the above example, FIG.
As shown in (d), the text designated and displayed the previous time is “Tomorrow is,” so the next text “Sunny” is designated and displayed on the display unit 103.

【００６０】尚、話者が、次のテキストを指定表示する
命令を、読み上げ指定表示部１１３に入力することで、
読み上げ指定表示部１１３が次のテキストを指定表示す
るようにしてもよい。この場合、フォント変換の間隔を
話者のペースに合わすことができる。Note that, when the speaker inputs a command for designating and displaying the next text into the reading designation display unit 113,
The text-to-speech designation display unit 113 may designate and display the next text. In this case, the font conversion interval can be adjusted to the speaker's pace.

【００６１】また、実施の形態２のフォントテーブル作
成部１０８とフォント指定入力部１０７を本実施の形態
のフォント変換装置に設け、本実施の形態のフォントテ
ーブル１００１を新規に作成するようにしてもよい。Further, the font table creation unit 108 and the font designation input unit 107 according to the second embodiment are provided in the font conversion apparatus according to the present embodiment, and the font table 1001 according to the present embodiment is newly created. Good.

【００６２】また、実施の形態３のフォント変更入力部
１１１とフォント学習部１１０とを本実施の形態のフォ
ント変換装置に設け、本実施の形態におけるフォント変
換制御部１０２の代わりに実施の形態３のフォント変換
制御部１０９を用いて、本実施の形態のフォントテーブ
ル１００１を変更できるようにしてもよい。Further, the font change input unit 111 and the font learning unit 110 according to the third embodiment are provided in the font conversion device according to the present embodiment, and the third embodiment is replaced with the font conversion control unit 102 according to the third embodiment. The font table 1001 according to the present embodiment may be changed using the font conversion control unit 109 described above.

【００６３】以上のように本実施の形態のフォント変換
装置によれば、入力されている文章の一部を指定し、指
定されているテキストを話者が発話することにより、指
定されたテキストのフォントを音声パワーに応じて自由
に変更することができる。As described above, according to the font conversion apparatus of the present embodiment, a part of the input text is specified, and the specified text is spoken by the speaker, whereby the specified text is converted. Fonts can be freely changed according to audio power.

【００６４】（実施の形態５）次に本発明の実施の形態
５によるフォント変換装置について、図面を参照しなが
ら説明する。前述した実施の形態４のフォント変換装置
は、フォント変更可能なテキストが下線等により指定さ
れる装置とした。本実施の形態のフォント変換装置は、
話者が発話した音声を認識し、既に入力されている文章
と比較することで、変換すべきテキストを自動で検出で
きるようにしたことを特徴とする。(Fifth Embodiment) Next, a font conversion apparatus according to a fifth embodiment of the present invention will be described with reference to the drawings. The above-described font conversion apparatus according to the fourth embodiment is an apparatus in which text whose font can be changed is specified by underlining or the like. The font conversion device according to the present embodiment
It is characterized in that a text to be converted can be automatically detected by recognizing a voice spoken by a speaker and comparing it with a sentence already input.

【００６５】図１１は本実施の形態によるフォント変換
装置の構成図である。尚、実施の形態１及び４と同一ブ
ロックは同一の符号を付け、それらの詳細な説明は省略
する。このフォント変換装置は、音声入力部１０１、フ
ォント変換制御部１０２、表示部１０３、フォントテー
ブル参照部１０４、音声認識部１０５、パワー情報抽出
部１０６、テキスト入力部１１２、フォントテーブル１
００１に加えて、位置情報検出部１１５を含んで構成さ
れる。FIG. 11 is a block diagram of the font converter according to the present embodiment. The same blocks as those in the first and fourth embodiments are denoted by the same reference numerals, and detailed description thereof will be omitted. This font conversion device includes a voice input unit 101, a font conversion control unit 102, a display unit 103, a font table reference unit 104, a voice recognition unit 105, a power information extraction unit 106, a text input unit 112, a font table 1
001 and a position information detection unit 115.

【００６６】位置情報検出部１１５は、テキスト入力部
１１２から入力されたテキスト情報を参照して、音声認
識部１０５から入力されたテキストと同一のテキストを
比較検出し、比較結果が一致するテキストの表示位置情
報やテキストデータを含むテキスト情報をフォント変換
制御部１０２に出力するものである。The position information detecting section 115 refers to the text information input from the text input section 112 to compare and detect the same text as the text input from the voice recognizing section 105. It outputs text information including display position information and text data to the font conversion control unit 102.

【００６７】このように構成された本実施の形態による
フォント変換装置の動作例について説明する。ユーザの
操作により、テキスト入力部１１２から位置情報検出部
１１５に対してテキスト情報が与えられる。また、表示
部１０３ではテキスト入力部１１２から入力されたテキ
ストが表示される。一例として、テキスト入力部１１２
から、「明日は、晴れでしょう。」というテキストデー
タが入力され、図１２（ａ）のように表示部１０３で
「明日は、晴れでしょう。」が標準フォントで表示され
る場合を考える。このときテキスト入力部１１２から
「明日は、晴れでしょう。」のテキスト情報が位置情報
検出部１１５にも入力される。An example of the operation of the font conversion apparatus according to the present embodiment thus configured will be described. Text information is provided from the text input unit 112 to the position information detection unit 115 by a user operation. The display unit 103 displays the text input from the text input unit 112. As an example, the text input unit 112
Then, text data "Tomorrow will be fine." Is input, and "Tomorrow will be fine." Is displayed in the standard font on the display unit 103 as shown in FIG. At this time, the text information “Will be fine tomorrow.” Is also input from the text input unit 112 to the position information detection unit 115.

【００６８】次に話者（ユーザ）は表示部１０３で表示
されているテキストを発声する。話者の音声が入力され
ると、音声入力部１０１はパワー情報抽出部１０６と音
声認識部１０５に音声データを与える。上記の例におい
て、話者は「明日は、」を２０ｄＢの大きさで発声した
とする。パワー情報抽出部１０６は、音声入力部１０１
から入力された音声データのパワー情報が２０ｄＢであ
ることを抽出する。このパワー情報はフォント変換制御
部１０２に出力される。Next, the speaker (user) utters the text displayed on the display unit 103. When a speaker's voice is input, the voice input unit 101 provides voice data to the power information extraction unit 106 and the voice recognition unit 105. In the above example, it is assumed that the speaker utters “Tomorrow is” with a magnitude of 20 dB. The power information extraction unit 106 includes a voice input unit 101
It is extracted that the power information of the audio data input from is 20 dB. This power information is output to the font conversion control unit 102.

【００６９】一方、音声認識部１０５は音声入力部１０
１から入力された音声データを音声認識し、テキスト
「明日は、」を出力する。このテキストは位置情報検出
部１１５に与えられる。位置情報検出部１１５は、テキ
スト入力部１１２から入力されたテキスト情報と音声認
識部１０５から入力されたテキストデータを用いて、音
声認識部１０５から入力されたテキストデータと同一の
テキストデータの記載位置を探索し、当該テキストの表
示位置を検出する。この表示位置情報とテキストデータ
とはテキスト情報としてフォント変換制御部１０２に出
力される。上記の例において、位置情報検出部１１５に
対してテキスト入力部１１２から「明日は、晴れで
す。」のテキスト情報が入力される。また、音声認識部
１０５から位置情報検出部１１５に対してテキスト「明
日は、」が入力される。位置情報検出部１１５は、入力
された「明日は、晴れでしょう。」のテキスト情報と、
「明日は、」のテキストとを比較する。そして位置情報
検出部１１５は、テキスト「明日は、」と、表示部１０
３でテキスト「明日は、」が表示されている位置情報と
をフォント変換制御部１０２に出力する。On the other hand, the voice recognition unit 105 is
The voice data input from step 1 is recognized by voice and the text "Tomorrow is" is output. This text is provided to the position information detection unit 115. The position information detection unit 115 uses the text information input from the text input unit 112 and the text data input from the voice recognition unit 105 to write the same text data as the text data input from the voice recognition unit 105. And detects the display position of the text. The display position information and the text data are output to the font conversion control unit 102 as text information. In the above example, the text information “Tomorrow is fine.” Is input from the text input unit 112 to the position information detection unit 115. Also, the text “Tomorrow is” is input from the voice recognition unit 105 to the position information detection unit 115. The position information detection unit 115 inputs the text information of “Tomorrow will be fine.”
Compare with the text "Tomorrow is." Then, the position information detection unit 115 displays the text “Tomorrow is,”
In step 3, the position information indicating the text “Tomorrow is” is output to the font conversion control unit 102.

【００７０】フォント変換制御部１０２は、パワー情報
抽出部１０６から入力されたパワー情報２０ｄＢをフォ
ントテーブル参照部１０４に出力する。フォントテーブ
ル参照部１０４は図８（ａ）のようなフォントテーブル
１００１を参照する。そして、フォント変換制御部１０
２から入力されたパワー情報２０ｄＢに対応したフォン
ト情報「（フォント）ゴシック、（スタイル）太字、
（サイズ）２４、（文字飾り）無し、（色）黒」が取得
され、フォント変換制御部１０２に出力される。The font conversion control section 102 outputs the power information 20 dB input from the power information extraction section 106 to the font table reference section 104. The font table reference unit 104 refers to a font table 1001 as shown in FIG. Then, the font conversion control unit 10
Font information “(Font) Gothic, (Style) bold,
(Size) 24, (no character decoration), (color) black ”are obtained and output to the font conversion control unit 102.

【００７１】フォント変換制御部１０２は、フォントテ
ーブル参照部１０４から入力されたフォント情報に基づ
いて、位置情報検出部１１５から入力されたテキスト
「明日は、」を「（フォント）ゴシック、（スタイル）
太字、（サイズ）２４、（文字飾り）無し、（色）黒」
にフォント変換する。こうしてフォント変換されたテキ
スト「明日は、」が標準フォントの「晴れでしょう。」
と共に、図１２（ｂ）に示すように表示部１０３で表示
される。The font conversion control unit 102 converts the text “tomorrow is” input from the position information detection unit 115 into “(font) gothic, (style) based on the font information input from the font table reference unit 104.
Bold, (size) 24, no (text decoration), (color) black "
Font conversion. The text "Tomorrow," which has been font-converted in this way, will use the standard font "Sunny."
At the same time, it is displayed on the display unit 103 as shown in FIG.

【００７２】尚、実施の形態２のフォントテーブル作成
部１０８とフォント指定入力部１０７を本実施の形態の
フォント変換装置に更に設け、フォントテーブル１００
１を新規に作成するようにしてもよい。The font table creation unit 108 and the font designation input unit 107 according to the second embodiment are further provided in the font conversion apparatus according to the second embodiment.
1 may be newly created.

【００７３】また、実施の形態３のフォント変更入力部
１１１とフォント学習部１１０を本実施の形態のフォン
ト変換装置に更に設け、フォント変換制御部１０２の代
わりに実施の形態３のフォント変換制御部１０９を用い
て、本実施の形態のフォントテーブル１００１を変更で
きるようにしてもよい。Further, the font change input unit 111 and the font learning unit 110 according to the third embodiment are further provided in the font conversion apparatus according to the present embodiment, and the font conversion control unit according to the third embodiment is used instead of the font conversion control unit 102. 109, the font table 1001 of the present embodiment may be changed.

【００７４】以上のように本実施の形態のフォント変換
装置によれば、入力された音声を認識し、既に入力され
ている文章と比較することで、変更されるテキストの文
章内の位置情報を得る機能が加えられる。このため話者
は自然な読み方をすることが可能になる。As described above, according to the font conversion apparatus of the present embodiment, the input voice is recognized and compared with the already input text, so that the position information in the text of the text to be changed can be obtained. The gain function is added. This allows the speaker to read naturally.

【００７５】（実施の形態６）次に本発明の実施の形態
６によるフォント変換装置について、図面を参照しなが
ら説明する。実施の形態１，２，３，４，５のフォント
変換装置では、テキストをフォント変換するために、音
声情報として音声パワー（パワー情報）を用いた。しか
し、本実施の形態のフォント変換装置は、音声情報とし
てテキストを発声した話者を識別し、話者毎にフォント
変換することを特徴とする。(Embodiment 6) Next, a font conversion apparatus according to Embodiment 6 of the present invention will be described with reference to the drawings. In the font converters of the first, second, third, fourth, and fifth embodiments, audio power (power information) is used as audio information in order to convert text into a font. However, the font conversion apparatus according to the present embodiment is characterized in that a speaker who has uttered a text as voice information is identified, and font conversion is performed for each speaker.

【００７６】図１３は本実施の形態によるフォント変換
装置の構成図である。尚、実施の形態２のフォント変換
装置と同一ブロックについては、同一の符号を付け、詳
細な説明を省略する。本実施の形態のフォント変換装置
は、音声入力部１０１、表示部１０３、音声認識部１０
５、フォント指定入力部１０７、話者識別用ラベル取得
部２０１、話者特徴学習部２０２、話者識別部２０３、
フォント変換制御部２０４、フォントテーブル作成部２
０５、フォントテーブル参照部２０６、話者識別用特徴
抽出部２０８、フォントテーブル１００２を含んで構成
される。FIG. 13 is a block diagram of the font converter according to the present embodiment. The same blocks as those of the font conversion apparatus according to the second embodiment are denoted by the same reference numerals, and detailed description will be omitted. The font conversion apparatus according to the present embodiment includes a voice input unit 101, a display unit 103, a voice recognition unit 10
5. Font designation input section 107, speaker identification label acquisition section 201, speaker characteristic learning section 202, speaker identification section 203,
Font conversion control unit 204, font table creation unit 2
05, a font table reference unit 206, a speaker identification feature extraction unit 208, and a font table 1002.

【００７７】話者識別用ラベル取得部２０１はキーボー
ド等の入力装置で構成され、話者名等の話者識別用ラベ
ルを入力するものである。話者識別用特徴抽出部２０８
は、音声入力部１０１から音声データが入力されると、
入力音声データから話者の特徴を良く捉えた声道情報等
の音声特徴量を抽出するものである。The speaker identification label acquiring section 201 is composed of an input device such as a keyboard, and inputs a speaker identification label such as a speaker name. Speaker identification feature extraction unit 208
When audio data is input from the audio input unit 101,
This is to extract a voice feature amount such as vocal tract information that well captures the characteristics of a speaker from input voice data.

【００７８】話者特徴学習部２０２は、話者識別用特徴
抽出部２０８から話者毎に音声特徴量が入力され、話者
識別用ラベル取得部２０１から話者識別用ラベルが入力
されると、話者識別用ラベルと音声特徴量とを用いて、
話者識別を行うために用いるテンプレートを作成し、話
者識別部２０３にテンプレートを出力するものである。The speaker feature learning unit 202 receives a speech feature amount for each speaker from the speaker identification feature extraction unit 208 and a speaker identification label from the speaker identification label acquisition unit 201. , Using the speaker identification label and the speech feature amount,
A template used for speaker identification is created, and the template is output to the speaker identification unit 203.

【００７９】話者識別部２０３は、話者識別用特徴抽出
部２０８から入力された話者識別用の音声特徴量と、話
者特徴学習部２０２で作成されたテンプレートとを用い
て話者の識別を行い、認識結果である話者識別用ラベル
をフォント変換制御部２０４に与えるものである。The speaker identification unit 203 uses the speech feature amount for speaker identification input from the speaker identification feature extraction unit 208 and the template created by the speaker feature learning unit 202 to identify the speaker. The identification is performed, and a speaker identification label as a recognition result is provided to the font conversion control unit 204.

【００８０】フォントテーブル作成部２０５は、話者識
別用ラベル取得部２０１から入力された話者識別用ラベ
ルと、フォント指定入力部１０７から入力されたフォン
ト情報とを用いてフォントテーブル１００２を新規に作
成するものである。フォントテーブル１００２の一例を
図１４に示す。尚、既にフォント情報が入力されている
フォントテーブルに話者識別用ラベルを入力すること
で、図１４のようなフォントテーブルを完成させてもよ
い。The font table creation unit 205 newly creates the font table 1002 using the speaker identification label input from the speaker identification label acquisition unit 201 and the font information input from the font designation input unit 107. To create. FIG. 14 shows an example of the font table 1002. The font table as shown in FIG. 14 may be completed by inputting the speaker identification label into the font table in which the font information has already been input.

【００８１】フォントテーブル参照部２０６は、フォン
ト変換制御部２０４から話者識別用ラベルが入力される
と、フォントテーブル１００２を参照し、入力された話
者識別用ラベルに対応したフォント情報を出力するもの
である。フォント変換制御部２０４は、音声認識部１０
５からテキストデータが入力されると、話者識別部２０
３から与えられた話者識別用ラベルをフォントテーブル
参照部２０６に与え、フォントテーブル参照部２０６か
ら入力されたフォント情報に基づいて、音声認識部１０
５から入力されたテキストデータをフォント変換するも
のである。表示部１０３は、はフォント変換前の標準フ
ォントのテキスト（文章）を表示したり、フォント変換
制御部２０４でフォント変換されたテキストを表示する
ものである。When the speaker identification label is input from the font conversion control unit 204, the font table reference unit 206 refers to the font table 1002 and outputs font information corresponding to the input speaker identification label. Things. The font conversion control unit 204 controls the voice recognition unit 10
When text data is input from the speaker 5, the speaker identification unit 20
3 is given to the font table reference unit 206, and based on the font information input from the font table reference unit 206, the speech recognition unit 10
The font data is converted from the text data input from step 5. The display unit 103 displays text (sentence) of a standard font before font conversion, and displays text whose font has been converted by the font conversion control unit 204.

【００８２】このように構成された本実施の形態による
フォント変換装置の動作について説明する。本実施の形
態では、実際のテキストを音声入力する前に、話者識別
に用いるテンプレートを作成し、フォントテーブル１０
０２も作成する。そして、話者が音声入力を行うと、話
者識別部２０３が入力話者を識別する。また音声認識部
１０５は入力音声データを音声認識し、フォント変換制
御部２０４に出力する。フォント変換制御部２０４は音
声認識結果のテキストデータを、話者毎に定められたフ
ォント情報によってフォント変換する。The operation of the thus configured font conversion apparatus according to the present embodiment will be described. In the present embodiment, a template used for speaker identification is created before actual text is input by voice, and the font table 10
02 is also created. Then, when the speaker performs voice input, the speaker identification unit 203 identifies the input speaker. The voice recognition unit 105 performs voice recognition on the input voice data and outputs the data to the font conversion control unit 204. The font conversion control unit 204 converts the text data of the speech recognition result into fonts based on font information defined for each speaker.

【００８３】上記の動作例を具体的に説明する。まず、
話者は、実際のテキストを音声入力する前に、話者識別
部２０３で用いるテンプレートを作成するため、話者識
別用ラベル取得部２０１で話者識別用ラベルを入力する
必要がある。このため図１４に示すように、登録ラベル
としてICHIRO、JIRO・・・GORO・・・のように話者の名
前で入力する。また、話者は試験用のテキストの発声を
行い、音声入力部１０１から話者識別用特徴抽出部２０
８に音声データを入力する。具体的な一例として、２人
の話者である一郎と二郎が、各々の試験用のテキストを
音声入力する場合を考える。まず、一郎が話者識別用ラ
ベル取得部２０１から「ICHIRO」というラベルを話者特
徴学習部２０２に入力する。この後、一郎は「イチロ
ウ」と発声したり、他の試験用の文章を通常の音圧で読
み上げる。音声入力部１０１から話者識別用特徴抽出部
２０８に音声データ「イチロウ」や他の試験用の音声デ
ータが与えられる。The above operation example will be specifically described. First,
Before the speaker inputs the actual text by voice, the speaker identification label 203 needs to be input by the speaker identification label acquisition unit 201 in order to create a template used by the speaker identification unit 203. Therefore, as shown in FIG. 14, the registration label is entered with the name of the speaker, such as ICHIRO, JIRO... GORO. Further, the speaker utters a test text, and a speaker identification feature extracting unit 20
8 is input with audio data. As a specific example, consider a case in which two speakers, Ichiro and Jiro, input their test texts by voice. First, Ichiro inputs a label “ICHIRO” from the speaker identification label acquiring unit 201 to the speaker characteristic learning unit 202. After this, Ichiro utters "Ichiro" and reads out other test sentences at normal sound pressure. Speech data “Ichiro” and other test speech data are provided from the speech input unit 101 to the speaker identification feature extraction unit 208.

【００８４】話者識別用特徴抽出部２０８は、音声入力
部１０１から入力された音声データ「イチロウ」や他の
音声データを用いて、話者識別に用いる音声特徴量を抽
出し、この音声特徴量を話者特徴学習部２０２に出力す
る。話者特徴学習部２０２は、話者識別用特徴抽出部２
０８から与えられた音声特徴量を用い、話者識別用ラベ
ル取得部２０１から入力された「ICHIRO」というラベル
に対するテンプレートを作成する。こうしてICHIROのテ
ンプレートが作成され、同様にしてJIROのテンプレート
も作成される。The speaker identification feature extraction unit 208 extracts the speech feature amount used for speaker identification using the speech data “Ichiro” and other speech data input from the speech input unit 101, and extracts this speech feature. The amount is output to the speaker characteristic learning unit 202. The speaker feature learning unit 202 includes a speaker identification feature extraction unit 2
08, a template for the label “ICHIRO” input from the speaker identification label acquiring unit 201 is created. In this way, an ICHIRO template is created, and a JIRO template is created in the same manner.

【００８５】次にフォントテーブル１００２を作成す
る。実際のテキストを音声入力する前に、話者は話者識
別用ラベル取得部２０１を介して話者識別用ラベルをフ
ォントテーブル作成部２０５に入力する。上記の例にお
いて、話者識別用ラベル取得部２０１から「ICHIRO」、
「JIRO」というラベルがフォントテーブル作成部２０５
に入力されると、フォントテーブル１００２の状態は図
１５（ａ）のようになる。Next, a font table 1002 is created. Before inputting the actual text by voice, the speaker inputs the speaker identification label to the font table creation unit 205 via the speaker identification label acquisition unit 201. In the above example, “ICHIRO”,
The label “JIRO” is displayed in the font table creation unit 205
, The state of the font table 1002 is as shown in FIG.

【００８６】更に話者は、フォント指定入力部１０７か
ら、話者自身が用いるフォントをフォントテーブル作成
部２０５に入力する。フォントテーブル作成部２０５
は、話者識別用ラベル取得部２０１から入力された話者
識別用ラベルとフォント指定入力部１０７から入力され
たフォント情報とを用いて、フォントテーブル１００２
を作成する。上記の例において、図１５（ａ）の話者の
ラベル「ICHIRO」に対して、フォント指定入力部１０７
からフォントテーブル作成部２０５に対して「（フォン
ト）ゴシック、（スタイル）太字、（サイズ）３２、
（文字飾り）無し、（色）黒」が入力される。また話者
のラベル「JIRO」に対して、「（フォント）ゴシック、
（スタイル）太字斜体、（サイズ）２４、（文字飾り）
無し、（色）黒」が入力される。こうして図１５（ｂ）
のようなフォントテーブル１００２が作成される。Further, the speaker inputs a font used by the speaker from the font designation input unit 107 to the font table creation unit 205. Font table creation unit 205
The font table 1002 uses the speaker identification label input from the speaker identification label acquisition unit 201 and the font information input from the font designation input unit 107.
Create In the above example, the font designation input unit 107 for the label “ICHIRO” of the speaker in FIG.
To the font table creation unit 205 from “(Font) Gothic, (Style) bold, (Size) 32,
No (text decoration), (color) black "is input. Also, for the speaker label "JIRO", "(Font) Gothic,
(Style) bold italic, (size) 24, (text decoration)
None, (color) black "is input. Thus, FIG.
Is created.

【００８７】次に、話者は実際のテキストを音声入力す
る。図１３において、音声入力部１０１から話者識別用
特徴抽出部２０８と音声認識部１０５に音声データが入
力される。上記の例において、一郎が「明日は、晴れで
しょう。」と発声したとする。そして、一郎の音声デー
タ「明日は、晴れでしょう。」が、音声入力部１０１を
介して話者識別用特徴抽出部２０８と音声認識部１０５
とに入力される。Next, the speaker speaks the actual text. In FIG. 13, voice data is input from a voice input unit 101 to a speaker identification feature extraction unit 208 and a voice recognition unit 105. In the above example, suppose that Ichiro uttered, "Tomorrow will be fine." Then, Ichiro's voice data “Tomorrow will be fine.” Is output via the voice input unit 101 to the speaker identification feature extraction unit 208 and the voice recognition unit 105.
Entered as

【００８８】話者識別用特徴抽出部２０８で、一郎が発
声した話者識別用の音声データ「明日は、晴れでしょ
う。」から一郎の音声特徴量を抽出する。話者識別部２
０３は話者識別用特徴抽出部２０８から入力された音声
特徴量を、話者特徴学習部２０２で作成されたICHIROと
JIROのテンプレートを用いて比較照合し、話者の識別を
行う。そして識別結果である「ICHIRO」という話者のラ
ベルがフォント変換制御部２０４に出力される。The speaker identification feature extraction unit 208 extracts Ichiro's voice feature amount from the speaker identification voice data "I'll be fine tomorrow." Speaker identification unit 2
03, the speech feature amount input from the speaker identification feature extraction unit 208 is compared with the ICHIRO created by the speaker feature learning unit 202.
Compare and collate using JIRO templates to identify speakers. Then, the speaker label “ICHIRO” as the identification result is output to the font conversion control unit 204.

【００８９】次に音声認識部１０５は、音声入力部１０
１から入力された音声データを「明日は、晴れでしょ
う。」と音声認識した後、この認識結果のテキストデー
タをフォント変換制御部２０４に出力する。フォント変
換制御部２０４で、話者識別部２０３から入力された話
者のラベル「ICHIRO」がフォントテーブル参照部２０６
に出力される。Next, the speech recognition unit 105
After the voice data input from 1 is recognized as "Tomorrow will be fine.", The text data of the recognition result is output to the font conversion control unit 204. In the font conversion control unit 204, the label “ICHIRO” of the speaker input from the speaker identification unit 203 is stored in the font table reference unit 206.
Is output to

【００９０】フォントテーブル参照部２０６は、図１５
（ｂ）のフォントテーブル１００２を参照し、フォント
変換制御部２０４から入力されたラベル「ICHIRO」に対
応したフォント情報「（フォント）ゴシック、（スタイ
ル）太字、（サイズ）３２、（文字飾り）無し、（色）
黒」を取得し、フォント変換制御部２０４に出力する。The font table reference unit 206 is provided in FIG.
Referring to the font table 1002 of (b), font information “(Font) Gothic, (Style) bold, (Size) 32, (No character decoration) corresponding to the label“ ICHIRO ”input from the font conversion control unit 204 ,(color)
"Black" is obtained and output to the font conversion control unit 204.

【００９１】フォント変換制御部２０４では、フォント
テーブル参照部２０６から入力された上記のフォント情
報を基に、音声認識部１０５から入力されたテキスト
「明日は、晴れでしょう。」をフォント変換する。そし
て、このフォント変換されたテキストが図１８（ａ）に
示すように表示部１０３で表示される。もし他の話者で
ある二郎が同様の文章「明後日は、雨でしょう。」を発
声すると、フォント変換されたテキストが図１８（ｂ）
に示すように表示部１０３で表示される。The font conversion control unit 204 converts the font of the text "Tomorrow will be fine" input from the speech recognition unit 105 based on the font information input from the font table reference unit 206. Then, the font-converted text is displayed on the display unit 103 as shown in FIG. If another speaker, Jiro, utters a similar sentence, "The day after tomorrow will be rain."
Are displayed on the display unit 103 as shown in FIG.

【００９２】尚、本実施の形態のフォント変換装置に対
して、実施の形態３のフォント変換制御部、フォント変
更入力部、フォント学習部を更に設けることで、フォン
トテーブル１００２の情報を更新できるようにしてもよ
い。The font conversion apparatus according to the present embodiment is further provided with a font conversion control unit, a font change input unit, and a font learning unit according to the third embodiment, so that information in the font table 1002 can be updated. It may be.

【００９３】また、本実施の形態のフォント変換装置に
対し、実施の形態４のテキスト入力部１１２を更に設
け、本実施の形態の音声認識部１０５の代わりに実施の
形態４の読み上げ指定表示部１１３を用いることで、実
施の形態４のように、入力されたテキストのフォントを
話者毎にフォント変換できるようにしてもよい。Further, a text input unit 112 according to the fourth embodiment is further provided for the font conversion apparatus according to the present embodiment, and a reading designation display unit according to the fourth embodiment is used instead of the voice recognition unit 105 according to the fourth embodiment. Using 113, the font of the input text may be converted for each speaker as in the fourth embodiment.

【００９４】また、本実施の形態のフォント変換装置に
対し、実施の形態５のテキスト入力部１１２と位置情報
検出部１１５とを更に設けることで、実施の形態５のよ
うに、入力されたテキストのフォントを話者毎にフォン
ト変換できるようにしてもよい。Further, the font conversion apparatus according to the present embodiment is further provided with the text input section 112 and the position information detecting section 115 according to the fifth embodiment, so that the input text can be changed as in the fifth embodiment. May be converted into fonts for each speaker.

【００９５】また、本実施の形態で述べたように、話者
毎にテキストをフォント変換する方法を、実施の形態
１，３，４，５のフォント変換装置にも利用することが
できる。これによって実施の形態１，３，４，５のフォ
ント変換装置は複数の人が利用できる装置となる。Also, as described in the present embodiment, the method of font-converting a text for each speaker can be used in the font converters of the first, third, fourth and fifth embodiments. As a result, the font conversion apparatuses according to the first, third, fourth, and fifth embodiments can be used by a plurality of persons.

【００９６】以上のように本実施の形態のフォント変換
装置によれば、発声した話者に応じて音声入力されたテ
キストのフォントを変換し、画面に表示することができ
る。例えば、複数の人が各々の文章を入力する場合、話
者毎にフォントを指定することにより、文章を入力した
話者が誰であるかを第３者に理解させることができる。As described above, according to the font conversion apparatus of the present embodiment, it is possible to convert the font of the text input by voice according to the speaker who utters and display it on the screen. For example, when a plurality of persons input each sentence, by specifying a font for each speaker, it is possible to make a third party understand who is the speaker who input the sentence.

【００９７】（実施の形態７）次に、本発明の実施の形
態７によるフォント変換装置について、図面を参照しな
がら説明する。本実施の形態のフォント変換装置は、テ
キストをフォント変換する際、話者の音声情報として性
別、感情、年齢等の個人情報を用いることを特徴とす
る。(Embodiment 7) Next, a font conversion apparatus according to Embodiment 7 of the present invention will be described with reference to the drawings. The font conversion apparatus according to the present embodiment is characterized in that personal information such as sex, emotion, and age is used as voice information of a speaker when performing font conversion on text.

【００９８】図１６は本実施の形態によるフォント変換
装置の構成図である。ここで、実施の形態１と同一ブロ
ックは同一の符号を付け、それらの詳細な説明を省略す
る。本実施の形態のフォント変換装置は、音声入力部１
０１、表示部１０３、個人情報抽出部３０１、個人情報
認識部３０２、フォント変換制御部３０３、フォントテ
ーブル参照部３０５、フォントテーブル１００３を含ん
で構成される。FIG. 16 is a configuration diagram of the font conversion apparatus according to the present embodiment. Here, the same blocks as those in the first embodiment are denoted by the same reference numerals, and detailed description thereof will be omitted. The font conversion apparatus according to the present embodiment includes a voice input unit 1
01, a display unit 103, a personal information extraction unit 301, a personal information recognition unit 302, a font conversion control unit 303, a font table reference unit 305, and a font table 1003.

【００９９】個人情報抽出部３０１は、音声入力部１０
１から話者の音声データが入力されると、入力音声デー
タから話者の声帯や声道情報等からなる話者の個人情報
を抽出するものである。このような個人情報は個人情報
認識部３０２に与えられる。個人情報認識部３０２は話
者の個人情報が入力されると、個人情報の認識を行い、
認識結果として性別や感情、年齢等の個人状態情報を生
成するものである。これらの個人状態情報はフォント変
換制御部３０３に与えられる。[0099] The personal information extracting section 301 is provided with the voice input section 10.
When the speaker's voice data is input from 1, the speaker's personal information including the vocal cords and vocal tract information of the speaker is extracted from the input voice data. Such personal information is provided to the personal information recognition unit 302. When the personal information of the speaker is input, the personal information recognition unit 302 recognizes the personal information,
It generates personal status information such as gender, emotion, and age as a recognition result. These pieces of personal state information are provided to the font conversion control unit 303.

【０１００】フォントテーブル１００３は、個人の状態
を複数の属性に分別し、各属性毎に各種フォント設定値
を記憶するテーブルである。フォントテーブル１００３
の一例を図１７に示す。図１７（ａ）のフォントテーブ
ル１００３ａは性別を属性として設定し、図１７（ｂ）
のフォントテーブル１００３ｂは感情を属性として設定
し、図１７（ｃ）のフォントテーブル１００３ｃは年齢
を属性として設定したフォントテーブルの一例である。
尚、フォントテーブル１００３の個人状態情報の属性と
フォント情報は、話者が入力できるようにするか、又は
初めから与えられたものでもよい。The font table 1003 is a table that classifies an individual's state into a plurality of attributes and stores various font setting values for each attribute. Font table 1003
FIG. 17 shows an example. The font table 1003a in FIG. 17A sets gender as an attribute, and FIG.
Is an example of a font table in which emotion is set as an attribute, and a font table 1003c in FIG. 17C is an example in which age is set as an attribute.
The attribute of the personal status information and the font information in the font table 1003 may be input by a speaker or may be given from the beginning.

【０１０１】フォントテーブル参照部３０５では、話者
の個人状態情報がフォント変換制御部３０３から入力さ
れると、フォントテーブル１００３を参照し、個人状態
情報に対応するフォント情報を取得し、フォント変換制
御部３０３に与えるものである。フォント変換制御部３
０３は、個人情報認識部３０２から個人状態情報が入力
され、音声認識部１０５から音声認識結果のテキストデ
ータが入力されると、フォントテーブル参照部３０５に
個人状態情報を与え、話者の個人状態情報に対応するフ
ォント情報を取得し、このフォント情報に基づいて、音
声認識部１０５から入力されたテキストデータをフォン
ト変換するものである。表示部１０３は、フォント変換
前の標準フォントのテキスト（文章）を表示したり、フ
ォント変換制御部３０３でフォント変換されたテキスト
を表示するものである。When the speaker's personal status information is input from the font conversion control unit 303, the font table reference unit 305 refers to the font table 1003, acquires font information corresponding to the personal status information, and performs font conversion control. This is given to the unit 303. Font conversion control unit 3
03, when the personal status information is input from the personal information recognizing unit 302 and the text data of the voice recognition result is input from the voice recognizing unit 105, the personal status information is given to the font table reference unit 305, and the speaker's personal status is displayed. The font information corresponding to the information is obtained, and the text data input from the speech recognition unit 105 is font-converted based on the font information. The display unit 103 displays a text (text) of a standard font before font conversion, and displays the text whose font has been converted by the font conversion control unit 303.

【０１０２】このように構成された本実施の形態による
フォント変換装置の動作例について説明する。話者が発
声を行い、音声入力部１０１で音声を取り込むと、音声
データが個人情報抽出部３０１と音声認識部１０５とに
出力される。一例として、成人男性が「明日は、晴れで
しょう。」と発声したとする。まず、男性の音声データ
「明日は、晴れでしょう。」が音声入力部１０１から個
人情報抽出部３０１と音声認識部１０５とに出力され
る。An example of the operation of the thus configured font conversion apparatus according to the present embodiment will be described. When the speaker utters a voice and captures a voice with the voice input unit 101, voice data is output to the personal information extraction unit 301 and the voice recognition unit 105. As an example, suppose an adult man said, "Tomorrow will be fine." First, male voice data “Tomorrow will be fine” is output from the voice input unit 101 to the personal information extraction unit 301 and the voice recognition unit 105.

【０１０３】個人情報抽出部３０１は、音声入力部１０
１から入力された音声データ「明日は、晴れでしょ
う。」から声帯や声道情報などの個人情報を抽出し、こ
の個人情報を個人情報認識部３０２に与える。個人情報
認識部３０２は、個人情報抽出部３０１から入力された
声帯や声道情報を周波数スペクトルに変換し、そのフォ
ルマントデータから話者の性別を判別する。ここでは低
次のフォルマント周波数が低いと判断し、話者が「男
性」であると認識する。そして、認識結果である「男
性」がフォント変換制御部３０３に出力される。[0103] The personal information extraction unit 301
The personal information such as vocal cords and vocal tract information is extracted from the voice data "Tomorrow will be fine." The personal information recognizing unit 302 converts the vocal cords and vocal tract information input from the personal information extracting unit 301 into a frequency spectrum, and determines the sex of the speaker from the formant data. Here, it is determined that the low-order formant frequency is low, and the speaker is recognized as "male". The recognition result “male” is output to the font conversion control unit 303.

【０１０４】一方、音声認識部１０５では、音声入力部
１０１から入力された音声データを「明日は、晴れでし
ょう。」として音声認識し、認識結果のテキストをフォ
ント変換制御部３０３に出力する。フォント変換制御部
３０３は、個人情報認識部３０２から入力された個人状
態情報「男性」をフォントテーブル参照部３０５に与え
る。On the other hand, the voice recognition unit 105 performs voice recognition on the voice data input from the voice input unit 101 as “Tomorrow will be fine.” The recognition result text is output to the font conversion control unit 303. The font conversion control unit 303 provides the personal status information “male” input from the personal information recognition unit 302 to the font table reference unit 305.

【０１０５】フォントテーブル参照部３０５は、フォン
ト変換制御部３０３から入力された個人状態情報「男
性」に対応するフォント情報として、例えば、図１７
（ａ）に示すフォント情報「（フォント）ゴシック、
（スタイル）太字、（サイズ）３２，（文字飾り）無
し、（色）黒」を取得し、フォント変換制御部３０３に
出力する。The font table reference unit 305 stores, as font information corresponding to the personal status information “male” input from the font conversion control unit 303, for example, as shown in FIG.
The font information “(Font) Gothic,
(Style) bold, (size) 32, (no character decoration), (color) black ”are obtained and output to the font conversion control unit 303.

【０１０６】フォント変換制御部３０３は、フォントテ
ーブル参照部３０５から入力されたフォント情報「（フ
ォント）ゴシック、（スタイル）太字、（サイズ）３
２，（文字飾り）無し、（色）黒」に基づいて、音声認
識部１０５から入力されたテキストデータ「明日は、晴
れでしょう。」をフォント変換する。そして、このフォ
ント変換されたテキストが図１８（ａ）に示すように表
示部１０３で表示される。The font conversion control unit 303 converts the font information “(font) gothic, (style) bold, (size) 3” input from the font table reference unit 305.
2, the text data "Tomorrow will be fine" input from the voice recognition unit 105 is font-converted based on "2, (text decoration), (color) black". Then, the font-converted text is displayed on the display unit 103 as shown in FIG.

【０１０７】また、女性が「明後日は、雨でしょう。」
と音声入力を行った場合は、前記の例と同じ処理が行わ
れ、図１８（ｂ）に示すように、フォント変換されたテ
キストが表示部１０３で表示される。[0107] The woman said, "The day after tomorrow will rain."
When the voice input is performed, the same processing as in the above example is performed, and the font-converted text is displayed on the display unit 103 as shown in FIG.

【０１０８】尚、具体的な例として、個人情報を性別と
して考えてきたが、個人情報が話者の感情であってもよ
い。個人情報が感情の場合は、発声した際の話者の感情
により、音声入力された文字のフォントが変換される。
例えば、認識された話者の感情によって文字の色を変更
することにより、文章の書き手である話者が文章の読み
手に感情を伝えることが可能になる。この場合に用いら
れるフォントテーブル１００３ｂは図１７（ｂ）のよう
になる。楽しい場合は、表示されるテキストのフォント
は、例えば「（フォント）ゴシック、（スタイル）太
字、（サイズ）３２、（文字飾り）無し、（色）黒」と
なり、悲しい場合には、表示されるテキストのフォント
は、例えば「（フォント）明朝、（スタイル）斜体、
（サイズ）３２、（文字飾り）無し、（色）青」とな
る。As a specific example, personal information has been considered as gender, but personal information may be a speaker's emotion. If the personal information is emotion, the font of the character input by voice is converted according to the speaker's emotion when speaking.
For example, by changing the color of the character according to the recognized emotion of the speaker, the speaker who is the writer of the text can transmit the emotion to the reader of the text. The font table 1003b used in this case is as shown in FIG. In the case of fun, the font of the displayed text is, for example, “(Font) Gothic, (Style) bold, (Size) 32, (No character decoration), (Color) black”, and in the sad case, it is displayed. The font of the text is, for example, "(Font) Mincho, (Style)
(Size) 32, no (text decoration), (color) blue ".

【０１０９】また、上記の個人情報は年齢であってもよ
い。この場合、発声した話者の推定年齢に基づいて音声
入力されたテキストのフォントが変換される。この場合
に用いられるフォントテーブル１００３ｃは図１７
（ｃ）のようになる。２０歳以上の場合は、表示される
テキストのフォントは、例えば「（フォント）ゴシッ
ク、（スタイル）太字、（サイズ）３２、（文字飾り）
無し、（色）黒」となり、２０歳未満の場合には、表示
されるテキストのフォントは、例えば「（フォント）明
朝、（スタイル）斜体、（サイズ）１４、（文字飾り）
無し、（色）青」となる。[0109] The personal information may be age. In this case, the font of the text input by speech is converted based on the estimated age of the speaker who uttered. The font table 1003c used in this case is shown in FIG.
(C). If the user is over 20 years old, the font of the displayed text is, for example, “(Font) Gothic, (Style) bold, (Size) 32, (Character decoration)
None, (color) black ", and if the user is under 20 years old, the font of the displayed text is, for example," (font) Mincho, (style) italic, (size) 14, (character decoration)
None, (color) blue ".

【０１１０】例えば、個人状態を年齢に設定し、子供が
テキストを音声入力すると、フォントテーブル１００３
ｃが用いられ、入力されたテキストが２０歳未満のフォ
ントに変換される。そして、読み手が、この変換された
テキストを見ることにより、話手が子供であるというこ
とを識別できる。いずれにしても個人の状態を性別にす
るか、感情にするか、年齢にするかを事前に設定する必
要がある。For example, when the personal status is set to age and the child inputs text by voice, the font table 1003
c is used to convert the input text to a font under 20 years old. Then, the reader can identify that the speaker is a child by looking at the converted text. In any case, it is necessary to set in advance whether the individual's condition is gender, emotion, or age.

【０１１１】尚、本実施の形態のフォント変換装置に対
し、実施の形態２のようなフォントテーブル作成部とフ
ォント指定入力部とを更に設け、本実施の形態のフォン
トテーブル１００３を新規に作成できるようにしてもよ
い。The font conversion apparatus according to the present embodiment is further provided with a font table creation section and a font designation input section as in the second embodiment, and a new font table 1003 according to the present embodiment can be created. You may do so.

【０１１２】また、本実施の形態のフォント変換装置に
対し、実施の形態３のようなフォント変換制御部、フォ
ント変更入力部、フォント学習部を更に設けることで、
本実施の形態のフォントテーブル１００３を更新できる
ようにしてもよい。Further, the font conversion apparatus according to the present embodiment is further provided with a font conversion control unit, a font change input unit, and a font learning unit as in the third embodiment.
The font table 1003 of this embodiment may be updated.

【０１１３】また、本実施の形態のフォント変換装置に
対し、実施の形態４のテキスト入力部１１２を更に設
け、本実施の形態の音声認識部１０５の代わりに実施の
形態４の読み上げ指定表示部１１３を用いることで、入
力されたテキストのフォントを話者毎に変換できるよう
にしてもよい。Further, the text conversion unit 112 according to the fourth embodiment is further provided in the font conversion device according to the fourth embodiment, and the reading designation display unit according to the fourth embodiment is used instead of the voice recognition unit 105 according to the fourth embodiment. By using 113, the font of the input text may be converted for each speaker.

【０１１４】また、本実施の形態のフォント変換装置に
対し、実施の形態５のテキスト入力部１１２と位置情報
検出部１１５とを更に設け、実施の形態５のように、入
力されたテキストのフォントを話者毎にフォント変換で
きるようにしてもよい。Further, the font conversion apparatus according to the present embodiment is further provided with a text input unit 112 and a position information detection unit 115 according to the fifth embodiment. May be font-converted for each speaker.

【０１１５】以上のように本実施の形態のフォント変換
装置によれば、発声した際の話者の感情により、音声入
力された文字のフォントを変換し、画面に表示すること
が可能になる。例えば、認識された話者の感情によって
文字の色を変更することにより、文章の書き手（話者）
は、読み手（読者）に感情を伝えることが可能となる。As described above, according to the font conversion apparatus of the present embodiment, it is possible to convert the font of the character input by voice according to the speaker's emotion at the time of uttering, and display it on the screen. For example, by changing the color of characters according to the perceived speaker's emotion, the writer (speaker)
Can convey emotions to readers (readers).

【０１１６】また、発声した話者の性別により、音声入
力されたテキストのフォントを変換し、画面に表示する
ことが可能になる。例えば、認識された性別によって文
字のフォントを変換することにより、文章の読み手（読
者）にひと目で文章の書き手（話者）が男性か女性かを
理解させることができる。Further, it becomes possible to convert the font of the text input by voice and display it on the screen, depending on the gender of the speaking speaker. For example, by converting the font of the character according to the recognized gender, the reader (reader) of the text can understand at a glance whether the writer (speaker) of the text is male or female.

【０１１７】また、発声した話者の年齢により、音声入
力された文字のフォントを変換し、画面に表示すること
が可能になる。例えば、子供がテキストを音声入力する
場合、読み手（読者）がこの変換されたテキストを読む
ことにより、文章の書き手（話者）が子供であると理解
することができる。Further, it becomes possible to convert the font of the character input by voice according to the age of the speaking speaker and display it on the screen. For example, when a child voice-inputs a text, a reader (reader) can understand that the writer (speaker) of the text is a child by reading the converted text.

【０１１８】以上の個人情報抽出部３０１及び個人情報
認識部３０２は、話者の音声特性を示す音声情報を抽出
する音声情報抽出部の機能を有しているが、音声情報の
内容に応じて、以下の第１〜第３の音声情報抽出部のよ
うに構成することもできる。The personal information extracting section 301 and the personal information recognizing section 302 have the function of the voice information extracting section for extracting the voice information indicating the voice characteristics of the speaker, but according to the content of the voice information. , The following first to third audio information extracting units.

【０１１９】第１の音声情報抽出部は、音声入力部によ
り入力された音声データから、話者の感情識別に用いる
音声特徴量を音声情報として抽出する感情識別用特徴抽
出部と、感情識別用特徴抽出部から抽出された音声特徴
量を用いて、話者の感情を識別する感情識別部とを含む
ものとする。The first voice information extraction unit includes an emotion identification feature extraction unit that extracts, as voice information, a voice feature used for speaker's emotion recognition from the voice data input by the voice input unit, An emotion identification unit that identifies a speaker's emotion using the speech feature amount extracted from the feature extraction unit is included.

【０１２０】第２の音声情報抽出部は、音声入力部によ
り入力された音声データから、話者の性別識別に用いる
音声特徴量を音声情報として抽出する性別識別用特徴抽
出部と、性別識別用特徴抽出部から抽出された音声特徴
量を用いて、話者の性別を識別する性別識別部とを含む
ものとする。The second voice information extraction unit includes a gender identification feature extraction unit for extracting, as voice information, a voice feature used for gender identification of a speaker from the voice data input by the voice input unit, and a gender identification feature extraction unit. It includes a gender identification unit that identifies the gender of the speaker using the speech feature amount extracted from the feature extraction unit.

【０１２１】第３の音声情報抽出部は、音声入力部によ
り入力された音声データから、話者の年齢識別に用いる
音声特徴量を音声情報として抽出する年齢識別用特徴抽
出部と、年齢識別用特徴抽出部から抽出される音声特徴
量を用いて、話者の年齢を識別する年齢識別部とを含む
ものとする。The third voice information extraction unit includes an age identification feature extraction unit that extracts, as voice information, a voice feature used to identify the age of the speaker from the voice data input by the voice input unit, An age identification unit that identifies the age of the speaker using the audio feature amount extracted from the feature extraction unit is included.

【０１２２】（実施の形態８）次に、本発明の実施の形
態８によるフォント変換装置について、図面を参照しな
がら説明する。本実施の形態のフォント変換装置は、音
声データが認識される確かさ情報に基づいて、テキスト
をフォント変換する際のフォント設定値を変化させるこ
とを特徴とする。(Eighth Embodiment) Next, a font conversion apparatus according to an eighth embodiment of the present invention will be described with reference to the drawings. The font conversion apparatus according to the present embodiment is characterized in that a font setting value for converting a text into a font is changed based on certainty information for recognizing voice data.

【０１２３】図１９は本実施の形態によるフォント変換
装置の構成図である。尚、実施の形態１と同一のブロッ
クは、同一の符号を付けて詳細な説明は省略する。本実
施の形態のフォント変換装置は、音声入力部１０１、表
示部１０３、音声認識部４０１、フォント変換制御部４
０２、フォントテーブル参照部４０３、フォントテーブ
ル１００４を含んで構成される。FIG. 19 is a block diagram of the font converter according to the present embodiment. The same blocks as in the first embodiment are denoted by the same reference numerals, and detailed description is omitted. The font conversion device according to the present embodiment includes a voice input unit 101, a display unit 103, a voice recognition unit 401, a font conversion control unit 4
02, a font table reference unit 403, and a font table 1004.

【０１２４】音声認識部４０１は、音声入力部１０１か
ら音声データが入力されると、音声データを認識し、認
識結果のテキストデータを出力すると共に、音声データ
を認識する際に用いた確かさ情報（尤度情報）を出力
し、フォント変換制御部４０２に与えるものである。When voice data is input from the voice input unit 101, the voice recognition unit 401 recognizes the voice data, outputs text data as a recognition result, and outputs the reliability information used in recognizing the voice data. (Likelihood information) to be output to the font conversion control unit 402.

【０１２５】フォントテーブル１００４は複数の尤度情
報を階級とし、各階級毎に各種フォント設定値を記憶す
るテーブルである。フォントテーブル１００４の一例を
図２０に示す。ここでは尤度３０００以上の階級と、尤
度３０００未満の階級が設定されている。尚、フォント
テーブル１００４のテキストの尤度情報の階級と、これ
に対応するフォント情報は、話者が入力できるようにす
るか、又は初めから与えられたものでもよい。The font table 1004 is a table in which a plurality of likelihood information are classified into classes and various font setting values are stored for each class. An example of the font table 1004 is shown in FIG. Here, a class with a likelihood of 3000 or more and a class with a likelihood of less than 3000 are set. Note that the class of the likelihood information of the text in the font table 1004 and the corresponding font information may be input by the speaker or may be provided from the beginning.

【０１２６】図１９のフォントテーブル参照部４０３
は、フォント変換制御部４０２から尤度情報が入力され
ると、フォントテーブル１００４を参照し、尤度情報に
対応したフォント情報を取得し、フォント変換制御部４
０２に与えるものである。フォント変換制御部４０２
は、音声認識部４０１で認識されたテキストと、当該テ
キストの尤度情報とが入力されると、テキストの尤度情
報に対応するフォント情報を用いて、テキストデータを
フォント変換するものである。表示部１０３は、標準フ
ォントのテキスト（文章）を表示したり、フォント変換
されたテキストを表示するものである。The font table reference unit 403 in FIG.
When the likelihood information is input from the font conversion control unit 402, the font conversion unit 4 refers to the font table 1004 and acquires font information corresponding to the likelihood information.
02. Font conversion control unit 402
When the text recognized by the voice recognition unit 401 and the likelihood information of the text are input, the text data is font-converted using font information corresponding to the text likelihood information. The display unit 103 displays text (sentence) in a standard font, and displays font-converted text.

【０１２７】このように構成された本実施の形態による
フォント変換装置の動作例について説明する。音声入力
部１０１から、音声認識部４０１に音声データが入力さ
れる。具体的な一例として、話者が「明日は、晴れでし
ょう。」と音声入力する場合を考える。まず、話者が
「明日は、」と発声すると、「明日は、」の音声データ
が音声入力部１０１から音声認識部４０１に入力され
る。An example of the operation of the thus configured font conversion apparatus according to the present embodiment will be described. Voice data is input from the voice input unit 101 to the voice recognition unit 401. As a specific example, consider a case where a speaker inputs a voice saying "Tomorrow will be fine." First, when the speaker utters “Tomorrow is”, the voice data of “Tomorrow is” is input from the voice input unit 101 to the voice recognition unit 401.

【０１２８】音声認識部４０１は入力された音声データ
を音声認識し、テキスト「明日は、」に変換する。ま
た、このテキスト「明日は、」の尤度が４０００である
とする。このテキストデータ「明日は、」と尤度４００
０とがフォント変換制御部４０２に出力される。The voice recognition unit 401 performs voice recognition on the input voice data, and converts the voice data into a text "Tomorrow is." It is also assumed that the likelihood of this text “Tomorrow is” is 4000. This text data "Tomorrow is" and likelihood 400
0 is output to the font conversion control unit 402.

【０１２９】フォント変換制御部４０２では、音声認識
部４０１から入力された尤度４０００をフォントテーブ
ル参照部４０３に出力する。フォントテーブル参照部４
０３は、フォント変換制御部４０２から尤度４０００が
入力されると、図２０に示すフォントテーブル１００４
を参照する。そしてフォントテーブル参照部４０３は、
尤度４０００に対応するフォント情報「（フォント）ゴ
シック、（スタイル）標準、（サイズ）２０、（文字飾
り）無し、（色）黒」を取得し、フォント変換制御部４
０２に出力する。The font conversion control section 402 outputs the likelihood 4000 input from the speech recognition section 401 to the font table reference section 403. Font table reference section 4
03, when the likelihood 4000 is input from the font conversion control unit 402, the font table 1004 shown in FIG.
See Then, the font table reference unit 403
The font conversion control unit 4 acquires the font information “(Font) Gothic, (Style) standard, (Size) 20, (No character decoration), (Color) black” corresponding to the likelihood 4000.
02 is output.

【０１３０】フォント変換制御部４０２では、フォント
テーブル４０４から入力されたフォント情報を基に、音
声認識部４０１から入力されたテキストデータ「明日
は、」をフォント変換する。そして表示部１０３はフォ
ント変換されたテキストを図２１（ａ）のように表示す
る。The font conversion control unit 402 converts the font of the text data “Tomorrow is” input from the voice recognition unit 401 based on the font information input from the font table 404. Then, the display unit 103 displays the font-converted text as shown in FIG.

【０１３１】次に、話者が「晴れでしょう。」を発声
し、テキスト「晴れでしょう。」が尤度２０００で認識
されたとする。この後は、「明日は、」と同様の処理が
行われ、表示部１０３はテキスト「晴れでしょう。」が
図２１（ｂ）のように「（フォント）ゴシック、（スタ
イル）太字、（サイズ）３２、（文字飾り）無し、
（色）黒」で表示する。Next, it is assumed that the speaker utters “Will be fine.” And the text “Would be fine.” Is recognized with a likelihood of 2000. Thereafter, the same processing as “Tomorrow is performed” is performed, and the display unit 103 changes the text “Will be fine” to “(Font) Gothic, (Style) bold, (Size) as shown in FIG. ) 32, without (text decoration),
(Color) black ".

【０１３２】尚、本実施の形態のフォント変換装置に対
して、実施の形態２のようなフォントテーブル作成部と
フォント指定入力部とを更に設けることで、本実施の形
態のフォントテーブル１００３を新規に作成できるよう
にしてもよい。The font conversion apparatus according to the present embodiment is further provided with a font table creation section and a font designation input section as in the second embodiment, so that the font table 1003 according to the present embodiment is newly provided. May be created.

【０１３３】尚、本実施の形態のフォント変換装置に対
して、実施の形態３のようなフォント変換制御部、フォ
ント変更入力部、フォント学習部を更に設けることで、
本実施の形態のフォントテーブル１００３を更新できる
ようにしてもよい。The font conversion apparatus according to the present embodiment is further provided with a font conversion control section, a font change input section, and a font learning section as in the third embodiment.
The font table 1003 of this embodiment may be updated.

【０１３４】以上のように本実施の形態のフォント変換
装置によれば、認識したテキストの確かさにより、音声
入力されたテキストのフォントを変換し、画面に表示す
ることができる。例えば、認識されたテキストの確かさ
が基準値以下であれば、テキストのフォントを強調する
ことにより、テキストの確かさの度合いを読み手（読
者）に伝えることができる。As described above, according to the font converter of the present embodiment, the font of the text input by voice can be converted based on the certainty of the recognized text and displayed on the screen. For example, if the certainty of the recognized text is equal to or less than the reference value, the degree of certainty of the text can be communicated to the reader (reader) by emphasizing the font of the text.

【０１３５】[0135]

【発明の効果】請求項１に記載のフォント変換装置によ
れば、話者の音声情報により、音声入力されたテキスト
のフォントを変換し、画面に表示することができる。According to the font conversion apparatus of the present invention, the font of the text input by voice can be converted based on the voice information of the speaker and displayed on the screen.

【０１３６】また請求項２に記載のフォント変換装置に
よれば、発声の強弱により、音声入力されたテキストの
フォントを変換し、画面に表示することが可能となる。
例えば、一定パワー以上であれば、強調フォントを使用
することにより、声の強弱で通常フォントと強調フォン
トを使い分けることができる。Further, according to the font conversion device of the present invention, it is possible to convert the font of the text input by voice according to the strength of the utterance and display it on the screen.
For example, if the power is equal to or higher than a certain power, the normal font and the emphasized font can be selectively used depending on the strength of the voice by using the emphasized font.

【０１３７】また、請求項３に記載のフォント変換装置
によれば、発声した話者により、音声入力されたテキス
トのフォントを変換し、画面に表示することが可能にな
る。例えば、複数人で文章を入力する場合、話者毎にフ
ォントを指定して文章を入力することにより、文章を入
力した話者が誰であるかを理解することができる。Further, according to the font conversion apparatus of the third aspect, it is possible to convert the font of the text input by the uttering speaker and display it on the screen. For example, when a sentence is input by a plurality of persons, by specifying a font for each speaker and inputting the sentence, it is possible to understand who is the speaker who input the sentence.

【０１３８】また、請求項４に記載のフォント変換装置
によれば、発声した時の話者の感情により、音声入力さ
れたテキストのフォントを変換し、画面に表示すること
が可能になる。例えば、認識された話者の感情によって
文字の色を変更することにより、文章の書き手（話者）
は、読み手（読者）に書き手の感情を伝えることができ
る。Further, according to the font conversion apparatus of the fourth aspect, it is possible to convert the font of the text input by voice according to the speaker's emotion at the time of uttering and display the font on the screen. For example, by changing the color of characters according to the perceived speaker's emotion, the writer (speaker)
Can convey the writer's feelings to the reader (reader).

【０１３９】また、請求項５に記載のフォント変換装置
によれば、発声した話者の性別により、音声入力された
テキストのフォントを変換し、画面に表示することが可
能になる。例えば、認識された性別によって文字のフォ
ントを変換することにより、文章の読み手（読者）にひ
と目で文章の書き手（話者）が男性か女性かを理解する
ことができる。Further, according to the font conversion apparatus of the present invention, it is possible to convert the font of the text input by voice according to the gender of the speaker who utters and display the font on the screen. For example, by converting the font of the character according to the recognized gender, the reader (reader) of the text can understand at a glance whether the writer (speaker) of the text is male or female.

【０１４０】また、請求項６に記載のフォント変換装置
によれば、発声した話者の年齢により、音声入力された
テキストのフォントを変換し、画面に表示することが可
能になる。例えば、子供が音声入力したテキストに対し
ては子供用フォントを使用すると、読み手（読者）はフ
ォント変換されたテキストを読むことにより、文章の書
き手（話者）が子供であると判断することができる。Further, according to the font converting apparatus of the present invention, it is possible to convert the font of the text input by voice according to the age of the speaker who utters and display it on the screen. For example, if a child uses a child font for text input by a child, the reader (reader) may determine that the text writer (speaker) is a child by reading the font-converted text. it can.

【０１４１】また、請求項７に記載のフォント変換装置
によれば、個人の音声パワー情報を学習することによ
り、声の大小の個人差を吸収することができる。Further, according to the font conversion apparatus of the present invention, by learning the voice power information of the individual, it is possible to absorb the difference between the large and small voices.

【０１４２】また、請求項８に記載のフォント変換装置
によれば、一旦、変換されたフォントを修正して記憶す
る学習機能が得られる。その学習機能により、好みに応
じたフォントに文字を変換することができる。Further, according to the font conversion apparatus of the present invention, a learning function of correcting and storing the converted font once is obtained. With the learning function, characters can be converted into fonts according to preference.

【０１４３】また、請求項９に記載のフォント変換装置
によれば、入力されているテキストの一部が指定された
とき、指定テキストを発話することにより、指定テキス
トのフォントを自動で変更することができる。According to the ninth aspect, when a part of the input text is specified, the specified text is automatically changed by speaking the specified text. Can be.

【０１４４】また、請求項１０に記載のフォント変換装
置によれば、入力された音声を認識し、すでに入力され
ている文章と比較することで、変更対象のテキストを自
動で検索する機能が加わる。話者はこのためフォント変
更時に自然な読み方をすることができる。According to the font conversion apparatus of the tenth aspect, a function of automatically retrieving the text to be changed by recognizing the input voice and comparing it with the already input text is added. . This allows the speaker to read naturally when changing the font.

【０１４５】また、請求項１１に記載のフォント変換装
置によれば、認識したテキストの確かさにより、音声入
力されたテキストのフォントを変換し、画面に表示する
ことが可能となる。例えば、認識されたテキストの確か
さが一定以下であれば、テキストを強調することによ
り、テキストの確かさの度合いを読み手（読者）に伝え
ることができる。Further, according to the font conversion apparatus of the present invention, it is possible to convert the font of the text input by voice based on the certainty of the recognized text and display it on the screen. For example, if the certainty of the recognized text is equal to or less than a certain value, the degree of certainty of the text can be communicated to a reader (reader) by emphasizing the text.

[Brief description of the drawings]

【図１】本発明の実施の形態１におけるフォント変換装
置の構成図である。FIG. 1 is a configuration diagram of a font conversion device according to a first embodiment of the present invention.

【図２】実施の形態１のフォント変換装置に用いられる
フォントテーブルの一例である。FIG. 2 is an example of a font table used in the font conversion device according to the first embodiment.

【図３】実施の形態１のフォント変換装置の動作を示す
表示例である。FIG. 3 is a display example showing the operation of the font conversion device according to the first embodiment.

【図４】本発明の実施の形態２におけるフォント変換装
置の構成図である。FIG. 4 is a configuration diagram of a font conversion device according to a second embodiment of the present invention.

【図５】実施の形態２のフォント変換装置に用いられる
フォントテーブルの一例である。FIG. 5 is an example of a font table used in the font conversion device according to the second embodiment.

【図６】本発明の実施の形態３におけるフォント変換装
置の構成図である。FIG. 6 is a configuration diagram of a font conversion device according to a third embodiment of the present invention.

【図７】実施の形態３のフォント変換装置の動作を示す
表示例である。FIG. 7 is a display example showing the operation of the font conversion apparatus according to the third embodiment.

【図８】実施の形態３のフォント変換装置に用いられる
フォントテーブルの一例である。FIG. 8 is an example of a font table used in the font conversion device according to the third embodiment.

【図９】本発明の実施の形態４におけるフォント変換装
置の構成図である。FIG. 9 is a configuration diagram of a font conversion device according to a fourth embodiment of the present invention.

【図１０】実施の形態４のフォント変換装置の動作を示
す表示例である。FIG. 10 is a display example showing an operation of the font conversion apparatus according to the fourth embodiment.

【図１１】本発明の実施の形態５におけるフォント変換
装置の構成図である。FIG. 11 is a configuration diagram of a font conversion device according to a fifth embodiment of the present invention.

【図１２】実施の形態５のフォント変換装置の動作を示
す表示例である。FIG. 12 is a display example showing an operation of the font conversion apparatus according to the fifth embodiment.

【図１３】本発明の実施の形態６におけるフォント変換
装置の構成図である。FIG. 13 is a configuration diagram of a font conversion device according to a sixth embodiment of the present invention.

【図１４】実施の形態６のフォント変換装置に用いられ
るフォントテーブルの一例である。FIG. 14 is an example of a font table used in the font conversion device according to the sixth embodiment.

【図１５】実施の形態６のフォント変換装置のフォント
テーブル作成動作を示す説明図である。FIG. 15 is an explanatory diagram illustrating a font table creation operation of the font conversion device according to the sixth embodiment.

【図１６】本発明の実施の形態７におけるフォント変換
装置の構成図である。FIG. 16 is a configuration diagram of a font conversion device according to a seventh embodiment of the present invention.

【図１７】実施の形態７のフォント変換装置に用いられ
るフォントテーブルの一例である。FIG. 17 is an example of a font table used in the font conversion device according to the seventh embodiment.

【図１８】実施の形態７のフォント変換装置の動作を示
す表示例である。FIG. 18 is a display example showing the operation of the font conversion apparatus according to the seventh embodiment.

【図１９】本発明の実施の形態８におけるフォント変換
装置の構成図である。FIG. 19 is a configuration diagram of a font conversion device according to an eighth embodiment of the present invention.

【図２０】実施の形態８のフォント変換装置に用いられ
るフォントテーブルの一例である。FIG. 20 is an example of a font table used in the font conversion device according to the eighth embodiment.

【図２１】実施の形態８のフォント変換装置の動作を示
す表示例である。FIG. 21 is a display example showing the operation of the font conversion apparatus according to the eighth embodiment.

[Explanation of symbols]

１０１音声入力部１０２，１０９，２０４，３０３，４０２フォント変
換制御部１０３表示部１０４，３０５，４０３フォントテーブル参照部１０５，４０１音声認識部１０６パワー情報抽出部１０７フォント指定入力部１０８フォントテーブル作成部１１０フォント学習部１１１フォント変更入力部１１２テキスト入力部１１３読み上げ指定表示部１１５位置情報検出部２０１話者識別用ラベル取得部２０２話者特徴学習部２０３話者識別部２０５フォントテーブル作成部２０８話者識別用特徴抽出部３０１個人情報抽出部３０２個人情報認識部１００１，１００２，１００３，１００４フォントテ
ーブル101 voice input unit 102, 109, 204, 303, 402 font conversion control unit 103 display unit 104, 305, 403 font table reference unit 105, 401 voice recognition unit 106 power information extraction unit 107 font designation input unit 108 font table creation unit Reference Signs List 110 font learning unit 111 font change input unit 112 text input unit 113 reading designation display unit 115 position information detection unit 201 speaker identification label acquisition unit 202 speaker characteristic learning unit 203 speaker identification unit 205 font table creation unit 208 speaker Identification feature extraction unit 301 Personal information extraction unit 302 Personal information recognition unit 1001, 1002, 1003, 1004 Font table

───────────────────────────────────────────────────── フロントページの続き (72)発明者稲垣悟大阪府門真市大字門真1006番地松下電器産業株式会社内 (72)発明者小島良宏大阪府門真市大字門真1006番地松下電器産業株式会社内 (72)発明者小山和宏大阪府門真市大字門真1006番地松下電器産業株式会社内Ｆターム(参考） 5B009 KB00 RB31 5D015 AA03 HH01 KK02 ──────────────────────────────────────────────────続き Continued on the front page (72) Inventor Satoru Inagaki 1006 Kazuma Kadoma, Osaka Pref. Matsushita Electric Industrial Co., Ltd. (72) Inventor Yoshihiro Kojima 1006 Kadoma Kadoma, Kadoma City Osaka Pref. Matsushita Electric Industrial Co., Ltd. 72) Inventor Kazuhiro Koyama 1006 Kazuma Kadoma, Kadoma City, Osaka Prefecture F-term in Matsushita Electric Industrial Co., Ltd. 5B009 KB00 RB31 5D015 AA03 HH01 KK02

Claims

[Claims]

1. A font conversion apparatus for recognizing a voice uttered by a speaker, converting the voice into text data, and converting a font of the text data based on voice information included in the input voice of the speaker. A voice input unit that inputs voice uttered by a speaker and outputs voice data; a voice recognition unit that recognizes voice data input from the voice input unit and converts the voice data into text data; A speech information extraction unit for extracting speech information indicating the speaker's speech characteristics from speech data input by the user, and discriminating the speech information into a plurality of classes or attributes, and a font of a text corresponding to each class or attribute. A font table that stores information, a font table reference unit that refers to the font table and obtains font information corresponding to the input voice information, When the voice information of the speaker is provided from the voice information extraction unit and the text data of the speaker is provided from the voice recognition unit, the voice information is provided to the font table reference unit, and obtained from the font table. A font conversion device, comprising: a font conversion control unit that converts font data of the text data using font information; and a display unit that displays text converted by the font conversion control unit.

2. The power information extracting unit according to claim 1, wherein the voice information extracting unit is a power information extracting unit that extracts a voice power of a speaker as the voice information from voice data input from the voice input unit. The described font converter.

3. A speaker identification feature extraction unit for extracting, as speech information, a speech feature amount used for speaker identification from speech data input by the speech input unit, A speaker identification label acquisition unit that inputs a speaker identification label corresponding to the speaker; a speech feature amount extracted from the speaker identification feature extraction unit; and a speech input from the speaker identification label acquisition unit. A speaker feature learning unit that creates a template used for speaker identification using the speaker identification label; a speech feature amount extracted from the speaker identification feature extraction unit; and a speaker feature learning unit created by the speaker feature learning unit. A speaker identification unit that identifies a speaker using the template and outputs a speaker identification label of the speaker who uttered the voice; and a speaker identification label obtained from the speaker identification label acquisition unit. Font information specified for each speaker And a font table creation unit that creates a font table using,
2. The font conversion apparatus according to claim 1, further comprising:

4. An emotion identification feature extraction unit for extracting, from the audio data input by the audio input unit, an audio feature used for emotion identification of a speaker as the audio information, the audio information extraction unit includes: 2. The font conversion apparatus according to claim 1, further comprising: an emotion identification unit that identifies a speaker's emotion using the voice feature amount extracted from the emotion identification feature extraction unit.

5. The gender identification feature extraction unit that extracts, as speech information, a speech feature used for gender identification of a speaker from the speech data input by the speech input unit, 2. The font conversion apparatus according to claim 1, further comprising: a gender identification unit that identifies the gender of the speaker using the voice feature extracted from the gender identification feature extraction unit.

6. An age-identifying feature extracting unit that extracts, as the speech information, a speech feature amount used for speaker age discrimination from the speech data input by the speech input unit, 2. The font conversion device according to claim 1, further comprising: an age identification unit that identifies the age of the speaker by using a speech feature amount extracted from the age identification feature extraction unit.

7. A new font table for storing font information corresponding to each class of the voice feature amount is generated, and a voice uttered by a speaker is recognized and converted into text data, and the input voice of the speaker is converted to text data. A font conversion device for converting a font of the text data based on included voice information using the font table, wherein the voice input unit inputs a voice uttered by a speaker and outputs voice data, A voice recognition unit that recognizes voice data input from a voice input unit and converts the voice data into text data; and voice information that extracts voice information indicating voice characteristics of a speaker from the voice data input by the voice input unit. An extraction unit, a font designation input unit for inputting font information, and a phonebook of text corresponding to each class, discriminating the voice information into a plurality of classes. A font table storing information, and using the font information input by the font designation input unit and the audio information extracted by the audio information extraction unit, font information corresponding to each audio information is stored in the font table. A font table creating unit for newly storing the input audio information by referring to the font table; a font table reference unit for obtaining font information corresponding to the input audio information; and audio information of the speaker from the audio information extracting unit. A font conversion unit that, when text data of a speaker is provided from the voice recognition unit, provides the voice information to the font table reference unit, and performs font conversion on the text data using font information obtained from the font table; A control unit, and a display for displaying the text converted by the font conversion control unit Font conversion device characterized by the, the equipped.

8. Recognizing a voice uttered by a speaker, converting the voice into text data, converting a font of the text data based on voice information included in input voice of the speaker, and changing a font of the text. Is executed, the font conversion device uses the changed font for subsequent font conversion, wherein a voice uttered by a speaker is input, and a voice input unit that outputs voice data; and A voice recognition unit that performs voice recognition on input voice data and converts the voice data into text data, and a voice information extraction unit that extracts voice information indicating voice characteristics of a speaker from the voice data input by the voice input unit. A font table for discriminating the audio information into a plurality of classes, and storing text font information corresponding to each class; A font table reference unit for acquiring font information corresponding to input voice information by referring to a text file; a speaker inputting text information including a text to be font-changed, a display position of the text, and font information after the font change. A font change input unit to be provided, when the voice information of the speaker is provided from the voice information extraction unit, and when the text data of the speaker is provided from the voice recognition unit, voice information is provided to the font table reference unit; Using the font information obtained from the font table, the text is font-converted, and voice information used when the text input from the font change input unit is font-converted is output, and the voice information is input from the font change input unit. The changed text to be changed is font-formatted using the font information input from the font change input section. A font conversion control unit for changing fonts, a font learning unit for changing data of the font table based on audio information output from the font conversion control unit and font information input from the font change input unit, A display unit for displaying text converted by the font conversion control unit.

9. A font conversion device for extracting speech information of a speaker for each input text data by speaking the already input text data by the speaker, and converting a font of the input text data. A voice input unit that inputs voice uttered by a speaker and outputs voice data; and a voice that extracts voice information indicating a voice characteristic of the speaker from the voice data input by the voice input unit. An information extraction unit, a font table that discriminates the voice information into a plurality of classes, and stores text font information corresponding to each class; and, by referring to the font table, font information corresponding to the input voice information. A font table reference section to be acquired; a text input section for inputting text information including text and a text display position; A part of the text input by the input unit is specified, and a speaker is instructed to speak the specified text, and a text-to-speech specification display unit that outputs text information of the specified text; and When voice information of a speaker is given and text information is given from the reading designation display unit, the speech information is given to the font table reference unit, and the reading designation is used by using the font information obtained from the font table. A font conversion control unit for performing font conversion on the text specified on the display unit; and a display unit for displaying the text input by the text input unit and displaying the text converted by the font conversion control unit. A font conversion device characterized by the above-mentioned.

10. A font converter for extracting speaker's voice information for each input text data by uttering the already input text data by the speaker, and converting the font of the input text data. A voice input unit that inputs voice uttered by a speaker and outputs voice data; a voice recognition unit that recognizes voice data input from the voice input unit and converts the voice data into text data; A speech information extraction unit for extracting speech information indicating a speaker's speech characteristics from speech data input by the speech input unit; and a speech font for discriminating the speech information into a plurality of classes and corresponding to each class. A font table that stores information; and a font table reference unit that refers to the font table and obtains font information corresponding to input voice information. A text input unit for inputting text information including a text and a text display position; comparing the text recognized by the voice recognition unit with the text input from the text input unit; A position information detection unit to be output; and, when speech information of the speaker is given from the speech information extraction unit, and when text information spoken by the speaker is given from the position information detection unit, the speech is referred to the font table reference unit. Giving information, using the font information obtained from the font table, a font conversion control unit for performing font conversion of the text output from the position information detection unit, and displaying the text input by the text input unit, A display unit for displaying the text converted by the font conversion control unit. Font conversion device that.

11. A font conversion device for recognizing a voice uttered by a speaker and converting it into text data, and converting a font of the text data based on the likelihood of the voice recognition, wherein A voice input unit for inputting voice data and outputting voice data, and converting voice data input from the voice input unit into text data by voice recognition, and likelihood information on the likelihood of the voice-recognized text. And a speech recognition unit that outputs the likelihood information into a plurality of classes, and a font table that stores font information corresponding to each class. A font table reference unit for acquiring font information of the text, and text data of the speaker and text likelihood information from the speech recognition unit. The font conversion control unit that gives the likelihood information to the font table reference unit and converts the text into a font using the font information obtained from the font table. A display unit for displaying a text.