JP2008090332A

JP2008090332A - Character information-speech converting method

Info

Publication number: JP2008090332A
Application number: JP2007334190A
Authority: JP
Inventors: Masaru Washizu; 優鷲頭
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2007-12-26
Filing date: 2007-12-26
Publication date: 2008-04-17

Abstract

<P>PROBLEM TO BE SOLVED: To provide a character information-speech converting method in which kinds of speeches to be output are determined according to contents of a document to be read aloud. <P>SOLUTION: The character information-speech converting method consists of: inputting character data 7 as electronic data; extracting a character string in unique representation from the character data 7; integrating predetermined points by kinds of speeches corresponding to the unique representation as to the character data 7; and outputting integrated point data 10 by kinds of speeches. Further, the character information-speech converting method voices and outputs the character data 7 with a speech of a kind corresponding to the highest points. Points are set to be higher for unique representation more characteristic of the kind of the speech. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、文字情報音声変換方法に関し、特に、パーソナルコンピュータや携帯電話等で利用される文字情報読み上げシステムにおける文字情報音声変換方法に関する。 The present invention relates to a character information speech conversion method, and more particularly to a character information speech conversion method in a character information reading system used in a personal computer, a mobile phone, or the like.

デジタル化（電子化）した文字情報（文字データ）を人の音声（言葉）に変換する技術が進み、一人の声優が朗読しているのと遜色ない自然な発声（発音）が再現できるようになってきた。この文字情報読み上げ技術は、例えばパーソナルコンピュータや携帯電話等で利用されている。また、この技術により、一つの文書（文字情報）を特定の声優の声で機械的に合成することもでき、好みの声優の声で文字を読ませることもできる。 Advances in technology to convert digitized (electronic) character information (character data) into human speech (words), so that a natural voice (pronunciation) comparable to that of a voice actor reading aloud can be reproduced It has become. This text information reading technique is used in, for example, personal computers and mobile phones. Also, with this technology, one document (character information) can be mechanically synthesized with the voice of a specific voice actor, and characters can be read with the voice of a favorite voice actor.

なお、女性文には女性の声を、男性文には男性の声を合成して、文章を読み上げる技術が知られている（例えば、特許文献１参照）。また、冠詞や形容詞等に女性形男性形があるような言語の文章を音声出力する際に、文形に合致した性別の音声を合成して出力する技術が知られている（例えば、特許文献２参照）。
特開平１１−２９６１９３号公報特開昭５８−２２５４８３号公報 A technique is known in which a female voice is synthesized into a female sentence and a male voice is synthesized into a male sentence and the sentence is read out (see, for example, Patent Document 1). In addition, when a sentence in a language in which an article or adjective has a female form male form is output as a voice, a technique for synthesizing and outputting a gender voice that matches the sentence form is known (for example, Patent Documents). 2).
JP 11-296193 A JP 58-225483 A

前述のように、文字情報読み上げ技術において、文章内容に応じて性別を区別して音声を選択することは知られているが、文章の内容に応じた音声で出力したり、また、出力する音声にあわせて、文章（文末）を変更することは行われていない。 As described above, in the text information reading technology, it is known to select speech by distinguishing gender according to the text content, but it is possible to output with voice according to the text content, or to output voice In addition, there is no change in the sentence (end of sentence).

しかし、文字情報読み上げシステムの種々の用途を考えると、同じ台詞であっても、声の主のキャラクタ( 男女、年齢、時代、アニメの主人公等) によって文末表現を変化させた方が、利用者（聞き手又は読み手）にとって、より馴染みやすい（違和感の少ない）ものになる。また、文章データを音声変換して出力する場合、文章の内容に応じた声で出力することが望まれる。 However, considering the various uses of the text-to-speech reading system, users who change the end-of-sentence expression according to the main character of the voice (gender, age, era, main character of the anime, etc.), even for the same dialogue For the (listener or reader), it becomes more familiar (with less discomfort). In addition, when text data is converted into voice and output, it is desired to output the voice according to the content of the text.

本発明は、読み上げ対象である文章の内容に応じて、出力する音声の種類を決定する文字情報音声変換方法を提供することを目的とする。 An object of this invention is to provide the character information audio | voice conversion method which determines the kind of audio | voice to output according to the content of the text which is a reading-out object.

本発明の文字情報音声変換方法は、文字情報を変換して音声情報を出力する音声変換装置における文字情報音声変換方法において、電子データである文字情報を入力し、前記文字情報から固有表現からなる文字列を抽出し、前記文字情報について当該固有表現に対応する音声の種類毎に予め定められたポイントを積算し、積算されたポイントを前記音声の種類毎に出力する。 The character information speech conversion method of the present invention is a character information speech conversion method in a speech conversion device that converts character information and outputs speech information. Character information that is electronic data is input, and the character information includes a unique expression. A character string is extracted, points predetermined for each type of speech corresponding to the specific expression are added to the character information, and the integrated points are output for each type of speech.

本発明の文字情報音声変換方法によれば、文章データを音声変換して出力する場合、文章の内容に応じた声で出力することができる。これにより、利用者（聞き手又は読み手）にとって、より馴染みやすい（違和感の少ない）台詞で音声を再生することができる。従って、文字情報読み上げシステムの用途を拡大することができる。 According to the character information speech conversion method of the present invention, when text data is converted into speech and output, it can be output with a voice corresponding to the content of the text. As a result, it is possible to reproduce the speech in a speech that is more familiar (less uncomfortable) to the user (listener or reader). Therefore, the application of the character information reading system can be expanded.

図１は文字情報音声変換方法構成図であり、本発明の文字情報音声変換方法の構成を示す。 FIG. 1 is a block diagram of a character information / speech conversion method, showing a configuration of a character information / speech conversion method of the present invention.

文字情報音声変換装置は、入力装置１、表示装置（の音声選択画面）２、変換処理装置３、出力装置４、変換処理データベース（ＤＢ）５、音声サンプルＤＢ６を備える。変換処理装置３は、ＣＰＵ（中央演算処理装置）及び主メモリからなり、表現変換処理又は音声決定処理を行う。変換処理ＤＢ５は、入力された文字情報（文字データ）７について表現変換処理及び音声決定処理を行うためのものであり、表現変換処理において用いるＤＢ（表現変換ＤＢ）５１と音声決定処理において用いるＤＢ（音声決定ＤＢ）５２とからなる（図４参照）。音声サンプルＤＢ６は、文字データ７を音声データ９として出力するためのものである。 The character information speech conversion device includes an input device 1, a display device (speech selection screen) 2, a conversion processing device 3, an output device 4, a conversion processing database (DB) 5, and a speech sample DB 6. The conversion processing device 3 includes a CPU (Central Processing Unit) and a main memory, and performs expression conversion processing or voice determination processing. The conversion process DB 5 is for performing the expression conversion process and the voice determination process for the input character information (character data) 7. The DB (expression conversion DB) 51 used in the expression conversion process and the DB used in the voice determination process (Voice determination DB) 52 (see FIG. 4). The voice sample DB 6 is for outputting the character data 7 as the voice data 9.

変換処理装置３における表現変換処理及び音声決定処理は、主メモリ上に存在する表現変換処理プログラム及び音声決定処理プログラムを当該ＣＰＵで実行することにより実現される。これらのプログラムは、フレキシブルディスク、ＣＤ−ＲＯＭ、ＤＶＤ等の種々のコンピュータ読取可能な記録媒体に格納することにより、又は、インターネットを介してダウンロードすることにより、提供することができる。 The expression conversion process and the sound determination process in the conversion processing device 3 are realized by executing the expression conversion process program and the sound determination process program existing on the main memory by the CPU. These programs can be provided by being stored in various computer-readable recording media such as a flexible disk, a CD-ROM, and a DVD, or downloaded via the Internet.

入力装置１は、利用者が変換処理装置３に電子データとしての文字データ７を入力する手段であって、例えばキーボードからなる。また、入力装置１は、例えば表示装置２の表示画面、携帯電話、電子ブック等であってもよい。表示装置２は、音声の種類を選択入力する手段であって、変換処理装置３からの指示に従って指示入力画面を利用者に表示する。出力装置４は、変換後の文字データを出力する手段であって、例えば、音声を出力するスピーカ、文書を出力するプリンタ、電子データを記録する記憶装置又はその記憶媒体（又は記録媒体）からなる。スピーカであれば、例えば積算されたポイントの最も高い音声で、前記入力された文字データ７を音声出力する（音声データ９を出力する）。プリンタであれば、変換後の文字データを印刷出力する。記憶装置又はその記憶媒体であれば、テキストデータ８（.txtファイル）のような電子データを例えばフレキシブルディスクのような媒体に格納する。また、出力装置４は、後述するように、ポイントデータ１０を出力する。 The input device 1 is a means for a user to input character data 7 as electronic data to the conversion processing device 3, and is composed of, for example, a keyboard. The input device 1 may be, for example, the display screen of the display device 2, a mobile phone, an electronic book, or the like. The display device 2 is means for selectively inputting the type of sound, and displays an instruction input screen to the user in accordance with an instruction from the conversion processing device 3. The output device 4 is means for outputting converted character data, and includes, for example, a speaker that outputs sound, a printer that outputs documents, a storage device that records electronic data, or a storage medium (or recording medium) thereof. . In the case of a speaker, for example, the input character data 7 is output by voice with the highest accumulated point (speech data 9 is output). If it is a printer, the converted character data is printed out. If it is a storage device or its storage medium, electronic data such as text data 8 (.txt file) is stored in a medium such as a flexible disk. The output device 4 outputs point data 10 as will be described later.

図２は文字情報音声変換処理説明図であり、本発明の文字情報音声変換方法における表現変換処理について示す。 FIG. 2 is an explanatory diagram of character information voice conversion processing, and shows the expression conversion processing in the character information voice conversion method of the present invention.

表現変換処理において、変換処理装置３は、入力装置１から入力された文字データ７から、選択された音声の種類に基づいて、表現変換ＤＢ５１を用いて通常表現からなる被変換文字列を抽出して、当該部分を固有表現からなる変換文字列に置換して、この変換後の文字データを出力装置４から音声データ９又はテキストデータ８として出力する。音声データ９を出力する場合、変換処理装置３は音声サンプルＤＢ６を用いる（図２では省略）。 In the expression conversion process, the conversion processing device 3 extracts a converted character string composed of a normal expression from the character data 7 input from the input device 1 using the expression conversion DB 51 based on the selected speech type. Then, the portion is replaced with a converted character string composed of a unique expression, and the converted character data is output from the output device 4 as voice data 9 or text data 8. When outputting the audio data 9, the conversion processing device 3 uses the audio sample DB 6 (omitted in FIG. 2).

文字データ（この例ではテキストデータ）７の入力は、例えば変換処理装置３の入力装置１であるキーボード１から、読み上げ対象即ち処理対象の「おれにもひとつわけてくれ」と言う文章を直接入力することによる。又は、変換処理装置３に入力装置１としての携帯電話１を接続し、携帯電話１の表示画面上に表示された文章の中の一部「おれにもひとつわけてくれ」を選択指定することにより、入力するようにしてもよい。更に、入力装置１としてのいわゆる電子ブック１のような、予め文字データ７を電子データとして種々の記録媒体（ＣＤ−ＲＯＭ、ＤＶＤ）に取り込んだものから、その記録する文章を表示画面上に表示し、その中の一部「おれにもひとつわけてくれ」を選択指定することにより、入力するようにしてもよい。 Character data (in this example, text data) 7 is input directly from the keyboard 1 which is the input device 1 of the conversion processing device 3, for example, by directly reading a sentence to be read out, i. By doing. Alternatively, the mobile phone 1 as the input device 1 is connected to the conversion processing device 3 and a part of the text displayed on the display screen of the mobile phone 1 is selected and designated. May be input. Furthermore, the text to be recorded is displayed on the display screen from the character data 7 previously captured as electronic data in various recording media (CD-ROM, DVD) as in the so-called electronic book 1 as the input device 1. However, it is also possible to input by selecting and specifying a part of it, “Make it apart from me”.

変換処理装置３は、音声の種類の選択のために、音声選択画面２１を表示装置２上に表示する。音声選択画面２１は「漫画の主人公Ａ」等の音声の種類についての複数の選択肢を表示し、いずれか１個を選択可能とする。これを見た利用者は、１個の音声の種類、例えば「漫画の主人公Ａ」を選択する。音声の種類は、声優を指定するものであってもよく、「〜風」と言う指定であってもよい。表示される音声の種類は、例えば音声サンプルＤＢ６にその音声がサンプリングされている音声の種類である。音声サンプルＤＢ６は、図示しないが、周知の構成のＤＢであり、通常表現からなる文字列及び固有表現からなる文字列毎に、その音声のサンプリングデータを格納する。 The conversion processing device 3 displays a sound selection screen 21 on the display device 2 for selection of the sound type. The voice selection screen 21 displays a plurality of options for the type of voice such as “Manga hero A”, and any one of them can be selected. The user who sees this selects one voice type, for example, “manga hero A”. The type of voice may specify a voice actor, or may be specified as “~ wind”. The type of sound displayed is, for example, the type of sound whose sound is sampled in the sound sample DB 6. Although not shown, the audio sample DB 6 is a DB having a well-known configuration, and stores sampling data of the audio for each character string consisting of a normal expression and each character string consisting of a unique expression.

変換処理装置３は、文字データ（テキストデータ）７が入力され、音声の種類が選択されると、表現変換ＤＢ５１を用いて表現変換処理を開始する。表現変換ＤＢ５１は、音声の種類毎に、通常表現からなる被変換文字列とこれに対応する固有表現からなる変換文字列との組を格納する。通常表現は一般に用いられる表現（単語）であり、例えば「おれ」「〜くれ」等である。固有表現はその音声の種類即ち人物に特有の表現（単語）であり、例えば、漫画の主人公Ａ等の音声の種類について、「おれ」に対応する「まろ」、「〜くれ」に対応する「〜たも（たもれ）」等である。他の音声の種類を選択すると、これらの対応関係は変化する。表現変換ＤＢ５１は、当該音声の種類である話し手が、常用する語尾表現、挨拶の言葉、自分や相手を示す表現を、その話者の固有表現としてなるべく多く採取する。 When the character data (text data) 7 is input and the type of speech is selected, the conversion processing device 3 starts the expression conversion process using the expression conversion DB 51. The expression conversion DB 51 stores a set of a converted character string made up of a normal expression and a converted character string made up of a corresponding unique expression for each type of speech. The normal expression is a commonly used expression (word), such as “I” or “~ Kure”. The specific expression is an expression (word) peculiar to the type of sound, that is, a person. For example, for the type of sound such as the hero A of the comic, “Maro” corresponding to “I” and “~ Kure” corresponding to “ ~ Tamare "etc. When other audio types are selected, these correspondences change. The expression conversion DB 51 collects as many utterance expressions, greeting words, and expressions that indicate the person and the other party as the speaker's unique expressions as much as possible.

変換処理装置３は、表現変換ＤＢ５１を用いて、文字データ７の先頭から順に検索して、通常表現からなる被変換文字列を抽出して、当該部分を固有表現からなる変換文字列に置換する。例えば、「おれにもひとつわけてくれ」と言う文字データ７について、選択された音声の種類「漫画の主人公Ａ」の表現変換ＤＢ５１で検索すると、最初に被変換文字列「おれ」が抽出される。そこで、これを対応する変換文字列「まろ」に置換する。次に、被変換文字列「〜くれ」が抽出されるので、これを対応する変換文字列「〜たも」に置換する。この結果、「まろにもひとつわけてたも」と言う変換後の文字データが得られる。これにより、処理対象の文字データ（文章）を、読み上げようとする音声の種類に応じて、変更することができ、読み上げる音声と文章との整合を取ることができる。 The conversion processing device 3 uses the expression conversion DB 51 to search in order from the beginning of the character data 7, extracts the converted character string consisting of the normal expression, and replaces the part with the converted character string consisting of the unique expression. . For example, when the character data 7 that says “I want to divide it into me” is searched in the expression conversion DB 51 of the selected voice type “Manga hero A”, the converted character string “I” is extracted first. The Therefore, this is replaced with the corresponding conversion character string “MARO”. Next, since the converted character string “~ Kure” is extracted, it is replaced with the corresponding converted character string “˜Tama”. As a result, the character data after the conversion that “one piece was also divided” is obtained. Thereby, the character data (sentence) to be processed can be changed according to the type of voice to be read out, and the voice to be read out can be matched with the sentence.

この変換後の文字データを、変換処理装置３は、出力装置４から音声データ９又はテキストデータ８として出力する。これにより、変換後の音声データ９のみならず、変換後の文章のデータ（テキストデータ８）をも得ることができる。音声データ９を出力する場合、変換処理装置３は、選択された音声の種類「漫画の主人公Ａ」の音声サンプルＤＢ６を用いる。 The conversion processing device 3 outputs the converted character data as audio data 9 or text data 8 from the output device 4. Thereby, not only the converted voice data 9 but also converted text data (text data 8) can be obtained. When outputting the audio data 9, the conversion processing device 3 uses the audio sample DB 6 of the selected audio type “manga hero A”.

なお、以上と同様にして、例えば、音声選択画面２１において時代劇風なキャラクタの男の声優を選択した場合、「あなた、なかなかやるわね」と言う文字データ７を入力して、「あなた」を「おぬし」に置換し、語尾の「〜わね」を「〜な」に置換することにより、「おぬし、なかなかやるな」というテキストデータ８及び音声データ９を得ることができる。また、「よくできた」「よくできました」「うまくできた」等の現代風の表現を、「ようできた」と言う時代劇風の表現とすることができる。 In the same manner as described above, for example, when a male voice actor of a historical drama character is selected on the voice selection screen 21, the character data 7 saying “You, I do quite well” is input, and “You” Is replaced with “Onushi”, and “-Wane” at the end of the word is replaced with “-Na”, thereby obtaining text data 8 and voice data 9 of “Onusu, very good”. . In addition, contemporary expressions such as “well done”, “well done” and “well done” can be used as a historical drama-like expression saying “you can do it”.

また、例えば、入力装置１である携帯電話１に着信した電子メール（の全体）を文字データ７として選択し、声優リストの中から聞きたい声優を選択すると、当該声優の声で電子メールを読み上げるようにすることができる。この場合、電子メールの文面は、通信サービス業者の設置する変換処理装置３により、当該声優の固有の表現になるように変換される。 Further, for example, when an e-mail (entire) received on the mobile phone 1 as the input device 1 is selected as the character data 7 and a voice actor to be heard is selected from the voice actor list, the e-mail is read out with the voice of the voice actor. Can be. In this case, the text of the e-mail is converted by the conversion processing device 3 installed by the communication service provider so as to be a unique expression of the voice actor.

更に、例えば、幼児向け学習ソフト（ソフトウェア）等において、音声の説明付きで興味を誘う場合、その音声が人気のある漫画のキャラクタ（の声優）であるならば、学習効果が高くなることも考えられる。但し、通常は、予め録音された通りのことしか話さない。全ての進行にあわせて音声を用意することは、事実上困難である。しかし、本発明によれば、一般的な文章に基づいて、漫画のキャラクタ等が自然に読み上げているようなテキストデータ８及び音声データ９を得ることができる。 In addition, for example, in learning software (software) for young children, when an interest is accompanied by a voice explanation, if the voice is a popular cartoon character (voice actor), the learning effect may be enhanced. It is done. However, it usually only speaks as it was recorded in advance. It is practically difficult to prepare audio for every progress. However, according to the present invention, it is possible to obtain text data 8 and voice data 9 that a cartoon character or the like naturally reads out based on a general sentence.

図３（Ａ）は、文字情報音声変換処理フローであり、本発明による文字情報音声変換方法の処理を示す。 FIG. 3A is a character information speech conversion processing flow, and shows the processing of the character information speech conversion method according to the present invention.

入力装置１が、利用者の入力指示に従って、変換処理装置３に変換処理対象の文字データ７を指定又は入力する（ステップＳ１１）。この後、変換処理装置３がその表示装置２の表示画面上に音声選択画面２１を表示すると、これを見た利用者が、当該画面から１個の音声の種類を選択又は入力する（ステップＳ１２）。これに応じて、変換処理装置３が、ステップＳ１２において選択された音声の種類の表現変換ＤＢ５１を用いて、ステップＳ１１において入力された文字データ７から通常表現からなる被変換文字列を抽出して、当該部分を固有表現からなる変換文字列に置換し（ステップＳ１３）、変換後の文字データを出力する。例えば、固有表現出力ファイルに変換後の文字データ（テキストデータ８）を格納する（ステップＳ１４）。 The input device 1 designates or inputs character data 7 to be converted into the conversion processing device 3 in accordance with a user input instruction (step S11). Thereafter, when the conversion processing device 3 displays the voice selection screen 21 on the display screen of the display device 2, the user who sees it selects or inputs one voice type from the screen (step S12). ). In response to this, the conversion processing device 3 uses the speech type expression conversion DB 51 selected in step S12 to extract a character string to be converted consisting of normal expressions from the character data 7 input in step S11. Then, the portion is replaced with a converted character string consisting of a unique expression (step S13), and the converted character data is output. For example, the converted character data (text data 8) is stored in the specific expression output file (step S14).

図４及び図５は文字情報音声変換処理説明図であり、本発明の文字情報音声変換方法における音声決定処理について示す。 4 and 5 are explanatory diagrams of character information speech conversion processing, and show speech determination processing in the character information speech conversion method of the present invention.

音声決定処理において、変換処理装置３は、入力装置１から入力された文字データ７から固有表現からなる文字列を抽出し、音声決定ＤＢ５２を用いて当該固有表現に対応する音声の種類毎に予め定められたポイントを積算し、音声サンプルＤＢ６（図４では省略）を用いて、出力装置４から最もポイントの高い音声の種類で音声データ９として出力する。 In the speech determination process, the conversion processing device 3 extracts a character string composed of the unique expression from the character data 7 input from the input device 1 and uses the speech determination DB 52 for each type of speech corresponding to the specific expression in advance. The determined points are integrated and output from the output device 4 as the audio data 9 with the highest point type using the audio sample DB 6 (not shown in FIG. 4).

文字データ７の入力は、前述の音声変換処理と同様にして、キーボード１から直接入力され、又は、携帯電話１や電子ブック１の文章の一部から選択される。ここでは、「まろにもひとつわけてたも」と入力されたとする。 The input of the character data 7 is directly input from the keyboard 1 or selected from a part of the text of the mobile phone 1 or the electronic book 1 in the same manner as the voice conversion process described above. Here, it is assumed that “one piece was also divided” was entered.

変換処理装置３は、文字データ７が入力されると、音声決定ＤＢ５２を用いて表現変換処理を開始する。音声決定ＤＢ５２は、図５に示すように、音声の種類毎に、通常表現（からなる被変換文字列）に対応する固有表現（からなる変換文字列）についての表現ポイントを格納する。例えば、漫画の主人公Ａ等の音声の種類について、「おれ」に対応する固有表現「まろ」は３ポイント、「くれ」に対応する固有表現「たも（たもれ）」は２ポイント等である。当該音声の種類に特徴的な（当該音声種類をよく表す）固有表現である程、ポイントが高く設定される。 When the character data 7 is input, the conversion processing device 3 starts the expression conversion processing using the voice determination DB 52. As shown in FIG. 5, the speech determination DB 52 stores expression points for specific expressions (converted character strings) corresponding to normal expressions (converted character strings) for each type of sound. For example, for the type of voice such as the hero A of the comic, the specific expression “Maro” corresponding to “Ore” is 3 points, the specific expression “Tame” corresponding to “Kure” is 2 points, etc. is there. The point is set higher as the unique expression (representing the voice type better) is more characteristic.

なお、音声決定ＤＢ５２は、図５に示すように、当該固有表現についての音声リンク及び形態素情報をも格納する。音声リンクは、当該音声の種類である漫画の主人公Ａの声優の協力で、事前にサンプリングされた当該固有表現の単語（音素片）の音声サンプルＤＢ６における格納先アドレスを示す。形態素情報は、当該固有表現についての形態素を示す。 Note that the voice determination DB 52 also stores a voice link and morpheme information about the specific expression, as shown in FIG. The voice link indicates the storage destination address in the voice sample DB 6 of the word (phoneme piece) of the specific expression sampled in advance with the cooperation of the voice actor of the hero A of the comic character who is the type of the voice. The morpheme information indicates a morpheme for the specific expression.

変換処理装置３は、音声決定ＤＢ５２を用いて、文字データ７の先頭から順に検索して、固有表現（からなる変換文字列）を抽出して、これについてのポイントを求め、これを当該文字データ７について積算する。例えば、「まろにもひとつわけてたも」と言う文字データ７について、音声決定ＤＢ５２で検索すると、最初に固有表現「まろ」が抽出される。そこで、これに対応するポイント「３点」及び音声の種類「漫画の主人公Ａ（の声優）」を求め、この音声の種類「漫画の主人公Ａ」についてそのポイント「３点」を積算する。次に、固有表現「たも」が抽出されるので、これに対応するポイント「２点」及び音声の種類「漫画の主人公Ａ」を求め、音声の種類「漫画の主人公Ａ」についてそのポイントを積算（カウント）して、当該ポイントを「５点」とする。この結果、「まろにもひとつわけてたも」と言う文字データ７について、音声の種類「漫画の主人公Ａ」は「５点」であり、他の音声の種類（の声優）は「０点」であると言う結果が得られる。 The conversion processing device 3 uses the voice determination DB 52 to search in order from the beginning of the character data 7, extracts a specific expression (converted character string consisting of), obtains a point about this, and obtains this character data. 7 is integrated. For example, when the character determination 7 that searches for the character data 7 “Maro also has been separated” in the speech determination DB 52, the unique expression “Maro” is first extracted. Therefore, the corresponding point “3 points” and the voice type “Manga hero A (no voice actor)” are obtained, and the point “3 points” is accumulated for this voice type “Manga hero A”. Next, since the proper expression “Tama” is extracted, the corresponding point “2 points” and the voice type “Manga hero A” are obtained, and the point is assigned to the voice type “Manga hero A”. The points are accumulated (counted), and the points are set to “5 points”. As a result, with respect to the character data 7 that says “I was divided into one piece”, the voice type “Manga hero A” is “5 points”, and the other voice type (no voice actor) is “0 points”. Is obtained.

この後、変換処理装置３は、積算されたポイント又はカウント値（ポイントデータ１０）を音声の種類毎に出力する。例えば、音声の種類「漫画の主人公Ａ」は「５点」であり、他の音声の種類は「０点」であることを、利用者に通知する。これにより、利用者は、当該文字データ７の読み上げにふさわしい音声の種類（例えば、声優）がどれ（誰）であるかを知ることができる。 Thereafter, the conversion processing device 3 outputs the accumulated points or count values (point data 10) for each type of sound. For example, the user is notified that the voice type “Manga hero A” is “5 points” and the other voice type is “0 points”. Thereby, the user can know which (who) the type (for example, voice actor) of the voice suitable for reading the character data 7 is.

また、変換処理装置３は、利用者の指示に従って、当該積算されたポイントの最も高い音声の種類で、入力された文字データ７を音声データ９として出力する。この場合、変換処理装置３は、音声の種類「漫画の主人公Ａ（の声優）」の音声サンプルＤＢ６を用いる。これにより、利用者は、当該文字データ７の読み上げにふさわしい音声の種類（例えば、声優）で、当該文字データの音声出力（音声データ９）を得ることができる。これにより、漫画のキャラクタ、有名人、声優等に固有の表現が文章に含まれている場合、当該漫画のキャラクタ、有名人、声優等の音声で出力することができ、当該処理対象の文字データ（文章）に合った音声で出力することができる。 Further, the conversion processing device 3 outputs the input character data 7 as the voice data 9 with the voice type having the highest accumulated point according to the user's instruction. In this case, the conversion processing device 3 uses the audio sample DB 6 of the audio type “Manga hero A (no voice actor)”. Thereby, the user can obtain the sound output (speech data 9) of the character data with the sound type (for example, voice actor) suitable for reading the character data 7. As a result, if the sentence contains expressions unique to the cartoon character, celebrity, voice actor, etc., it can be output in the voice of the cartoon character, celebrity, voice actor, etc., and the character data (text ) Can be output with sound that matches.

なお、例えば、入力装置１である携帯電話１に着信した電子メール（の全体）を文字データ７として選択し、声優おまかせモードを設定すれば、例えば声優リストの中から当該電子メールの語調にあった声優が選ばれ、電子メールを読み上げるようにすることができる。 For example, if the e-mail received by the mobile phone 1 as the input device 1 is selected as the character data 7 and the voice actor entrusting mode is set, for example, the voice tone list matches the tone of the e-mail. Voice actors can be selected and read out emails.

図３（Ｂ）は、他の文字情報音声変換処理フローであり、本発明による他の文字情報音声変換方法の処理を示す。 FIG. 3B is another character information speech conversion processing flow, and shows processing of another character information speech conversion method according to the present invention.

入力装置１が、利用者の入力指示に従って、変換処理装置３に変換処理対象の文字情報を指定又は入力する（ステップＳ２１）。これに応じて、変換処理装置３が、音声決定ＤＢ５２を用いて、ステップＳ２１において入力された文字情報から、固有表現（からなる被変換文字列）を抽出して、当該固有表現に対応する音声の種類毎に、予め定められた表現ポイント（ポイント）を積算し（ステップＳ２２）、当該積算されたポイントデータ１０を音声の種類毎に出力する（ステップＳ２３）。この後、変換処理装置３は、利用者の指示入力があれば、これに従って、当該積算されたポイントの最も高い音声の種類で、当該文字情報を音声データ９として出力する（ステップＳ２４）。 The input device 1 specifies or inputs character information to be converted into the conversion processing device 3 in accordance with a user input instruction (step S21). In response to this, the conversion processing device 3 uses the voice determination DB 52 to extract a specific expression (a character string to be converted) from the character information input in step S21, and to generate a voice corresponding to the specific expression. The predetermined expression points (points) are integrated for each type (step S22), and the integrated point data 10 is output for each type of voice (step S23). Thereafter, if there is an instruction input by the user, the conversion processing device 3 outputs the character information as the voice data 9 in accordance with the voice type having the highest accumulated point (step S24).

なお、利用者は表現変換処理又は音声決定処理のいずれかを選択することができる。このために、例えば、変換処理装置３は、入力装置１からの利用者による変換処理対象の文字情報の入力に先立って、表示装置２の表示画面上に処理選択画面（図示せず）を表示する。この処理選択画面において、利用者は、表現変換処理又は音声決定処理のいずれかを選択入力する。これに応じて、変換処理装置３が、前述の表現変換処理又は音声決定処理のいずれかを開始する。 Note that the user can select either the expression conversion process or the voice determination process. For this purpose, for example, the conversion processing device 3 displays a processing selection screen (not shown) on the display screen of the display device 2 prior to the input of character information to be converted by the user from the input device 1. To do. On this process selection screen, the user selects and inputs either the expression conversion process or the voice determination process. In response to this, the conversion processing device 3 starts either the expression conversion process or the voice determination process described above.

また、図４に示すように、入力した文字データ７についてのポイントデータ１０を求め、その上で、当該文字データ７を、図２に示すように、その最もポイントの高い音声の種類（声優等）の固有表現を含むテキストデータ８や音声データ９に変換して出力するようにしてもよい。 Further, as shown in FIG. 4, the point data 10 for the inputted character data 7 is obtained, and then the character data 7 is obtained by selecting the type of voice having the highest point (voice actor etc.) as shown in FIG. ) May be converted into text data 8 or voice data 9 including a proper expression and output.

以上から判るように、本発明の形態および実施例の特徴を列記すると以下のとおりである。
（付記１）文字情報を変換して音声情報を出力する音声変換装置における文字情報音声変換方法において、
電子データである文字情報を入力し、
音声の種類を選択入力し、
前記音声の種類に基づいて、前記文字情報から通常表現からなる被変換文字列を抽出して、当該部分を固有表現からなる変換文字列に置換し、
前記変換後の文字情報を音声出力する
ことを特徴とする文字情報音声変換方法。
（付記２）前記変換後の文字情報をテキストデータとして出力する
ことを特徴とする付記１に記載の文字情報音声変換方法。
（付記３）前記通常表現に対応する前記固有表現を格納する音声変換データベースを参照することにより、前記被変換文字列を前記変換文字列に変換する
ことを特徴とする付記１に記載の文字情報音声変換方法。
（付記４）電子データである文字情報を入力する手段と、
音声の種類を選択入力する手段と、
前記文字情報から通常表現からなる被変換文字列を抽出して、当該部分を固有表現からなる変換文字列に置換する手段と、
前記変換後の文字情報を出力する手段とを備える
ことを特徴とする文字情報音声変換装置。
（付記５）文字情報音声変換方法を実現するプログラムであって、
前記プログラムは、コンピュータに、
電子データである文字情報を入力する処理と、
音声の種類を選択入力する処理と、
前記文字情報から通常表現からなる被変換文字列を抽出して、当該部分を固有表現からなる変換文字列に置換する処理と、
前記変換後の文字情報を出力する処理とを実行させる
ことを特徴とする文字情報音声変換プログラム。
（付記６）文字情報を変換して音声情報を出力する音声変換装置における文字情報音声変換方法において、
電子データである文字情報を入力し、
前記文字情報から固有表現からなる文字列を抽出し、
前記文字情報について、当該固有表現に対応する音声の種類毎に予め定められたポイントを積算し、
前記積算されたポイントを前記音声の種類毎に出力する
ことを特徴とする文字情報音声変換方法。
（付記７）当該積算されたポイントの最も高い音声の種類で、前記文字情報を音声出力する
ことを特徴とする付記６に記載の文字情報音声変換方法。
（付記８）前記ポイントは、当該音声の種類に特徴的な固有表現である程、高く設定される
ことを特徴とする付記６に記載の文字情報音声変換方法。
（付記９）電子データである文字情報を入力する手段と、
前記文字情報から固有表現からなる文字列を抽出する手段と、
前記文字情報について、当該固有表現に対応する音声の種類毎に、予め定められたポイントを積算する手段と、
当該積算されたポイントを前記音声の種類毎に出力する手段とを備える
ことを特徴とする文字情報音声変換装置。
（付記１０）文字情報音声変換方法を実現するプログラムであって、
前記プログラムは、コンピュータに、
電子データである文字情報を入力する処理と、
前記文字情報から固有表現からなる文字列を抽出する処理と、
前記文字情報について、当該固有表現に対応する音声の種類毎に、予め定められたポイントを積算する処理と、
当該積算されたポイントを前記音声の種類毎に出力する処理とを実行させる
ことを特徴とする文字情報音声変換プログラム。 As can be seen from the above, the features of the embodiments and examples of the present invention are listed as follows.
(Supplementary Note 1) In a character information speech conversion method in a speech conversion device that converts character information and outputs speech information,
Enter text information that is electronic data,
Select and input the audio type,
Based on the type of speech, extract a converted character string consisting of a normal expression from the character information, and replace the part with a converted character string consisting of a unique expression,
A character information speech conversion method, wherein the converted character information is output as speech.
(Additional remark 2) The character information after the said conversion is output as text data. The character information audio | voice conversion method of Additional remark 1 characterized by the above-mentioned.
(Additional remark 3) The character information of Additional remark 1 characterized by converting the said character string to be converted into the said conversion character string by referring to the audio | voice conversion database which stores the said specific expression corresponding to the said normal expression. Voice conversion method.
(Appendix 4) Means for inputting character information that is electronic data;
Means for selecting and inputting the type of audio;
Means for extracting a converted character string consisting of a normal expression from the character information and replacing the part with a converted character string consisting of a unique expression;
And a means for outputting the character information after the conversion.
(Supplementary Note 5) A program for realizing a character information speech conversion method,
The program is stored in a computer.
A process of inputting character information that is electronic data;
Processing to select and input the type of audio,
A process of extracting a converted character string composed of a normal expression from the character information and replacing the part with a converted character string composed of a unique expression;
And a process for outputting the converted character information. A character information speech conversion program.
(Supplementary Note 6) In a character information speech conversion method in a speech conversion device that converts character information and outputs speech information,
Enter text information that is electronic data,
Extract a character string consisting of a unique expression from the character information,
For the character information, the points predetermined for each type of speech corresponding to the specific expression are integrated,
The accumulated point is output for each type of voice.
(Supplementary note 7) The character information speech conversion method according to supplementary note 6, wherein the character information is output as speech with the type of speech having the highest accumulated point.
(Supplementary note 8) The character information speech conversion method according to supplementary note 6, wherein the point is set to be higher as the characteristic expression is more characteristic.
(Supplementary note 9) means for inputting character information which is electronic data;
Means for extracting a character string comprising a unique expression from the character information;
For the character information, means for accumulating predetermined points for each type of speech corresponding to the specific expression;
And a means for outputting the accumulated points for each type of voice.
(Supplementary Note 10) A program for realizing a character information speech conversion method,
The program is stored in a computer.
A process of inputting character information that is electronic data;
A process of extracting a character string consisting of a unique expression from the character information;
For the character information, a process of accumulating predetermined points for each type of speech corresponding to the specific expression;
A character information-speech conversion program that executes a process of outputting the accumulated points for each type of speech.

本発明によれば、文字情報音声変換方法において、文章データを音声変換して出力する場合、文章の内容に応じた声で出力することができるので、利用者にとってより馴染みやすい台詞で音声を再生することができる。従って、文字情報読み上げシステムの用途を拡大することができる。 According to the present invention, in the text information speech conversion method, when text data is converted into speech and output, it can be output with a voice according to the content of the text, so that the speech is reproduced with a speech familiar to the user. can do. Therefore, the application of the character information reading system can be expanded.

文字情報音声変換方法構成図である。It is a character information voice conversion method block diagram. 文字情報音声変換処理説明図であり、表現変換処理について示す。It is character information voice conversion processing explanatory drawing, and shows an expression conversion process. 文字情報音声変換処理フローである。It is a character information voice conversion processing flow. 文字情報音声変換処理説明図であり、音声決定処理について示す。It is character information voice conversion processing explanatory drawing, and shows a voice determination process. 文字情報音声変換処理説明図であり、音声決定処理について示す。It is character information voice conversion processing explanatory drawing, and shows a voice determination process.

Explanation of symbols

１入力装置
２表示装置
３変換処理装置
４出力装置
５変換処理ＤＢ
６音声サンプルＤＢ
７文字データ（文字情報）
８テキストデータ
９音声データ 1 Input device 2 Display device 3 Conversion processing device 4 Output device 5 Conversion processing DB
6 Audio sample DB
7 Character data (character information)
8 Text data 9 Voice data

Claims

In a character information voice conversion method in a voice conversion device that converts character information and outputs voice information,
Enter text information that is electronic data,
Extract a character string consisting of a unique expression from the character information,
For the character information, the points predetermined for each type of speech corresponding to the specific expression are integrated,
The accumulated point is output for each type of voice.

The character information speech conversion method according to claim 1, wherein the character information is output as speech with the type of speech having the highest accumulated point.

The character information-to-speech conversion method according to claim 1, wherein the point is set to be higher as the characteristic expression is more characteristic.