JP6249760B2

JP6249760B2 - Text-to-speech device

Info

Publication number: JP6249760B2
Application number: JP2013266092A
Authority: JP
Inventors: 靖弘宮野; 昭夫田上; 本多　亮; 亮本多
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2013-08-28
Filing date: 2013-12-24
Publication date: 2017-12-20
Anticipated expiration: 2033-12-24
Also published as: JP2015064543A

Description

本発明は、テキストを読み上げるテキスト読み上げ装置に関する。 The present invention relates to a text reading device that reads text.

従来、コンテンツ（例えば、辞書コンテンツおよび書籍コンテンツ）に含まれる文章または単語等のテキストを音声に変換して読み上げるＴＴＳ（Text To Speech）機能を備えたテキスト読み上げ装置が開発されている。このようなテキスト読み上げ装置として、例えば、英語等の外国語のテキストを音声に変換する外国語ＴＴＳエンジンを搭載した電子辞書が挙げられる。さらに、日本語学習者、および装置による朗読を希望する者などのための、日本語のテキストを音声に変換する日本語ＴＴＳエンジンを搭載したテキスト読み上げ装置も挙げられる。 2. Description of the Related Art Conventionally, a text-to-speech apparatus having a TTS (Text To Speech) function has been developed that converts text such as sentences or words contained in content (for example, dictionary content and book content) into speech and reads it out. An example of such a text-to-speech device is an electronic dictionary equipped with a foreign language TTS engine that converts foreign language text such as English into speech. Furthermore, there is a text-to-speech device equipped with a Japanese TTS engine that converts Japanese text into speech for Japanese learners and those who wish to read by the device.

特許文献１には、日本語ＴＴＳエンジンを搭載したテキスト読み上げ装置の一例が開示されている。このテキスト読み上げ装置は、読み上げようとする漢字文字列が音声変換辞書データ部に登録されていない場合、かな漢字変換用の辞書データ部を参照することによって、上記漢字文字列の読みであるかなを得る。そして、そのかなに対応する音声を、上記漢字文字列に対応する音声の代わりに再生する。 Patent Document 1 discloses an example of a text-to-speech device equipped with a Japanese TTS engine. This text-to-speech device obtains the kana character string reading by referring to the dictionary data part for kana-kanji conversion when the kanji character string to be read is not registered in the speech conversion dictionary data part. . Then, the sound corresponding to the kana is reproduced instead of the sound corresponding to the kanji character string.

特開２０００−１９４３８９号公報（２０００年７月１４日公開）JP 2000-194389 A (published July 14, 2000)

特許文献１に開示されたテキスト読み上げ装置を含め、日本語ＴＴＳエンジンを搭載した従来のテキスト読み上げ装置では、テキスト中の漢字文字列に複数の読みがある場合、日本語ＴＴＳエンジンが決定した一つの読みのみで、上記漢字文字列が読み上げられる。そのため、漢字文字列が複数の読みを有する場合、コンテンツの作成者の意図に沿わない読みで、上記漢字文字列が読み上げられる可能性がある。すなわち、従来のテキスト読み上げ装置は、漢字文字列を含むテキストの読み上げ精度が低いという問題がある。 In a conventional text-to-speech device equipped with a Japanese TTS engine, including the text-to-speech device disclosed in Patent Document 1, when there are a plurality of readings in a kanji character string in a text, the Japanese TTS engine determines one The kanji character string is read out only by reading. Therefore, when the kanji character string has a plurality of readings, there is a possibility that the kanji character string is read out in a reading that does not conform to the intention of the content creator. That is, the conventional text-to-speech device has a problem that the text-to-speech accuracy including the kanji character string is low.

また、テキストには、日本語だけでなく、英語または中国語などの言語の文字列が含まれる場合がある。この場合、各言語の文字列が、それぞれスムーズに読み上げられることが望ましい。 In addition, the text may include a character string in a language such as English or Chinese as well as Japanese. In this case, it is desirable that each language character string be read out smoothly.

しかしながら、一般的に、１つのＴＴＳエンジンは、複数の言語の文字列を読み上げるように適合されていない。例外的に、日本語ＴＴＳエンジンは、日本語の文字列だけでなく、英語の文字列も読み上げることができる。だが、中国語の文字列を読み上げることはできない。そのため、従来のテキスト読み上げ装置には、複数の言語が混在したテキストをスムーズに読み上げることができないという問題がある。 However, in general, one TTS engine is not adapted to read multiple language strings. As an exception, the Japanese TTS engine can read not only Japanese character strings but also English character strings. But you can't read Chinese strings. For this reason, the conventional text-to-speech device has a problem that it cannot smoothly read a text in which a plurality of languages are mixed.

本発明は上記の課題に鑑みてなされたものであり、その目的は、複数の言語が混在しているテキストをスムーズに読み上げることができるテキスト読み上げ装置を提供することにある。 The present invention has been made in view of the above problems, and an object of the present invention is to provide a text-to-speech device that can smoothly read a text in which a plurality of languages are mixed.

上記の課題を解決するために、本発明の一態様に係るテキスト読み上げ装置は、表示部に表示するためのテキストデータを音声データに変換し、変換した音声データを出力することによって、上記テキストデータを読み上げるテキスト読み上げ装置であって、１つ以上の言語の音声データをそれぞれ生成する複数の音声データ生成部と、選択された上記テキストデータの範囲内の文字列の言語の種類を判定し、上記複数の音声データ生成部から、上記テキストデータに含まれる複数種類の言語の文字列の音声データを生成できる１つの音声データ生成部を選択する選択部と、を備えている。 In order to solve the above-described problem, a text-to-speech device according to one aspect of the present invention converts text data to be displayed on a display unit into voice data, and outputs the converted voice data. A plurality of speech data generation units that respectively generate speech data in one or more languages, determine a language type of a character string within a range of the selected text data, and A selection unit that selects one voice data generation unit capable of generating voice data of character strings of a plurality of types of languages included in the text data from a plurality of voice data generation units.

本発明の一態様によれば、複数の言語が混在しているテキストをスムーズに読み上げることができるという効果を奏する。 According to one aspect of the present invention, it is possible to smoothly read a text in which a plurality of languages are mixed.

本発明の一実施形態に係る情報処理装置が備える制御部の構成を示すブロック図である。It is a block diagram which shows the structure of the control part with which the information processing apparatus which concerns on one Embodiment of this invention is provided. 本発明の一実施形態に係る情報処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the information processing apparatus which concerns on one Embodiment of this invention. （ａ）は、本発明の一実施形態に係るコンテンツデータの一例を示す図であり、（ｂ）は、本発明の一実施形態に係るコンテンツデータから抽出され、読みＤＢ部に記憶される漢字文字列とその読み仮名のデータ形式を示す図である。(A) is a figure which shows an example of the content data which concerns on one Embodiment of this invention, (b) is extracted from the content data which concerns on one Embodiment of this invention, and is stored in the reading DB part It is a figure which shows the data format of a character string and its reading kana. 本発明の一実施形態に係る情報処理装置の、読み上げデータ処理時の動作を示すフローチャートである。It is a flowchart which shows the operation | movement at the time of reading-out data processing of the information processing apparatus which concerns on one Embodiment of this invention. 本発明の一実施形態に係る情報処理装置が備えた表示部におけるコンテンツの表示例である。It is an example of a display of the content in the display part with which the information processing apparatus which concerns on one Embodiment of this invention was provided. 本発明の一実施形態に係る情報処理装置の制御部が実行する読み上げデータ処理の流れを示すフローチャートの一部である。It is a part of flowchart which shows the flow of the reading-out data process which the control part of the information processing apparatus which concerns on one Embodiment of this invention performs. 本発明の一実施形態に係る制御部が実行する読み上げデータ処理の流れを示すフローチャートの一部であり、図６に示すフローチャートの残りである。FIG. 6 is a part of a flowchart showing a flow of read-out data processing executed by a control unit according to an embodiment of the present invention, and is the remainder of the flowchart shown in FIG. （ａ）および（ｂ）は、本発明の一実施形態に係る情報処理装置の表示部に表示された情報の拡大図である。(A) And (b) is an enlarged view of the information displayed on the display part of the information processing apparatus which concerns on one Embodiment of this invention. （ａ）および（ｂ）は、本発明の一変形例における表示部の表示例である。(A) And (b) is the example of a display of the display part in one modification of this invention. （ａ）〜（ｃ）は、本発明の一実施形態に係るルビ情報付きテキストの表示例を示す図であり、特に、（ａ）および（ｃ）は、テキストのルビが非表示である表示例であり、（ｂ）は、テキスト中の漢字文字列の上にルビが表示されている表示例である。(A)-(c) is a figure which shows the example of a display of the text with ruby information which concerns on one Embodiment of this invention, (a) And (c) is a table | surface with which the ruby of a text is non-displaying. It is a display example, (b) is a display example in which ruby is displayed on the kanji character string in the text. 本発明の他の実施形態に係る情報処理装置が備える制御部の構成を示すブロック図である。It is a block diagram which shows the structure of the control part with which the information processing apparatus which concerns on other embodiment of this invention is provided. 本発明の他の実施形態に係る情報処理装置が備えた表示部におけるコンテンツの一表示例である。It is an example of a display of the content in the display part with which the information processing apparatus which concerns on other embodiment of this invention was provided. 本発明の他の実施形態に係る情報処理装置が備えた表示部におけるコンテンツの他の表示例である。It is another example of the display of the content in the display part with which the information processing apparatus which concerns on other embodiment of this invention was provided. 本発明の他の実施形態に係る情報処理装置が備えた表示部におけるコンテンツのさらに他の表示例である。It is the further another example of a display of the content in the display part with which the information processing apparatus which concerns on other embodiment of this invention was provided. 本発明の他の実施形態に係る情報処理装置の制御部が実行する読み上げデータ処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the reading-out data process which the control part of the information processing apparatus which concerns on other embodiment of this invention performs.

〔実施形態１〕
以下、本発明の一実施形態を詳細に説明する。 Embodiment 1
Hereinafter, an embodiment of the present invention will be described in detail.

［情報処理装置１の構成］
本実施形態に係る情報処理装置１（テキスト読み上げ装置）は、コンテンツに含まれるテキストの読み上げを行う装置である。まず、図２を用いて、本実施形態に係る情報処理装置１の構成を説明する。図２は、情報処理装置１の構成を示すブロック図である。 [Configuration of Information Processing Apparatus 1]
The information processing apparatus 1 (text reading apparatus) according to the present embodiment is an apparatus that reads a text included in content. First, the configuration of the information processing apparatus 1 according to the present embodiment will be described with reference to FIG. FIG. 2 is a block diagram illustrating a configuration of the information processing apparatus 1.

図２に示すように、情報処理装置１は、表示部１０、タッチパネル２０、ＴＴＳエンジン３０（漢字文字列音声データ生成部）、制御部４０、記憶部５０、一時記憶部６０、およびスピーカ７０を備えている。 As illustrated in FIG. 2, the information processing apparatus 1 includes a display unit 10, a touch panel 20, a TTS engine 30 (kanji character string voice data generation unit), a control unit 40, a storage unit 50, a temporary storage unit 60, and a speaker 70. I have.

制御部４０は、情報処理装置１が備えた各部（表示部１０、タッチパネル２０、ＴＴＳエンジン３０およびスピーカ７０を含む）を統合的に制御する。 The control unit 40 integrally controls each unit (including the display unit 10, the touch panel 20, the TTS engine 30, and the speaker 70) included in the information processing apparatus 1.

記憶部５０は、コンテンツ記憶部５０Ａおよび読みＤＢ部５０Ｂを含んでいる。コンテンツ記憶部５０Ａには、電子辞書または電子書籍などのコンテンツデータが記憶される。コンテンツデータには、画像データまたはテキストデータ等が含まれる。読みＤＢ５０Ｂには、制御部４０によって、テキストデータに含まれる漢字文字列とその読みとが互いに対応付けられたデータがデータベース化されて記憶される。テキストデータは、テキストデータ中の漢字文字列とこれに対応する仮名文字列とを、漢字文字列とその読みとして互いに関連付けるタグ情報を有していてもよい。なお、タグ情報の具体例については後述する。 The storage unit 50 includes a content storage unit 50A and a reading DB unit 50B. Content data such as an electronic dictionary or an electronic book is stored in the content storage unit 50A. The content data includes image data or text data. In the reading DB 50B, the control unit 40 stores data in which the kanji character strings included in the text data and the readings are associated with each other in a database. The text data may have tag information that associates a kanji character string in the text data and a kana character string corresponding thereto with each other as a kanji character string and its reading. A specific example of tag information will be described later.

一時記憶部６０には、制御部４０により、テキストデータが一時的に記憶される。より詳細には、制御部４０は、コンテンツ記憶部５０Ａに記憶されているコンテンツからテキストデータを取得して、一時記憶部６０に記憶する。あるいは、制御部４０は、表示部１０に表示されたテキストデータ全体のうち、ユーザに選択された範囲のテキストデータを取得して、一時記憶部６０に記憶する。なお、一時記憶部６０は、記憶部５０に含まれていてもよい。 Text data is temporarily stored in the temporary storage unit 60 by the control unit 40. More specifically, the control unit 40 acquires text data from the content stored in the content storage unit 50A and stores the text data in the temporary storage unit 60. Or the control part 40 acquires the text data of the range selected by the user among the whole text data displayed on the display part 10, and memorize | stores it in the temporary memory part 60. FIG. The temporary storage unit 60 may be included in the storage unit 50.

表示部１０には、制御部４０により、ユーザに提示されるデータ、例えば、コンテンツに含まれる画像データまたはテキストデータなどが表示される。具体的には、制御部４０がコンテンツ記憶部５０Ａより取得したコンテンツデータが表示される。タッチパネル２０は、表示部１０上に設けられており、ユーザが表示部１０にタッチして行う操作を検出して、その結果の検出信号を制御部４０に出力する。制御部４０は、タッチパネル２０からの検出信号により、テキストデータの表示位置へのタッチを検出する。 On the display unit 10, data presented to the user by the control unit 40, for example, image data or text data included in the content is displayed. Specifically, the content data acquired by the control unit 40 from the content storage unit 50A is displayed. The touch panel 20 is provided on the display unit 10, detects an operation performed by the user touching the display unit 10, and outputs a detection signal as a result thereof to the control unit 40. The control unit 40 detects a touch on the display position of the text data based on a detection signal from the touch panel 20.

ＴＴＳエンジン３０には、一時記憶部６０に記憶されたテキストデータに対して制御部４０が変換処理を施したデータが入力される。ＴＴＳエンジン３０は、制御部４０より入力されたテキストデータを音声データに変換する。そして、ＴＴＳエンジン３０は、変換した音声データを制御部４０に出力する。スピーカ７０からは、制御部４０がＴＴＳエンジン３０より取得した音声データに基づく音声が出力される。 Data obtained by converting the text data stored in the temporary storage unit 60 by the control unit 40 is input to the TTS engine 30. The TTS engine 30 converts the text data input from the control unit 40 into voice data. Then, the TTS engine 30 outputs the converted audio data to the control unit 40. From the speaker 70, sound based on the sound data acquired by the control unit 40 from the TTS engine 30 is output.

図３の（ａ）に、タグ情報を有するテキストデータの一例を示す。また、図３の（ｂ）に、読みＤＢ部５０Ｂに蓄積される情報のデータ形式を示す。図３の（ａ）に示すテキストデータには、コンテンツデータの再生時には表示されないタグ情報（＜ｋａｎｊｉ＞、＜／ｋａｎｊｉ＞、＜ｙｏｍｉ＞、および＜／ｙｏｍｉ＞）を用いて、「雑歌」という漢字文字列と、「ぞうか」という仮名文字列とが、漢字文字列とその読みとして関連付けられている。制御部４０は、図３（ａ）に示すテキストデータを処理するとき、これらのタグ情報に基づき、テキストデータに含まれる「雑歌」という漢字文字列の開始位置および終了位置と、「ぞうか」という仮名文字列とを、テキストデータから検出する。そして、図３の（ｂ）に示すように、「雑歌」という漢字文字列の開始位置および終了位置に対して、「ぞうか」という仮名文字列を読みとして関連付けて、読みＤＢ部５０Ｂに記憶させる。 FIG. 3A shows an example of text data having tag information. FIG. 3B shows the data format of information stored in the reading DB unit 50B. In the text data shown in FIG. 3A, tag information (<kanji>, </ kanji>, <yomi>, and </ yomi>) that is not displayed when content data is reproduced is referred to as “miscellaneous song”. A kanji character string and a kana character string “Elephant” are associated as a kanji character string and its reading. When the control unit 40 processes the text data shown in FIG. 3A, based on the tag information, the control unit 40 starts and ends the kanji character string “miscellaneous” included in the text data, and “Zooka”. Is detected from the text data. Then, as shown in FIG. 3B, the kana character string “Elephant” is associated with the start position and end position of the kanji character string “miscellaneous” as a reading and stored in the reading DB unit 50B. Let

［制御部４０の詳細な構成］
次に、図１を用いて、制御部４０の詳細な構成を説明する。図１は、制御部４０の構成を示すブロック図である。図１に示すように、制御部４０は、漢字抽出部４２（漢字文字列抽出部）、読み仮名抽出部４３（仮名文字列抽出部）、漢字仮名置換部４４（漢字文字列音声データ生成部）、ＴＴＳエンジン出力部４５、およびスピーカ出力部４６を備えている。 [Detailed Configuration of Control Unit 40]
Next, a detailed configuration of the control unit 40 will be described with reference to FIG. FIG. 1 is a block diagram illustrating a configuration of the control unit 40. As shown in FIG. 1, the control unit 40 includes a kanji extraction unit 42 (kanji character string extraction unit), a reading kana extraction unit 43 (kana character string extraction unit), and a kanji kana character replacement unit 44 (kanji character string speech data generation unit). ), A TTS engine output unit 45, and a speaker output unit 46.

漢字抽出部４２は、テキストデータに含まれる漢字文字列を抽出する。読み仮名抽出部４３は、漢字抽出部４２によって抽出された漢字文字列と、テキストデータにおいて関連付けられている読み仮名を、当該テキストデータから抽出する。読み仮名抽出部４３は、読みＤＢ参照抽出部４３０、括弧内文字列抽出部４３１、および読み仮名判定部４３２を含んでいる。読みＤＢ参照抽出部４３０は、読みＤＢ部５０Ｂを参照して、テキストデータ中の漢字文字列の読み仮名を、読みＤＢ部５０Ｂから抽出する。具体的には、読みＤＢ参照抽出部４３０は、図３の（ｂ）に示す読みＤＢから、テキストデータにおける漢字文字列の位置を示す開始位置および終了位置を取得して、これらの位置で指定される漢字文字列と関連付けられた読みを抽出する。 The kanji extraction unit 42 extracts a kanji character string included in the text data. The reading kana extraction unit 43 extracts the kanji character string extracted by the kanji extraction unit 42 and the reading kana associated with the text data from the text data. The reading kana extraction unit 43 includes a reading DB reference extraction unit 430, a parenthesis character string extraction unit 431, and a reading kana determination unit 432. The reading DB reference extraction unit 430 refers to the reading DB unit 50B and extracts the reading kana of the kanji character string in the text data from the reading DB unit 50B. Specifically, the reading DB reference extraction unit 430 acquires the start position and the end position indicating the position of the kanji character string in the text data from the reading DB shown in FIG. To extract the readings associated with the kanji strings to be played.

括弧内文字列抽出部４３１は、漢字抽出部４２によって抽出された漢字文字列の直後に、括弧（括弧開きおよび括弧閉じ）で囲まれた仮名文字列（平仮名文字列または片仮名文字列）が存在する場合、該仮名文字列を抽出する。なお、括弧内文字列抽出部４３１は、読みＤＢ参照抽出部４３０によって読み仮名が抽出された漢字文字列については、漢字文字列の直後に、括弧および括弧で囲まれた仮名文字列があるか否かの判定を省略してもよい。読み仮名判定部４３２は、括弧内文字列抽出部４３１によって抽出された括弧内文字列が、括弧の直前の漢字文字列の読み仮名であるか否かを判定する。 The parenthesis character string extraction unit 431 includes a kana character string (a hiragana character string or a katakana character string) enclosed in parentheses (open and close parentheses) immediately after the kanji character string extracted by the kanji extraction unit 42. If so, the kana character string is extracted. The parenthesis character string extraction unit 431 determines whether there is a kana character string enclosed in parentheses and parentheses immediately after the kanji character string for the kanji character string from which the reading kana character extraction is performed by the reading DB reference extraction unit 430. The determination of whether or not may be omitted. The reading kana determination unit 432 determines whether the parenthesis character string extracted by the parenthesis character string extraction unit 431 is a reading kana of the kanji character string immediately before the parenthesis.

例えば、括弧内文字列抽出部４３１は、「雑歌（ぞうか）」という文字列（図５参照）がテキストデータ中に存在していた場合、この文字列に含まれる漢字文字列「雑歌」の直後に存在する括弧「（）」で囲まれた「ぞうか」という仮名文字列を、上記漢字文字列「雑歌」の読み仮名として抽出する。言い換えれば、括弧内文字列抽出部４３１は、上記漢字文字列（「雑歌」）に対応する上記仮名文字列（「ぞうか」）を抽出する。なお、読み仮名抽出部４３は、括弧「（）」で囲まれた仮名文字列の代わりに、「＜＞」、「≪≫」、またはそれ以外の特定の記号で囲まれている仮名文字列を抽出してもよい。すなわち、読み仮名抽出部４３は、所定の第１の記号と、当該所定の第１の記号に対応する所定の第２の記号とに囲まれた仮名文字列を抽出すればよい。ここで、第１の記号と、第２の記号とは、互いに同じ記号であってもよいし、括弧開きと括弧閉じのように対になった記号であってもよいし、あるいは、互いに無関係の記号（例えば、＄と＃）であってもよい。ただし、どの記号が第１の記号または第２の記号に相当するのかは、情報処理装置１において予め定義されている。 For example, when a character string “Zoka” (see FIG. 5) exists in the text data, the character string extraction unit 431 in parentheses stores the kanji character string “Zaika” included in this character string. The kana character string “zouka” surrounded by parentheses “()” immediately after is extracted as a reading kana for the kanji character string “miscellaneous”. In other words, the parenthesis character string extraction unit 431 extracts the kana character string (“Elephant”) corresponding to the kanji character string (“miscellaneous song”). The reading kana extraction unit 43 replaces the kana character string enclosed in parentheses “()” with “<>”, “<< >>”, or other specific symbols. May be extracted. In other words, the reading kana extraction unit 43 may extract a kana character string surrounded by a predetermined first symbol and a predetermined second symbol corresponding to the predetermined first symbol. Here, the first symbol and the second symbol may be the same symbol, or may be a pair of symbols such as parenthesis opening and parenthesis closing, or irrelevant to each other. (For example, $ and #). However, which symbol corresponds to the first symbol or the second symbol is defined in advance in the information processing apparatus 1.

漢字仮名置換部４４は、読み仮名抽出部４３による読み仮名抽出の結果に基づいて、一時記憶部６０に記憶されたテキストデータに置換処理を施す。具体的には、漢字仮名置換部４４は、一時記憶部６０に記憶されたテキストデータ中の漢字文字列のうち、読みＤＢ参照抽出部４３０によってその読み仮名が抽出された漢字文字列を、該読み仮名に置換する。さらに、漢字仮名置換部４４は、上記置換後のテキストデータから、漢字抽出部４２によって抽出された漢字文字列を、読み仮名判定部４３２によって当該漢字文字列の読み仮名と判定された仮名文字列に置換する。ＴＴＳエンジン出力部４５は、漢字仮名置換部４４による漢字仮名置換処理後のテキストデータを、ＴＴＳエンジン３０に出力する。 The kanji kana replacement unit 44 performs a replacement process on the text data stored in the temporary storage unit 60 based on the result of the reading kana extraction by the reading kana extraction unit 43. Specifically, the kanji kana replacement unit 44 extracts the kanji character string from which the reading kana is extracted by the reading DB reference extraction unit 430 from the kanji character strings in the text data stored in the temporary storage unit 60. Replace with the reading. Further, the kanji kana replacement unit 44 uses the kana character string extracted by the kanji extraction unit 42 from the replaced text data as the reading kana of the kanji character string by the reading kana determination unit 432. Replace with. The TTS engine output unit 45 outputs the text data after the kanji kana replacement processing by the kanji kana replacement unit 44 to the TTS engine 30.

ＴＴＳエンジン３０は、ＴＴＳエンジン出力部４５から入力されたテキストデータを、音声データに変換する。その後、ＴＴＳエンジン３０は、変換した音声データをスピーカ出力部４６に出力する。スピーカ出力部４６は、ＴＴＳエンジン３０より入力された音声データを電気信号に変換して、スピーカ７０に出力する。 The TTS engine 30 converts the text data input from the TTS engine output unit 45 into voice data. Thereafter, the TTS engine 30 outputs the converted audio data to the speaker output unit 46. The speaker output unit 46 converts the audio data input from the TTS engine 30 into an electrical signal and outputs it to the speaker 70.

［制御部４０の動作］
続いて、制御部４０の動作、特に、読み上げデータ処理に関する制御部４０の動作を説明する。読み上げデータ処理では、情報処理装置１が、テキストデータの読み上げを行う。以下では、まず、読み上げデータ処理が実行されるまでの制御部４０の動作の概要を説明する。その後で、読み上げデータ処理の詳細を説明する。 [Operation of Control Unit 40]
Next, the operation of the control unit 40, particularly, the operation of the control unit 40 related to reading data processing will be described. In the reading data processing, the information processing apparatus 1 reads the text data. Below, the outline | summary of operation | movement of the control part 40 until read-out data processing is performed first is demonstrated. Thereafter, details of the reading data processing will be described.

［制御部４０の動作；読み上げデータ処理が実行されるまでの制御部４０の動作］
図４および図５を用いて、読み上げデータ処理が実行されるまでの制御部４０の動作の概要を説明する。図４は、読み上げデータ処理が実行されるまでの制御部４０の動作を示すフローチャートである。また、図５は、表示部１０におけるコンテンツの表示例である。なお、以下の説明では、実際にはタッチパネル２０がタッチされることを、表示部１０がタッチされると表現する。 [Operation of control unit 40; operation of control unit 40 until read-out data processing is executed]
The outline of the operation of the control unit 40 until the reading data processing is executed will be described with reference to FIGS. 4 and 5. FIG. 4 is a flowchart showing the operation of the control unit 40 until the reading data processing is executed. FIG. 5 is a display example of content on the display unit 10. In the following description, actually touching the touch panel 20 is expressed as touching the display unit 10.

図４に示すように、制御部４０は、まず、表示部１０にコンテンツが表示中であるか否かを判定する（Ｓ１）。コンテンツが表示中である場合（Ｓ１でＹｅｓ）、制御部４０は、表示部１０に表示されているテキストの全部または一部が、ユーザによってタッチされたか否かを判定する（Ｓ２）。以下では、タッチされたテキストの全部または一部を、選択テキストと呼ぶ。図５では、表示部１０に、選択テキストである「雑歌」が影付きで表示されている。なお、選択テキストの選択は、図５に示すようにタッチペンＰを用いて行われてもよいし、あるいはユーザの指で行われてもよい。それ以外にも、選択テキストは、情報処理装置１が備えたキー（図示せず）をユーザが操作することによって選択されてもよい。 As shown in FIG. 4, the control unit 40 first determines whether or not content is being displayed on the display unit 10 (S1). When the content is being displayed (Yes in S1), the control unit 40 determines whether or not all or part of the text displayed on the display unit 10 has been touched by the user (S2). Hereinafter, all or part of the touched text is referred to as selected text. In FIG. 5, “Miscellaneous Song” which is the selected text is displayed with a shadow on the display unit 10. The selected text may be selected using the touch pen P as shown in FIG. 5 or may be performed with the user's finger. In addition, the selected text may be selected by the user operating a key (not shown) provided in the information processing apparatus 1.

コンテンツが表示中でない（Ｓ１でＮｏ）か、またはテキストの全部または一部がタッチされない（Ｓ２でＮｏ）場合、制御部４０は、読み上げデータ処理を行わずに動作を終了する。一方、コンテンツが表示中であり（Ｓ１でＹｅｓ）、かつ、該コンテンツに含まれるテキストの全部または一部がタッチされた（Ｓ２でＹｅｓ）場合、図５に示すように、制御部４０は、タッチメニューＴＭを表示する（Ｓ３）。同図に示すように、タッチメニューＴＭには、情報処理装置１に選択テキストを読み上げさせる「読み上げ」の選択肢が含まれる。 When the content is not being displayed (No in S1) or all or part of the text is not touched (No in S2), the control unit 40 ends the operation without performing the reading data processing. On the other hand, when the content is being displayed (Yes in S1) and all or part of the text included in the content is touched (Yes in S2), as shown in FIG. The touch menu TM is displayed (S3). As shown in the figure, the touch menu TM includes an option of “speech” that causes the information processing apparatus 1 to read the selected text.

タッチメニューＴＭは、「読み上げ」の他に、選択テキストにマーカーを引いて表示する「マーカー」を選択肢に含んでいてもよい（図５参照）。その他にも、タッチメニューＴＭは、例えば、選択テキストに含まれる用語の辞書検索を実行させる選択肢を含んでいてもよい。 The touch menu TM may include, in addition to “speech”, “markers” displayed by drawing a marker on the selected text (see FIG. 5). In addition, the touch menu TM may include, for example, an option for performing a dictionary search for terms included in the selected text.

なお、ユーザがテキストを選択するために行う操作は、タッチ操作のみに限らない。例えば、表示部１０の画面上にボタンが表示されており、本ボタンを押下する操作が行われたとき、あらかじめ情報処理装置１内で取り決められた仕様や設定に従った動作（例えば、表示中のテキストの全文を選択するような動作）が行われ、この動作によって選択テキストが決定されてもよい。 Note that the operation that the user performs to select text is not limited to the touch operation. For example, when a button is displayed on the screen of the display unit 10 and an operation of pressing this button is performed, an operation in accordance with specifications and settings previously determined in the information processing apparatus 1 (for example, during display) The selected text may be determined by this operation.

タッチメニューＴＭを表示した後（Ｓ３の後）、制御部４０は、タッチメニューＴＭから「読み上げ」が選択されたか否かを判定する（Ｓ４）。「読み上げ」以外の選択肢が選択された場合（Ｓ４でＮｏ）、制御部４０は、選択された選択肢に応じた動作を実行した後、読み上げデータ処理を行わずに動作を終了する。 After displaying the touch menu TM (after S3), the control unit 40 determines whether or not “read” is selected from the touch menu TM (S4). When an option other than “Read” is selected (No in S4), the control unit 40 performs an operation corresponding to the selected option, and then ends the operation without performing the read data processing.

一方、「読み上げ」が選択された場合（Ｓ４でＹｅｓ）、制御部４０は読み上げデータ処理を実行する（Ｓ５）。読み上げデータ処理Ｓ５では、制御部４０は、選択テキストに対して、漢字仮名置換処理を実行した後、ＴＴＳエンジン３０に対し、漢字仮名置換処理を施された選択テキストを出力する。なお、読み上げデータ処理Ｓ５の詳細を後述する。ＴＴＳエンジン３０では、漢字仮名置換処理後のテキストデータが音声データに変換される。その後、制御部４０は、ＴＴＳエンジン３０から出力された音声データを取得して、スピーカ７０に出力する。その後、処理は終了する。 On the other hand, when “Reading” is selected (Yes in S4), the control unit 40 executes reading data processing (S5). In the read-out data process S5, the control unit 40 executes the kanji kana replacement process for the selected text, and then outputs the selected text subjected to the kanji kana replacement process to the TTS engine 30. Details of the reading data processing S5 will be described later. In the TTS engine 30, the text data after the kanji-kana replacement process is converted into voice data. Thereafter, the control unit 40 acquires the audio data output from the TTS engine 30 and outputs it to the speaker 70. Thereafter, the process ends.

なお、ここでは、タッチメニューＴＭを表示する構成を説明したが、テキスト選択操作を契機とする予め定められた動作が１つ（すなわち、読み上げ処理の実行）のみの場合、タッチメニューＴＭを表示することなく、読み上げ処理を実行してもよい。 Although the configuration for displaying the touch menu TM has been described here, the touch menu TM is displayed when there is only one predetermined operation (that is, execution of the reading process) triggered by the text selection operation. The reading process may be executed without any problem.

［制御部４０の動作；読み上げデータ処理Ｓ５の詳細］
次に、図６および図７を用いて、上述した読み上げデータ処理Ｓ５の詳細を説明する。図６および図７は、読み上げデータ処理Ｓ５の流れを示すフローチャートである。 [Operation of control unit 40; details of reading data processing S5]
Next, details of the reading data processing S5 described above will be described with reference to FIGS. 6 and 7 are flowcharts showing the flow of the read-out data processing S5.

読み上げデータ処理Ｓ５では、まず、制御部４０は、選択テキストを一時記憶部６０にコピーする（Ｓ５０１）。以降のステップでは、制御部４０の各部は、一時記憶部６０内の選択テキストに対して、漢字文字列を仮名文字列に置換する処理などを行う。 In the reading data processing S5, first, the control unit 40 copies the selected text to the temporary storage unit 60 (S501). In subsequent steps, each unit of the control unit 40 performs a process of replacing the kanji character string with the kana character string for the selected text in the temporary storage unit 60.

ステップＳ５０１の後、読みＤＢ参照抽出部４３０が、読みＤＢ部５０Ｂを参照して、読み仮名と関連付けられた漢字文字列が選択テキストに含まれているか否かを判定する（Ｓ５０２）。読み仮名と関連付けられた漢字文字列が選択テキストに含まれている場合（Ｓ５０２でＹｅｓ）、漢字仮名置換部４４は、該漢字文字列をその読み仮名に置換する（Ｓ５０３）。 After step S501, the reading DB reference extraction unit 430 refers to the reading DB unit 50B to determine whether or not the selected text includes a kanji character string associated with the reading kana (S502). When the selected text includes a kanji character string associated with the reading kana (Yes in S502), the kanji kana replacement unit 44 replaces the kanji character string with the reading kana (S503).

読み仮名と関連付けられた漢字文字列が選択テキストに含まれていない場合（Ｓ５０２でＮｏ）、または、ステップＳ５０３の後、漢字抽出部４２が、選択テキストに漢字文字列が含まれているか否かを判定する（Ｓ５０４）。選択テキストに漢字文字列が含まれていない場合（Ｓ５０４でＮｏ）、ＴＴＳエンジン出力部４５が、一時記憶部６０内の選択テキストをＴＴＳエンジン３０に出力し（Ｓ５１１）、読み上げデータ処理Ｓ５は終了する。 When the kanji character string associated with the reading kana is not included in the selected text (No in S502), or after step S503, the kanji extraction unit 42 determines whether the selected text includes a kanji character string. Is determined (S504). If the selected text does not contain a Kanji character string (No in S504), the TTS engine output unit 45 outputs the selected text in the temporary storage unit 60 to the TTS engine 30 (S511), and the read-out data processing S5 ends. To do.

一方、選択テキストに漢字文字列が含まれている場合（Ｓ５０４でＹｅｓ）、括弧内文字列抽出部４３１は、選択テキスト内において、漢字文字列の直後に、括弧で囲まれた文字列（以下では、括弧内文字列と呼ぶ）が存在するか否かを判定する（Ｓ５０５）。 On the other hand, when the selected text includes a kanji character string (Yes in S504), the character string extraction unit 431 in parentheses immediately follows the kanji character string in the selected text (hereinafter referred to as a character string enclosed in parentheses). Then, it is determined whether or not there is a character string in parentheses (S505).

選択テキスト内において、漢字文字列の直後に、括弧内文字列が存在しない場合（Ｓ５０５でＮｏ）、括弧内文字列抽出部４３１は、選択テキストの末尾が漢字文字列であるか否かを判定する（Ｓ５０６）。選択テキストの末尾が漢字文字列ではない場合（Ｓ５０６でＮｏ）、ＴＴＳエンジン出力部４５が、一時記憶部６０内の選択テキストをＴＴＳエンジン３０に出力し（Ｓ５１１）、読み上げデータ処理Ｓ５は終了する。 If there is no character string in parentheses immediately after the kanji character string in the selected text (No in S505), the character string extraction unit 431 in parentheses determines whether the end of the selected text is a kanji character string. (S506). If the end of the selected text is not a Chinese character string (No in S506), the TTS engine output unit 45 outputs the selected text in the temporary storage unit 60 to the TTS engine 30 (S511), and the read-out data processing S5 ends. .

一方、選択テキストの末尾が漢字文字列である場合（Ｓ５０６でＹｅｓ）、括弧内文字列抽出部４３１は、選択テキストの直後に、括弧内文字列が存在するか否かを判定する（Ｓ５０７）。選択テキストの直後に括弧内文字列が存在しない場合（Ｓ５０７でＮｏ）、ＴＴＳエンジン出力部４５が、一時記憶部６０内の選択テキストをＴＴＳエンジン３０に出力し（Ｓ５１１）、読み上げデータ処理Ｓ５は終了する。 On the other hand, when the end of the selected text is a Kanji character string (Yes in S506), the parenthesized character string extraction unit 431 determines whether or not the parenthesized character string exists immediately after the selected text (S507). . When there is no character string in parentheses immediately after the selected text (No in S507), the TTS engine output unit 45 outputs the selected text in the temporary storage unit 60 to the TTS engine 30 (S511), and the read-out data processing S5 finish.

選択テキスト内において、漢字文字列の直後に、括弧内文字列が存在する場合（Ｓ５０５でＹｅｓ、Ｅｘ１に示す例の場合）、または、末尾が漢字である選択テキストの直後に、括弧内文字列が存在する場合（Ｓ５０７でＹｅｓ、Ｅｘ２に示す例の場合）、読み仮名判定部４３２は、括弧内文字列が平仮名のみまたは片仮名のみからなる仮名文字列であるか否かを判定する（Ｓ５０８）。これにより、括弧内文字列が、その直前の漢字文字列の読みを表記したものであるか否かが判定される。 In the selected text, if there is a character string in parentheses immediately after the kanji character string (in the example shown as Yes or Ex1 in S505) or immediately after the selected text that ends with the kanji character, Is present (in the case of the example shown as Yes and Ex2 in S507), the reading kana determination unit 432 determines whether or not the character string in parentheses is a kana character string consisting only of hiragana or only katakana (S508). . Thereby, it is determined whether or not the character string in parentheses is a representation of the reading of the immediately preceding kanji character string.

括弧内文字列が平仮名および片仮名以外の文字列を含む場合（Ｓ５０８でＮｏ、Ｅｘ３に示す例の場合）、ＴＴＳエンジン出力部４５が、一時記憶部６０内の選択テキストをＴＴＳエンジン３０に出力し（Ｓ５１１）、読み上げデータ処理Ｓ５は終了する。 When the character string in the parenthesis includes a character string other than hiragana and katakana (in the example shown as No and Ex3 in S508), the TTS engine output unit 45 outputs the selected text in the temporary storage unit 60 to the TTS engine 30. (S511), the reading data processing S5 ends.

一方、括弧内文字列が平仮名または片仮名からなる仮名文字列である場合（Ｓ５０８でＹｅｓ）、読み仮名判定部４３２は、括弧内文字列の長さが所定範囲内であるか否かを判定する（Ｓ５０９、仮名文字列抽出ステップ）。所定範囲は、括弧の直前の漢字文字列の文字列長の３倍以内であってよい。これにより、読み仮名判定部４３２は、括弧内文字列が、括弧の直前の漢字文字列の読み仮名としては長すぎる場合に、上記括弧内文字列は上記漢字文字列の読み仮名ではないと判断することができる。 On the other hand, if the character string in parentheses is a kana character string made up of hiragana or katakana (Yes in S508), the reading kana determination unit 432 determines whether the length of the character string in parentheses is within a predetermined range. (S509, kana character string extraction step). The predetermined range may be within three times the character string length of the kanji character string immediately before the parentheses. Thus, the reading character determination unit 432 determines that the character string in parentheses is not the reading character of the kanji character string when the character string in parentheses is too long as the reading character name of the kanji character string immediately before the parenthesis. can do.

括弧内文字列の長さが所定範囲内でない場合（Ｓ５０９でＮｏ）、ＴＴＳエンジン出力部４５が、一時記憶部６０内の選択テキストをＴＴＳエンジン３０に出力し（Ｓ５１１）、読み上げデータ処理Ｓ５は終了する。一方、括弧内文字列の長さが所定範囲内である場合（Ｓ５０９でＹｅｓ）、漢字仮名置換部４４は、括弧の直前の漢字文字列を括弧内文字列に置換し、さらに、括弧開きから括弧閉じまでの文字列の一部または全部が選択テキストに含まれる場合、その文字列を選択テキストから削除する（Ｓ５１０）。その後、ＴＴＳエンジン出力部４５は、一時記憶部６０内の選択テキストをＴＴＳエンジン３０に出力する（Ｓ５１１）。 When the length of the character string in the parenthesis is not within the predetermined range (No in S509), the TTS engine output unit 45 outputs the selected text in the temporary storage unit 60 to the TTS engine 30 (S511), and the read-out data processing S5 finish. On the other hand, when the length of the character string in the parenthesis is within the predetermined range (Yes in S509), the kanji kana replacement unit 44 replaces the character string in front of the parenthesis with the character string in parenthesis, and further opens the parenthesis. If part or all of the character string up to closing the parenthesis is included in the selected text, the character string is deleted from the selected text (S510). Thereafter, the TTS engine output unit 45 outputs the selected text in the temporary storage unit 60 to the TTS engine 30 (S511).

ここで、ステップＳ５１０では、括弧の直前の漢字文字列が一つの漢字からなる場合、漢字仮名置換部４４は、その一つの漢字を括弧内文字列に置換する。一方、括弧の直前の漢字文字列が複数の漢字からなる場合、漢字仮名置換部４４は、その複数の漢字を括弧内文字列に置換する。言い換えれば、漢字仮名置換部４４は、括弧開き直前からテキストを遡って、漢字文字コードを有する漢字からなる一連の漢字群を括弧内文字列に置換する。 Here, in step S510, when the kanji character string immediately before the parenthesis is composed of one kanji, the kanji kana replacement unit 44 replaces the one kanji with the character string in parentheses. On the other hand, when the kanji character string immediately before the parenthesis is composed of a plurality of kanji characters, the kanji kana replacement unit 44 replaces the plurality of kanji characters with the character string in parentheses. In other words, the kanji kana replacement unit 44 traces the text immediately before opening the parenthesis and replaces a series of kanji characters composed of kanji having a kanji character code with a character string in parentheses.

これにより、選択テキストが括弧を含むＥｘ１に示す例（「雑歌（ぞうか）」）の場合であっても、または選択テキストが括弧を含まないＥｘ２に示す例（「雑歌」）の場合であっても、ＴＴＳエンジン３０には、Ｅｘ４に示す「ぞうか」という選択テキストが入力されることになる。以上で、読み上げデータ処理Ｓ５は終了する。 Thus, even if the selected text is an example shown in Ex1 including parentheses (“Miscellaneous songs”) or an example where the selected text is shown in Ex2 not including parentheses (“Miscellaneous songs”). However, the selected text “Elephant” shown in Ex4 is input to the TTS engine 30. Thus, the reading data processing S5 is completed.

なお、図示しないが、ステップＳ５１１の後、選択テキストデータを入力されたＴＴＳエンジン３０は、該選択テキストデータを音声データに変換して、制御部４０に出力する（音声データ生成ステップ）。制御部４０は、ＴＴＳエンジン３０から入力された音声データを、スピーカ７０に出力する。スピーカ７０は、制御部４０より入力された音声データに基づいて音声を出力する。これによって、コンテンツの作成者が意図したとおりの読みで、漢字文字列が読み上げられる。 Although not shown, after step S511, the TTS engine 30 to which the selected text data is input converts the selected text data into voice data and outputs the voice data to the control unit 40 (voice data generation step). The control unit 40 outputs the audio data input from the TTS engine 30 to the speaker 70. The speaker 70 outputs sound based on the sound data input from the control unit 40. As a result, the kanji character string is read out in the reading as intended by the creator of the content.

ここでは、情報処理装置１が、ユーザに選択された選択テキストを読み上げる例を説明したが、本発明はこれに限られない。情報処理装置１が、選択テキストの代わりにテキストデータの全体を順次読み上げる構成とすることにより、情報処理装置１がテキストの朗読を行う構成を実現することができる。 Here, an example has been described in which the information processing apparatus 1 reads the selected text selected by the user, but the present invention is not limited to this. By adopting a configuration in which the information processing device 1 sequentially reads the entire text data instead of the selected text, a configuration in which the information processing device 1 reads the text can be realized.

［読み上げデータ処理Ｓ５の変形例］
制御部４０は、読みＤＢ５０Ｂを参照しての漢字仮名置換は行わずに、括弧内文字列を用いた漢字仮名置換のみを行っても良い。この場合、読み上げデータ処理Ｓ５において、ステップＳ５０２およびステップＳ５０３は実行されない。すなわち、ステップＳ５０１に続いて、ステップＳ５０４が実行される。 [Modification of Reading Data Processing S5]
The control unit 40 may perform only kanji kana replacement using a character string in parentheses without performing kanji kana replacement with reference to the reading DB 50B. In this case, step S502 and step S503 are not executed in the reading data processing S5. That is, step S504 is executed following step S501.

また、読み上げデータ処理Ｓ５において、ステップＳ５０３は、選択テキストがＴＴＳエンジン３０に出力されるステップＳ５１１の前に実行されれば、他のタイミングで実行されてもよい。すなわち、ステップＳ５０３は、ステップＳ５０４からステップＳ５１０までの間に実行されてもよいし、または、ステップＳ５１０に続けて実行されてもよい。 Further, in the reading data processing S5, step S503 may be executed at other timing as long as it is executed before step S511 in which the selected text is output to the TTS engine 30. That is, step S503 may be executed between step S504 and step S510, or may be executed subsequent to step S510.

さらに、読み上げデータ処理Ｓ５において、ステップＳ５０９でＹｅｓの場合、すなわち括弧内文字列の長さが所定範囲内である場合、漢字仮名置換部４４は、括弧内文字列をＴＴＳエンジン出力部４５に出力してもよい。この構成の場合、ＴＴＳエンジン出力部４５は、選択テキストの代わりに括弧内文字列をＴＴＳエンジン３０に出力する。ＴＴＳエンジン３０は、括弧内文字列を音声データに変換して、制御部４０に出力する。制御部４０は、ＴＴＳエンジン３０から入力された音声データを、スピーカ７０に出力する。この結果、スピーカ７０から、選択テキストに含まれる漢字文字列に代わりに、括弧内文字列が出力されることになる。これによっても、コンテンツの作成者の意図通りに、漢字文字列が読み上げられる効果を奏する。 Furthermore, in the read-out data processing S5, if Yes in step S509, that is, if the length of the parenthesized character string is within a predetermined range, the kanji kana replacement unit 44 outputs the parenthesized character string to the TTS engine output unit 45. May be. In the case of this configuration, the TTS engine output unit 45 outputs a parenthesized character string to the TTS engine 30 instead of the selected text. The TTS engine 30 converts the character string in parentheses into voice data and outputs it to the control unit 40. The control unit 40 outputs the audio data input from the TTS engine 30 to the speaker 70. As a result, the parenthesis character string is output from the speaker 70 instead of the kanji character string included in the selected text. This also has the effect of reading out a kanji character string as intended by the creator of the content.

［従来のテキスト読み上げ装置との比較］
ここでは、本実施形態に係る情報処理装置１が達成する特有の効果を説明する。そのために、図８の（ａ）および（ｂ）を参照して、本実施形態の情報処理装置１の動作と、従来のテキスト読み上げ装置の動作とを比較しながら説明する。図８の（ａ）および（ｂ）は、いずれも、情報処理装置１の表示部１０の拡大図である。図８の（ａ）では、漢字文字列の「雑歌」が選択テキストとなっている。一方、図８の（ｂ）では、漢字文字列（「雑歌」）と、それに続く括弧および該括弧内の平仮名文字列（「（ぞうか）」）とが、選択テキストとなっている。 [Comparison with conventional text-to-speech devices]
Here, a specific effect achieved by the information processing apparatus 1 according to the present embodiment will be described. Therefore, referring to FIGS. 8A and 8B, the operation of the information processing apparatus 1 according to the present embodiment and the operation of a conventional text-to-speech apparatus will be compared. FIGS. 8A and 8B are both enlarged views of the display unit 10 of the information processing apparatus 1. In (a) of FIG. 8, the kanji character string "Miscellaneous song" is the selected text. On the other hand, in FIG. 8B, the kanji character string (“Miscellaneous Song”), followed by the parenthesis and the hiragana character string (“(Elephant)”) within the parenthesis are the selected text.

前述のように、従来のテキスト読み上げ装置では、テキスト中の漢字文字列の読みをＴＴＳエンジンが選択する。従って、漢字文字列が複数の読みを有する場合（例えば、「大神」は、「おおかみ」、「だいじん」、または「おおみわ」と読める）、ＴＴＳエンジンは任意のいずれかの読みを選択する。従って、コンテンツの作成者の意図とは異なる読みが選択される可能性がある。例えば、コンテンツの作成者が「大神」を「おおみわ」と読ませること意図してコンテンツを作成したにも関わらず、ＴＴＳエンジンは「大神」の読みとして「おおかみ」を選択する場合がそうである。 As described above, in the conventional text-to-speech device, the TTS engine selects the reading of the kanji character string in the text. Therefore, when a Kanji character string has a plurality of readings (for example, “Okami” can be read as “Okami”, “Dajin”, or “Oomiwa”), the TTS engine selects any one of the readings. . Therefore, there is a possibility that a reading different from the intention of the content creator is selected. For example, even though the content creator created the content with the intention of reading “Okami” as “Oomiwa”, the TTS engine would select “Okami” as the reading of “Okami”. is there.

また、従来のテキスト読み上げ装置では、図８の（ｂ）に示すように、選択テキストが、漢字文字列およびその漢字文字列の読み仮名を含む場合、コンテンツの作成者の意図とは異なり、漢字文字列とその読み仮名とがどちらも読み上げられてしまう可能性がある。例えば、コンテンツの作成者は「雑歌（ぞうか）」という選択テキストを単に「ぞうか」と読んでもらうことを意図していたにもかかわらず、これが「ぞうかぞうか」と読み上げられる場合がそうである。 In the conventional text-to-speech device, as shown in FIG. 8B, when the selected text includes a kanji character string and a reading kana of the kanji character string, the kanji is different from the intention of the content creator. There is a possibility that both the character string and its kana will be read out. For example, the content creator may have read the selected text “Zoka” simply as “Zoka” but it reads “Zozuka”. That's right.

一方、本実施形態に係る情報処理装置１では、図８の（ａ）および（ｂ）に示すように、漢字文字列（「雑歌」）の直後に、括弧開きと括弧閉じとの間に囲まれた読み仮名（「ぞうか」）が存在している場合、読み上げデータ処理Ｓ５の結果、上記漢字文字列から上記括弧閉じまでの文字列（「雑歌（ぞうか）」）の代わりに、上記読み仮名（「ぞうか」）がＴＴＳエンジン３０に入力される。そのため、情報処理装置１は、上記従来のテキスト読み上げ装置とは異なり、上記漢字文字列を、コンテンツの作成者が意図した通りの読みで読み上げることができる。さらに、選択テキストが「雑歌（ぞうか）」である場合、情報処理装置１は、上述した読み上げデータ処理Ｓ５において、選択テキストから「ぞうか」という文字列を得て、この文字列をＴＴＳエンジン３０に出力する。従って、上記選択テキストは、コンテンツの作成者の意図の通りに、「ぞうか」と読み上げられる。 On the other hand, in the information processing apparatus 1 according to the present embodiment, as shown in FIGS. 8A and 8B, immediately after the kanji character string (“Miscellaneous Song”), it is enclosed between the parenthesis opening and the parenthesis closing. If the read kana (“Zouka”) exists, the result of the reading data processing S5 is that the character string from the Kanji character string to the parenthesis closing (“Zoka”) is replaced with the above A reading pseudonym (“Elephant”) is input to the TTS engine 30. Therefore, the information processing apparatus 1 can read the kanji character string by reading as intended by the content creator, unlike the conventional text reading apparatus. Further, when the selected text is “Miscellaneous Song”, the information processing apparatus 1 obtains the character string “Elephant” from the selected text in the above-described reading data processing S5, and uses this character string as the TTS engine. Output to 30. Therefore, the selected text is read as “Elephant” as intended by the creator of the content.

〔変形例１〕
図９の（ａ）に、本実施形態の一変形例における表示部１０の表示例を示す。本変形例１では、図９の（ａ）に示すように、タッチされた文字列（「雑歌」）が漢字文字列であり、その漢字文字列の直後に括弧（括弧開きおよび括弧閉じ）が存在する場合、制御部４０は、上記タッチされた漢字文字列から、該漢字文字列の直後の括弧閉じまでの文字列（「雑歌（ぞうか）」）を選択テキストとする。 [Modification 1]
FIG. 9A shows a display example of the display unit 10 in a modification of the present embodiment. In the first modification, as shown in FIG. 9A, the touched character string (“Miscellaneous Song”) is a Kanji character string, and parentheses (opening and closing parentheses) are immediately after the Kanji character string. If it exists, the control unit 40 uses the touched kanji character string to the character string ("Zozuka") from the touched kanji character string to the closing of the parenthesis immediately after the kanji character string.

本変形例１の構成では、制御部４０は、タッチメニューＴＭ等から、ＴＴＳエンジン３０による「読み上げ」が選択された場合、括弧内の仮名文字（「ぞうか」）をＴＴＳエンジン３０に渡す。一方、「辞書検索」が選択された場合、制御部４０は、開き括弧から閉じ括弧までの文字列（「（ぞうか）」）を検索キーワードに採用しない。すなわち、制御部４０は、辞書検索が選択された場合は、選択テキストのうち、上記タッチされた文字列（「雑歌」）のみを検索キーワードとして採用する。これにより、文字列がタッチされた後、辞書検索が選択された場合に、タッチされた文字列には含まれない括弧開きから括弧閉じまでの文字列が、辞書検索エンジンによる辞書の検索範囲を不要に限定することを防止することができる。 In the configuration of the first modification, the control unit 40 passes the kana character (“Elephant”) in parentheses to the TTS engine 30 when “read” by the TTS engine 30 is selected from the touch menu TM or the like. On the other hand, when “dictionary search” is selected, the control unit 40 does not adopt the character string (“(Elephant)”) from the opening parenthesis to the closing parenthesis as the search keyword. That is, when the dictionary search is selected, the control unit 40 adopts only the touched character string (“miscellaneous song”) among the selected text as a search keyword. As a result, when a dictionary search is selected after a character string is touched, the character string from the parenthesis opening to the parenthesis closing not included in the touched character string becomes the dictionary search range by the dictionary search engine. It can prevent limiting to unnecessary.

〔変形例２〕
図９の（ｂ）に、本実施形態の他の変形例における表示部１０の表示例を示す。本変形例２では、制御部４０は、図９の（ｂ）に示すように、タッチされた文字列（「雑歌（ぞうか）」）の近傍にポップアップメニューＰＵＭを表示する。そして、上記タッチされた文字列のうち、選択テキストとする文字列を、ポップアップメニューＰＵＭからユーザに選択させる。 [Modification 2]
FIG. 9B shows a display example of the display unit 10 in another modification of the present embodiment. In the second modification, the control unit 40 displays a pop-up menu PUM near the touched character string (“Miscellaneous Song”) as shown in FIG. 9B. Then, the user selects a character string to be selected text from the touched character string from the pop-up menu PUM.

本変形例２の構成では、ユーザが選択テキストとする文字列を決定することができる。これにより、ユーザの所望に応じて、上記タッチされた文字列に含まれる任意の範囲の文字列の読み上げを実現することができる。 In the configuration of the second modification, the user can determine a character string to be selected text. As a result, it is possible to realize reading of a character string in an arbitrary range included in the touched character string, as desired by the user.

〔変形例３〕
本実施形態では、情報処理装置１が読み上げるテキストデータが表示部１０に表示されている構成を説明した。しかしながら、その一変形例では、情報処理装置１によって読み上げられるテキストデータは、表示部１０に表示されていなくてもよい。例えば、情報処理装置１は、ユーザがコンテンツ記憶部５０Ａ中のファイルを選択すると、そのファイル中のテキストデータを表示せずに読み上げてもよい。 [Modification 3]
In the present embodiment, the configuration in which text data read by the information processing apparatus 1 is displayed on the display unit 10 has been described. However, in the modification, the text data read out by the information processing apparatus 1 may not be displayed on the display unit 10. For example, when the user selects a file in the content storage unit 50A, the information processing apparatus 1 may read the text data in the file without displaying it.

〔実施形態２〕
本発明の他の実施形態について、図１０の（ａ）〜図１０の（ｃ）に基づいて説明すれば、以下のとおりである。なお、説明の便宜上、前記実施形態にて説明した部材と同じ機能を有する部材については、同じ符号を付記し、その説明を省略する。 [Embodiment 2]
Another embodiment of the present invention will be described below with reference to FIGS. 10A to 10C. For convenience of explanation, members having the same functions as those described in the embodiment are given the same reference numerals, and descriptions thereof are omitted.

コンテンツに含まれるテキストデータには、テキストデータ中の漢字文字列と、その漢字文字列の読みを表すルビ（平仮名文字列または片仮名文字列）とを関連付けるルビ情報が、予め付与されている場合がある。図１０の（ａ）〜図１０の（ｃ）に、ルビ情報付きテキストの表示例を示す。図１０の（ａ）および図１０の（ｃ）は、ルビＲｕが非表示であるルビ情報付きテキストの表示例である。また、図１０の（ｂ）は、漢字文字列の上にルビＲｕが表示されているルビ情報付きテキストの表示例である。 The text data included in the content may be preliminarily provided with ruby information that associates a kanji character string in the text data with a ruby (a hiragana character string or a katakana character string) representing the reading of the kanji character string. is there. FIGS. 10A to 10C show display examples of text with ruby information. FIGS. 10A and 10C are display examples of text with ruby information in which ruby Ru is not displayed. FIG. 10B is a display example of text with ruby information in which ruby Ru is displayed on the kanji character string.

本実施形態の情報処理装置２に備えられる制御部８０は、前記実施形態の制御部４０の読み仮名抽出部４３に代えて、読み仮名抽出部８３（仮名文字列抽出部）を含んでいる。また、記憶部５０にはルビ情報がさらに記憶されている。情報処理装置２のその他の構成は、情報処理装置１のそれと同じである。 The control unit 80 provided in the information processing apparatus 2 of the present embodiment includes a reading kana extraction unit 83 (a kana character string extraction unit) instead of the reading kana extraction unit 43 of the control unit 40 of the embodiment. The storage unit 50 further stores ruby information. Other configurations of the information processing apparatus 2 are the same as those of the information processing apparatus 1.

実施形態１に係る読み仮名抽出部４３は、テキストデータ中の漢字文字列の直後に、括弧に囲まれた平仮名文字列または片仮名文字列が存在する場合に、その平仮名文字列または片仮名文字列を、上記漢字文字列と関連付けられた仮名文字列として抽出する。一方、本実施形態に係る読み仮名抽出部８３は、ルビ情報を参照して、漢字文字列のルビＲｕを、上記漢字文字列と関連付けられた仮名文字列として抽出する。 When the hiragana character string or the katakana character string enclosed in parentheses exists immediately after the kanji character string in the text data, the reading kana extraction unit 43 according to the first embodiment extracts the hiragana character string or the katakana character string. , Extracted as a kana character string associated with the kanji character string. On the other hand, the reading kana extraction unit 83 according to the present embodiment refers to the ruby information and extracts the kanji character string ruby Ru as the kana character string associated with the kanji character string.

［制御部８０の動作］
以下に、図１０の（ｃ）を参照して、本実施形態に係る制御部８０の動作を説明する。図１０の（ｃ）に示すように、制御部８０は、タッチペンＰ等によりタッチされた文字列（「下人」）を選択テキストとして決定する。そして、制御部４０は、タッチされた文字列に対する動作を選択するためのタッチメニューＴＭを表示する。 [Operation of control unit 80]
Below, with reference to FIG.10 (c), operation | movement of the control part 80 which concerns on this embodiment is demonstrated. As shown in (c) of FIG. 10, the control unit 80 determines a character string touched by the touch pen P or the like (“lower person”) as the selected text. And the control part 40 displays the touch menu TM for selecting the operation | movement with respect to the touched character string.

タッチメニューＴＭから、「読み上げ」の選択肢を選択する操作が行われたとき、読み仮名抽出部８３は、ルビ情報を参照して、選択テキスト中の漢字文字列（「下人」）と関連付けられたルビＲｕ（「げにん」）を抽出する。漢字仮名置換部４４は、選択テキスト中の漢字文字列（「下人」）を、読み仮名抽出部８３によって抽出されたルビＲｕ（「げにん」）に置換する。ＴＴＳエンジン出力部４５は、漢字仮名置換部４４による置換処理後の選択テキスト（「げにん」）を、ＴＴＳエンジン３０に出力する。ＴＴＳエンジン３０に入力された選択テキストが音声データに変換された後、当該音声データに基づいてスピーカ７０から音声出力が行われる。 When an operation of selecting a “speech” option from the touch menu TM is performed, the reading kana extraction unit 83 refers to the ruby information and is associated with the kanji character string (“sludge”) in the selected text. Extract Ruby Ru ("Genin"). The kanji-kana replacement unit 44 replaces the kanji character string (“lower person”) in the selected text with the ruby Ru (“genin”) extracted by the reading-kana extraction unit 83. The TTS engine output unit 45 outputs the selected text (“Genin”) after the replacement process by the kanji kana replacement unit 44 to the TTS engine 30. After the selected text input to the TTS engine 30 is converted into voice data, voice output is performed from the speaker 70 based on the voice data.

読み仮名抽出部８３は、漢字文字列と関連付けられた仮名文字列を抽出する方法を、テキストがルビ情報付きテキストであるか否かに応じて、前記実施形態１の方法と本実施形態の方法との間で変更してもよい。具体的には、読み仮名抽出部８３は、テキストにルビ情報が付与されている場合は、本実施形態のように、ルビ情報に基づいて、漢字文字列のルビを、上記漢字文字列と関連付けられた仮名文字列として抽出する。一方、テキストにルビ情報が付与されていない場合は、前記実施形態１のように、漢字文字列の直後に存在する括弧内の平仮名文字列または片仮名文字列を、上記漢字文字列と関連付けられた仮名文字列として抽出する。 The reading kana extraction unit 83 extracts the kana character string associated with the kanji character string according to whether or not the text is text with ruby information and the method of the present embodiment. You may change between Specifically, when the ruby information is added to the text, the reading kana extraction unit 83 associates the ruby of the kanji character string with the kanji character string based on the ruby information as in the present embodiment. The extracted kana character string is extracted. On the other hand, when the ruby information is not given to the text, the hiragana or katakana character string in parentheses immediately after the kanji character string is associated with the kanji character string as in the first embodiment. Extract as a kana character string.

〔実施形態３〕
本発明の他の実施形態について、図１１〜図１５に基づいて説明すれば、以下のとおりである。なお、説明の便宜上、前記の各実施形態にて説明した部材と同じ機能を有する部材については、同じ符号を付記し、その説明を省略する。 [Embodiment 3]
The following will describe another embodiment of the present invention with reference to FIGS. For convenience of explanation, members having the same functions as those described in the above embodiments are denoted by the same reference numerals and description thereof is omitted.

前記実施形態１のＴＴＳエンジン３０（図１参照）のような、日本語を読み上げるためのＴＴＳエンジン（以下、日本語ＴＴＳエンジンと呼称する）は、一般的に、日本語の音声データの他に、アルファベット１文字ずつの音声データを出力することができる。そのため、日本語ＴＴＳエンジンは、日本語の文字列だけでなく、英語の文字列も読み上げることができる。なお、日本語ＴＴＳエンジンは、英語を読み上げる場合、その英語を構成する各アルファベットの音声データを組み合わせて出力する。そのため、日本語ＴＴＳエンジンは、英語を、ネイティブのような自然な発音で、スムーズに読み上げることはできない。 A TTS engine (hereinafter referred to as a Japanese TTS engine) for reading out Japanese, such as the TTS engine 30 (see FIG. 1) of the first embodiment, is generally used in addition to Japanese speech data. The voice data for each letter of the alphabet can be output. Therefore, the Japanese TTS engine can read out not only Japanese character strings but also English character strings. When the Japanese TTS engine reads out English, the Japanese TTS engine outputs a combination of voice data of each alphabet constituting the English. Therefore, the Japanese TTS engine cannot read English smoothly with a natural pronunciation like native.

しかしながら、日本語ＴＴＳエンジンは、日本語および英語以外の種類の言語（例えば中国語）の文字を読み上げることができない。 However, the Japanese TTS engine cannot read out characters in languages other than Japanese and English (for example, Chinese).

そこで、本実施形態では、日本語ＴＴＳエンジンを含む複数のＴＴＳエンジンを備え、日本語以外の言語もスムーズに読み上げられるように構成された情報処理装置を説明する。本実施形態の情報処理装置は、テキスト中の文字列の言語に基づいて、複数のＴＴＳエンジンの中から、いずれかのＴＴＳエンジンを選択する。そして、選択したＴＴＳエンジンを用いて、テキストを読み上げる。本実施形態の情報処理装置は、特に、日本語ＴＴＳエンジンを用いて、日本語のテキストを読み上げる場合、前記実施形態１で説明した読み上げデータ処理（図６および図７参照）を実行する。 Therefore, in the present embodiment, an information processing apparatus including a plurality of TTS engines including a Japanese TTS engine and configured to smoothly read out languages other than Japanese will be described. The information processing apparatus according to the present embodiment selects any TTS engine from a plurality of TTS engines based on the language of the character string in the text. The text is read out using the selected TTS engine. The information processing apparatus according to the present embodiment executes the reading data processing (see FIGS. 6 and 7) described in the first embodiment, particularly when the Japanese text is read out using the Japanese TTS engine.

なお、本実施形態の情報処理装置は、前記実施形態１の情報処理装置１であってもよいし、前記実施形態２の情報処理装置２であってもよい。 Note that the information processing apparatus of the present embodiment may be the information processing apparatus 1 of the first embodiment or the information processing apparatus 2 of the second embodiment.

［本実施形態の情報処理装置の構成］
図１１を用いて、本実施形態の情報処理装置の構成を説明する。図１１は、本実施形態の情報処理装置が備えた制御部９４０の構成を示す構成図である。本実施形態の情報処理装置は、前記実施形態１の情報処理装置１の構成において、制御部４０の代わりに、制御部９４０を備えている。また、情報処理装置１のＴＴＳエンジン３０の代わりに、日本語ＴＴＳエンジン９３１（音声データ生成部、第１音声データ生成部）、英語ＴＴＳエンジン９３２（音声データ生成部）、および中国語ＴＴＳエンジン９３３（音声データ生成部、第２音声データ生成部）を備えている。日本語ＴＴＳエンジン９３１は、日本語および英語（アルファベット）を読み上げることができる。英語ＴＴＳエンジン９３２は、英語のみを読み上げることができる。また、中国語ＴＴＳエンジン９３３は、中国語および英語（アルファベット）を読み上げることができる。なお、日本語ＴＴＳエンジン９３１は、前記実施形態１および２のＴＴＳエンジン３０と同じ機能を有しており、前記実施形態１および２で説明したＴＴＳエンジン３０の処理と同じ処理を実行することができる。 [Configuration of the information processing apparatus of this embodiment]
The configuration of the information processing apparatus according to this embodiment will be described with reference to FIG. FIG. 11 is a configuration diagram illustrating a configuration of the control unit 940 provided in the information processing apparatus of the present embodiment. The information processing apparatus according to the present embodiment includes a control unit 940 instead of the control unit 40 in the configuration of the information processing apparatus 1 according to the first embodiment. Further, in place of the TTS engine 30 of the information processing apparatus 1, a Japanese TTS engine 931 (voice data generation unit, first voice data generation unit), an English TTS engine 932 (voice data generation unit), and a Chinese TTS engine 933. (Audio data generation unit, second audio data generation unit). The Japanese TTS engine 931 can read Japanese and English (alphabet). The English TTS engine 932 can read only English. In addition, the Chinese TTS engine 933 can read out Chinese and English (alphabet). The Japanese TTS engine 931 has the same function as the TTS engine 30 of the first and second embodiments, and can execute the same process as the process of the TTS engine 30 described in the first and second embodiments. it can.

［制御部９４０の詳細な構成］
図１１に示すように、制御部９４０は、前記実施形態１の制御部４０の全構成要素に加えて、言語判断部９４１（選択部）をさらに含む。 [Detailed Configuration of Control Unit 940]
As shown in FIG. 11, the control unit 940 further includes a language determination unit 941 (selection unit) in addition to all the components of the control unit 40 of the first embodiment.

言語判断部９４１は、一時記憶部６０から、ユーザに選択されたテキスト、すなわち選択テキストを取得する。そして、選択テキスト中の文字列の言語の種類を判定する。また、言語判断部９４１は、選択テキスト中に複数種類の言語の文字列が存在する場合、日本語ＴＴＳエンジン９３１、英語ＴＴＳエンジン９３２、および中国語ＴＴＳエンジン９３３の中から、選択テキストを読み上げるＴＴＳエンジン（以下、選択エンジンと呼称する）を選択する選択エンジン決定処理を実行する。 The language determination unit 941 acquires the text selected by the user, that is, the selected text, from the temporary storage unit 60. Then, the language type of the character string in the selected text is determined. In addition, the language determination unit 941 reads out the selected text from the Japanese TTS engine 931, the English TTS engine 932, and the Chinese TTS engine 933 when character strings of a plurality of types of languages exist in the selected text. A selection engine determination process for selecting an engine (hereinafter referred to as a selection engine) is executed.

詳細には、選択エンジン決定処理において、言語判断部９４１は、まず、選択テキストの最初の文字が何語であるかを判定する。すなわち、言語判断部９４１は、最初の文字の言語の種類を判定する。そして、判定した文字の言語を示す情報（言語設定）を保持する。次に、選択テキストの二番目の文字が何語であるかを判定する。そして、言語設定を、最初の文字の言語を示す情報から、二番目の文字の言語を示す情報に更新する。以下、同様に、選択テキストの文字を、一文字ずつ、順番に、その文字が何語であるかを判定するとともに、その判定結果に基づいて、言語設定を更新してゆく。 Specifically, in the selection engine determination process, the language determination unit 941 first determines how many words the first character of the selected text is. That is, the language determination unit 941 determines the language type of the first character. And the information (language setting) which shows the language of the determined character is hold | maintained. Next, it is determined how many words the second character of the selected text is. Then, the language setting is updated from information indicating the language of the first character to information indicating the language of the second character. Hereinafter, in the same manner, the characters of the selected text are sequentially determined one by one in order of the number of the characters, and the language setting is updated based on the determination result.

言語判断部９４１は、選択テキストの最後の文字が何語であるかを判定し、その判定結果に基づいて、言語設定を更新したとき、その時点における言語設定に応じて、選択エンジンを決定する。例えば、上記の時点における言語設定が日本語である場合、言語判断部９４１は、選択エンジンとして、日本語ＴＴＳエンジン９３１を選択する。また、上記の時点における言語設定が英語、中国語である場合、言語判断部９４１は、選択エンジンとして、英語ＴＴＳエンジン９３２、中国語ＴＴＳエンジン９３３をそれぞれ選択する。 The language determination unit 941 determines how many words the last character of the selected text is, and when the language setting is updated based on the determination result, the selection engine is determined according to the language setting at that time. . For example, when the language setting at the above time is Japanese, the language determination unit 941 selects the Japanese TTS engine 931 as the selection engine. In addition, when the language setting at the time is English or Chinese, the language determination unit 941 selects the English TTS engine 932 and the Chinese TTS engine 933 as selection engines.

上記選択エンジン決定処理によれば、例えば、選択テキストにおいて、最初の一部が英語の文字列であり、中間の一部が日本語の文字列であり、最後の一部が中国語の文字列である場合、言語判断部９４１は、日本語ＴＴＳエンジン９３１を選択する。 According to the selection engine determination process, for example, in the selected text, the first part is an English character string, the middle part is a Japanese character string, and the last part is a Chinese character string. If so, the language determination unit 941 selects the Japanese TTS engine 931.

このように、言語判断部９４１は、選択テキスト内に複数種類の言語の文字列が存在する場合、日本語ＴＴＳエンジン９３１、英語ＴＴＳエンジン９３２、および中国語ＴＴＳエンジン９３３から、選択テキストに含まれる複数種類の言語の文字列を読み上げることができる１つのＴＴＳエンジンを選択する。 As described above, the language determination unit 941 includes, from the Japanese TTS engine 931, the English TTS engine 932, and the Chinese TTS engine 933, in the selected text when character strings of a plurality of types of languages exist in the selected text. One TTS engine capable of reading out character strings in a plurality of types of languages is selected.

ところで、前述したように、日本語ＴＴＳエンジン９３１、英語ＴＴＳエンジン９３２、中国語ＴＴＳエンジン９３３には、それぞれ、読み上げることができない文字が存在する。そのような文字を、ここでは外字と呼ぶ。例えば、日本語ＴＴＳエンジン９３１にとって、日本語および英語以外の言語（例えば中国語）の文字が、外字に該当する。また、中国語ＴＴＳエンジン９３３にとって、中国語および英語以外の言語（例えば日本語）の文字が、外字に該当する。 By the way, as described above, the Japanese TTS engine 931, the English TTS engine 932, and the Chinese TTS engine 933 each have characters that cannot be read out. Such characters are referred to herein as external characters. For example, for the Japanese TTS engine 931, characters in languages other than Japanese and English (eg, Chinese) correspond to external characters. For the Chinese TTS engine 933, characters in languages other than Chinese and English (for example, Japanese) correspond to external characters.

そこで、選択テキストに外字が含まれる場合、言語判断部９４１は、選択テキストの最初の文字から、その外字の直前の文字までの文字列を、新しい選択テキストにする。換言すれば、言語判断部９４１は、選択テキストを更新することによって、選択エンジンが、更新後の選択テキストの全体を読み上げることができるようにする。それ以外に、言語判断部９４１は、選択テキストの最初の文字から、改行の直前の文字までの文字列を、新しい選択テキストにしてもよい。 Therefore, when the selected text includes an external character, the language determination unit 941 sets the character string from the first character of the selected text to the character immediately before the external character as a new selected text. In other words, the language determination unit 941 updates the selected text so that the selection engine can read out the entire updated selected text. In addition, the language determination unit 941 may use a character string from the first character of the selected text to the character immediately before the line feed as a new selected text.

言語判断部９４１は、選択エンジンとして、日本語ＴＴＳエンジン９３１を選択した場合、更新後の選択テキストを漢字抽出部４２に出力する。また、選択エンジンとして、英語ＴＴＳエンジン９３２または中国語ＴＴＳエンジン９３３を選択した場合、更新後の選択テキストと、言語設定とを、ＴＴＳエンジン出力部４５に出力する。 When the Japanese TTS engine 931 is selected as the selection engine, the language determination unit 941 outputs the updated selected text to the kanji extraction unit 42. When the English TTS engine 932 or the Chinese TTS engine 933 is selected as the selection engine, the updated selected text and language setting are output to the TTS engine output unit 45.

ＴＴＳエンジン出力部４５は、言語判断部９４１から入力された言語設定に応じて、英語ＴＴＳエンジン９３２または中国語ＴＴＳエンジン９３３に、言語判断部９４１から入力された選択テキストを出力する。具体的には、ＴＴＳエンジン出力部４５は、言語設定が英語である場合、英語ＴＴＳエンジン９３２に、選択テキストを出力する。また、言語設定が中国語である場合、中国語ＴＴＳエンジン９３３に、選択テキストを出力する。なお、選択テキストが入力された漢字抽出部４２が実行する処理は、前記実施形態１で説明したので、その処理の説明を省略する。 The TTS engine output unit 45 outputs the selected text input from the language determination unit 941 to the English TTS engine 932 or the Chinese TTS engine 933 according to the language setting input from the language determination unit 941. Specifically, the TTS engine output unit 45 outputs the selected text to the English TTS engine 932 when the language setting is English. If the language setting is Chinese, the selected text is output to the Chinese TTS engine 933. Since the process executed by the kanji extraction unit 42 to which the selected text is input has been described in the first embodiment, description of the process is omitted.

［選択エンジンの決定方法］
ここでは、言語判断部９４１が、複数のＴＴＳエンジンの中から、選択エンジンを決定する方法を具体的に説明する。 [How to select a selection engine]
Here, a method in which the language determination unit 941 determines a selection engine from a plurality of TTS engines will be specifically described.

言語判断部９４１は、日本語ＴＴＳエンジン９３１、英語ＴＴＳエンジン９３２、および中国語ＴＴＳエンジン９３３の中から、最も長い選択テキストの範囲を読み上げることができるＴＴＳエンジンを、選択エンジンとして選択する。 The language determination unit 941 selects a TTS engine that can read out the longest selected text range from the Japanese TTS engine 931, the English TTS engine 932, and the Chinese TTS engine 933 as the selection engine.

例えば、選択テキストが日本語および英語を含む場合、言語判断部９４１は、選択エンジンとして、英語ＴＴＳエンジン９３２あるいは中国語ＴＴＳエンジン９３３ではなく、日本語ＴＴＳエンジン９３１を選択する。なぜならば、英語ＴＴＳエンジン９３２および中国語ＴＴＳエンジン９３３は、選択テキスト中の英語のみを読み上げることができる。一方、日本語ＴＴＳエンジン９３１は、選択テキスト中の日本語および英語の両方を読み上げることができる。従って、日本語ＴＴＳエンジン９３１は、英語ＴＴＳエンジン９３２および中国語ＴＴＳエンジン９３３よりも、長い選択テキストの範囲を読み上げることができるためである。 For example, when the selected text includes Japanese and English, the language determination unit 941 selects the Japanese TTS engine 931 instead of the English TTS engine 932 or the Chinese TTS engine 933 as the selection engine. This is because the English TTS engine 932 and the Chinese TTS engine 933 can read only the English in the selected text. On the other hand, the Japanese TTS engine 931 can read both Japanese and English in the selected text. Therefore, the Japanese TTS engine 931 can read out a longer selected text range than the English TTS engine 932 and the Chinese TTS engine 933.

同様に、選択テキストが中国語および英語を含む場合、言語判断部９４１は、選択エンジンとして、日本語ＴＴＳエンジン９３１あるいは英語ＴＴＳエンジン９３２ではなく、中国語ＴＴＳエンジン９３３を選択する。 Similarly, when the selected text includes Chinese and English, the language determination unit 941 selects the Chinese TTS engine 933 instead of the Japanese TTS engine 931 or the English TTS engine 932 as the selection engine.

また、選択テキストが日本語および中国語を含む場合、言語判断部９４１は、選択テキストにおいて、他方の言語よりも前に出現する言語を読み上げることができるＴＴＳエンジンを、選択エンジンとして選択する。例えば、選択テキストにおいて、日本語、中国語がこの順番で出現する場合、言語判断部９４１は、日本語ＴＴＳエンジン９３１を、選択エンジンとして選択する。 When the selected text includes Japanese and Chinese, the language determination unit 941 selects, as the selection engine, a TTS engine that can read a language that appears before the other language in the selected text. For example, when Japanese and Chinese appear in this order in the selected text, the language determination unit 941 selects the Japanese TTS engine 931 as the selection engine.

なお、選択テキストが、英語を含むが、日本語および中国語を含まない場合、言語判断部９４１は、選択エンジンとして、英語ＴＴＳエンジン９３２を選択する。なぜならば、英語ＴＴＳエンジン９３２は、日本語ＴＴＳエンジン９３１および中国語ＴＴＳエンジン９３３よりも、英語を自然かつスムーズに読み上げることができるためである。 When the selected text includes English but does not include Japanese and Chinese, the language determination unit 941 selects the English TTS engine 932 as the selection engine. This is because the English TTS engine 932 can read English naturally and more smoothly than the Japanese TTS engine 931 and the Chinese TTS engine 933.

選択エンジンの決定方法は、上述した方法に限定されない。例えば、別の構成では、言語判断部９４１は、選択テキストにおける日本語の部分、英語の部分、中国語の部分を読み上げる選択エンジンとして、それぞれ、日本語ＴＴＳエンジン９３１、英語ＴＴＳエンジン９３２、中国語ＴＴＳエンジン９３３を選択してもよい。 The method for determining the selection engine is not limited to the method described above. For example, in another configuration, the language determination unit 941 uses a Japanese TTS engine 931, an English TTS engine 932, and a Chinese as selection engines that read out a Japanese part, an English part, and a Chinese part in the selected text, respectively. A TTS engine 933 may be selected.

この構成の場合、選択テキスト中の日本語、英語、中国語が全て読み上げられる。さらに、選択テキストにおいて、日本語、英語、中国語の各部分が、その言語を読み上げるために適合されたＴＴＳエンジンでそれぞれ読み上げられるので、上記の各部分が自然かつスムーズに読み上げられる。 In this configuration, Japanese, English, and Chinese in the selected text are all read out. Furthermore, in the selected text, each part of Japanese, English, and Chinese is read out by a TTS engine adapted to read out the language, so that each of the above parts is read out naturally and smoothly.

［読み上げの具体例］
図１２〜図１４を用いて、本実施形態の情報処理装置によって、選択テキストが読み上げられる具体例を説明する。図１２〜図１４は、それぞれ、本実施形態の情報処理装置が備えた表示部１０（図２参照）を示す図である。上記の各図では、表示部１０の画面に表示されたテキストの一部がタッチペンＰで選択されており、この選択されたテキスト（選択テキスト）に対して実行することができる処理を示すタッチメニューＴＭが表示されている。 [Specific example of reading aloud]
A specific example in which the selected text is read out by the information processing apparatus according to the present embodiment will be described with reference to FIGS. 12 to 14 are diagrams each showing the display unit 10 (see FIG. 2) provided in the information processing apparatus of the present embodiment. In each of the above drawings, a part of the text displayed on the screen of the display unit 10 is selected with the touch pen P, and a touch menu showing processing that can be executed on the selected text (selected text). TM is displayed.

図１２に示す選択テキスト Selected text shown in Figure 12

は、日本語、英語、中国語および記号で構成されている。言語判断部９４１は、この選択テキストを読み上げる選択エンジンとして、日本語ＴＴＳエンジン９３１を選択する。選択エンジンが読み上げる文字列の始端は「今日（こんにち）は」であり、終端は「Ｈｉ！」である。図１２に示す選択テキストにおいて、「Ｈｉ！」の後ろの中国語は、日本語ＴＴＳエンジン９３１にとって外字であるので、読み上げられない。 Consists of Japanese, English, Chinese and symbols. The language determination unit 941 selects the Japanese TTS engine 931 as a selection engine that reads out the selected text. The beginning of the character string read out by the selection engine is “Today” and the end is “Hi!”. In the selected text shown in FIG. 12, the Chinese after “Hi!” Is not read out because it is an external character for the Japanese TTS engine 931.

ここで、日本語ＴＴＳエンジン９３１は、日本語の文字列「今日（こんにち）は」を読み上げる際、前記実施形態１で説明した読み上げデータ処理（図６および図７参照）を実行する。そして、漢字文字列「今日」の代わりに、仮名文字列「こんにち」を読み上げる。 Here, the Japanese TTS engine 931 executes the reading data processing (see FIGS. 6 and 7) described in the first embodiment when reading the Japanese character string “Today”. Then, instead of the kanji character string “today”, the kana character string “Konichi” is read out.

図１３に示す選択テキスト「Ｈｅｌｌｏ！／Ｈｉ！」は、英語および記号で構成されている。言語判断部９４１は、この選択テキストを読み上げる選択エンジンとして、英語ＴＴＳエンジン９３２を選択する。選択エンジンが読み上げる文字列の始端は「Ｈｅｌｌｏ！」であり、終端は「Ｈｉ！」である。 The selected text “Hello! / Hi!” Shown in FIG. 13 is composed of English and symbols. The language determination unit 941 selects the English TTS engine 932 as a selection engine that reads out the selected text. The beginning of the character string read out by the selection engine is “Hello!” And the end is “Hi!”.

図１４に示す選択テキスト Selected text shown in Figure 14

は、英語、中国語および記号で構成されている。言語判断部９４１は、この選択テキストを読み上げる選択エンジンとして、中国語ＴＴＳエンジン９３３を選択する。選択エンジンが読み上げる文字列の始端は「Ｈｅｌｌｏ！」であり、終端は、 Consists of English, Chinese and symbols. The language determination unit 941 selects the Chinese TTS engine 933 as a selection engine that reads out the selected text. The beginning of the string read by the selection engine is "Hello!"

である。 It is.

［選択エンジン決定処理の詳細］
図１５を用いて、言語判断部９４１が実行する選択エンジン決定処理を説明する。図１５は、選択エンジン決定処理の流れを示すフローチャートである。ここで、言語判断部９４１は、選択エンジン決定処理を実行する前に、一時記憶部６１から、選択テキストを予め取得している。また、初期段階では、言語設定は英語であり、変数ｎは１であるとする。 [Details of selection engine decision processing]
A selection engine determination process executed by the language determination unit 941 will be described with reference to FIG. FIG. 15 is a flowchart showing the flow of the selection engine determination process. Here, the language determination unit 941 acquires the selected text from the temporary storage unit 61 in advance before executing the selection engine determination process. In the initial stage, the language setting is English and the variable n is 1.

選択エンジン決定処理では、言語判断部９４１は、まず、選択テキストのｎ番目の文字（あるいは記号）が、選択テキストの最後の文字または判定終了記号であるか否かを判定する（Ｓ１００）。ここで、判定終了記号とは、外字であってもよいし、あるいは改行記号であってもよい。 In the selection engine determination process, the language determination unit 941 first determines whether the nth character (or symbol) of the selected text is the last character of the selected text or the determination end symbol (S100). Here, the determination end symbol may be an external character or a line feed symbol.

選択テキストのｎ番目の文字が、選択テキストの最後の文字または判定終了記号である場合（Ｓ１００でＹｅｓ）、言語判断部９４１は、その時点の言語設定に基づいて、選択エンジンを決定するとともに、選択テキストのｎ番目の文字、または、ｎ番目の文字の直前の文字を、選択エンジンが読み上げる文字列の終端として決定する（Ｓ６００）。そして、選択エンジン決定処理を終了する。 When the nth character of the selected text is the last character of the selected text or the determination end symbol (Yes in S100), the language determination unit 941 determines the selection engine based on the current language setting, and The nth character of the selected text or the character immediately before the nth character is determined as the end of the character string read out by the selection engine (S600). Then, the selection engine determination process ends.

一方、選択テキストのｎ番目の文字が、選択テキストの最後の文字または判定終了記号ではない場合（Ｓ１００でＮｏ）、言語判断部９４１は、選択テキストのｎ番目の文字が、アルファベットまたは記号（判定終了記号以外の記号）であるか否かを判定する（Ｓ２００）。選択テキストのｎ番目の文字が、アルファベットまたは記号である場合（Ｓ２００でＹｅｓ）、言語判断部９４１は、変数ｎに１を加算する（Ｓ５００）。そして、選択エンジン決定処理は、次の文字が選択テキストの最後の文字または終了判定文字であるか否かを判定するステップＳ１００の処理に戻る。 On the other hand, when the nth character of the selected text is not the last character of the selected text or the determination end symbol (No in S100), the language determining unit 941 determines that the nth character of the selected text is an alphabet or a symbol (determination). It is determined whether or not it is a symbol other than the end symbol (S200). When the nth character of the selected text is an alphabet or a symbol (Yes in S200), the language determination unit 941 adds 1 to the variable n (S500). Then, the selection engine determination process returns to the process of step S100 for determining whether or not the next character is the last character or the end determination character of the selected text.

一方、選択テキストのｎ番目の文字が、アルファベットまたは記号ではない場合（Ｓ２００でＮｏ）、言語判断部９４１は、ｎ番目の文字が日本語であるか否かを判定する（Ｓ３００）。ｎ番目の文字が日本語である場合（Ｓ３００でＹｅｓ）、言語判断部９４１は、その時点の言語設定が何語（Ａ：日本語、Ｂ、英語、Ｃ：それ以外の言語）であるかを判定する（Ｓ３１０）。 On the other hand, when the nth character of the selected text is not an alphabet or a symbol (No in S200), the language determination unit 941 determines whether or not the nth character is Japanese (S300). When the n-th character is Japanese (Yes in S300), the language determination unit 941 determines the language (A: Japanese, B, English, C: other languages) at that time. Is determined (S310).

ステップＳ３１０の時点の言語設定が日本語である場合（Ｓ３１０でＡ）、言語判断部９４１は、変数ｎに１を加算する（Ｓ５００）。そして、選択エンジン決定処理は、次の文字が選択テキストの最後の文字または終了判定文字であるか否かを判定するステップＳ１００の処理に戻る。 If the language setting at step S310 is Japanese (A in S310), the language determination unit 941 adds 1 to the variable n (S500). Then, the selection engine determination process returns to the process of step S100 for determining whether or not the next character is the last character or the end determination character of the selected text.

また、ステップＳ３１０の時点の言語設定が英語である場合（Ｓ３１０でＢ）、言語判断部９４１は、言語設定を日本語に変更する（Ｓ３２０）。その後、言語判断部９４１は、変数ｎに１を加算する（Ｓ５００）。そして、選択エンジン決定処理は、次の文字が選択テキストの最後の文字または終了判定文字であるか否かを判定するステップＳ１００の処理に戻る。 If the language setting at step S310 is English (B in S310), the language determination unit 941 changes the language setting to Japanese (S320). Thereafter, the language determination unit 941 adds 1 to the variable n (S500). Then, the selection engine determination process returns to the process of step S100 for determining whether or not the next character is the last character or the end determination character of the selected text.

また、ステップＳ３１０の時点の言語設定が日本語および英語以外の言語である場合（Ｓ３１０でＣ）、その時点の言語設定に基づいて、選択エンジンを決定するとともに、選択テキストのｎ番目の文字の直前の文字を、選択エンジンが読み上げる文字列の終端として決定する（Ｓ６００）。そして、選択エンジン決定処理を終了する。 If the language setting at the time of step S310 is a language other than Japanese and English (C in S310), the selection engine is determined based on the language setting at that time, and the nth character of the selected text The immediately preceding character is determined as the end of the character string read out by the selection engine (S600). Then, the selection engine determination process ends.

ステップＳ３００において、ｎ番目の文字が日本語ではない場合（Ｓ３００でＮｏ）、言語判断部９４１は、ｎ番目の文字が中国語であるか否かを判定する（Ｓ４００）。ｎ番目の文字が中国語である場合（Ｓ４００でＹｅｓ）、言語判断部９４１は、ステップＳ４００の時点の言語設定が何語（Ａ：中国語、Ｂ、英語、Ｃ：それ以外の言語）であるかを判定する（Ｓ４１０）。 In step S300, when the nth character is not Japanese (No in S300), the language determination unit 941 determines whether or not the nth character is Chinese (S400). When the n-th character is Chinese (Yes in S400), the language determination unit 941 determines the language setting (A: Chinese, B, English, C: other languages) at the time of step S400. It is determined whether there is (S410).

ステップＳ４１０の時点の言語設定が中国語である場合（Ｓ４１０でＡ）、言語判断部９４１は、変数ｎに１を加算する（Ｓ５００）。そして、選択エンジン決定処理は、次の文字が選択テキストの最後の文字または終了判定文字であるか否かを判定するステップＳ１００の処理に戻る。 When the language setting at the time of step S410 is Chinese (A in S410), the language determination unit 941 adds 1 to the variable n (S500). Then, the selection engine determination process returns to the process of step S100 for determining whether or not the next character is the last character or the end determination character of the selected text.

また、ステップＳ４１０の時点の言語設定が英語である場合（Ｓ４１０でＢ）、言語判断部９４１は、言語設定を中国語に変更する（Ｓ４２０）。その後、言語判断部９４１は、変数ｎに１を加算する（Ｓ５００）。そして、選択エンジン決定処理は、次の文字が選択テキストの最後の文字または終了判定文字であるか否かを判定するステップＳ１００の処理に戻る。 If the language setting at the time of step S410 is English (B in S410), the language determination unit 941 changes the language setting to Chinese (S420). Thereafter, the language determination unit 941 adds 1 to the variable n (S500). Then, the selection engine determination process returns to the process of step S100 for determining whether or not the next character is the last character or the end determination character of the selected text.

また、ステップＳ４１０の時点の言語設定が中国語および英語以外の言語である場合（Ｓ４１０でＣ）、その時点の言語設定に基づいて、選択エンジンを決定するとともに、選択テキストのｎ番目の文字の直前の文字を、選択エンジンが読み上げる文字列の終端として決定する（Ｓ６００）。そして、選択エンジン決定処理を終了する。 If the language setting at the time of step S410 is a language other than Chinese and English (C in S410), the selection engine is determined based on the language setting at that time and the nth character of the selected text The immediately preceding character is determined as the end of the character string read out by the selection engine (S600). Then, the selection engine determination process ends.

なお、言語判断部９４１が選択エンジンを決定する方法は、上述した方法に限られない。例えば、言語判断部９４１は、選択テキストの最初の文字を読み上げることができるＴＴＳエンジンのうち、所定の基準より多くの文字列を読み上げることができる１または複数のＴＴＳエンジンの中から、１つのＴＴＳを選択してもよい。例えば、所定の基準は、選択テキストに含まれる文字列数の７割であってよい。この構成の場合、言語判断部９４１は、７割以上の文字列を読み上げるＴＴＳエンジンのいずれか１つを選択する。 Note that the method by which the language determination unit 941 determines the selection engine is not limited to the method described above. For example, the language determination unit 941 may select one TTS from one or more TTS engines that can read out a character string more than a predetermined reference among TTS engines that can read out the first character of the selected text. May be selected. For example, the predetermined criterion may be 70% of the number of character strings included in the selected text. In the case of this configuration, the language determination unit 941 selects any one of the TTS engines that read out 70% or more of the character string.

あるいは、言語判断部９４１は、選択テキスト中の着目した文字（例えば、読み上げ可能な最初の文字）に対応したＴＴＳエンジンのうち、着目した文字以降の文字列から、より多くの文字列の音声データを生成できるＴＴＳエンジンを選択してもよい。ここで、読み上げ可能な文字とは、記号などの読み上げ不可能な文字ではなく、アルファベット、漢字などの読み上げ可能な文字を意味する。 Alternatively, the language determination unit 941 can generate more character string audio data from the character string after the focused character in the TTS engine corresponding to the focused character (for example, the first character that can be read out) in the selected text. A TTS engine that can generate Here, the characters that can be read out are not characters that cannot be read out such as symbols, but characters that can be read out such as alphabets and kanji.

〔実施形態４〕
情報処理装置１および２が備える各制御ブロック（特に、漢字抽出部４２、読み仮名抽出部４３・８３、漢字仮名置換部４４）は、集積回路（ＩＣチップ）等に形成された論理回路（ハードウェア）によって実現してもよいし、ＣＰＵ（Central Processing Unit）を用いてソフトウェアによって実現してもよい。 [Embodiment 4]
Each control block (in particular, the kanji extraction unit 42, the reading kana extraction units 43 and 83, and the kanji kana replacement unit 44) included in the information processing apparatuses 1 and 2 is a logic circuit (hardware) formed in an integrated circuit (IC chip) or the like. Hardware), or software using a CPU (Central Processing Unit).

後者の場合、情報処理装置１および２は、各機能を実現するソフトウェアであるプログラムの命令を実行するＣＰＵ、上記プログラムおよび各種データがコンピュータ（またはＣＰＵ）で読み取り可能に記録されたＲＯＭ（Read Only Memory）または記憶装置（これらを「記録媒体」と称する）、上記プログラムを展開するＲＡＭ（Random Access Memory）などを備えている。そして、コンピュータ（またはＣＰＵ）が上記プログラムを上記記録媒体から読み取って実行することにより、本発明の目的が達成される。上記記録媒体としては、「一時的でない有形の媒体」、たとえば、テープ、ディスク、カード、半導体メモリ、プログラマブルな論理回路などを用いることができる。また、上記プログラムは、該プログラムを伝送可能な任意の伝送媒体（通信ネットワークや放送波等）を介して上記コンピュータに供給されてもよい。なお、本発明は、上記プログラムが電子的な伝送によって具現化された、搬送波に埋め込まれたデータ信号の形態でも実現され得る。 In the latter case, the information processing apparatuses 1 and 2 include a CPU that executes instructions of a program that is software that realizes each function, and a ROM (Read Only) in which the program and various data are recorded so as to be readable by a computer (or CPU). Memory) or a storage device (these are referred to as “recording media”), a RAM (Random Access Memory) for expanding the program, and the like. And the objective of this invention is achieved when a computer (or CPU) reads the said program from the said recording medium and runs it. As the recording medium, a “non-temporary tangible medium” such as a tape, a disk, a card, a semiconductor memory, a programmable logic circuit, or the like can be used. The program may be supplied to the computer via an arbitrary transmission medium (such as a communication network or a broadcast wave) that can transmit the program. The present invention can also be realized in the form of a data signal embedded in a carrier wave in which the program is embodied by electronic transmission.

〔まとめ〕
本発明の態様１に係るテキスト読み上げ装置（情報処理装置１、２）は、表示部（１０）に表示するためのテキストデータを音声データに変換し、変換した音声データを出力することによって、上記テキストデータを読み上げるテキスト読み上げ装置であって、１つ以上の言語の音声データをそれぞれ生成する複数の音声データ生成部（日本語ＴＴＳエンジン９３１、英語ＴＴＳエンジン９３２、中国語ＴＴＳエンジン９３３）と、選択された上記テキストデータの範囲内の文字列の言語の種類を判定し、上記複数の音声データ生成部から、上記テキストデータに含まれる複数種類の言語の文字列の音声データを生成できる１つの音声データ生成部を選択する選択部（言語判断部９４１）と、を備えている。 [Summary]
The text-to-speech device (information processing devices 1 and 2) according to aspect 1 of the present invention converts the text data to be displayed on the display unit (10) into voice data, and outputs the converted voice data, thereby A text-to-speech device that reads out text data, and a plurality of speech data generation units (Japanese TTS engine 931, English TTS engine 932, Chinese TTS engine 933) that respectively generate speech data in one or more languages, and a selection One voice that can determine the language type of the character string within the range of the text data and generate voice data of a plurality of types of character strings included in the text data from the plurality of voice data generation units And a selection unit (language determination unit 941) for selecting a data generation unit.

上記の構成によれば、選択されたテキストデータの範囲内に含まれる文字列の言語の種類に基づき、複数の音声データ生成部の中から、テキストデータに含まれる言語に関し、複数種類の言語の音声データを生成することができる１つの音声データ生成部が選択される。 According to the above configuration, based on the language type of the character string included in the range of the selected text data, the language included in the text data is selected from the plurality of speech data generation units. One voice data generation unit capable of generating voice data is selected.

そのため、選択されたテキストデータの範囲内に複数の言語の文字列が含まれる場合であっても、テキストデータの複数種類の言語の読み上げに適合した音声データ生成部を用いて、テキストデータ中の文字列の音声データが生成される。従って、複数種類の言語の文字列を含むテキストデータをスムーズに読み上げることができる。 Therefore, even in the case where character strings of a plurality of languages are included in the range of the selected text data, a voice data generation unit suitable for reading out a plurality of types of text data is used, Sound data of a character string is generated. Accordingly, text data including character strings of a plurality of types of languages can be read out smoothly.

本発明の態様２に係るテキスト読み上げ装置（情報処理装置１、２）は、上記態様１において、上記選択部（言語判断部９４１）は、選択された上記テキストデータ中の着目する文字列の音声データを生成できる音声データ生成部から、選択された上記テキストデータにおいて、より多くの文字列の音声データを生成できる音声データ生成部（日本語ＴＴＳエンジン９３１、英語ＴＴＳエンジン９３２、中国語ＴＴＳエンジン９３３）を選択してもよい。 In the text-to-speech apparatus (information processing apparatuses 1 and 2) according to aspect 2 of the present invention, in the aspect 1, the selection unit (language determination unit 941) is a voice of a character string of interest in the selected text data. A voice data generator (Japanese TTS engine 931, English TTS engine 932, Chinese TTS engine 933) that can generate voice data of more character strings in the selected text data from the voice data generator that can generate data. ) May be selected.

上記の構成によれば、選択された上記テキストデータ中の着目する文字列の音声データを生成でき、かつ、選択された上記テキストデータにおいて、より多くの文字列の音声データを生成できる音声データ音声データ生成部を選択することができる。 According to said structure, the audio | voice data audio | voice which can produce | generate the audio | voice data of the character string of interest in the selected said text data, and can produce | generate the audio | voice data of more character strings in the selected said text data A data generator can be selected.

上記の構成において、着目する文字列は、選択されたテキストデータの先頭から見て、音声データを生成できる最初の文字であってもよい。 In the above configuration, the character string of interest may be the first character that can generate voice data when viewed from the beginning of the selected text data.

この場合、選択されたテキストデータの先頭から見て、音声データを生成できる最初の文字（例えば、アルファベットまたは漢字）に対応した音声データ生成部、すなわち、音声データを生成可能な最初の文字の音声データを生成することができる音声データ生成部が選択される。 In this case, as viewed from the beginning of the selected text data, the voice data generation unit corresponding to the first character (for example, alphabet or kanji) that can generate voice data, that is, the first character voice that can generate voice data An audio data generation unit capable of generating data is selected.

また、先頭から見て音声データを生成可能な最初の文字に対応する音声データ生成部が複数ある場合、それらの音声データ生成部のうち、最初の文字以降の文字列において、より多くの文字列の音声データを生成可能な音声データ生成部を選択することができる。 In addition, when there are a plurality of voice data generation units corresponding to the first character capable of generating voice data when viewed from the beginning, among the voice data generation units, more character strings in the character strings after the first character It is possible to select an audio data generation unit capable of generating the audio data.

本発明の態様３に係るテキスト読み上げ装置（情報処理装置１、２）は、上記態様１または２において、上記複数の音声データ生成部（日本語ＴＴＳエンジン９３１、英語ＴＴＳエンジン９３２、中国語ＴＴＳエンジン９３３）は、第１の言語および第２の言語の音声データを生成する第１音声データ生成部と、第１の言語および第３の言語の音声データを生成する第２音声データ生成部とを含んでおり、上記選択部は、選択された上記テキストデータの範囲に第１の言語が含まれる場合において、(i)さらに第２の言語が含まれる場合、上記第１音声データ生成部を選択し、また、(ii)さらに第３の言語が含まれる場合、上記第２音声データ生成部を選択してもよい。 The text-to-speech apparatus (information processing apparatus 1 or 2) according to aspect 3 of the present invention is the above-described aspect 1 or 2, wherein the plurality of voice data generation units (Japanese TTS engine 931, English TTS engine 932, Chinese TTS engine) 933) includes a first audio data generation unit that generates audio data of the first language and the second language, and a second audio data generation unit that generates audio data of the first language and the third language. And the selection unit selects the first speech data generation unit when the first language is included in the range of the selected text data, and (i) when the second language is further included. In addition, (ii) when the third language is further included, the second audio data generation unit may be selected.

上記の構成によれば、第１音声データ生成部によって、第１の言語および第２の言語の文字列の音声データが生成される一方、第２音声データ生成部によって、第１の言語および第３の言語の文字列の音声データが生成される。 According to the above configuration, the first voice data generation unit generates the voice data of the character strings of the first language and the second language, while the second voice data generation unit generates the first language and the second language. The voice data of the character string of the 3 languages is generated.

そのため、第１の言語および第２の言語を含むテキストデータ、および、第１の言語および第３の言語を含むテキストデータをスムーズに読み上げることができる。 Therefore, text data including the first language and the second language, and text data including the first language and the third language can be read out smoothly.

本発明の態様４に係るテキスト読み上げ装置（情報処理装置１、２）は、上記態様１〜３のいずれかにおいて、選択された上記テキストデータの範囲内に、上記選択部（言語判断部９４１）により選択された上記音声データ生成部（日本語ＴＴＳエンジン９３１、英語ＴＴＳエンジン９３２、中国語ＴＴＳエンジン９３３）には音声データを生成することができない文字列が存在する場合、上記選択部は、その文字列の音声データを生成できる上記音声データ生成部をさらに選択してもよい。 The text-to-speech device (information processing devices 1 and 2) according to aspect 4 of the present invention is the selection unit (language determination unit 941) within the range of the selected text data in any of the above aspects 1 to 3. If there is a character string that cannot generate speech data in the speech data generation unit (Japanese TTS engine 931, English TTS engine 932, Chinese TTS engine 933) selected by the above, the selection unit You may further select the said audio | voice data production | generation part which can produce | generate the audio | voice data of a character string.

上記の構成によれば、選択されたテキストデータの範囲内に、選択部により選択された音声データ生成部には音声データを生成することができない文字列が存在する場合であっても、その文字列の音声データを生成する音声データ生成部が新たに選択される。そのため、選択されたテキストデータにおいて、より広い範囲の文字列が読み上げられる。 According to the above configuration, even if there is a character string that cannot be generated in the voice data generation unit selected by the selection unit within the range of the selected text data, A sound data generation unit that generates sound data of the column is newly selected. Therefore, a wider range of character strings is read out in the selected text data.

例えば、選択されたテキストデータに英語（第１の言語）、日本語（第２の言語）、および中国語（第３の言語）の文字列が含まれる場合、各言語の文字列の読み上げに適合した音声データ生成部によって、各言語の文字列が読み上げられる。従って、３種類以上の言語の文字列を含むテキストをスムーズに読み上げることができる。 For example, when the selected text data includes English (first language), Japanese (second language), and Chinese (third language) character strings, the character string of each language is read out. A character string of each language is read out by the adapted voice data generation unit. Accordingly, it is possible to smoothly read a text including a character string of three or more languages.

本発明の態様５に係るテキスト読み上げ装置（情報処理装置１、２）は、表示部（１０）に表示するためのテキストデータを音声データに変換し、変換した音声データを出力することによって、上記テキストデータを読み上げるテキスト読み上げ装置であって、漢字文字列および仮名文字列を含む上記テキストデータを上記表示部に表示している時に、選択された該テキストデータの範囲内の上記漢字文字列に関連付けられた上記仮名文字列を、該テキストデータから読み仮名として抽出する仮名文字列抽出部（読み仮名抽出部４３、８３）と、上記漢字文字列の代わりに、上記仮名文字列抽出部によって抽出された上記仮名文字列に基づいて、上記漢字文字列を読み上げるための音声データを生成する漢字文字列音声データ生成部（ＴＴＳエンジン３０および漢字仮名置換部４４）と、を備えている。 The text-to-speech device (information processing devices 1 and 2) according to aspect 5 of the present invention converts the text data to be displayed on the display unit (10) into voice data, and outputs the converted voice data, thereby A text-to-speech device that reads out text data and associates the text data including a kanji character string and a kana character string with the kanji character string within a range of the selected text data when the text data is displayed on the display unit. The extracted kana character string is extracted by the kana character string extracting unit (reading kana extraction unit 43, 83) that extracts the kana character string from the text data as a reading kana, and the kana character string extracting unit instead of the kanji character string. A kanji character string voice data generation unit (TTS) that generates voice data for reading out the kanji character string based on the kana character string. Engine 30 and the kanji kana replacement section 44), and a.

上記の構成によれば、選択されたテキストデータ中の範囲内の漢字文字列が読み上げられる場合、まず、上記漢字文字列と関連付けられている仮名文字列が、上記テキストデータ中から読み仮名として抽出される。ここで、仮名文字抽出部は、例えば、漢字文字列の直後に存在する括弧で囲まれた平仮名または片仮名を、上記漢字文字列の読み仮名として指定されている仮名文字列として抽出してもよい。 According to the above configuration, when a kanji character string within a range in the selected text data is read out, first, a kana character string associated with the kanji character string is extracted as a reading kana from the text data. Is done. Here, the kana character extraction unit may extract, for example, a hiragana or katakana enclosed in parentheses immediately after the kanji character string as a kana character string designated as a reading kana of the kanji character string. .

また、当該抽出された上記仮名文字列が音声データに変換される。そして、上記漢字文字列については、上記仮名文字列から変換された音声データに基づいて読み上げが行われる。言い換えれば、上記漢字文字列については、上記仮名文字列が示す読みで、読み上げが行われる。 Further, the extracted kana character string is converted into voice data. The kanji character string is read out based on the voice data converted from the kana character string. In other words, the kanji character string is read out by the reading indicated by the kana character string.

これにより、テキストデータ中の漢字文字列に複数の読みが存在する場合であっても、上記テキストデータ以外を参照することなく、上記漢字文字列をより正しい読み（すなわち、コンテンツの作成者が意図する読み）で読み上げることができる。その結果、漢字文字列を含むテキストの読み上げ精度を向上させることができる。 As a result, even when there are multiple readings in the kanji character string in the text data, the kanji character string is read more correctly (ie, the content creator intends to read the kanji character string without referring to other than the text data). Reading). As a result, it is possible to improve the reading accuracy of text including a kanji character string.

本発明の態様６に係るテキスト読み上げ装置（情報処理装置１）は、上記態様５において、上記仮名文字列抽出部（読み仮名抽出部４３）は、上記漢字文字列の直後に存在する所定の第１の記号と、該第１の記号の後に存在する所定の第２の記号とによって囲まれた仮名文字列を抽出してもよい。 The text-to-speech apparatus (information processing apparatus 1) according to aspect 6 of the present invention is the above-described aspect 5, wherein the kana character string extraction unit (reading kana extraction unit 43) is a predetermined first character that exists immediately after the kanji character string. A kana character string surrounded by one symbol and a predetermined second symbol existing after the first symbol may be extracted.

上記の構成によれば、漢字文字列の直後に存在する第１の記号および第２の記号に囲まれた仮名文字列が、上記漢字文字列に関連付けられた仮名文字列として抽出される。例えば、第１の記号および第２の記号が括弧開きおよび括弧閉じ「（）」であり、テキストが「大神（おおみわ）」である場合、「大神」という漢字文字列の直後に存在する括弧「（）」に囲まれた「おおみわ」という仮名文字列が、上記漢字文字列に関連付けられた仮名文字列として抽出される。そして、「大神」という漢字文字列の代わりに、「おおみわ」という仮名文字列が読み上げられる。 According to the above configuration, the kana character string surrounded by the first symbol and the second symbol existing immediately after the kanji character string is extracted as a kana character string associated with the kanji character string. For example, when the first symbol and the second symbol are parenthesis opening and closing parenthesis “()” and the text is “Ogami”, the parenthesis immediately after the kanji character string “Okami” A kana character string “Oomiwa” surrounded by “()” is extracted as a kana character string associated with the kanji character string. Then, instead of the kanji character string “Okami”, the kana character string “Oomiwa” is read out.

ここで、漢字文字列の直後に存在する括弧に囲まれた仮名文字列は、上記漢字文字列の読み仮名である可能性が高い。例えば、上述した「大神（おおみわ）」というテキストでは、「おおみわ」という仮名文字列が、「大神」という漢字文字列の読み仮名になっている。この例の場合、「大神」という複数の読みを有する漢字文字列を、「おおかみ」でも「だいじん」でもなく、コンテンツの作成者の意図する通りに「おおみわ」と読み上げることができる。また、この例以外のテキストであっても、漢字文字列の直後に存在する括弧に囲まれた仮名文字列が、上記漢字文字列の読み仮名になっているテキストであれば、上記漢字文字列を正しく読み上げることができる。なお、所定の第１および第２の記号は、括弧開きおよび括弧閉じ「（）」以外に、例えば、「＜＞」、または「≪≫」であってもよい。あるいは、第１の記号と第２の記号とは、同じ記号であってもよいし、互いに無関係の記号（例えば、＄と＃）であってもよい。 Here, the kana character string enclosed in parentheses immediately after the kanji character string is highly likely to be a reading kana of the kanji character string. For example, in the above-mentioned text “Ogami”, the kana character string “Oomiwa” is the reading kana of the kanji character string “Okami”. In this example, a kanji character string having a plurality of readings “Okami” can be read out as “Omiwa” as intended by the creator of the content, not “Okami” or “Dajin”. In addition, even if the text is other than this example, if the kana character string enclosed in parentheses immediately after the kanji character string is text that is a reading kana of the kanji character string, the kanji character string Can be read correctly. The predetermined first and second symbols may be, for example, “<>” or “<< >>” in addition to the parentheses opening and closing “()”. Alternatively, the first symbol and the second symbol may be the same symbol, or may be symbols unrelated to each other (for example, $ and #).

本発明の態様７に係るテキスト読み上げ装置（情報処理装置１）は、上記態様５または６において、上記テキストデータは、該テキストデータ中の漢字とその読み仮名とを関連付けるタグ情報を有しており、上記仮名文字列抽出部（読み仮名抽出部４３）は、上記タグ情報を参照して、上記テキストデータ中の上記漢字文字列の読み仮名を抽出してもよい。 In the text-to-speech device (information processing apparatus 1) according to aspect 7 of the present invention, in the above-described aspect 5 or 6, the text data has tag information that associates kanji in the text data with the reading kana. The kana character string extraction unit (reading kana extraction unit 43) may extract the reading kana of the kanji character string in the text data with reference to the tag information.

上記の構成によれば、タグ情報に基づいて、テキストデータ中の漢字文字列の読み仮名が抽出され、該漢字文字列の代わりに読み上げられる。これにより、テキスト読み上げ装置が、テキストデータ中の漢字文字列を、正しい読みで読み上げることができる。 According to the above configuration, based on the tag information, the reading kana of the kanji character string in the text data is extracted and read out instead of the kanji character string. Thereby, the text-to-speech device can read out the kanji character string in the text data with correct reading.

本発明の態様８に係るテキスト読み上げ装置（情報処理装置２）は、上記態様５において、上記テキストデータは、上記漢字文字列に付されたルビを示すルビ情報を有しており、上記仮名文字列抽出部（読み仮名抽出部８３）は、上記ルビ情報を参照して、上記漢字文字列のルビを、該漢字文字列と関連付けられた上記仮名文字列として抽出してもよい。 In the text-to-speech device (information processing device 2) according to aspect 8 of the present invention, in the aspect 5, the text data includes ruby information indicating ruby attached to the kanji character string, and the kana character The column extraction unit (reading kana extraction unit 83) may extract the ruby of the kanji character string as the kana character string associated with the kanji character string with reference to the ruby information.

上記の構成によれば、テキストデータに含まれるルビ情報に基づいて、テキストデータ中の漢字文字列のルビが抽出される。これにより、テキストデータ中の漢字を、コンテンツの作成者が意図した読みで読み上げることができる。なぜならば、ルビ情報は、コンテンツの作成者が作成する場合が多いと考えられるためである。 According to the above configuration, the ruby of the kanji character string in the text data is extracted based on the ruby information included in the text data. As a result, the kanji in the text data can be read out with the reading intended by the creator of the content. This is because ruby information is often created by a content creator.

本発明の態様９に係るテキスト読み上げ装置（情報処理装置１、２）は、上記態様５〜８のいずれかにおいて、上記テキストデータから上記漢字文字列を抽出する漢字文字列抽出部（漢字文字列抽出部４２）をさらに備え、上記仮名文字列抽出部（読み仮名抽出部４３、８３）は、上記漢字文字列抽出部によって抽出された上記漢字文字列に関連付けられた上記仮名文字列を抽出してもよい。 The text-to-speech device (information processing devices 1 and 2) according to aspect 9 of the present invention is the kanji character string extraction unit (kanji character string) that extracts the kanji character string from the text data in any of the above aspects 5 to 8. The kana character string extraction unit (reading kana extraction units 43 and 83) extracts the kana character string associated with the kanji character string extracted by the kanji character string extraction unit. May be.

上記の構成によれば、漢字文字列抽出部によって抽出された漢字文字列の代わりに、上記テキストデータにおいて該漢字文字列と関連付けられた仮名文字列が読み上げられる。この構成は、例えば、テキスト読み上げ装置がテキストデータを朗読する場合に、適用することができる。テキスト読み上げ装置は、テキストデータを朗読する場合、テキストデータ中の文字列を最初から順番に読み上げてゆく。朗読の間、次に読み上げられる文字列が漢字文字列である場合に、漢字文字列抽出部がその漢字文字列を抽出する。そして、テキスト読み上げ装置は、上記漢字文字列の代わりに、該漢字文字列の読み仮名を読み上げる。これにより、テキスト読み上げ装置は、テキストデータ中に、複数の読みを有する漢字文字列が存在する場合であっても、該テキストデータを正しい読み（すなわち、コンテンツの作成者が意図する読み）で朗読することができる。 According to the above configuration, the kana character string associated with the kanji character string in the text data is read out instead of the kanji character string extracted by the kanji character string extraction unit. This configuration can be applied, for example, when the text-to-speech device reads text data. When reading out text data, the text-to-speech device reads out character strings in the text data in order from the beginning. During the reading, if the character string to be read out next is a kanji character string, the kanji character string extraction unit extracts the kanji character string. The text-to-speech device reads the kana character of the kanji character string instead of the kanji character string. As a result, the text-to-speech device reads the text data with correct reading (that is, the reading intended by the content creator) even if there is a kanji character string having a plurality of readings in the text data. can do.

本発明の態様１０に係るテキスト読み上げ装置（情報処理装置１、２）は、上記態様５において、上記表示部をさらに備え、上記仮名文字列抽出部（読み仮名抽出部４３、８３）は、表示されている上記テキストデータからユーザによって選択された上記漢字文字列に関連付けられた上記仮名文字列を抽出してもよい。 The text-to-speech device (information processing devices 1 and 2) according to aspect 10 of the present invention further includes the display unit in aspect 5, and the kana character string extraction unit (reading kana extraction units 43 and 83) The kana character string associated with the kanji character string selected by the user may be extracted from the text data.

上記の構成によれば、ユーザによって、テキスト中の漢字文字列が選択される。そして、ユーザに選択された漢字文字列の代わりに、上記テキストデータにおいて該漢字文字列と関連付けられた仮名文字列が読み上げられる。前述のように、上記漢字文字列に関連付けられた仮名文字列として、上記漢字文字列の読み仮名が抽出される構成とした場合、上記漢字文字列の代わりに、該漢字文字列の読み仮名が読み上げられる。 According to said structure, the kanji character string in a text is selected by the user. Then, instead of the kanji character string selected by the user, the kana character string associated with the kanji character string in the text data is read out. As described above, when the kana character string associated with the kanji character string is extracted as the kana character string, the kana character string reading kana is used instead of the kanji character string. Read aloud.

これにより、ユーザが読み上げを所望する漢字文字列を選択したとき、その漢字文字列が複数の読みを有する場合であっても、テキスト読み上げ装置は、上記漢字文字列を正しい読みで読み上げることができる。 Thereby, when the user selects a kanji character string desired to be read out, the text-to-speech device can read out the kanji character string with correct reading even if the kanji character string has a plurality of readings. .

本発明の態様１１に係るテキスト読み上げ装置の制御方法は、表示部に表示するためのテキストデータを音声データに変換し、変換した音声データを出力することによって、上記テキストデータを読み上げるテキスト読み上げ装置の制御方法であって、漢字文字列および仮名文字列を含む上記テキストデータを上記表示部に表示している時に、選択された該テキストデータの範囲内の上記漢字文字列に関連付けられた上記仮名文字列を、該テキストデータから読み仮名として抽出する仮名文字列抽出ステップと、上記漢字文字列の代わりに、上記仮名文字列抽出ステップにおいて抽出された上記仮名文字列に基づいて、上記漢字文字列を読み上げるための音声データを生成する音声データ生成ステップと、を含んでいる。 A control method for a text-to-speech device according to an aspect 11 of the present invention is a text-to-speech device that reads out the text data by converting the text data to be displayed on the display unit into voice data and outputting the converted voice data. The kana character associated with the kanji character string within the range of the selected text data when the text data including the kanji character string and the kana character string is displayed on the display unit. A kana character string extraction step for extracting a string as a reading kana from the text data; and, instead of the kanji character string, the kanji character string is converted based on the kana character string extracted in the kana character string extraction step. A voice data generation step for generating voice data to be read out.

上記の構成によれば、態様５に係るテキスト読み上げ装置と同様の効果を奏することができる。 According to said structure, there can exist an effect similar to the text-to-speech apparatus which concerns on aspect 5. FIG.

本発明の各態様に係る情報処理装置（１、２）は、コンピュータによって実現してもよく、この場合には、コンピュータを上記テキスト読み上げ装置が備える各手段として動作させることにより上記テキスト読み上げ装置をコンピュータにて実現させるテキスト読み上げ装置の制御プログラム、およびそれを記録したコンピュータ読み取り可能な記録媒体も、本発明の範疇に入る。 The information processing device (1, 2) according to each aspect of the present invention may be realized by a computer. In this case, the text-to-speech device is operated by causing the computer to operate as each unit included in the text-to-speech device. A text-to-speech control program implemented by a computer and a computer-readable recording medium on which the control program is recorded also fall within the scope of the present invention.

本発明は上述した各実施形態に限定されるものではなく、請求項に示した範囲で種々の変更が可能であり、異なる実施形態にそれぞれ開示された技術的手段を適宜組み合わせて得られる実施形態についても本発明の技術的範囲に含まれる。さらに、各実施形態にそれぞれ開示された技術的手段を組み合わせることにより、新しい技術的特徴を形成することができる。 The present invention is not limited to the above-described embodiments, and various modifications are possible within the scope shown in the claims, and embodiments obtained by appropriately combining technical means disclosed in different embodiments. Is also included in the technical scope of the present invention. Furthermore, a new technical feature can be formed by combining the technical means disclosed in each embodiment.

本発明は、テキストを読み上げる各種のテキスト読み上げ装置、例えば電子辞書、電子書籍端末、およびタブレット端末等に好適に利用することができる。 The present invention can be suitably used for various text-to-speech devices that read text, such as an electronic dictionary, an electronic book terminal, and a tablet terminal.

１、２情報処理装置（テキスト読み上げ装置）
３０ＴＴＳエンジン（漢字文字列音声データ生成部）
４２漢字抽出部（漢字文字列抽出部）
４３、８３読み仮名抽出部（仮名文字列抽出部）
４４漢字仮名置換部（漢字文字列音声データ生成部）
９３１日本語ＴＴＳエンジン（音声データ生成部、第１音声データ生成部）
９３２英語ＴＴＳエンジン（音声データ生成部）
９３３中国語ＴＴＳエンジン（音声データ生成部、第２音声データ生成部）
９４１言語判断部（選択部） 1, 2 Information processing device (text-to-speech device)
30 TTS engine (kanji character string voice data generator)
42 Kanji extraction unit (Kanji character string extraction unit)
43, 83 Reading Kana extraction part (kana character string extraction part)
44 Kanji kana replacement part (kanji character string voice data generation part)
931 Japanese TTS engine (voice data generator, first voice data generator)
932 English TTS engine (voice data generator)
933 Chinese TTS engine (voice data generator, second voice data generator)
941 Language determination unit (selection unit)

Claims

A text-to-speech device that reads out the text data by converting the text data to be displayed on the display unit into voice data and outputting the converted voice data,
A first audio data generation unit that generates audio data of a first language and a second language, and a second audio data generation unit that generates audio data of a first language and a third language, A plurality of voice data generation units that respectively generate voice data in the above languages ;
The language type of the character string within the range of the selected text data can be determined, and the voice data of the character strings of a plurality of types of languages included in the text data can be generated from the plurality of voice data generation units. A selection unit for selecting an audio data generation unit ,
When the first language is included in the range of the selected text data, the selection unit selects (i) the first speech data generation unit when the second language is further included, and (ii) A text-to-speech device that selects the second voice data generation unit when a third language is further included .

The selection unit can generate voice data of more character strings in the selected text data from a voice data generation unit that can generate voice data of a character string of interest in the selected text data. The text-to-speech apparatus according to claim 1, wherein a generation unit is selected.

When there is a character string that cannot be generated in the voice data generation unit selected by the selection unit within the range of the selected text data, the selection unit displays the voice of the character string. The text-to-speech apparatus according to claim 1 or 2 , further selecting another voice data generation unit capable of generating data.