JP5445244B2

JP5445244B2 - Speech synthesis apparatus, speech synthesis method, and speech synthesis program

Info

Publication number: JP5445244B2
Application number: JP2010054846A
Authority: JP
Inventors: 健太郎村瀬; 伸之片江; 拓也野田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2010-03-11
Filing date: 2010-03-11
Publication date: 2014-03-19
Anticipated expiration: 2030-03-11
Also published as: JP2011191332A

Description

本発明は、音声合成時に使用するユーザ辞書の利用方法に関する。 The present invention relates to a method for using a user dictionary used during speech synthesis.

テキストを入力し、その読み上げ音声を生成する音声合成技術において、テキストの読み正解率を高めるには、単語の表記に対する読みが登録されている言語辞書に、多くの単語を登録する必要がある。しかし、登録数が増すに従って同表記異読語も増し、読み誤りの原因となる。 In the speech synthesis technology for inputting text and generating the reading speech, it is necessary to register many words in a language dictionary in which the reading for the word notation is registered in order to increase the correct reading rate of the text. However, as the number of registrations increases, the number of misreads with the same notation increases, causing a reading error.

そこで、日本語に頻出する単語を集めた基本辞書をベースに、必要に応じて、ユーザが独自に管理するユーザ辞書を用いる技術が開示されている。分野毎の専門用語を集めた専門辞書を使い分ける方法も開示されている。また、他人のユーザ辞書を利用する方法も開示されている（例えば、特許文献１、特許文献２、特許文献３、特許文献４、特許文献５参照）。 Therefore, a technique is disclosed that uses a user dictionary that is independently managed by a user as needed, based on a basic dictionary that collects words that frequently appear in Japanese. A method for selectively using specialized dictionaries that collect technical terms for each field is also disclosed. In addition, a method using another person's user dictionary is also disclosed (see, for example, Patent Document 1, Patent Document 2, Patent Document 3, Patent Document 4, and Patent Document 5).

特許第３８９４４７９号公報Japanese Patent No. 3894479 特開２００７−２０６９７５号公報JP 2007-206975 A 特許第３２２００９６号公報Japanese Patent No. 3220096 特開平１１−３２７８７１号公報JP-A-11-327871 特開２００７−８００１９号公報Japanese Patent Laid-Open No. 2007-80019

しかし、これらの方法には、以下の問題点がある。まず、ユーザ辞書を用いる場合、それなりの数の単語を自分で登録する必要があり、ユーザにとっては、登録作業の負担が大きい。 However, these methods have the following problems. First, when using a user dictionary, it is necessary to register a certain number of words by yourself, and the burden of registration work is large for the user.

また、予め分野別の専門辞書を準備しておく場合は、新語へ対応できないことや、膨大な日本語の語彙を全てカバーする専門辞書を予め準備するのは、実質的には不可能であるといった問題がある。さらに、専門辞書を利用する場合は、ユーザと専門辞書作成者の間で、分野の分類の仕方が共通でないと、適切な分野の専門辞書を選択できないという問題がある。 Also, when preparing specialized dictionaries by field in advance, it is practically impossible to prepare specialized dictionaries that cannot handle new words or that cover all the vast Japanese vocabulary. There is a problem. Furthermore, when a specialized dictionary is used, there is a problem that a specialized dictionary in an appropriate field cannot be selected unless the way of classifying the field is common between the user and the creator of the specialized dictionary.

他人のユーザ辞書を利用する方法では、ある表記の未知語に対して、複数の他人のユーザ辞書を調べ、登録率の高い読みを、当該未知語に対する読みとして、基本辞書に取り込むことを行っているが、この方法では、一つの表記に対して、一つの読みしか登録できないため、同表記異読語に対応できない。 In the method of using another person's user dictionary, a plurality of other person's user dictionaries are examined for a certain notation unknown word, and a reading with a high registration rate is taken into the basic dictionary as a reading for the unknown word. However, in this method, since only one reading can be registered for one notation, it is not possible to cope with the same notation.

また、自分のユーザ辞書と他人のユーザ辞書との間の類似度を比較して、類似度の高い他人の辞書に含まれ、自分の辞書に含まれない単語を、自分の辞書に取り入れる方法では、ユーザ辞書は、それぞれ個人が独立に管理しているもので、それらに、似たような単語が登録されている確率は少ないという問題がある。例えば、Ａさんは、政治、経済、野球に興味があり、それらの文章を良く合成していて、それらに関連した単語がユーザ登録されているとする。また、Ｂさんは、芸能、天気、野球に興味があり、それらに関連した単語をユーザ登録しているとする。ここで、ＡさんとＢさんのユーザ辞書登録単語の類似度をみると、類似するのは野球の部分だけで、、芸能、天気に関する部分は類似しない。従って、類似しないデータの割合が多くなり、ＡさんとＢさんは、ユーザ辞書を共有できない。また、仮に、ＡさんもＢさんも野球関連の語しか登録しておらず、類似度が高かったとしても、類似度が高いがためにＡさんの辞書に取り込める単語も少なくなり、共有の効果があまり得られない。このように、共有できる確率が少なく、共有できたとしてもその効果が少ない。 Also, by comparing the degree of similarity between your user dictionary and another person's user dictionary, you can incorporate words that are included in another person's dictionary with a high degree of similarity but not in your own dictionary into your own dictionary. The user dictionaries are individually managed by individuals, and there is a problem that the probability that similar words are registered is low. For example, suppose that Mr. A is interested in politics, economy, and baseball, synthesizes those sentences well, and the words related to them are registered in the user. In addition, Mr. B is interested in performing arts, weather, and baseball, and registers words related to them as a user. Here, looking at the similarity between the user dictionary registered words of Mr. A and Mr. B, only the baseball part is similar, and the parts related to entertainment and weather are not similar. Therefore, the ratio of data that is not similar increases, and Mr. A and Mr. B cannot share the user dictionary. Also, if both A and B have registered only baseball-related words, even if the degree of similarity is high, the number of words that can be taken into A's dictionary because the degree of similarity is high, and the sharing effect Can not get much. Thus, the probability of sharing is small, and even if it can be shared, the effect is small.

そこで、本発明では、他人のユーザ辞書を有効に利用することによって、自分のユーザ辞書に登録されていない未知語に対しても、読み正解率を向上させることができるようにすることを目的とする。 Therefore, the present invention aims to improve the reading accuracy rate even for unknown words that are not registered in one's user dictionary by effectively using another person's user dictionary. To do.

上記目的を達成するために、以下に開示する音声合成装置は、見出し語、前記見出し語の読み、及び前記見出し語に関連性を有する周辺テキストを関連付けて記憶する利用者のユーザ辞書と、基本辞書とにアクセス可能であり、合成対象テキストの入力を受付けるテキスト入力部と、前記見出し語、前記見出し語の読み、及び前記見出し語に関連性を有する前記周辺テキストを関連付けて記憶する複数の他人のユーザ辞書を参照可能にするインターフェース部と、前記テキスト入力部で入力を受付けた合成対象テキストから前記利用者のユーザ辞書にも前記基本辞書にも含まれない未知語を抽出する未知語抽出部と、前記未知語抽出部が抽出した前記未知語が含まれる合成対象テキストから前記未知語に関連性を有する前記周辺テキストを抽出する周辺テキスト抽出部と、前記インターフェース部を介して前記他人のユーザ辞書を参照し、前記他人のユーザ辞書から前記未知語抽出部が抽出した前記未知語に一致する前記見出し語を抽出する見出し語抽出部と、前記他人のユーザ辞書から前記見出し語抽出部が抽出した前記見出し語に関連付けられた前記周辺テキストと前記合成対象テキストから前記周辺テキスト抽出部が抽出した前記周辺テキストとの一致度を算出する一致度算出部と、前記一致度算出部が算出した前記一致度に基づいて、前記他人のユーザ辞書から前記見出し語抽出部が抽出した前記見出し語の読みを、前記合成対象テキストから前記未知語抽出部が抽出した前記未知語の読みとするか否かを決定する読み決定部とを備える。 To achieve the above object, the speech synthesis apparatus disclosed below, the headword, the headword readings, and the user of the user dictionary to store in association with surrounding text having relevance to the entry word, the basic A text input unit that is accessible to the dictionary and receives input of the text to be synthesized , and a plurality of other persons that associate and store the headword, the reading of the headword, and the surrounding text having relevance to the headword unknown word extraction unit that extracts of the interface unit the user dictionary to be referenced, the unknown word from the compositing target text accepts an input by the text input unit not included in the basic dictionary in the user dictionary of the user When, the peripheral text with relevance to the unknown word from the synthesis target text that contains the unknown word which the unknown word extraction unit has extracted extraction Headword extraction and peripheral text extraction unit, through said interface unit refers to the user dictionary of the others, the entry word that matches the user dictionary of the person to the unknown word extracted is the unknown word extraction section that an extraction unit, the degree of coincidence between said peripheral text from said user dictionary and the peripheral text associated with the entry word which the index word extracting unit has extracted from the compositing target text surrounding the text extraction unit has extracted the others a match degree calculating section for calculating, based on said degrees of coincidence the coincidence degree calculation unit has calculated, the reading from the user dictionary of the others of the entry word which the index word extracting unit has extracted the from the compositing target text and a reading determination unit which determines whether the reading of the unknown word to the unknown word extraction portion is extracted.

上記の構成によれば、他人のユーザ辞書を有効に利用することによって、自分のユーザ辞書に登録されていない未知語に対しても、読み正解率を向上させることができる。 According to the above configuration, it is possible to improve the correct reading rate even for unknown words that are not registered in the user dictionary by effectively using the user dictionary of another person.

本発明の実施形態１に係る音声合成装置の全体構成の一例を示すブロック図1 is a block diagram showing an example of the overall configuration of a speech synthesizer according to Embodiment 1 of the present invention. 本発明の実施形態１に係る音声合成装置の動作の一例を示すフロー図The flowchart which shows an example of operation | movement of the speech synthesizer which concerns on Embodiment 1 of this invention. 各種のデータ例を示す図Diagram showing examples of various data 本発明の実施形態２に係る音声合成装置の構成の一例を示すブロック図The block diagram which shows an example of a structure of the speech synthesizer concerning Embodiment 2 of this invention. 本発明の実施形態３に係る音声合成装置の全体構成の一例を示すブロック図The block diagram which shows an example of the whole structure of the speech synthesizer which concerns on Embodiment 3 of this invention.

[実施形態１]
図１は、本発明の実施形態１に係る音声合成装置１００の全体構成を示すブロック図である。図１において、音声合成装置１００は、基本辞書１１０、ユーザ辞書１２０、ユーザ辞書インターフェース部１３０、テキスト入力部１４０、言語処理部１５０、波形処理部１６０、及び音声出力部１７０を備える。 [Embodiment 1]
FIG. 1 is a block diagram showing the overall configuration of a speech synthesis apparatus 100 according to Embodiment 1 of the present invention. In FIG. 1, the speech synthesizer 100 includes a basic dictionary 110, a user dictionary 120, a user dictionary interface unit 130, a text input unit 140, a language processing unit 150, a waveform processing unit 160, and a speech output unit 170.

基本辞書１１０は、音声合成の際に必要となる基本単語が格納された辞書である。基本辞書は、標準辞書と称してもよい。ユーザ辞書１２０は、ユーザが随時単語を登録していくユーザ固有の辞書である。他人のユーザ辞書１２１、１２２、１２３は、他のユーザが独自に随時単語を登録していく他のユーザ固有の辞書である。基本辞書１１０には、例えば日本語で頻出する単語の表記（＝見出し語）、当該見出し語の読み、及びアクセント等が格納されている。 The basic dictionary 110 is a dictionary that stores basic words necessary for speech synthesis. The basic dictionary may be referred to as a standard dictionary. The user dictionary 120 is a user-specific dictionary in which a user registers words as needed. The other person's user dictionaries 121, 122, and 123 are dictionaries unique to other users in which other users individually register words at any time. The basic dictionary 110 stores, for example, notation of words that frequently appear in Japanese (= headwords), reading of the headwords, accents, and the like.

ユーザ辞書１２０には、ユーザが登録した単語の表記（＝見出し語）、当該見出し語の読み、及びアクセント、当該見出し語に関連性を有する周辺テキストが格納されている。他人のユーザ辞書１２１、１２２、１２３には、他のユーザが登録した単語の表記（＝見出し語）、当該見出し語の読み、及びアクセント、当該見出し語に関連性を有する周辺テキストが格納されている。なお、基本辞書１１０に、見出し語に関連性を有する周辺テキストが格納されていてもよい。 The user dictionary 120 stores a notation of a word registered by the user (= headword), reading of the headword, accents, and peripheral text related to the headword. The other person's user dictionaries 121, 122, 123 store notation of words registered by other users (= headwords), readings of the headwords, accents, and peripheral texts related to the headwords. Yes. The basic dictionary 110 may store peripheral text having relevance to the headword.

音声合成装置１００は、ユーザ辞書インターフェース部１３０、他人のユーザ辞書のインターフェース部１３１〜１３３を介して、異なるユーザ間で、ユーザ辞書情報のやり取りができるようになっている。 The speech synthesizer 100 can exchange user dictionary information between different users via the user dictionary interface unit 130 and the interface units 131 to 133 of another person's user dictionary.

テキスト入力部１４０は、合成するテキストの入力を受付ける。例えば、キーボードを介してユーザがテキストを入力する構成、ＣＤやフレキシブルディスクなどのメディアを読取るドライブを介して電子的に入力する構成、スキャナなどによりＯＣＲで読取ったテキストを入力する構成、又は有線または無線のネットワークを介して電子的にテキストを受け取る構成、あるいはこれらの組み合わせであってもよい。 Text input unit 140 accepts input of text to be synthesized. For example, a configuration in which a user inputs text through a keyboard, a configuration in which electronic input is performed through a drive that reads a medium such as a CD or a flexible disk, a configuration in which text read by an OCR by a scanner is input, or wired or A configuration in which text is electronically received via a wireless network, or a combination thereof may be used.

入力されたテキストは言語処理部１５０へ送られる。言語処理部１５０では、入力されたテキストの読み、アクセント等を解析し、出力する。言語処理部１５０は、形態素解析部１５１、未知語抽出部１５２、周辺テキスト抽出部１５３、見出し語抽出部１５４、特徴レベル設定部１５５、一致度算出部１５６、及び読み決定部１５７を備える。 The input text is sent to the language processing unit 150. The language processing unit 150 analyzes and outputs the input text reading, accent, and the like. The language processing unit 150 includes a morphological analysis unit 151, an unknown word extraction unit 152, a surrounding text extraction unit 153, a headword extraction unit 154, a feature level setting unit 155, a matching degree calculation unit 156, and a reading determination unit 157.

形態素解析部１５１は、基本辞書１１０とユーザ辞書１２０とを利用して、形態素解析を行い、読みや品詞等を決定する。 The morpheme analysis unit 151 performs morpheme analysis using the basic dictionary 110 and the user dictionary 120 to determine readings, parts of speech, and the like.

未知語抽出部１５２は、形態素解析でテキストを形態素に分解した結果から、入力テキスト中で、基本辞書１１０にもユーザ辞書１２０にも登録されていないと判定された形態素列を未知語として抽出する。 The unknown word extraction unit 152 extracts, as an unknown word, a morpheme sequence that is determined not to be registered in the basic dictionary 110 or the user dictionary 120 in the input text from the result of decomposing the text into morphemes by morphological analysis. .

周辺テキスト抽出部１５３は、形態素解析の結果から、入力されたテキストの中で、未知語に関連性を有する単語を周辺テキストとして抽出する。周辺テキスト抽出部１５３は、未知語が含まれる文章、又は、未知語が含まれる文章の近隣の文章を形態素解析し、形態素解析の結果得られた各単語の属性に基づいて関連性を有する周辺テキストを抽出することができる。各単語の属性には、例えば、単語の品詞のように、単語の文法上の性質などが含まれる。周辺テキストの抽出方法としては、特定の言葉のみ、例えば、特定の品詞のみを抽出したり、特定の品詞を含む文節を抽出すればよい。なお、抽出される周辺テキストは、単語に限られず、例えば、文章全文や複数の単語を含む文節であってもよい。 The surrounding text extraction unit 153 extracts words having relevance to the unknown word from the input text as the surrounding text from the result of the morphological analysis. Peripheral text extraction unit 153 performs a morphological analysis on a sentence containing an unknown word or a sentence in the vicinity of a sentence containing an unknown word, and has a relevance based on the attribute of each word obtained as a result of the morphological analysis Text can be extracted. The attribute of each word includes the grammatical nature of the word, such as the part of speech of the word. As a method for extracting the surrounding text, only a specific word, for example, only a specific part of speech may be extracted, or a phrase including the specific part of speech may be extracted. Note that the peripheral text to be extracted is not limited to a word, and may be, for example, a sentence including a whole sentence or a plurality of words.

なお。未知語抽出部１５２、及び周辺テキスト抽出部１５３での抽出処理は、形態素解析を行わず実行することもできる。例えば、単純に、漢字やカタカナ、アルファベットが連続している部分を、周辺テキストとみなして抽出し、更に、周辺テキストの中で、基本辞書１１０にも、ユーザ辞書１２０にも登録されていないテキストを、未知語として抽出すればよい。 Note that. The extraction processing in the unknown word extraction unit 152 and the surrounding text extraction unit 153 can be performed without performing morphological analysis. For example, a portion in which kanji, katakana, and alphabets are simply extracted is regarded as surrounding text, and further, text that is not registered in the basic dictionary 110 or the user dictionary 120 in the surrounding text. May be extracted as an unknown word.

見出し語抽出部１５４は、ユーザ辞書インターフェース部１３０、他人のユーザ辞書のインターフェース部１３１〜１３３を介して他人のユーザ辞書を参照し、当該他人のユーザ辞書から未知語抽出部１５２が抽出した未知語に一致する見出し語を抽出する。 The headword extraction unit 154 refers to the other person's user dictionary via the user dictionary interface unit 130 and the other person's user dictionary interface units 131 to 133, and the unknown word extracted by the unknown word extraction unit 152 from the other person's user dictionary Extract headwords that match.

特徴レベル設定部１５５は、見出し語抽出部１５４が抽出した見出し語に関連付けられた周辺テキストに、属性に応じた特徴レベルを設定する。すなわち、特徴レベル設定部１５５は、未知語と同じ見出し語に関連付けて格納されている周辺テキストに、属性に応じた特徴レベルを付与する。特徴レベルは、例えば、入力テキストにコンテキストとの関連性の度合いを示す値とすることができる。特徴レベル設定部１５５は、例えば、周辺テキストの属性に予め特徴レベルを関連付けて格納するアクセス可能なテーブルを参照して、周辺テキストに特徴レベルを設定することができる。特徴レベルが高いほど、文章のコンテキストを示唆する可能性の高いテキストであると考えることができる。例えば、テーブルには、ユーザ辞書に登録されているテキストであれば特徴レベルを５、固有名詞であれば特徴レベルを４、複合名詞であれば特徴レベルを３、それ以外の普通名詞であれば特徴レベルを２等と設定しておき、テーブルを参照して抽出した周辺テキストの特徴レベルが設定できるようにすればよい。 The feature level setting unit 155 sets a feature level corresponding to the attribute in the surrounding text associated with the headword extracted by the headword extraction unit 154. That is, the feature level setting unit 155 assigns a feature level corresponding to the attribute to the surrounding text stored in association with the same headword as the unknown word. The feature level can be, for example, a value indicating the degree of relevance of the input text with the context. For example, the feature level setting unit 155 can set the feature level for the surrounding text with reference to an accessible table that stores the feature level associated with the attribute of the surrounding text in advance. It can be considered that the higher the feature level, the more likely it is to suggest the context of the sentence. For example, in the table, if the text is registered in the user dictionary, the feature level is 5; if it is a proper noun, the feature level is 4; if it is a compound noun, the feature level is 3; The feature level may be set to 2 etc. so that the feature level of the surrounding text extracted with reference to the table can be set.

一致度算出部１５６は、未知語と同じ見出し語に関連付けて格納されている周辺テキストと入力テキストから抽出された周辺テキストとの一致度を計算する。例えば、周辺テキスト同士で一致するものの、特徴レベルの和をとって一致度を求めればよい。なお、本例では、「一致度」として、特徴レベルの和である加算値を用いているが、これに限定されない。例えば、特徴レベルの積算値、平均値、最大値、最小値等の一致度を表す代表値を用いることができる。 The coincidence calculation unit 156 calculates the coincidence between the surrounding text stored in association with the same headword as the unknown word and the surrounding text extracted from the input text. For example, although the surrounding text matches, the degree of matching may be obtained by taking the sum of the feature levels. In this example, the addition value that is the sum of the feature levels is used as the “matching degree”, but the present invention is not limited to this. For example, a representative value representing the degree of coincidence such as the integrated value, average value, maximum value, minimum value, etc. of the feature level can be used.

なお、特徴レベル設定部１５５を備えることは必須ではなく、任意であってよい。このとき、例えば、入力テキストから抽出された周辺テキストと他人のユーザ辞書から抽出された周辺テキストとの一致する数を用いて一致度を算出する構成とすることができる。 The feature level setting unit 155 is not essential and may be arbitrary. At this time, for example, the degree of coincidence can be calculated using the number of matches between the surrounding text extracted from the input text and the surrounding text extracted from another person's user dictionary.

読み決定部１５７は、一致度算出部１５６が算出した一致度に基づいて、他人のユーザ辞書から抽出した見出し語の読みを、入力テキストから抽出された未知語の読みとするか否かを決定する。 Based on the degree of coincidence calculated by the degree of coincidence calculation unit 156, the reading determination unit 157 determines whether or not to read the headword extracted from the other person's user dictionary as the unknown word extracted from the input text. To do.

例えば、読み決定部１５７は、一致度計算部１５６で計算された未知語と同じ見出し語に関連付けて格納されている周辺テキストと入力テキストから抽出された周辺テキストとの間で最も一致度が高かった他人のユーザ辞書から抽出された見出し語の読みを、当該未知語の読みとして決定する。なお、読み決定部１５７は、読みに加えて、さらにアクセントその他必要な情報を決定してもよい。 For example, the reading determination unit 157 has the highest matching degree between the surrounding text stored in association with the same headword as the unknown word calculated by the matching degree calculation unit 156 and the surrounding text extracted from the input text. The reading of the headword extracted from the other person's user dictionary is determined as the reading of the unknown word. Note that the reading determination unit 157 may further determine accents and other necessary information in addition to reading.

波形処理部１６０は、言語処理部１５０から出力された、読み、アクセント情報に応じて合成音声データを生成する。図示は省略しているが、波形処理部１６０は、音声を合成するための波形辞書を有してもよい。例えば波形処理部１６０は、波形辞書内の音声素片に対して、例えば、ＰＳＯＬＡ（ＰｉｔｃｈＳｙｎｃｈｒｏｎｏｕｓＯｖｅｒｌａｐＡｄｄ）法等を用いたデジタル信号処理で目的のアクセントとなるように声の高さを調整しながら接続し、合成音声を生成することができる。 The waveform processing unit 160 generates synthesized speech data according to the reading and accent information output from the language processing unit 150. Although not shown, the waveform processing unit 160 may have a waveform dictionary for synthesizing speech. For example, the waveform processing unit 160 adjusts the pitch of the voice in the waveform dictionary so as to be a target accent in digital signal processing using, for example, the PSOLA (Pitch Synchronous Over Add) method. Can be connected to generate synthesized speech.

音声出力部１７０は、波形処理部１６０で生成された音声データを、各種音声フォーマットに応じた形式に変換し、出力する。 The audio output unit 170 converts the audio data generated by the waveform processing unit 160 into a format corresponding to various audio formats and outputs the converted data.

なお、上記説明では、音声合成装置１００は、基本辞書１１０、ユーザ辞書１２０、ユーザ辞書インターフェース部１３０、テキスト入力部１４０、言語処理部１５０、波形処理部１６０、及び音声出力部１７０を備える構成としたが、音声合成装置１００の構成はこれに限られない。例えば、音声合成装置１００が、ネットワークに接続されたサーバ上にあってもよい。この場合、例えば、テキスト入力部１４０は、ネットワークに接続されたユーザ端末で入力されたテキストを受信する構成とし、音声出力部１７０は、合成した音声データを、ネットワークを介して前記ユーザ端末へ送信する構成とすることができる。また、当該サーバが上記他人のユーザ辞書も格納し、複数のユーザからテキスト入力を受付けたときに、複数のユーザに対してそれぞれのユーザ辞書を利用して、入力テキストについて音声合成を可能とする構成としてもよい。また、音声合成装置１００が備える機能部は、複数のコンピュータに分散されていてもよい。 In the above description, the speech synthesizer 100 includes the basic dictionary 110, the user dictionary 120, the user dictionary interface unit 130, the text input unit 140, the language processing unit 150, the waveform processing unit 160, and the speech output unit 170. However, the configuration of the speech synthesizer 100 is not limited to this. For example, the speech synthesizer 100 may be on a server connected to a network. In this case, for example, the text input unit 140 is configured to receive text input from a user terminal connected to the network, and the voice output unit 170 transmits the synthesized voice data to the user terminal via the network. It can be set as the structure to do. In addition, when the server stores the other person's user dictionary and accepts text input from a plurality of users, the user dictionary can be used for the plurality of users to synthesize speech for the input text. It is good also as a structure. Further, the functional units included in the speech synthesizer 100 may be distributed among a plurality of computers.

以下、本発明の実施形態１に係る音声合成装置の動作について、図２のフロー図、及び図３の各種データ例を示す図に基づいて説明する。自分、Ａさん、Ｂさんのそれぞれのユーザ辞書には、図３（ｃ）、（ａ）、（ｂ）に示すように、普段よく読み上げさせているテキストに関連した単語がユーザ登録されているものとする。 Hereinafter, the operation of the speech synthesizer according to the first embodiment of the present invention will be described with reference to the flowchart in FIG. 2 and various data examples in FIG. 3. As shown in FIGS. 3C, 3A, and 3B, words related to the text that is usually read aloud are registered in the user dictionaries of myself, Mr. A, and Mr. B, respectively. Shall.

最初に、図３（ｄ）に示す、高速道路のインターチェンジについて述べている合成対象テキストが、テキスト入力部１４０に入力されたものとする（ステップＳ２０１）。 First, it is assumed that the composition target text describing the interchange of the expressway shown in FIG. 3D is input to the text input unit 140 (step S201).

次に、入力された合成対象テキストに対して、形態素解析部１５１で形態素解析を行う(ステップＳ２０２)。 Next, the morpheme analysis unit 151 performs morpheme analysis on the input composition target text (step S202).

未知語検出部１５２は、形態素解析の結果、基本辞書にもユーザ辞書にも登録されていない単語を未知語として抽出する(ステップＳ２０３)。ここでは、ＩＣが未知語として抽出されたとする。 As a result of the morphological analysis, the unknown word detection unit 152 extracts words that are not registered in the basic dictionary or the user dictionary as unknown words (step S203). Here, it is assumed that the IC is extracted as an unknown word.

次に、周辺テキスト抽出部１５３は、未知語ＩＣの含まれる文章、またはその前後の文章中から、周辺テキストを抽出する。ここでは、例えば周辺テキストとして、名詞を抽出する構成について説明する。ＩＣの含まれる文章中の名詞である、「一般道路」、「高速道路」、「出入り口」が周辺テキストとして抽出されたものとする(ステップＳ２０４)。 Next, the surrounding text extraction unit 153 extracts the surrounding text from the sentence including the unknown word IC or the sentences before and after the unknown word IC. Here, for example, a configuration for extracting nouns as surrounding text will be described. It is assumed that “general road”, “highway”, and “entrance / exit”, which are nouns in the sentence including the IC, are extracted as the surrounding text (step S204).

次に、見出し語抽出部１５４は、ユーザ辞書インターフェース部１３０、他人のユーザ辞書のインターフェース部１３１〜１３３を介して他人のユーザ辞書を参照し、当該他人のユーザ辞書から未知語抽出部１５２が抽出した未知語に一致する見出し語、及び当該見出し語に関連付けられた周辺テキストを抽出する（ステップＳ２０５）。ここでは、まず、Aさんのユーザ辞書から、ＩＣに相当する見出し語を検索すると、ＩＣ：インターチェ’ンジ、その周辺テキストとして「ＥＴＣ専用」、「高速道路」、「一般道路」、「料金」が抽出される。次にＢさんのユーザ辞書から、ＩＣに相当する見出し語を検索すると、ＩＣ：アイシ’ー、その周辺テキストとして「集積回路」、「素子」、「半導体」、「電子回路」が抽出される。 Next, the headword extraction unit 154 refers to the other person's user dictionary via the user dictionary interface unit 130 and the other person's user dictionary interface units 131 to 133, and the unknown word extraction unit 152 extracts from the other person's user dictionary. The headword that matches the unknown word and the surrounding text associated with the headword are extracted (step S205). Here, first, when searching for a headword corresponding to IC from Mr. A's user dictionary, IC: Interchange, and its surrounding text are “ETC only”, “highway”, “general road”, “charge” Is extracted. Next, when a headword corresponding to the IC is searched from the user dictionary of Mr. B, “IC” is extracted and “integrated circuit”, “element”, “semiconductor”, and “electronic circuit” are extracted as surrounding text. .

次に、特徴レベル設定部１５５は、見出し語抽出部１５４が抽出した未知語と同じ見出し語に関連付けられた周辺テキストに、及び、周辺テキスト１５３が抽出した周辺テキストに、属性に応じた特徴レベルを設定する(ステップＳ２０６)。ここでは、自分のユーザ辞書に登録されている単語を最も特徴レベルが高い４、基本辞書に登録されている固有名詞は特徴レベルが３、複合名詞は特徴レベルが２、その他の名詞は特徴レベルが１として設定している。上記に従うと、既に自分のユーザ辞書に登録されている、「高速道路」は特徴レベル４、複合語である「一般道路」は特徴レベル２、普通名詞である「出入り口」は特徴レベル１となる。 Next, the feature level setting unit 155 applies the feature level corresponding to the attribute to the peripheral text associated with the same headword as the unknown word extracted by the headword extraction unit 154 and to the peripheral text extracted from the peripheral text 153. Is set (step S206). Here, the word registered in the user dictionary has the highest feature level 4, the proper noun registered in the basic dictionary has the feature level 3, the compound noun has the feature level 2, and the other nouns have the feature level. Is set as 1. According to the above, “highway” already registered in his / her user dictionary is feature level 4, compound word “general road” is feature level 2, and common noun “entrance / exit” is feature level 1. .

また、特徴レベルの設定方法として、頻度情報を用いてもい。例えば、未知語「ＩＣ」を含む隣接する文章が以下のようになっていたとする。「ＩＣ」とは、一般道路と高速道路を繋ぐ出入り口のこと。一般的に高速道路には約１０ｋｍ毎にICが設置されている。高速道路と高速道路がつながる地点はJCと呼ばれる。」この場合、ICの周辺テキストとして、「高速道路」が４つ抽出されることになり、この出現頻度をそのまま特徴レベルとして用いてもよい。 Further, frequency information may be used as a feature level setting method. For example, it is assumed that an adjacent sentence including the unknown word “IC” is as follows. "IC" is the doorway that connects ordinary roads and expressways. In general, ICs are installed on highways every 10 km. The point where the expressway connects to the expressway is called JC. In this case, four “highways” are extracted as the IC surrounding text, and the appearance frequency may be used as the feature level as it is.

次に、一致度計算部１５６は、未知語と同じ見出し語に関連付けて格納されている周辺テキストと入力テキストから抽出された周辺テキストの一致度を算出する（ステップＳ２０７）。ここで、既に抽出した、入力テキストの周辺テキストである「高速道路」、「一般道路」、及び「出入り口」とＡさんのユーザ辞書の見出し語に関連付けて格納されている周辺テキスト「ＥＴＣ専用」、「高速道路」、「一般道路」、及び「料金」との一致度を計算する。一致度の計算方法は、例えば、一致するものの特徴レベルの和で計算し、「高速道路」の「４」と「一般道路」の「２」の和の値であり、Ａさんとの一致度は「６」となる。Ｂさんの場合もＡさんと同様に、既に抽出した入力テキストの周辺テキストとの一致度を計算する。ただし、ここでは一致する周辺テキストが全くないので、Ｂさんとの一致度は「０」となる。 Next, the matching score calculation unit 156 calculates the matching score between the surrounding text stored in association with the same headword as the unknown word and the surrounding text extracted from the input text (step S207). Here, the surrounding text “ETC only” stored in association with the headwords in the user dictionary of “A highway”, “general road”, and “entrance / exit”, which are already extracted, are the extracted surrounding text of the input text. The degree of coincidence with “highway”, “general road”, and “fee” is calculated. The degree of coincidence is calculated by, for example, calculating the sum of the feature levels of the coincidence, and is the sum of “4” for “highway” and “2” for “general road”. Becomes “6”. In the case of Mr. B, similar to Mr. A, the degree of coincidence between the input text already extracted and the surrounding text is calculated. However, since there is no matching surrounding text here, the degree of matching with Mr. B is “0”.

次に、読み決定部１５７は、一致度計算部１５６で計算された未知語と同じ見出し語に関連付けて格納されている周辺テキストと入力テキストから抽出された周辺テキストとの間で最も一致度が高かった他人のユーザ辞書から抽出された見出し語の読みを、当該未知語の読みとして決定する。ここでは、Ａさんとの一致度は「６」、Ｂさんとの一致度は「０」であるので、Ａさんの読み「ＩＣ：インターチェ’ンジ」を採用して決定する（ステップＳ２０８）。最後に、決定された読みと、形態素解析ですでに読みが判明している部分を合わせて、読み決定部１５７は、読み、及びアクセントを付与する(ステップＳ２０９)。 Next, the reading determination unit 157 has the highest matching degree between the surrounding text stored in association with the same headword as the unknown word calculated by the matching degree calculation unit 156 and the surrounding text extracted from the input text. The reading of the headword extracted from the higher user dictionary of the other person is determined as the reading of the unknown word. Here, since the degree of coincidence with Mr. A is “6” and the degree of coincidence with Mr. B is “0”, the reading “IC: Interchange” of Mr. A is adopted and determined (step S208). . Finally, the reading determination unit 157 adds the reading and the accent by combining the determined reading and the part whose reading has already been found by the morphological analysis (step S209).

以上の方法で、未知語だけでなく、その未知語に関する話題によく使われそうな周辺テキストも含めて、他人のユーザ辞書を検索することで、同じようなコンテキストで利用された未知語に対する読みを他人の辞書から検索できるようになる。そのため、同表記異読語の問題を軽減できる。また、本実施形態における音声合成装置は、ユーザ辞書全体ではなく、合成させたいテキストに含まれる未知語と周辺テキストのみを用いて、他人のユーザ辞書と比較することにより、ユーザ辞書同士の類似性が低くても、適切な読みが登録されている辞書を探しだして利用することができる。 By using the above method, you can search for unknown words used in similar contexts by searching other people's user dictionaries, including not only unknown words but also surrounding text that is likely to be used for topics related to the unknown words. Can be searched from someone else's dictionary. Therefore, it is possible to reduce the problem of differently read words. In addition, the speech synthesizer according to the present embodiment uses the unknown word and the surrounding text included in the text to be synthesized instead of the entire user dictionary, and compares the similarity with the user dictionary by comparing with the other user dictionary. Even if it is low, it is possible to find and use a dictionary in which appropriate readings are registered.

更に、本発明は、周辺テキストに対して特徴レベルを付けることで、一致度の信頼度を向上している。例えば、ユーザ辞書単語群や、固有名詞に特徴レベルを高く付与することで、より、未知語が含まれるコンテキストの特徴を表した周辺テキストが重視され、適切な他人のユーザ辞書を選択できる確率を上げることができ、読み精度を高められる。 Furthermore, the present invention improves the reliability of the degree of matching by assigning a feature level to the surrounding text. For example, by assigning a high feature level to a user dictionary word group or proper noun, the surrounding text representing the feature of the context including the unknown word is more emphasized, and the probability that an appropriate user dictionary of another person can be selected is increased. The reading accuracy can be increased.

すなわち、自分のユーザ辞書に登録されていない単語の読みを正しく求めることができる。ユーザ登録している単語や、未知語に含まれる単語は、日本語に頻出する単語が登録されている基本単語辞書でカバーできなかった単語であるため、固有名詞や専門用語等が多く含まれ、合成対象テキストの内容の特徴を表す単語となっている。従って、これらの単語を、他人のユーザ辞書に登録されている単語と比較することで、既に、同じような内容のテキストの読み上げを行った他のユーザを推定することができる。しかも、同じような内容のテキストの読み上げ用に用意されたユーザ辞書であるため、必然的に読みも正しいものが登録されている確率が高く、同表記異読語の問題を回避できる。また、従来の分野別辞書を用いる場合とは異なり、合成対象のテキストから特定の分野を推定する必要がないため、分野分けや分野推定誤りの影響を考えなくてもよい利点がある。更に、絶えずユーザがメンテナンスしている最新のユーザ辞書を利用できるため、新語にも対応できる。 That is, it is possible to correctly obtain a reading of a word that is not registered in the user dictionary. Words registered by users and words that are included in unknown words are words that could not be covered by the basic word dictionary in which words that appear frequently in Japanese are registered, so they contain many proper nouns and technical terms. This is a word that represents the characteristics of the content of the text to be synthesized. Therefore, by comparing these words with words registered in the other person's user dictionary, it is possible to estimate other users who have already read out the text having the same content. In addition, since the user dictionary is prepared for reading out text with similar contents, there is a high probability that the correct reading is inevitably registered, and it is possible to avoid the problem of misread words. Further, unlike the case where a conventional field-specific dictionary is used, it is not necessary to estimate a specific field from the text to be synthesized. Furthermore, since the latest user dictionary maintained by the user can be used constantly, new words can be handled.

[実施形態２]
本実施形態に係る音声合成装置は、合成対象テキストから未知語抽出部１５２が抽出した未知語の表記に所定の変換を行い、当該未知語と同一とみなせる未知語を生成する未知語変換部を更に備えていてもよい。未知語変換部は、合成対象テキストから未知語抽出部１５２が抽出した未知語の表記を、同一とみなせる表記に変換する。例えば、変換規則を予め記録しておき、未知語変換部は、この変換規則を用いて変換することができる。変換規則は、例えば、変換後と変換前との対応関係を規定した参照可能なデータであってもよいし、所定の変換プログラムによって規定されるものであってもよい。また、合成対象テキストから周辺テキスト抽出部１５３が抽出した周辺テキストの表記に所定の変換を行い、当該周辺テキストと同一とみなせる周辺テキストを生成する周辺テキスト変換部を更に備えていてもよい。周辺テキスト変換部は、合成対象テキストから周辺テキスト抽出部１５３が抽出した周辺テキストの表記を、同一とみなせる表記に変換する。 [Embodiment 2]
The speech synthesizer according to the present embodiment includes an unknown word conversion unit that performs predetermined conversion on the notation of the unknown word extracted by the unknown word extraction unit 152 from the synthesis target text and generates an unknown word that can be regarded as the same as the unknown word. Furthermore, you may provide. The unknown word conversion unit converts the notation of the unknown word extracted by the unknown word extraction unit 152 from the synthesis target text into a notation that can be regarded as the same. For example, a conversion rule is recorded in advance, and the unknown word conversion unit can perform conversion using this conversion rule. The conversion rule may be, for example, referable data that defines the correspondence between the converted data and the converted data, or may be specified by a predetermined conversion program. Further, it may further include a peripheral text conversion unit that performs predetermined conversion on the notation of the peripheral text extracted by the peripheral text extraction unit 153 from the synthesis target text and generates a peripheral text that can be regarded as the same as the peripheral text. The surrounding text conversion unit converts the notation of the surrounding text extracted by the surrounding text extraction unit 153 from the synthesis target text into a notation that can be regarded as the same.

これにより、未知語、及び／又は周辺テキストの表記が異なる場合でも、読みが同一とみなせるときに対応可能となる。例えば、未知語の表記が「出入り口」であり、他人のユーザ辞書に登録されている見出し語の表記が「出入口」であるような場合に有効な構成例である。 As a result, even when the unknown words and / or surrounding texts are represented differently, it is possible to cope with cases where the readings can be regarded as the same. For example, this is a configuration example effective when the unknown word notation is “entrance / exit” and the headword registered in the user dictionary of another person is “entrance / exit”.

本発明の実施形態２に係る音声合成装置の一致度算出部の構成は、図４に示すブロック図となる。本発明の実施形態２に係る音声合成装置は、実施形態１の構成に対して、一致度算出部１５６が見出し語一致度判定部１８０、周辺テキスト一致度判定部１９０を備えており、見出し語一致度判定部１８０は、読み展開部１８１、特定文字削除部１８２、及び文字変換部１８３を含み、周辺テキスト一致度判定部１９０は、読み展開部１９１、特定文字削除部１９２、及び文字変換部１９３を含む。 The configuration of the coincidence degree calculation unit of the speech synthesizer according to Embodiment 2 of the present invention is the block diagram shown in FIG. In the speech synthesizer according to the second embodiment of the present invention, the matching degree calculation unit 156 includes a headword matching degree determination unit 180 and a surrounding text matching degree determination unit 190 as compared with the configuration of the first embodiment. The coincidence degree determination unit 180 includes a reading expansion unit 181, a specific character deletion unit 182, and a character conversion unit 183, and the surrounding text coincidence degree determination unit 190 includes a reading expansion unit 191, a specific character deletion unit 192, and a character conversion unit. 193.

見出し語一致度判定部１８０は、未知語と他人のユーザ辞書に登録されている見出し語との一致度を判定する。周辺テキスト一致度判定部１９０は、合成対象テキストの周辺テキストと他人のユーザ辞書に登録されている周辺テキストとの一致度を判定する。 The headword coincidence degree determination unit 180 determines the degree of coincidence between an unknown word and a headword registered in another person's user dictionary. The surrounding text matching degree determination unit 190 determines the matching degree between the surrounding text of the synthesis target text and the surrounding text registered in the other person's user dictionary.

読み展開部１８１、１９１は、未知語、又はユーザ辞書の見出し語、及び／又は周辺テキストの漢字を平仮名に変換したテキストを生成する。特定文字削除部１８２、１９２は、送り仮名や、ハイフン、ピリオドなどの記号を削除したテキストを生成する。文字変換部１８３、１９３は、大文字と小文字との間の変換や、ひらがなとカタカナとの間の変換、ローマ数字とアラビア数字との間の変換などを行い、変換したテキストを生成する。 The reading expansion units 181 and 191 generate text obtained by converting unknown words or entry words in a user dictionary and / or kanji in surrounding text into hiragana. The specific character deletion units 182 and 192 generate text in which symbols such as a sending kana, a hyphen, and a period are deleted. The character conversion units 183 and 193 perform conversion between uppercase and lowercase letters, conversion between hiragana and katakana, conversion between Roman numerals and Arabic numerals, and generate converted text.

これらの変換処理を行ったテキストと他人のユーザ辞書の見出し語とで、完全に一致すれば、表記が異なっていても、未知語と他人のユーザ辞書の見出し語は一致しているとみなす。 If the converted text and the entry word of the other person's user dictionary completely match, even if the notation is different, the unknown word and the entry word of the other person's user dictionary are considered to be the same.

次に、見出し語が一致しているとみなされた場合は、周辺テキストの比較を、見出し語の場合と同様に行う。周辺テキストの比較においても、表記は一致していないものでも、読み展開、特定文字削除、文字変換により一致する場合は、同一の周辺テキストとみなして、特徴レベル加算部１９４で特徴レベルの加算を行い、一致度を算出する。 Next, when the headwords are considered to match, the surrounding text is compared in the same way as the headword. Even in the comparison of surrounding text, even if the notation does not match, when matching by reading expansion, specific character deletion, character conversion, it is regarded as the same surrounding text, and the feature level adding unit 194 adds the feature level And the degree of coincidence is calculated.

上記においては、合成対象テキストの未知語を変換したものと他人のユーザ辞書の見出し語の変換していないものとを比較する例について説明したが、これに限られない。合成対象テキストの未知語を変換したものと他人のユーザ辞書の見出し語を変換したものとを比較してもよい。また、合成対象テキストの周辺テキストを変換したものと他人のユーザ辞書の周辺テキストの変換していないものとを比較する例について説明したが、これに限られない。合成対象テキストの周辺テキストを変換したものと他人のユーザ辞書の周辺テキストを変換したものとを比較してもよい。 In the above description, the example in which the unknown word of the synthesis target text is converted and the headword of the other person's user dictionary is not converted has been described, but the present invention is not limited to this. You may compare what converted the unknown word of the synthetic | combination object text, and the thing which converted the headword of the other person's user dictionary. Moreover, although the example which compares what converted the surrounding text of a synthetic | combination object text with the thing which has not converted the surrounding text of another user's dictionary was demonstrated, it is not restricted to this. You may compare what converted the surrounding text of the synthetic | combination object text, and what converted the surrounding text of another person's user dictionary.

[実施形態３]
本発明の実施形態３に係る音声合成装置の全体構成は、図５に示すブロック図となる。 [Embodiment 3]
The entire configuration of the speech synthesizer according to Embodiment 3 of the present invention is a block diagram shown in FIG.

本実施形態に係る音声合成装置は、一致度算出部１５６が算出した一致度が所定値以上であるときの他人のユーザ辞書の見出し語、当該見出し語の読みを利用者のユーザ辞書に追加登録するユーザ辞書更新部２００を更に備えていてもよい。ユーザ辞書更新部２００は、更に、追加登録した見出し語に関連性を有する周辺テキストのうち、当該周辺テキストの特徴レベルが所定値以上である周辺テキストを利用者のユーザ辞書に追加登録する構成としてもよい。 The speech synthesizer according to the present embodiment additionally registers a headword of another person's user dictionary and a reading of the headword when the matching degree calculated by the matching degree calculation unit 156 is equal to or greater than a predetermined value in the user dictionary of the user. The user dictionary update unit 200 may further be provided. The user dictionary updating unit 200 further additionally registers peripheral text having a feature level of the peripheral text that is greater than or equal to a predetermined value among the peripheral text having relevance to the additionally registered headword in the user dictionary. Also good.

これにより、他人のユーザ辞書の内容を、自分のユーザ辞書に取り込める。そのため、自分のユーザ辞書のデータ量が多くなるため、他人のユーザ辞書を参照する必要がなくなり、高速な処理ができる。 As a result, the contents of the other person's user dictionary can be taken into the user's user dictionary. This increases the amount of data in the user dictionary of the user, so that it is not necessary to refer to the user dictionary of another person, and high-speed processing can be performed.

本発明の実施形態３に係る音声合成装置は、実施形態１の構成に対して、ユーザ辞書更新部２００を更に備え、ユーザ辞書更新部２００は、有効判定部２１０、登録周辺テキスト決定部２２０、及び登録部２３０を含む。ユーザ辞書更新部２００は、読み決定部１５７で決定された他人のユーザ辞書の読みを、自分のユーザ辞書に追加登録する機能を有する。 The speech synthesizer according to the third embodiment of the present invention further includes a user dictionary update unit 200 with respect to the configuration of the first embodiment. The user dictionary update unit 200 includes a validity determination unit 210, a registered peripheral text determination unit 220, And a registration unit 230. The user dictionary updating unit 200 has a function of additionally registering readings of other people's user dictionaries determined by the reading determining unit 157 in its own user dictionary.

例えば、図３(ｅ)に示すテキストが入力されたとする。ここから、未知語が「ＩＣ」、周辺テキストが「半導体」、「回路」、「集積回路」、「電子回路」として抽出されたものとする。一致度算出部１５６は、既に抽出した入力テキストの周辺テキストと他人のユーザ辞書の周辺テキストとの一致度を計算する。一致度の計算方法は実施形態１と同様で、一致度算出部１５６は、例えば、一致するものの特徴レベルの和で計算し、一致する周辺テキストが全くないので、Ａさんとの一致度は「０」となる。Ｂさんの場合は、「半導体」の「１」と「集積回路」の「４」と「電子回路」の「２」の和の値であり、Ｂさんとの一致度は「７」となる。読み決定部１５７は、Ｂさんの見出し語に対する読みとして「アイシー」を採用したものとする。 For example, assume that the text shown in FIG. From this, it is assumed that the unknown word is extracted as “IC” and the surrounding text is extracted as “semiconductor”, “circuit”, “integrated circuit”, and “electronic circuit”. The degree-of-match calculation unit 156 calculates the degree of match between the surrounding text of the input text that has already been extracted and the surrounding text of another person's user dictionary. The coincidence calculation method is the same as that in the first embodiment, and the coincidence calculation unit 156 calculates, for example, the sum of the feature levels of the matches, and there is no matching surrounding text. 0 ". In the case of Mr. B, it is the sum of “1” of “semiconductor”, “4” of “integrated circuit”, and “2” of “electronic circuit”, and the degree of coincidence with Mr. B is “7”. . It is assumed that the reading determining unit 157 adopts “Icy” as a reading for the headword of Mr. B.

ここで、ユーザ辞書更新部２００は、有効判定部２１０において、一致度が閾値以上であるかどうかを判定する。ユーザ辞書更新部２００は、一致度が閾値以上の場合は、同じようなテキストに出てくる正しい読みである可能性が高いため、Ｂさんの見出し語の読み「アイシー」を自分の辞書に追加登録するものとして決定する。 Here, the user dictionary update unit 200 determines in the validity determination unit 210 whether or not the degree of coincidence is greater than or equal to a threshold value. When the matching degree is equal to or higher than the threshold, the user dictionary updating unit 200 adds a reading “Icy” of Mr. B's headword to his / her dictionary because there is a high possibility of correct reading appearing in similar text. Decide to register.

次に、登録周辺テキスト決定部２２０では、他人のユーザ辞書、及び自分のユーザ辞書の中の周辺テキストのうち、特徴レベルが一定値以上のものを選択して採用する。自分が入力したテキストに含まれない周辺テキストでも特徴レベルの高いものを採用することで、様々な入力への対応力を高めることができる。なお、登録周辺テキスト決定部２２０は、ユーザ辞書更新部２００の構成から省略してもよい。 Next, the registered surrounding text determining unit 220 selects and employs a user text of another person and the surrounding text in the user's user dictionary having a feature level of a certain value or more. By adopting the surrounding text that is not included in the text you have entered, you can increase your ability to handle various types of input. The registered peripheral text determination unit 220 may be omitted from the configuration of the user dictionary update unit 200.

最後に、登録部２３０は、未知語を見出し語とし、他人のユーザ辞書の読みを読みとし、登録周辺テキスト決定部２２０で選択した周辺テキストを周辺テキストとして自分のユーザ辞書に追加登録する。登録する際は、特徴レベルも同時に登録してもよい。 Finally, the registration unit 230 uses an unknown word as a headword, reads another user's dictionary as a reading, and additionally registers the peripheral text selected by the registered peripheral text determination unit 220 as a peripheral text in its own user dictionary. When registering, the feature level may be registered at the same time.

上記実施形態で説明した構成は、単に具体例を示すものであり、本発明の技術的範囲を制限するものではない。本発明の効果を奏する範囲において、任意の構成を採用することが可能である。 The configuration described in the above embodiment merely shows a specific example, and does not limit the technical scope of the present invention. Any configuration can be employed within the scope of the effects of the present invention.

なお、本発明の実施形態は、上述した実施形態を実現するソフトウェアのプログラム（実施の形態では図２に示すフロー図に対応したプログラム）が装置に供給され、その装置のコンピュータが、供給されたプログラムを読出して、実行することによっても達成される場合を含む。したがって、本実施形態で説明した機能処理をコンピュータで実現するために、コンピュータにインストールされるプログラム自体も本発明の一実施形態である。つまり、本発明の機能処理を実現させるためのプログラムも、実施形態の一側面に含まれる。また、本発明の機能処理を実現させるためのプログラムを記録した媒体も、実施形態の一側面に含まれる。 In the embodiment of the present invention, a software program for realizing the above-described embodiment (in the embodiment, a program corresponding to the flowchart shown in FIG. 2) is supplied to the apparatus, and a computer of the apparatus is supplied. This includes the case where it is also achieved by reading and executing the program. Therefore, in order to realize the functional processing described in this embodiment by a computer, the program itself installed in the computer is also an embodiment of the present invention. That is, a program for realizing the functional processing of the present invention is also included in one aspect of the embodiment. A medium recording a program for realizing the functional processing of the present invention is also included in one aspect of the embodiment.

１００音声合成装置
１１０基本辞書
１２０ユーザ辞書
１２１他人のユーザ辞書
１２２他人のユーザ辞書
１２３他人のユーザ辞書
１３０ユーザ辞書インターフェース部
１３１ユーザ辞書インターフェース部
１３２ユーザ辞書インターフェース部
１３３ユーザ辞書インターフェース部
１４０テキスト入力部
１５０言語処理部
１５１形態素解析部
１５２未知語抽出部
１５３周辺テキスト抽出部
１５４見出し語抽出部
１５５特徴レベル設定部
１５６一致度算出部
１５７読み決定部
１６０波形処理部
１７０音声出力部
１８０見出し語一致度判定部
１８１読み展開部
１８２特定文字削除部
１８３文字変換部
１９０周辺テキスト一致度判定部
１９１読み展開部
１９２特定文字削除部
１９３文字変換部
１９４特徴レベル加算部
２００ユーザ辞書更新部
２１０有効判定部
２２０登録周辺テキスト決定部
２３０登録部 100 speech synthesizer 110 basic dictionary 120 user dictionary 121 other person's user dictionary 122 other person's user dictionary 123 other person's user dictionary 130 user dictionary interface unit 131 user dictionary interface unit 132 user dictionary interface unit 133 user dictionary interface unit 140 text input unit 150 Language processing unit 151 Morphological analysis unit 152 Unknown word extraction unit 153 Peripheral text extraction unit 154 Headword extraction unit 155 Feature level setting unit 156 Matching degree calculation unit 157 Reading determination unit 160 Waveform processing unit 170 Speech output unit 180 Headword matching degree determination Section 181 Reading development section 182 Specific character deletion section 183 Character conversion section 190 Peripheral text matching degree determination section 191 Reading expansion section 192 Specific character deletion section 193 Character conversion section 194 Feature level addition section 200 U The dictionary update unit 210 Validity determination unit 220 Registration peripheral text determination unit 230 Registration unit

Claims

Headword, the headword readings, and the user of the user dictionary to store in association with surrounding text having relevance to the entry word, is accessible to the basic dictionary,
A text input part that accepts input of text to be synthesized;
An interface unit that makes it possible to refer to a plurality of other user's dictionaries that store the headword, the reading of the headword, and the surrounding text that is related to the headword ;
And unknown word extracting unit for extracting an unknown word from the compositing target text accepts an input in the text input section not included in the basic dictionary in the user dictionary of the user,
A peripheral text extraction unit for extracting the surrounding text with relevance to the unknown word from the synthesis target text that contains the unknown word which the unknown word extraction portion is extracted,
Index word extracting unit for extracting the headword via said interface unit refers to the user dictionary of the others, to match the unknown word which the unknown word extraction unit has extracted from the user dictionary of the others,
Matching degree calculating the degree of coincidence between said peripheral text the peripheral text extracting unit from the peripheral text and the compositing target text from the user dictionary associated with the entry word which the index word extracting unit has extracted the others and extracted A calculation unit;
On the basis of the degrees of coincidence the coincidence degree calculation unit has calculated, the reading from the user dictionary of the others of the entry word which the index word extracting unit has extracted, the unknown word extraction unit from said compositing target text extracted the A speech synthesizer comprising: a reading determination unit that determines whether or not to read an unknown word.

To the peripheral text the peripheral text extracting unit has extracted, further comprising a feature level setting unit for setting a characteristic level in accordance with the attribute of the peripheral text,
The coincidence degree calculation unit calculates the degree of coincidence with the feature level of the peripheral text the feature level setting unit has set, the speech synthesis apparatus according to claim 1.

The coincidence degree calculation unit calculates a sum of the characteristic level of said peripheral text for the entry word which the index word extracting unit from the user dictionary and the extraction of the others as the degree of matching,
The reading determination unit, the value of the sum of the feature level to read the highest the entry word is determined as a reading of the unknown word, the speech synthesis apparatus according to claim 2.

The synthesis target from text representation of the unknown word to the unknown word extraction unit has extracted performs predetermined conversion, further comprising an unknown word conversion unit for generating the unknown words which can be regarded as the same as the unknown word,
The entry word extraction unit extracts the headword via said interface unit refers to the user dictionary of the others, consistent with the unknown word generated by the conversion of the unknown word conversion unit from the user dictionary of the person And
Said match degree calculating section, the compositing target text with the surrounding text associated with the entry word which the index word extracting unit has extracted from the user dictionary of the others and the peripheral text the peripheral text extracting unit has extracted The speech synthesizer according to claim 1, wherein the degree of coincidence is calculated.

The synthesis target text notation of the peripheral text the peripheral text extracting unit has extracted performs predetermined conversion, further comprising a peripheral text conversion unit for generating the peripheral text which can be regarded as the same as the peripheral text,
It said match degree calculating section, the match with the surrounding text associated with the entry words extracted is the index word extraction unit and the peripheral text generated by the conversion of the peripheral text conversion unit from a user dictionary of the person The speech synthesizer according to claim 1, which also calculates the degree.

The degree of coincidence the coincidence degree calculation unit has calculated is not less than a predetermined value, the reading of the entry word and the entry word of the user dictionary of the others, the user dictionary updating unit that additionally registers the user dictionary of the user The speech synthesizer according to claim 1, further comprising:

The user dictionary updating unit is further among the peripheral text have relevance to the additional registered headword, additionally registers the surrounding text feature level of the peripheral text is not less than a predetermined value in the user dictionary of the user The speech synthesizer according to claim 6.

Headword, the headword readings, and the user of the user dictionary to store in association with surrounding text having relevance to the entry word, is accessible computers and basic dictionary,
A text input step that accepts input of text to be synthesized;
And unknown word extracting step of extracting an unknown word from the compositing target text accepted input in the text input step not included in the basic dictionary in the user dictionary of the user,
And peripheral text extraction step of extracting the surrounding text with relevance to the unknown word from the synthesis target text that contains the unknown word extracted by the unknown word extraction step,
Via the interface unit refers to the user dictionary of others, and headword extracting a headword matching the unknown word extracted by the unknown word extraction step from a user dictionary of the others,
Matching degree calculating the degree of coincidence between said peripheral text from the user dictionary and the peripheral text associated with the entry words extracted by the index word extracting step from the synthesis target text extracted by the neighborhood text extraction step of the others A calculation step;
On the basis of the degree of matching calculated by the matching degree calculating step, the unknown to the reading from the user dictionary of the others headword the index word extracting unit has extracted, extracted with the unknown word extraction step from the synthesis target text A speech synthesis method including a reading determination step for determining whether or not to read a word.

Headword, the headword readings, and the user of the user dictionary to store in association with surrounding text having relevance to the entry word, accessible computers and basic dictionary,
A text input step that accepts input of text to be synthesized;
And unknown word extracting step of extracting an unknown word from the compositing target text accepted input in the text input step not included in the basic dictionary in the user dictionary of the user,
And peripheral text extracting a surrounding text with relevance to the unknown word from the synthesis target text that contains the unknown word extracted by the unknown word extraction step,
Via the interface unit refers to the user dictionary of others, and headword extracting the index word that matches the unknown word extracted by the unknown word extraction step from a user dictionary of the others,
Matching degree calculating the degree of coincidence between said peripheral text from the user dictionary and the peripheral text associated with the entry words extracted by the index word extracting step from the synthesis target text extracted by the neighborhood text extraction step of the others A calculation step;
On the basis of the degree of matching calculated by the matching degree calculation step, the reading from the user dictionary of the others of the entry word in which the index word extracting unit has extracted, extracted with the unknown word extraction step from the synthesis target text the A speech synthesis program for executing a reading determination step for determining whether or not to read an unknown word.