JP2011133658A

JP2011133658A - Device, method and program for synthesizing audio

Info

Publication number: JP2011133658A
Application number: JP2009293029A
Authority: JP
Inventors: Kentaro Murase; 健太郎村瀬
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2009-12-24
Filing date: 2009-12-24
Publication date: 2011-07-07
Anticipated expiration: 2029-12-24
Also published as: JP5325086B2

Abstract

<P>PROBLEM TO BE SOLVED: To improve the reading correct answer rate even for unknown words not registered in one's own user dictionary by effectively using user dictionaries of other persons without preparing individual field dictionaries beforehand. <P>SOLUTION: An audio synthesizing device includes: a user registration word extracting part which extracts user registration words contained in a user dictionary from a synchronization object text whose input has been accepted by a text input part; a containing rate calculating part which calculates the containing rate of the extracted user registration words contained in the plurality of user dictionaries of other persons via an interface; a reference user dictionary selecting part which selects the user dictionaries of other persons used by the user dictionaries of other persons based on the containing rate calculated for each of the plurality of user dictionaries of other persons; and a reading determining part which determines readings of the unknown words not contained in a basic dictionary or the user dictionary using the selected user dictionaries of other persons to be used. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、音声合成時に使用するユーザ辞書の利用方法に関する。 The present invention relates to a method for using a user dictionary used during speech synthesis.

テキストを入力し、その読み上げ音声を生成する音声合成技術において、テキストの読み正解率を高めるには、単語の表記に対する読みが登録されている言語辞書に、多くの単語を登録する必要がある。しかし、登録数が増すに従って同表記異読語も増し、読み誤りの原因となる。 In the speech synthesis technology for inputting text and generating the reading speech, it is necessary to register many words in a language dictionary in which the reading for the word notation is registered in order to increase the correct reading rate of the text. However, as the number of registrations increases, the number of misreads with the same notation increases, causing a reading error.

そこで、日本語に頻出する単語を集めた基本辞書をベースに、必要に応じて、ユーザが独自に管理するユーザ辞書を用いる技術が開示されている。分野毎の専門用語を集めた専門辞書を使い分ける方法も開示されている。また、他人のユーザ辞書を利用する方法も開示されている（例えば、特許文献１、特許文献２、特許文献３、特許文献４、特許文献５参照）。 Therefore, a technique is disclosed that uses a user dictionary that is independently managed by a user as needed, based on a basic dictionary that collects words that frequently appear in Japanese. A method for selectively using specialized dictionaries that collect technical terms for each field is also disclosed. In addition, a method using another person's user dictionary is also disclosed (see, for example, Patent Document 1, Patent Document 2, Patent Document 3, Patent Document 4, and Patent Document 5).

特許第３８９４４７９号公報Japanese Patent No. 3894479 特開２００７−２０６９７５号公報JP 2007-206975 A 特許第３２２００９６号公報Japanese Patent No. 3220096 特開平１１−３２７８７１号公報JP-A-11-327871 特開２００７−８００１９号公報Japanese Patent Laid-Open No. 2007-80019

しかし、これらの方法には、以下の問題点がある。まず、ユーザ辞書を用いる場合、それなりの数の単語を自分で登録する必要があり、ユーザにとっては、登録作業の負担が大きい。 However, these methods have the following problems. First, when using a user dictionary, it is necessary to register a certain number of words by yourself, and the burden of registration work is large for the user.

また、予め分野別の専門辞書を準備しておく場合は、新語へ対応できないことや、膨大な日本語の語彙を全てカバーする専門辞書を予め準備するのは、実質的には不可能であるといった問題がある。さらに、専門辞書を利用する場合は、ユーザと専門辞書作成者の間で、分野の分類の仕方が共通でないと、適切な分野の専門辞書を選択できないという問題がある。 Also, when preparing specialized dictionaries by field in advance, it is practically impossible to prepare specialized dictionaries that cannot handle new words or that cover all the vast Japanese vocabulary. There is a problem. Furthermore, when a specialized dictionary is used, there is a problem that a specialized dictionary in an appropriate field cannot be selected unless the way of classifying the field is common between the user and the creator of the specialized dictionary.

他人のユーザ辞書を利用する方法では、ある表記の未知語に対して、複数の他人のユーザ辞書を調べ、登録率の高い読みを、当該未知語に対する読みとして、基本辞書に取り込むことを行っているが、この方法では、一つの表記に対して、一つの読みしか登録できないため、同表記異読語に対応できない。 In the method of using another person's user dictionary, a plurality of other person's user dictionaries are examined for a certain notation unknown word, and a reading with a high registration rate is taken into the basic dictionary as a reading for the unknown word. However, in this method, since only one reading can be registered for one notation, it is not possible to cope with the same notation.

また、自分のユーザ辞書と他人のユーザ辞書との間の類似度を比較して、類似度の高い他人の辞書に含まれ、自分の辞書に含まれない単語を、自分の辞書に取り入れる方法では、ユーザ辞書は、それぞれ個人が独立に管理しているもので、それらに、似たよう単語が登録されている確率は少ないという問題がある。例えば、Ａさんは、政治、経済、野球に興味があり、それらの文章を良く合成していて、それらに関連した単語がユーザ登録されているとする。また、Ｂさんは、芸能、天気、野球に興味があり、それらに関連した単語をユーザ登録しているとする。ここで、ＡさんとＢさんのユーザ辞書登録単語の類似度をみると、類似するのは野球の部分だけで、２／３が類似しないため、共有できない。また、仮に、ＡさんもＢさんも野球関連の語しか登録しておらず、類似度が高かったとしても、類似度が高いがためにＡさんの辞書に取り込める単語も少なくなり、共有の効果があまり得られない。このように、共有できる確率が少なく、共有できたとしてもその効果が少ない。 Also, by comparing the degree of similarity between your user dictionary and another person's user dictionary, you can incorporate words that are included in another person's dictionary with a high degree of similarity but not in your own dictionary into your own dictionary. The user dictionaries are individually managed by individuals, and there is a problem that the probability that words similar to those are registered is small. For example, suppose that Mr. A is interested in politics, economy, and baseball, synthesizes those sentences well, and the words related to them are registered in the user. In addition, Mr. B is interested in performing arts, weather, and baseball, and registers words related to them as a user. Here, looking at the similarity between the user dictionary registered words of Mr. A and Mr. B, only the baseball part is similar, and 2/3 is not similar and cannot be shared. Also, if both A and B have registered only baseball-related words, even if the degree of similarity is high, the number of words that can be taken into A's dictionary because the degree of similarity is high, and the sharing effect Can not get much. Thus, the probability of sharing is small, and even if it can be shared, the effect is small.

そこで、本発明では、他人のユーザ辞書を有効に利用することによって、自分のユーザ辞書に登録されていない未知語に対しても、読み正解率を向上させることができるようにすることを目的とする。 Therefore, the present invention aims to improve the reading accuracy rate even for unknown words that are not registered in one's user dictionary by effectively using another person's user dictionary. To do.

上記の目的を達成するために、以下に開示する音声合成装置は、合成対象テキストの入力を受付けるテキスト入力部と、テキスト入力部で入力を受付けた合成対象テキストから、ユーザ辞書に含まれるユーザ登録単語を抽出するユーザ登録単語抽出部と、ユーザ登録単語抽出部が抽出したユーザ登録単語について、他人のユーザ辞書に含まれる含有率を算出する含有率算出部と、含有率算出部が算出したユーザ登録単語の含有率に基づいて、他人のユーザ辞書から利用すべきユーザ辞書を選択する参照ユーザ辞書選択部と、参照ユーザ辞書選択部で選択された利用すべきユーザ辞書を用いて、基本辞書にもユーザ辞書にも含まれない未知語の読みを決定する読み決定部とを備える。 In order to achieve the above object, a speech synthesizer disclosed below includes a text input unit that accepts an input of a text to be synthesized, and a user registration included in a user dictionary from a text to be synthesized that is accepted by the text input unit. User registration word extraction unit for extracting words, user registration word extracted by the user registration word extraction unit, content rate calculation unit for calculating the content rate included in the other person's user dictionary, and user calculated by the content rate calculation unit Based on the content rate of the registered word, a reference user dictionary selection unit that selects a user dictionary to be used from another user's dictionary and a user dictionary to be used selected by the reference user dictionary selection unit are used as a basic dictionary. And a reading determining unit that determines reading of unknown words not included in the user dictionary.

上記の構成によれば、他人のユーザ辞書を有効に利用することによって、自分のユーザ辞書に登録されていない未知語に対しても、読み正解率を向上させることができる。 According to the above configuration, it is possible to improve the correct reading rate even for unknown words that are not registered in the user dictionary by effectively using the user dictionary of another person.

本発明の実施形態１に係る音声合成装置の全体構成を示すブロック図1 is a block diagram showing the overall configuration of a speech synthesizer according to Embodiment 1 of the present invention. 本発明の実施形態１に係る音声合成装置の動作を示すフロー図The flowchart which shows operation | movement of the speech synthesizer which concerns on Embodiment 1 of this invention. 各種のデータ例を示す図Diagram showing examples of various data 本発明の実施形態２、及び３に係る音声合成装置の全体構成を示すブロック図The block diagram which shows the whole structure of the speech synthesizer which concerns on Embodiment 2 and 3 of this invention 本発明の実施形態４に係る音声合成装置の全体構成を示すブロック図The block diagram which shows the whole structure of the speech synthesizer concerning Embodiment 4 of this invention.

[実施形態１]
図１は、本発明の実施形態１に係る音声合成装置１００の全体構成を示すブロック図である。図１において、音声合成装置１００は、基本辞書１１０、ユーザ辞書１２０、ユーザ辞書インターフェース部１３０、テキスト入力部１４０、言語処理部１５０、波形処理部１６０、及び音声出力部１７０を備える。 [Embodiment 1]
FIG. 1 is a block diagram showing the overall configuration of a speech synthesis apparatus 100 according to Embodiment 1 of the present invention. In FIG. 1, the speech synthesizer 100 includes a basic dictionary 110, a user dictionary 120, a user dictionary interface unit 130, a text input unit 140, a language processing unit 150, a waveform processing unit 160, and a speech output unit 170.

基本辞書１１０は、音声合成の際に必要となる基本単語が格納された辞書である。ユーザ辞書１２０は、ユーザが随時単語を登録していくユーザ固有の辞書である。他人のユーザ辞書１２１、１２２、１２３は、他のユーザが独自に随時単語を登録していく他のユーザ固有の辞書である。基本辞書１１０には、例えば日本語で頻出する単語の表記（＝見出し語）、読み、アクセント等が格納されている。ユーザ辞書１２０には、ユーザが登録した単語の表記（＝見出し語）、読み、アクセント等が格納されている。他人のユーザ辞書１２１、１２２、１２３には、他のユーザが登録した単語の表記（＝見出し語）、読み、アクセント等が格納されている。ユーザ辞書インターフェース部１３０、他人のユーザ辞書のインターフェース部１３１〜１３３を介して、異なるユーザ間で、ユーザ辞書情報のやり取りができるようになっている。 The basic dictionary 110 is a dictionary that stores basic words necessary for speech synthesis. The user dictionary 120 is a user-specific dictionary in which a user registers words as needed. The other person's user dictionaries 121, 122, and 123 are dictionaries unique to other users in which other users individually register words at any time. The basic dictionary 110 stores, for example, word notation (= entry words), readings, accents, and the like that frequently appear in Japanese. The user dictionary 120 stores notation (= entry words), readings, accents, and the like of words registered by the user. The other person's user dictionaries 121, 122, 123 store notation (= entry words), readings, accents, and the like of words registered by other users. The user dictionary information can be exchanged between different users via the user dictionary interface unit 130 and the interface units 131 to 133 of another person's user dictionary.

テキスト入力部１４０は、合成するテキストの入力を受付ける。例えば、キーボードを介してユーザがテキストを入力する構成、ＣＤやフレキシブルディスクなどのメディアを読取るドライブを介して電子的に入力する構成、スキャナなどによりＯＣＲで読取ったテキストを入力する構成、又は有線または無線のネットワークを介して電子的にテキストを受け取る構成、あるいはこれらの組み合わせであってもよい。 Text input unit 140 accepts input of text to be synthesized. For example, a configuration in which a user inputs text through a keyboard, a configuration in which electronic input is performed through a drive that reads a medium such as a CD or a flexible disk, a configuration in which text read by an OCR by a scanner is input, or wired or A configuration in which text is electronically received via a wireless network, or a combination thereof may be used.

入力されたテキストは言語処理部１５０へ送られる。言語処理部１５０では、入力されたテキストの読み、アクセント等を解析し、出力する。言語処理部１５０は、形態素解析部１５１、未知語抽出部１５２、ユーザ登録単語抽出部１５３、含有率算出部１５４、参照ユーザ辞書選択部１５５、及び読み決定部１５６を備える。 The input text is sent to the language processing unit 150. The language processing unit 150 analyzes and outputs the input text reading, accent, and the like. The language processing unit 150 includes a morphological analysis unit 151, an unknown word extraction unit 152, a user registered word extraction unit 153, a content rate calculation unit 154, a reference user dictionary selection unit 155, and a reading determination unit 156.

形態素解析部１５１は、基本辞書１１０とユーザ辞書１２０とを利用して、形態素解析を行い、読みを決定する。未知語抽出部１５２は、形態素解析でテキストを単語に分解した結果から、入力テキスト中で、基本辞書１１０にもユーザ辞書１２０にも登録されていないと判定された単語を未知語として抽出する。ユーザ登録単語抽出部１５３は、形態素解析の結果から、入力されたテキストの中で、ユーザ辞書１２０に登録されていた単語を抽出する。なお、未知語抽出部１５２およびユーザ登録単語抽出部１５３での抽出処理は、形態素解析を行わずに、単純なテキストと各種辞書との比較結果として抽出してもよい。 The morpheme analysis unit 151 performs morpheme analysis using the basic dictionary 110 and the user dictionary 120 to determine reading. The unknown word extraction unit 152 extracts, as an unknown word, a word determined not to be registered in the basic dictionary 110 or the user dictionary 120 in the input text from the result of decomposing the text into words by morphological analysis. The user registration word extraction unit 153 extracts words registered in the user dictionary 120 from the input text from the result of morphological analysis. Note that the extraction processing in the unknown word extraction unit 152 and the user registered word extraction unit 153 may be extracted as a comparison result between a simple text and various dictionaries without performing morphological analysis.

含有率算出部１５４は、未知語およびユーザ登録単語が、他人のユーザ辞書１２１〜１２３にどの程度の割合で含まれているかを算出する。この際、他人のユーザ辞書の情報は、回線経由で、ユーザ辞書インターフェース部（UDIC-IF）１３０が取得し、含有率算出部１５４に情報を送る。また、ユーザ辞書インターフェース部１３０は、他人の音声合成装置から、ユーザ辞書情報参照要求があった場合、自分のユーザ辞書１２０に含まれる情報を提供する。 The content rate calculation unit 154 calculates how much the unknown word and the user registration word are included in the other person's user dictionaries 121 to 123. At this time, the user dictionary interface unit (UDIC-IF) 130 acquires information on the user dictionary of the other person via the line, and sends the information to the content rate calculation unit 154. Further, the user dictionary interface unit 130 provides information included in the user dictionary 120 of the user when there is a user dictionary information reference request from another person's speech synthesizer.

参照ユーザ辞書選択部１５５は、未知語抽出部１５２及びユーザ登録単語抽出部１５３が抽出した未知語及びユーザ登録単語が、他人のユーザ辞書にどの程度含まれているかを表す含有率に基づいて、複数の他人のユーザ辞書の中から含有率の高いユーザ辞書を、利用すべきユーザ辞書として決定する。例えば、含有率が所定値以上の他人のユーザ辞書を利用すべきユーザ辞書として決定する。なお、含有率が最も高い他人のユーザ辞書を利用すべきユーザ辞書と決定してもよい。又は、含有率が高い順に、所定数の他人のユーザ辞書を選択し、利用すべきユーザ辞書としてもよい。 The reference user dictionary selection unit 155 is based on the content ratio indicating how much the unknown word and the user registration word extracted by the unknown word extraction unit 152 and the user registration word extraction unit 153 are included in the other person's user dictionary. A user dictionary with a high content rate is determined as a user dictionary to be used from among a plurality of other user dictionaries. For example, it is determined as a user dictionary that should use another person's user dictionary whose content rate is a predetermined value or more. In addition, you may determine with the user dictionary which should utilize the user dictionary of others with the highest content rate. Alternatively, a predetermined number of other user dictionaries may be selected in descending order of content rate, and used as a user dictionary to be used.

読み決定部１５６は、基本辞書１１０、ユーザ辞書１２０、参照ユーザ辞書選択部１５５が選択した利用すべき他人のユーザ辞書に含まれる単語の情報を利用して、入力テキストの読み、アクセント等を決定する。決定方法として、例えば、下記の２つの方法が挙げられる。すなわち、（１）未知語抽出部によって未知語として抽出された単語に対して、他人のユーザ辞書単語の表記と比較し、マッチする表記を持つ見出し語の読みを設定したり、あるいは（２）基本辞書１１０、ユーザ辞書１２０、利用すべき他人のユーザ辞書中の単語を用いて、形態素解析をやり直すこと等により、読み、アクセントを決定することができる。なお、読み決定部１５６は、読みに加えて、さらにアクセントその他必要な情報を決定してもよい。 The reading determination unit 156 determines the reading of the input text, the accent, and the like using the word information included in the user dictionary of the other person to be used selected by the basic dictionary 110, the user dictionary 120, and the reference user dictionary selection unit 155. To do. Examples of the determination method include the following two methods. That is, (1) the word extracted as an unknown word by the unknown word extraction unit is compared with the notation of another person's user dictionary word, and the reading of a headword having a matching notation is set, or (2) By using words in the basic dictionary 110, the user dictionary 120, and another user's user dictionary to be used, the reading and accent can be determined by performing morphological analysis again. Note that the reading determination unit 156 may further determine accents and other necessary information in addition to reading.

波形処理部１６０は、言語処理部１５０から出力された、読み、アクセント情報に応じて合成音声データを生成する。図示は省略しているが、波形処理部１６０は、音声を合成するための波形辞書を有してもよい。例えば波形処理部１６０は、波形辞書内の音声素片に対して、例えば、線形予測分析法の１つであるＰＳＯＬＡ（Pitch Synchronous Overlap Add）法等を用いたデジタル信号処理で目的のアクセントとなるように声の高さを調整しながら接続し、合成音声を生成することができる。 The waveform processing unit 160 generates synthesized speech data according to the reading and accent information output from the language processing unit 150. Although not shown, the waveform processing unit 160 may have a waveform dictionary for synthesizing speech. For example, the waveform processing unit 160 becomes a target accent for a speech unit in the waveform dictionary by digital signal processing using, for example, a PSOLA (Pitch Synchronous Overlap Add) method which is one of linear prediction analysis methods. Thus, it is possible to generate a synthesized speech by connecting while adjusting the pitch of the voice.

音声出力部１７０は、波形処理部１６０で生成された音声データを、各種音声フォーマットに応じた形式に変換し、出力する。 The audio output unit 170 converts the audio data generated by the waveform processing unit 160 into a format corresponding to various audio formats and outputs the converted data.

なお、上記説明では、音声合成装置１００は、基本辞書１１０、ユーザ辞書１２０、ユーザ辞書インターフェース部１３０、テキスト入力部１４０、言語処理部１５０、波形処理部１６０、及び音声出力部１７０を備える構成としたが、音声合成装置１００の構成はこれに限られない。例えば、音声合成装置１００が、ネットワークに接続されたサーバ上にあってもよい。この場合、例えば、テキスト入力部１４０は、ネットワークに接続されたユーザ端末で入力されたテキストを受信する構成とし、音声出力部１７０は、合成した音声データを、ネットワークを介して前記ユーザ端末へ送信する構成とすることができる。また、当該サーバが上記他人のユーザ辞書も格納し、複数のユーザからテキスト入力を受付けたときに、複数のユーザに対してそれぞれのユーザ辞書を利用して、入力テキストについて音声合成を可能とする構成としてもよい。また、音声合成装置１００が備える機能部は、複数のコンピュータに分散されていてもよい。 In the above description, the speech synthesizer 100 includes the basic dictionary 110, the user dictionary 120, the user dictionary interface unit 130, the text input unit 140, the language processing unit 150, the waveform processing unit 160, and the speech output unit 170. However, the configuration of the speech synthesizer 100 is not limited to this. For example, the speech synthesizer 100 may be on a server connected to a network. In this case, for example, the text input unit 140 is configured to receive text input from a user terminal connected to the network, and the voice output unit 170 transmits the synthesized voice data to the user terminal via the network. It can be set as the structure to do. In addition, when the server stores the other person's user dictionary and accepts text input from a plurality of users, the user dictionary can be used for the plurality of users to synthesize speech for the input text. It is good also as a structure. Further, the functional units included in the speech synthesizer 100 may be distributed among a plurality of computers.

以下、本発明の実施形態１に係る音声合成装置の動作について、図２のフロー図、及び図３の各種データ例を示す図に基づいて説明する。自分、Ａさん、Ｂさん、Ｃさんのそれぞれのユーザ辞書には、図３（ｅ）、（ａ）、（ｂ）、（ｃ）に示すように、普段よく読み上げさせているテキストに関連した単語がユーザ登録されているものとする。 Hereinafter, the operation of the speech synthesizer according to the first embodiment of the present invention will be described with reference to the flowchart in FIG. 2 and various data examples in FIG. 3. The user dictionaries of myself, Mr. A, Mr. B, and Mr. C are related to the texts that are usually read aloud as shown in FIGS. 3 (e), (a), (b), and (c). It is assumed that the word is registered as a user.

最初に、図３（ｄ）に示す、ＩＣカードについて述べている合成対象テキストが、テキスト入力部１４０に入力されたものとする（ステップＳ２０１）。 First, it is assumed that the composition target text describing the IC card shown in FIG. 3D is input to the text input unit 140 (step S201).

次に、入力された合成対象テキストに対して、形態素解析部１５１で形態素解析を行う(ステップＳ２０２)。 Next, the morpheme analysis unit 151 performs morpheme analysis on the input composition target text (step S202).

ユーザ登録単語抽出部１５３は、形態素解析でテキストを単語に分解した結果から、ユーザ辞書に登録されている単語が使われている部分をユーザ登録単語として抽出する(ステップＳ２０３)。ここでは、図３（ｅ）に示す、自分のユーザ辞書に登録されている「ＲＡＭ」と「ＲＯＭ」とがユーザ登録単語として抽出される。 The user registration word extraction unit 153 extracts, as a user registration word, a portion in which the word registered in the user dictionary is used from the result of decomposing the text into words by morphological analysis (step S203). Here, “RAM” and “ROM” registered in the user dictionary shown in FIG. 3E are extracted as user registration words.

未知語抽出部１５２は、形態素解析の結果、基本辞書にもユーザ辞書にも登録されていなかった部分を未知語として抽出する(ステップＳ２０４)。ここでは、基本辞書にもユーザ辞書にも登録されていない、「ＩＣ」、「ＥＥＰＲＯＭ」が未知語として検出されたとする。ちなみに、「ＥＥＰＲＯＭ」は、「Electrically Erasable and Programmable Read Only Memory」の略で、電気的に内容を書き換えることができるＲＯＭの一種である。 As a result of the morphological analysis, the unknown word extraction unit 152 extracts a portion that is not registered in the basic dictionary or the user dictionary as an unknown word (step S204). Here, it is assumed that “IC” and “EEPROM”, which are not registered in the basic dictionary or the user dictionary, are detected as unknown words. Incidentally, “EEPROM” is an abbreviation of “Electrically Erasable and Programmable Read Only Memory” and is a kind of ROM that can be electrically rewritten.

次に、含有率算出部１５４は、ユーザ登録単語抽出部１５３及び未知語抽出部１５２が抽出したユーザ登録単語と未知語とが、他人の辞書の中に含まれる含有率を計算する(ステップＳ２０５)。図３（ａ）に示す、Ａさんのユーザ辞書との比較では、ユーザ登録単語（「ＲＡＭ」、「ＲＯＭ」）と未知語（「ＩＣ」、「ＥＥＰＲＯＭ」）との合計４個の単語のうち、第１の未知語「ＩＣ」の１個だけが含まれるため、含有率は２５％である。一方、図３（ｂ）に示す、Ｂさんのユーザ辞書の中には、ユーザ登録単語「ＲＡＭ」、「ＲＯＭ」の全てと、未知語「ＩＣ」、「ＥＥＰＲＯＭ」の全てが含まれており、ユーザ登録単語と未知語との合計４個の単語が全て含まれるため、含有率が１００％であり、Ｂさんは、現在の合成対象テキストと同分野のテキストを、既に読ませている可能性が高いことが分かる。また、Ｃさんのユーザ辞書とはユーザ登録単語と未知語との合計４個の単語に一致するものが一つも含まれないため、含有率は０％となる。 Next, the content rate calculation unit 154 calculates a content rate in which the user registration word and the unknown word extracted by the user registration word extraction unit 153 and the unknown word extraction unit 152 are included in another person's dictionary (step S205). ). In comparison with Mr. A's user dictionary shown in FIG. 3A, a total of four words of user registration words (“RAM”, “ROM”) and unknown words (“IC”, “EEPROM”) are included. Among them, since only one of the first unknown words “IC” is included, the content rate is 25%. On the other hand, Mr. B's user dictionary shown in FIG. 3B includes all of the user registration words “RAM” and “ROM” and unknown words “IC” and “EEPROM”. Because the total of 4 words of user registration words and unknown words are all included, the content rate is 100%, and Mr. B can already read the text in the same field as the current synthesis target text It turns out that the nature is high. In addition, since Mr. C's user dictionary does not include any word that matches the total of four words of user registration words and unknown words, the content rate is 0%.

参照ユーザ辞書選択部１５５は、含有率に基づいて利用すべき他人のユーザ辞書を選択する(ステップＳ２０６)。ここでは、含有率が１００％で最も高い、Ｂさんのユーザ辞書を利用し、含有率の低い、Ａさん、Ｃさんの辞書は利用しないことを決定する。 The reference user dictionary selection unit 155 selects another person's user dictionary to be used based on the content rate (step S206). Here, it is determined that the user dictionary of Mr. B having the highest content rate is 100% and that the dictionaries of Mr. A and Mr. C having a low content rate are not used.

本説明では、最も含有率が高いＢさんの辞書に、全て未知語が含まれていたが、最も含有率が高い第１の他人の辞書に全ての未知語が含まれていない場合は、最も含有率が高い第１の他人の辞書に含まれていない未知語を含んでいる、第２、第３の辞書を、例えば含有率が一定の閾値以上にある他人のユーザ辞書の中から選択し、なるべく多くの未知語が含まれるように、複数の他人のユーザ辞書を選択すればよい。 In this description, all of the unknown words were included in Mr. B's dictionary with the highest content ratio. However, when all the unknown words are not included in the dictionary of the first other person with the highest content ratio, The second and third dictionaries that contain unknown words that are not included in the first other person's dictionary with a high content rate are selected from, for example, other people's user dictionaries whose content rate is equal to or greater than a certain threshold. A plurality of other user's dictionaries may be selected so that as many unknown words as possible are included.

読み決定部１５６は、ステップ２０４の未知語抽出結果において、未知語として抽出された部分に対し、ステップＳ２０６で選択した他人のユーザ辞書を参照し、利用できる見出し語があれば、未知語部分の読み、アクセント等を決定するために利用する。ここでは、未知語「ＩＣ」、「ＥＥＰＲＯＭ」に対して、Ｂさんのユーザ辞書の「ＩＣ」と「ＥＥＰＲＯＭ」の読み、アクセント情報を利用し、入力テキストに対する読み、アクセントを決定する(ステップＳ２０７)。 The reading determination unit 156 refers to the user dictionary of the other person selected in step S206 for the part extracted as the unknown word in the unknown word extraction result in step 204, and if there is a headword that can be used, the reading determination unit 156 Used to determine reading, accent, etc. Here, for the unknown words “IC” and “EEPROM”, reading “IC” and “EEPROM” in B's user dictionary and accent information are used to read the input text and determine the accent (step S207). ).

以上の方法で、自分のユーザ辞書に登録されていない単語の読みを正しく求めることができる。ユーザ登録単語や、未知語に含まれる単語は、日本語に頻出する単語が登録されている基本単語辞書でカバーできなかった単語であるため、固有名詞や専門用語等が多く含まれ、合成対象テキストの内容の特徴を表す単語となっている。従って、これらの単語を、他人のユーザ辞書に登録されている単語と比較することで、既に、同じような内容のテキストの読み上げを行った他のユーザを推定することができる。しかも、同じような内容のテキストの読み上げ用に用意されたユーザ辞書であるため、必然的に読みも正しいものが登録されている確率が高く、同表記異読語の問題を回避できる。また、従来の分野別辞書を用いる場合とは異なり、合成対象のテキストから特定の分野を推定する必要がないため、分野分けや分野推定誤りの影響を考えなくてもよい利点がある。更に、新語に対しても、絶えずユーザがメンテナンスしている最新のユーザ辞書を利用できるため、新語にも対応できる。なお、本実施形態では、ユーザ登録単語、及び未知語の２つの含有率を算出する構成で説明したが、どちらか１つの含有率を算出する構成としてもよい。 With the above method, it is possible to correctly obtain a reading of a word that is not registered in the user dictionary. User-registered words and words included in unknown words are words that could not be covered by the basic word dictionary in which words that appear frequently in Japanese are registered, so they contain many proper nouns, technical terms, etc. It is a word that represents the characteristics of the text content. Therefore, by comparing these words with words registered in the other person's user dictionary, it is possible to estimate other users who have already read out the text having the same content. In addition, since the user dictionary is prepared for reading out text with similar contents, there is a high probability that the correct reading is inevitably registered, and it is possible to avoid the problem of misread words. Further, unlike the case where a conventional field-specific dictionary is used, it is not necessary to estimate a specific field from the text to be synthesized. Furthermore, since the latest user dictionary maintained by the user can be used for new words, the new words can be handled. In addition, although this embodiment demonstrated the structure which calculates two content rates of a user registration word and an unknown word, it is good also as a structure which calculates any one content rate.

[実施形態２]
本実施形態に係る音声合成装置は、専門分野毎の専門用語が登録されている辞書である専門辞書１８０を更に備え、言語処理部１５０が、テキスト入力部１４０で入力を受付けた合成対象テキストから、専門辞書１８０に含まれる専門辞書登録単語を抽出する専門辞書登録単語抽出部１９０を更に備える。更に実施形態２の含有率算出部１５４は重み付け含有率算出部２００を含んでおり、ユーザ登録単語抽出部１５３が抽出したユーザ登録単語、未知語抽出部１５２が抽出した未知語、及び専門辞書登録単語抽出部１９０が抽出した専門辞書登録単語等の単語の種類に応じた重み付けをして、これらの単語が他人のユーザ辞書１２１、１２２、１２３に含まれる含有率を算出する。参照ユーザ辞書選択部１５５は、重み付け含有率算出部が算出したユーザ登録単語、未知語、及び専門辞書登録単語の重み付け含有率に基づいて、重み付け含有率が所定値以上である他人のユーザ辞書を、利用すべきユーザ辞書として選択する。 [Embodiment 2]
The speech synthesizer according to the present embodiment further includes a specialized dictionary 180 that is a dictionary in which specialized terms for each specialized field are registered, and the language processing unit 150 receives the input from the synthesis target text received by the text input unit 140. Further, a specialized dictionary registered word extraction unit 190 that extracts specialized dictionary registered words included in the specialized dictionary 180 is further provided. Furthermore, the content rate calculation unit 154 of the second embodiment includes a weighted content rate calculation unit 200, and the user registration word extracted by the user registration word extraction unit 153, the unknown word extracted by the unknown word extraction unit 152, and the specialized dictionary registration Weighting is performed according to the type of a word such as the specialized dictionary registered word extracted by the word extraction unit 190, and the content rate in which these words are included in the other person's user dictionaries 121, 122, 123 is calculated. The reference user dictionary selection unit 155 selects a user dictionary of another person whose weighted content rate is equal to or greater than a predetermined value based on the weighted content rate of the user registered words, unknown words, and specialized dictionary registered words calculated by the weighted content rate calculating unit. Select as user dictionary to be used.

図４は、本発明の実施形態２に係る音声合成装置の全体構成を示すブロック図である。実施形態１の構成に対して、専門辞書１８０、専門辞書登録単語抽出部１９０を更に備え、含有率算出部１５４は、重み付け含有率算出部２００を含む。 FIG. 4 is a block diagram showing the overall configuration of the speech synthesizer according to Embodiment 2 of the present invention. The configuration of the first embodiment further includes a specialized dictionary 180 and a specialized dictionary registered word extraction unit 190, and the content rate calculation unit 154 includes a weighted content rate calculation unit 200.

重み付け含有率算出部２００は、含有率を計算する際に、ユーザ登録単語、未知語、固有名詞単語の特定の種類の単語や、特定の品詞の含有率の重みを重くする。例えば、合成対象テキストからユーザ登録単語、未知語、専門辞書登録単語がそれぞれ1つずつ抽出され、Ａさんのユーザ辞書には、ユーザ登録単語と未知語とが含まれており、Ｂさんのユーザ辞書には、未知語と専門辞書登録単語とが含まれていたとする。このとき、そのまま含有率を算出すると、Ａさん、Ｂさんそれぞれのユーザ辞書に対して、含有率は２／３≒６６％となり、含有率に違いはない。ここで、重み付け含有率算出部２００が、専門辞書登録単語について、その含有率の重みを重く設定するものとする。すなわち、実際に含有されている専門辞書登録単語の数は１であるが、重み付けの重みが仮に２倍に設定されているとすると、含有率算出時には、抽出されたユーザ登録単語、未知語、専門辞書登録単語の総数３に対して、重み付けの分専門辞書登録単語の数を割り増しして含有率を計算する。すなわち、重み付けされたＢさんの含有率は、未知語の１単語と、重み付けされて割り増しされた専門辞書登録単語の２単語に対して、抽出されたユーザ登録単語、未知語、専門辞書登録単語の総数３で含有率を算出することになり、重み付け後の含有率は（１＋２）／３≒１００％となる。なお、単語の種類として、ユーザ登録単語、未知語、及び専門辞書登録単語に適用した例を示したが、固有名詞や普通名詞といった品詞の種類によっても単語の種類を分類できる。さらに、ユーザ登録単語の普通名詞、ユーザ登録単語の固有名詞というように、辞書種別と品詞を組み合わせて単語の種類を分類してもよい。 When calculating the content rate, the weighted content rate calculation unit 200 increases the weight of the content rate of specific types of words such as user registration words, unknown words, and proper noun words, and specific parts of speech. For example, one user registered word, one unknown word, and one specialized dictionary registered word are extracted from the synthesis target text, and the user registered word and the unknown word are included in Mr. A's user dictionary. It is assumed that the dictionary includes unknown words and specialized dictionary registered words. At this time, if the content rate is calculated as it is, the content rate is 2 / 3≈66% for each of the user dictionaries of Mr. A and Mr. B, and there is no difference in the content rate. Here, it is assumed that the weighted content ratio calculation unit 200 sets the weight of the content ratio to be heavy for the specialized dictionary registered words. That is, if the number of specialized dictionary registration words actually contained is 1, but the weighting weight is set to twice, when calculating the content rate, the extracted user registration words, unknown words, The content rate is calculated by increasing the number of specialized dictionary registration words by weighting the total number of specialized dictionary registration words 3. That is, the weighted content of Mr. B is the extracted user registration word, unknown word, technical dictionary registration word for one word of the unknown word and two words of the professional dictionary registration word weighted and increased. The content rate is calculated by the total number of 3 and the content rate after weighting is (1 + 2) / 3≈100%. In addition, although the example applied to a user registration word, an unknown word, and a specialized dictionary registration word was shown as a word kind, the kind of word can be classified also by the kind of part of speech, such as a proper noun and a common noun. Further, the word type may be classified by combining the dictionary type and the part of speech, such as a common noun of the user registration word and a proper noun of the user registration word.

参照ユーザ辞書選択部１５５は、重み付け含有率算出部２００がユーザ登録単語、未知語、専門辞書登録単語の特定の単語の種類に応じて重み付けした後のユーザ登録単語、未知語、及び、固有名詞辞書登録単語の含有率に基づいて、他人のユーザ辞書の中から利用すべきユーザ辞書を選択する。例えば、重み付けをした含有率が所定値以上の他人のユーザ辞書を利用すべきユーザ辞書として選択する。なお、重み付けをした含有率が最も高い他人のユーザ辞書を利用すべきユーザ辞書と決定してもよい。 The reference user dictionary selecting unit 155 is configured such that the weighted content rate calculating unit 200 weights the user registered words, unknown words, and specialized dictionary registered words according to the types of specific words, the user registered words, unknown words, and proper nouns. A user dictionary to be used is selected from another person's user dictionary based on the content rate of dictionary registered words. For example, a weighted content rate is selected as a user dictionary to be used by another person's user dictionary. In addition, you may determine with the user dictionary which should utilize the user dictionary of others with the highest content rate weighted.

読み決定部１５６は、基本辞書１１０、ユーザ辞書１２０、専門辞書１８０、参照ユーザ辞書選択部１５５が選択した利用すべき他人のユーザ辞書に登録されている単語情報を利用して、入力テキストの読み、アクセント等を決定する。 The reading determination unit 156 reads the input text using word information registered in the user dictionary of the other person to be used and selected by the basic dictionary 110, the user dictionary 120, the specialized dictionary 180, and the reference user dictionary selection unit 155. Determine accents.

これにより、同じような内容のテキストを読み上げさせる場合でも、専門辞書登録単語の重みを重く設定する場合は、より専門性の高い単語の含有率が重視されるため、同表記異読語による読み誤りを高い信頼度で防ぎつつ、利用すべき他人のユーザ辞書を見つけることができる。また、未知語に対する重みを重く設定する場合は、より多くの未知語を含むユーザ辞書を利用することができ、未知語に対する読み付与率を向上させることができ、目的に応じた読み正解率の改善を行うことができる。 As a result, even if the text with the same content is read aloud, if the weight of specialized dictionary registered words is set to be heavy, the content rate of words with higher expertise is emphasized. It is possible to find a user dictionary of another person to be used while preventing an error with high reliability. In addition, when the weights for unknown words are set to be heavy, a user dictionary including more unknown words can be used, the reading grant rate for unknown words can be improved, and the correct reading rate according to the purpose can be improved. Improvements can be made.

[実施形態３]
本発明の実施形態３に係る音声合成装置の全体構成は、実施形態２と同様に図４に示すブロック図となる。 [Embodiment 3]
The overall configuration of the speech synthesizer according to the third embodiment of the present invention is the block diagram shown in FIG.

実施形態３では、重み付け含有率算出部２００が、ユーザ登録単語、未知語、専門辞書登録単語のそれぞれの含有率計算の際に、単語の種類や、品詞の種類に応じて、異なる重み付けを行う。例えば、専門辞書登録単語＞固有名詞のユーザ登録単語＞未知語＞普通名詞のユーザ登録単語の順に重みが重くなるように重み付けする。 In the third embodiment, the weighted content rate calculation unit 200 performs different weighting according to the type of word and the type of part of speech when calculating the content rate of each of the user registered word, unknown word, and specialized dictionary registered word. . For example, weighting is performed so that the weights increase in the order of specialized dictionary registration word> user registration word of proper noun> unknown word> user registration word of common noun.

参照ユーザ辞書選択部１５５は、重み付け含有率算出部２００が専門辞書登録単語＞固有名詞のユーザ登録単語＞未知語＞普通名詞のユーザ登録単語の順に重みが重くなるように重み付けした後のユーザ登録単語、未知語、及び、専門辞書登録単語の含有率に基づいて、複数の他人のユーザ辞書の中で利用できる他人のユーザ辞書を選択する。例えば、含有率が所定値以上の他人のユーザ辞書を利用すべき他人のユーザ辞書として選択する。なお、含有率が最も高い他人のユーザ辞書を利用すべき他人のユーザ辞書と決定してもよい。また、上記のように、ユーザ登録単語、未知語、専門辞書登録単語のそれぞれの単語の種類や品詞の種類に応じた重み付けを行うだけでなく、単語そのものの出現頻度（出現数）に対して重み付けを行う構成としてもよい。例えば、野球に関する合成対象テキスト中に、未知語として「適時打」という単語が抽出されており、しかも、「適時打」が合成対象テキスト中に複数回出現するのであれば、「適時打」という単語の出現頻度に応じて重みを重くする重み付けを行ってもよい。 The reference user dictionary selection unit 155 performs user registration after the weighted content rate calculation unit 200 performs weighting so that the weights increase in the order of specialized dictionary registration word> user registration word of proper noun> unknown word> user registration word of common noun. Based on the content of words, unknown words, and specialized dictionary registered words, a user dictionary of another person that can be used is selected from among a plurality of other person's user dictionaries. For example, the user dictionary of another person whose content rate is equal to or higher than a predetermined value is selected as a user dictionary of another person to be used. The other person's user dictionary with the highest content rate may be determined as the other person's user dictionary. In addition, as described above, not only weighting is performed according to the type of word or part of speech of each of the user registered word, unknown word, and specialized dictionary registered word, but also the appearance frequency (number of appearances) of the word itself. It is good also as a structure which weights. For example, if the word “timely hit” is extracted as an unknown word in the synthesis target text related to baseball, and “timely hit” appears multiple times in the text to be synthesized, it is called “timely hit”. You may perform the weighting which makes a weight heavy according to the appearance frequency of a word.

読み決定部１５６は、基本辞書１１０、ユーザ辞書１２０、専門辞書１８０、参照ユーザ辞書選択部１５５が選択した利用すべき他人のユーザ辞書単語を利用して、入力テキストの読み、アクセント等を決定する。 The reading determination unit 156 determines the reading of the input text, the accent, and the like using the user dictionary words of other users to be used selected by the basic dictionary 110, the user dictionary 120, the specialized dictionary 180, and the reference user dictionary selection unit 155. .

これにより、合成対象の文章の内容を特徴的に表している可能性が高い専門辞書登録単語の含有率や、ユーザ辞書登録単語の含有率を重視でき、同じような内容のテキストを読み上げさせる場合でも、高い確率で同表記異読語の問題を回避しつつ、利用すべき他人のユーザ辞書を見つけることができる。なお、実施形態２、３では、ユーザ辞書登録単語、未知語、及び専門辞書登録単語の３つの含有率を算出していたが、それら３つのうちのどれか１つについて含有率を算出してもよい。 This makes it possible to emphasize the content rate of specialized dictionary registered words and the content rate of user dictionary registered words that are likely to represent the content of the text to be synthesized, and to read out text with similar content However, it is possible to find a user dictionary of another person to be used while avoiding the problem of the same notation word with a high probability. In the second and third embodiments, the three content rates of the user dictionary registered word, unknown word, and specialized dictionary registered word are calculated, but the content rate is calculated for any one of the three. Also good.

[実施形態４]
本実施形態に係る音声合成装置は、参照ユーザ辞書選択部１５５が利用すべきユーザ辞書を複数選択し、当該複数の利用すべきユーザ辞書間で未知語の読みが異なる場合、読み決定部は、当該複数の利用すべきユーザ辞書のうち、前記含有率が最も高いユーザ辞書を用いて、未知語の読み、アクセントを決定する。 [Embodiment 4]
The speech synthesizer according to the present embodiment selects a plurality of user dictionaries to be used by the reference user dictionary selection unit 155, and when the unknown word reading differs between the plurality of user dictionaries to be used, Of the plurality of user dictionaries to be used, an unknown word reading and accent are determined using the user dictionary having the highest content rate.

図５は、本発明の実施形態４に係る音声合成装置１００のうちの参照ユーザ辞書選択部１５５、読み決定部１５６の構成を示す図である。実施形態２、３に対して、参照ユーザ辞書選択部１５５が、含有率比較部１５７、同表記異読語検出部１５８、及び優先度決定部１５９を更に備える。 FIG. 5 is a diagram illustrating a configuration of the reference user dictionary selection unit 155 and the reading determination unit 156 in the speech synthesis apparatus 100 according to Embodiment 4 of the present invention. In contrast to the second and third embodiments, the reference user dictionary selection unit 155 further includes a content rate comparison unit 157, the same notation misreading word detection unit 158, and a priority determination unit 159.

含有率比較部１５７は、前記含有率算出部１５４が算出した、各々の他人の辞書に対する含有率を比較し、ある一定の含有率以上となる他人のユーザ辞書を利用すべきユーザ辞書として選択する。 The content rate comparison unit 157 compares the content rate of each other person's dictionary calculated by the content rate calculation unit 154, and selects the other person's user dictionary having a certain content rate or higher as a user dictionary to be used. .

同表記異読語検出部１５８は、前記含有率比較部で選択された一定の含有率以上となる他人のユーザ辞書間、または、自分のユーザ辞書と選択された他人のユーザ辞書間で、未知語および自分のユーザ辞書登録語の見出し語に対して、異なる読みが登録されているユーザ辞書がないかを検出する。 The same notation different word detection unit 158 is unknown between another user's dictionary that is equal to or higher than the certain content rate selected by the content rate comparison unit, or between the user's user dictionary and the selected user's user dictionary. It is detected whether there is a user dictionary in which different readings are registered for the word and the entry word of the user dictionary registered word.

優先度決定部１５９は、未知語の見出し語に対して異なる読みが登録されているユーザ辞書が複数存在する場合、ユーザ登録単語または未知語単独での含有率や、他人のユーザ辞書内の未知語の登録数に応じて、どのユーザ辞書の読みを利用するかの優先度をつける。例えば、ユーザ登録単語の含有率が高い他人のユーザ辞書に高い優先度を設定してもよい。また、未知語に対する見出し語の登録数が多い他人のユーザ辞書に高い優先度を設定してもよい。 When there are a plurality of user dictionaries in which different readings are registered with respect to an unknown word entry word, the priority determination unit 159 determines the content rate of the user registered word or the unknown word alone or the unknown in the other person's user dictionary. Prioritize which user dictionary reading is used according to the number of registered words. For example, you may set a high priority to the user dictionary of others with a high content rate of a user registration word. Also, a high priority may be set for the user dictionary of another person who has a large number of registered headwords for unknown words.

このように優先度を決定すれば、前者の場合は、同表記異読語の読み誤りの軽減を重視でき、後者の場合は、未知語を読める、未知語カバー率を上げることを重視できるようになる。 If priorities are determined in this way, in the former case, it is possible to focus on reducing reading errors of misread words, and in the latter case, it is possible to focus on increasing the coverage of unknown words so that unknown words can be read. become.

また、読み決定部１５６は、一時辞書作成部１６１、形態素解析部１６２、アクセント決定部１６３で構成され、基本辞書１１０、ユーザ辞書１２０、および利用すべき他人のユーザ辞書中の単語を用いて、形態素解析を行い、読み、アクセント等を決定する。 The reading determination unit 156 includes a temporary dictionary creation unit 161, a morpheme analysis unit 162, and an accent determination unit 163, and uses words in the basic dictionary 110, the user dictionary 120, and other users' user dictionary to be used, Perform morphological analysis to determine reading and accents.

一時辞書作成部１６１は、利用する他人のユーザ辞書が決定したら、決定した他人のユーザ辞書を、自分の基本辞書とユーザ辞書に含めて、合成対象テキストに対する一時的な辞書を作成する。 When the other person's user dictionary to be used is determined, the temporary dictionary creating unit 161 includes the determined other person's user dictionary in his basic dictionary and user dictionary, and creates a temporary dictionary for the text to be synthesized.

形態素解析部１６２は、合成対象テキストに対して、一時的な辞書を用いて、再度形態素解析を行い、読みを付与する。 The morpheme analysis unit 162 performs morpheme analysis on the synthesis target text again using a temporary dictionary, and gives a reading.

アクセント決定部１６３は、アクセント決定ルールなどを参照しながら、アクセント結合処理などを行い、最終的なアクセントを決定する。 The accent determination unit 163 performs an accent combination process while referring to an accent determination rule and the like, and determines a final accent.

実施形態１とは異なり、合成対象テキストに対して、未知語が読めるようになる言語辞書を用いて、形態素解析を再度行うことで、未知語が含まれていない最初の言語辞書で形態素解析を行う場合よりも、複合語の読み変化やアクセント結合等の現象に対応でき、形態素解析の精度が高まる。未知語が含まれていない言語辞書で形態素解析を行うと、未知語の部分の特定は可能だが、未知語前後の言葉とのアクセント結合や読みの変化などには対応できないが、本実施形態の構成により、その部分の精度が向上する。 Unlike Embodiment 1, morphological analysis is performed again on the first language dictionary that does not contain unknown words by performing morphological analysis again on the synthesis target text using a language dictionary that enables reading of unknown words. Compared to the case, it is possible to cope with phenomena such as compound word reading change and accent coupling, and the accuracy of morphological analysis is increased. If morphological analysis is performed with a language dictionary that does not contain unknown words, it is possible to identify the unknown word part, but it cannot cope with accent concatenation and reading changes with words before and after the unknown word. The accuracy of the portion is improved by the configuration.

以上説明したように、本実施形態によれば、分野別辞書を予め用意しなくても、他人のユーザ辞書を有効に利用することによって、自分のユーザ辞書に登録されていない未知語に対しても、読み正解率を向上させることができる。また、同表記異読語の問題に対応でき、かつ、自分のユーザ辞書に登録されていない未知語に対して、正しく読みを解析できる。また、ユーザ辞書登録の労力を減らし、かつ、正しく読み上げる確率を高めることができる。加えて、新語にも対応できる。 As described above, according to the present embodiment, an unknown word that is not registered in one's own user dictionary can be obtained by effectively using another user's dictionary without preparing a field-specific dictionary in advance. Also, the correct reading rate can be improved. In addition, it is possible to deal with the problem of differently read words and to correctly analyze readings for unknown words that are not registered in the user dictionary. Further, it is possible to reduce the user dictionary registration effort and increase the probability of reading correctly. In addition, it can handle new words.

上記実施形態で説明した構成は、単に具体例を示すものであり、本発明の技術的範囲を制限するものではない。本発明の効果を奏する範囲において、任意の構成を採用することが可能である。 The configuration described in the above embodiment merely shows a specific example, and does not limit the technical scope of the present invention. Any configuration can be employed within the scope of the effects of the present invention.

なお、本発明の実施形態は、上述した実施形態を実現するソフトウェアのプログラム（実施の形態では図２に示すフロー図に対応したプログラム）が装置に供給され、その装置のコンピュータが、供給されたプログラムを読出して、実行することによっても達成される場合を含む。したがって、本実施形態で説明した機能処理をコンピュータで実現するために、コンピュータにインストールされるプログラム自体も本発明の一実施形態である。つまり、本発明の機能処理を実現させるためのプログラムも、実施形態の一側面に含まれる。 In the embodiment of the present invention, a software program for realizing the above-described embodiment (in the embodiment, a program corresponding to the flowchart shown in FIG. 2) is supplied to the apparatus, and a computer of the apparatus is supplied. This includes the case where it is also achieved by reading and executing the program. Therefore, in order to realize the functional processing described in this embodiment by a computer, the program itself installed in the computer is also an embodiment of the present invention. That is, a program for realizing the functional processing of the present invention is also included in one aspect of the embodiment.

以上の実施形態１〜実施形態４に関し、さらに以下の付記を開示する。 The following additional notes are further disclosed with respect to the above-described first to fourth embodiments.

（付記１）
基本辞書と利用者固有のユーザ辞書を有する音声合成装置であって、
合成対象テキストの入力を受付けるテキスト入力部と、
複数の他人のユーザ辞書を参照可能なインターフェース部と、
前記テキスト入力部で入力を受付けた合成対象テキストから、ユーザ辞書に含まれるユーザ登録単語を抽出するユーザ登録単語抽出部と、
前記ユーザ登録単語抽出部が抽出したユーザ登録単語について、前記インターフェース部経由で前記複数の他人のユーザ辞書毎に含まれる含有率を算出する含有率算出部と、
前記複数の他人のユーザ辞書毎に算出された含有率に基づいて、前記他人のユーザ辞書から利用する他人のユーザ辞書を選択する参照ユーザ辞書選択部と、
前記参照ユーザ辞書選択部で選択された利用すべき他人のユーザ辞書を用いて、基本辞書にもユーザ辞書にも含まれない未知語の読みを決定する読み決定部とを備える、音声合成装置。 (Appendix 1)
A speech synthesizer having a basic dictionary and a user-specific user dictionary,
A text input part that accepts input of text to be synthesized;
An interface unit that can refer to a plurality of other user's dictionaries;
A user-registered word extraction unit that extracts a user-registered word included in the user dictionary from the text to be synthesized that is accepted by the text input unit;
About a user registration word extracted by the user registration word extraction unit, a content rate calculation unit that calculates a content rate included for each of the plurality of other user's dictionaries via the interface unit,
Based on the content rate calculated for each of the plurality of other person's user dictionaries, a reference user dictionary selecting unit that selects the other person's user dictionary to be used from the other person's user dictionary;
A speech synthesizer comprising: a reading determination unit that determines reading of an unknown word that is not included in either the basic dictionary or the user dictionary using the user dictionary of another person to be used selected by the reference user dictionary selection unit.

（付記２）
更に、前記合成対象テキストから、前記基本辞書にも前記利用者のユーザ辞書にも含まれない未知語を抽出する未知語抽出部を備え、
前記含有率算出部は、前記未知語抽出部が抽出した未知語について、前記インターフェース部経由で前記複数の他人のユーザ辞書毎に含まれる含有率を算出する、付記１に記載の音声合成装置。 (Appendix 2)
Furthermore, an unknown word extraction unit that extracts an unknown word that is not included in the basic dictionary or the user dictionary of the user from the synthesis target text,
The speech synthesizer according to appendix 1, wherein the content rate calculation unit calculates the content rate included in each of the plurality of other user's dictionaries via the interface unit for the unknown word extracted by the unknown word extraction unit.

（付記３）
前記含有率算出部は、前記ユーザ登録単語抽出部が抽出したユーザ登録単語、及び前記未知語抽出部が抽出した未知語が他人のユーザ辞書に含まれる含有率を算出し、
前記参照ユーザ辞書選択部は、前記複数の他人のユーザ辞書毎に算出されたユーザ登録単語、及び未知語の含有率に基づいて、前記複数の他人のユーザ辞書から利用すべき他人のユーザ辞書を選択する、付記２に記載の音声合成装置。 (Appendix 3)
The content rate calculation unit calculates the content rate that the user registration word extracted by the user registration word extraction unit and the unknown word extracted by the unknown word extraction unit are included in another user's dictionary,
The reference user dictionary selection unit calculates a user dictionary of another person to be used from the plurality of other person's user dictionaries based on a user registration word calculated for each of the plurality of other person's user dictionaries and an unknown word content rate. The speech synthesizer according to appendix 2, which is selected.

（付記４）
更に、専門分野毎の専門単語が登録されている専門辞書と、
前記テキスト入力部で入力を受付けた合成対象テキストから、前記専門辞書に含まれる専門辞書登録単語を抽出する専門辞書登録単語抽出部を備え、
前記含有率算出部は、前記ユーザ登録単語抽出部が抽出したユーザ登録単語、及び専門辞書登録単語抽出部が抽出した専門辞書登録単語が他人のユーザ辞書に含まれる含有率を算出し、
前記参照ユーザ辞書選択部は、前記含有率算出部が算出したユーザ登録単語、及び専門辞書登録単語の含有率に基づいて、前記複数の他人のユーザ辞書から利用すべき他人のユーザ辞書を選択する、付記１に記載の音声合成装置。 (Appendix 4)
In addition, a specialized dictionary in which specialized words for each specialized field are registered,
A specialized dictionary registered word extraction unit that extracts a specialized dictionary registered word included in the specialized dictionary from the synthesis target text accepted by the text input unit;
The content rate calculation unit calculates the content rate that the user registration word extracted by the user registration word extraction unit and the professional dictionary registration word extracted by the specialized dictionary registration word extraction unit are included in the other person's user dictionary,
The reference user dictionary selection unit selects a user dictionary of another person to be used from the plurality of other person's user dictionaries based on the user registration word calculated by the content rate calculation unit and the content rate of the specialized dictionary registration word. The speech synthesizer according to appendix 1.

（付記５）
前記合成対象テキストから、前記基本辞書にも前記利用者のユーザ辞書にも含まれない未知語を抽出する未知語抽出部と、
専門分野毎の専門単語が登録されている専門辞書と、
前記テキスト入力部で入力を受付けた合成対象テキストから、前記専門辞書に含まれる専門辞書登録単語を抽出する専門辞書登録単語抽出部とをさらに備え、
前記含有率算出部は、前記ユーザ登録単語抽出部が抽出したユーザ登録単語、前記未知語抽出部が抽出した未知語、及び専門辞書登録単語抽出部が抽出した辞書登録単語のうち少なくとも１つに基づいて、他人のユーザ辞書に含まれる含有率を算出し、
前記参照ユーザ辞書選択部は、前記含有率算出部が算出した含有率に基づいて、前記複数の他人のユーザ辞書から利用すべき他人のユーザ辞書を選択する、付記１に記載の音声合成装置。 (Appendix 5)
An unknown word extraction unit that extracts unknown words that are not included in the basic dictionary or the user dictionary of the user from the text to be synthesized;
A specialized dictionary in which specialized words for each specialized field are registered,
A specialized dictionary registration word extraction unit that extracts a specialized dictionary registration word included in the specialized dictionary from the synthesis target text accepted by the text input unit;
The content rate calculation unit may include at least one of a user registration word extracted by the user registration word extraction unit, an unknown word extracted by the unknown word extraction unit, and a dictionary registration word extracted by a specialized dictionary registration word extraction unit. Based on the content rate included in the other person's user dictionary,
The speech synthesis device according to attachment 1, wherein the reference user dictionary selection unit selects a user dictionary of another person to be used from the plurality of other person's user dictionaries based on the content rate calculated by the content rate calculation unit.

（付記６）
前記含有率算出部は、単語の種類に応じて重み付けして含有率を算出し、
前記参照ユーザ辞書選択部は、前記含有率算出部が重み付けをした後の含有率に基づいて前記複数の他人のユーザ辞書から、利用すべき他人のユーザ辞書を選択する、付記１〜５のいずれかに記載の音声合成装置。 (Appendix 6)
The content rate calculation unit calculates the content rate by weighting according to the type of word,
The reference user dictionary selection unit selects any other user's user dictionary to be used from the plurality of other person's user dictionaries based on the content rate after the content rate calculation unit is weighted. A speech synthesizer according to claim 1.

（付記７）
前記参照ユーザ辞書選択部が利用すべき他人のユーザ辞書を複数選択し、
前記読み決定部は、当該複数の利用すべき他人のユーザ辞書間で未知語の読みが異なる場合、当該選択された利用すべきユーザ辞書のうち、前記含有率が最も高い他人のユーザ辞書を用いて、未知語の読みを決定する、付記１〜６のいずれかに記載の音声合成装置。 (Appendix 7)
A plurality of other user dictionaries to be used by the reference user dictionary selection unit;
When the reading of unknown words is different among the plurality of other user dictionary to be used, the reading determining unit uses the other user dictionary having the highest content rate among the selected user dictionary to be used. The speech synthesizer according to any one of appendices 1 to 6, which determines reading of an unknown word.

（付記８）
基本辞書と利用者固有のユーザ辞書を用いる音声合成方法であって、
コンピュータが合成対象テキストの入力を受付けるテキスト入力ステップと、
前記テキスト入力ステップで入力を受付けた合成対象テキストから、前記コンピュータがユーザ辞書に含まれるユーザ登録単語を抽出するユーザ登録単語抽出ステップと、
前記ユーザ登録単語抽出ステップで抽出したユーザ登録単語について、前記コンピュータがインターフェース部経由で複数の他人のユーザ辞書毎に含まれる含有率を算出する含有率算出ステップと、
前記複数の他人のユーザ辞書毎に算出された含有率に基づいて、前記コンピュータが前記他人のユーザ辞書から利用する他人のユーザ辞書を選択する参照ユーザ辞書選択ステップと、
前記参照ユーザ辞書選択部で選択された利用すべき他人のユーザ辞書を用いて、前記コンピュータが基本辞書にもユーザ辞書にも含まれない未知語の読みを決定する読み決定ステップとを含む、音声合成方法。 (Appendix 8)
A speech synthesis method that uses a basic dictionary and a user-specific user dictionary,
A text input step in which the computer accepts input of the text to be synthesized;
A user-registered word extracting step in which the computer extracts a user-registered word included in a user dictionary from the synthesis target text received in the text input step;
About the user registration word extracted in the user registration word extraction step, the computer calculates a content ratio included in each of a plurality of other user's dictionaries via the interface unit,
A reference user dictionary selection step in which the computer selects a user dictionary of another person to use from the user dictionary of the other person based on the content rate calculated for each of the plurality of other person user dictionaries;
Using a user dictionary of another person to be used selected by the reference user dictionary selecting unit, and a reading determining step in which the computer determines a reading of an unknown word that is not included in the basic dictionary or the user dictionary. Synthesis method.

（付記９）
更に、前記コンピュータが前記合成対象テキストから、前記基本辞書にも前記利用者のユーザ辞書にも含まれない未知語を抽出する未知語抽出ステップを含み、
前記含有率算出ステップでは、前記未知語抽出ステップで抽出した未知語について、前記インターフェース部経由で前記複数の他人のユーザ辞書毎に含まれる含有率を算出する、付記８に記載の音声合成方法。 (Appendix 9)
Furthermore, the computer includes an unknown word extraction step of extracting an unknown word that is not included in the basic dictionary or the user's user dictionary from the composition target text,
9. The speech synthesis method according to appendix 8, wherein in the content rate calculation step, the content rate included in each of the plurality of other user's dictionaries is calculated via the interface unit for the unknown word extracted in the unknown word extraction step.

（付記１０）
基本辞書と利用者固有のユーザ辞書を用いる音声合成プログラムであって、
コンピュータに、
合成対象テキストの入力を受付けるテキスト入力ステップと、
前記テキスト入力ステップで入力を受付けた合成対象テキストから、ユーザ辞書に含まれるユーザ登録単語を抽出するユーザ登録単語抽出ステップと、
前記ユーザ登録単語抽出ステップで抽出したユーザ登録単語について、インターフェース部経由で複数の他人のユーザ辞書毎に含まれる含有率を算出する含有率算出ステップと、
前記複数の他人のユーザ辞書毎に算出された含有率に基づいて、前記他人のユーザ辞書から利用する他人のユーザ辞書を選択する参照ユーザ辞書選択ステップと、
前記参照ユーザ辞書選択部で選択された利用すべき他人のユーザ辞書を用いて、基本辞書にもユーザ辞書にも含まれない未知語の読みを決定する読み決定ステップとを実行させる、音声合成プログラム。 (Appendix 10)
A speech synthesis program that uses a basic dictionary and a user-specific user dictionary,
On the computer,
A text input step that accepts input of text to be synthesized;
A user registration word extraction step for extracting a user registration word included in the user dictionary from the synthesis target text accepted in the text input step;
About the user registration word extracted in the user registration word extraction step, a content rate calculation step for calculating a content rate included for each of a plurality of other user's dictionaries via the interface unit;
A reference user dictionary selection step of selecting a user dictionary of another person to be used from the user dictionary of the other person based on the content rate calculated for each of the plurality of other person user dictionaries;
A speech synthesis program for executing a reading determination step for determining reading of an unknown word that is not included in either the basic dictionary or the user dictionary using the user dictionary of another person to be used selected by the reference user dictionary selection unit .

（付記１１）
更に、前記コンピュータに、前記合成対象テキストから、前記基本辞書にも前記利用者のユーザ辞書にも含まれない未知語を抽出する未知語抽出ステップを実行させ、
前記含有率算出ステップでは、前記未知語抽出ステップで抽出した未知語について、前記インターフェース部経由で前記複数の他人のユーザ辞書毎に含まれる含有率を算出する、付記１０に記載の音声合成プログラム。 (Appendix 11)
Furthermore, the computer is caused to execute an unknown word extraction step for extracting an unknown word that is not included in the basic dictionary or the user dictionary of the user from the text to be synthesized.
The speech synthesis program according to appendix 10, wherein in the content rate calculation step, the content rate included in each of the plurality of other user's dictionaries is calculated via the interface unit for the unknown word extracted in the unknown word extraction step.

（付記１２）
前記含有率算出ステップでは、前記ユーザ登録単語抽出ステップで抽出したユーザ登録単語、及び前記未知語抽出ステップで抽出した未知語が他人のユーザ辞書に含まれる含有率を算出し、
前記参照ユーザ辞書選択ステップでは、前記複数の他人のユーザ辞書毎に算出されたユーザ登録単語、及び未知語の含有率に基づいて、前記複数の他人のユーザ辞書から利用すべき他人のユーザ辞書を選択する、付記１１に記載の音声合成プログラム。 (Appendix 12)
In the content rate calculation step, the user registration word extracted in the user registration word extraction step, and the content rate that the unknown word extracted in the unknown word extraction step is included in another user's dictionary,
In the reference user dictionary selecting step, based on the user registration word calculated for each of the plurality of other person's user dictionaries and the content rate of unknown words, the other person's user dictionary to be used from the plurality of other person's user dictionaries is obtained. The speech synthesis program according to appendix 11, which is selected.

（付記１３）
前記コンピュータに、
更に、前記テキスト入力部で入力を受付けた合成対象テキストから、専門分野毎の専門単語が登録されている専門辞書に含まれる専門辞書登録単語を抽出する専門辞書登録単語抽出ステップを実行させ、
前記含有率算出ステップでは、前記ユーザ登録単語抽出ステップで抽出したユーザ登録単語、及び前記専門辞書登録単語抽出ステップで抽出した専門辞書登録単語が他人のユーザ辞書に含まれる含有率を算出し、
前記参照ユーザ辞書選択ステップでは、前記含有率算出ステップで算出したユーザ登録単語、及び専門辞書登録単語の含有率に基づいて、前記複数の他人のユーザ辞書から利用すべき他人のユーザ辞書を選択する、付記１１に記載の音声合成プログラム。 (Appendix 13)
In the computer,
Furthermore, from the synthesis target text accepted by the text input unit, to execute a specialized dictionary registration word extraction step for extracting specialized dictionary registration words included in a specialized dictionary in which specialized words for each specialized field are registered,
In the content rate calculation step, the user registration word extracted in the user registration word extraction step, and the content rate in which the specialized dictionary registration word extracted in the specialized dictionary registration word extraction step is included in another user's dictionary,
In the reference user dictionary selection step, a user dictionary of another person to be used is selected from the plurality of other person's user dictionaries based on the user registration word calculated in the content ratio calculation step and the content ratio of the specialized dictionary registration word. The speech synthesis program according to attachment 11.

１００音声合成装置
１１０基本辞書
１２０ユーザ辞書
１３０ユーザ辞書インターフェース部
１４０テキスト入力部
１５０言語処理部
１５１形態素解析部
１５２未知語抽出部
１５３ユーザ登録単語抽出部
１５４含有率算出部
１５５参照ユーザ辞書選択部
１５６読み決定部
１５７含有率比較部
１５８同表記異読語検出部
１５９優先度決定部
１６０波形処理部
１６１一時辞書作成部
１６２形態素解析部
１６３アクセント決定部
１７０音声出力部
１８０専門辞書
１９０専門辞書登録単語抽出部
２００重み付け含有率算出部 100 speech synthesizer 110 basic dictionary 120 user dictionary 130 user dictionary interface unit 140 text input unit 150 language processing unit 151 morpheme analysis unit 152 unknown word extraction unit 153 user registered word extraction unit 154 content rate calculation unit 155 reference user dictionary selection unit 156 Reading decision unit 157 Content rate comparison unit 158 Same notation different word detection unit 159 Priority decision unit 160 Waveform processing unit 161 Temporary dictionary creation unit 162 Morphological analysis unit 163 Accent decision unit 170 Speech output unit 180 Special dictionary 190 Special dictionary registration word Extraction unit 200 Weighted content rate calculation unit

Claims

A speech synthesizer having a basic dictionary and a user-specific user dictionary,
A text input part that accepts input of text to be synthesized;
An interface unit that can refer to a plurality of other user's dictionaries;
A user-registered word extraction unit that extracts a user-registered word included in the user dictionary from the text to be synthesized that is accepted by the text input unit;
About a user registration word extracted by the user registration word extraction unit, a content rate calculation unit that calculates a content rate included for each of the plurality of other user's dictionaries via the interface unit,
Based on the content rate calculated for each of the plurality of other person's user dictionaries, a reference user dictionary selecting unit that selects the other person's user dictionary to be used from the other person's user dictionary;
A speech synthesizer comprising: a reading determination unit that determines reading of an unknown word that is not included in either the basic dictionary or the user dictionary using the user dictionary of another person to be used selected by the reference user dictionary selection unit.

Furthermore, an unknown word extraction unit that extracts an unknown word that is not included in the basic dictionary or the user dictionary of the user from the synthesis target text,
The speech synthesizer according to claim 1, wherein the content rate calculation unit calculates a content rate included in each of the plurality of other user's dictionaries via the interface unit for the unknown word extracted by the unknown word extraction unit. .

The content rate calculation unit calculates the content rate that the user registration word extracted by the user registration word extraction unit and the unknown word extracted by the unknown word extraction unit are included in another user's dictionary,
The reference user dictionary selection unit calculates a user dictionary of another person to be used from the plurality of other person's user dictionaries based on a user registration word calculated for each of the plurality of other person's user dictionaries and an unknown word content rate. The speech synthesizer according to claim 2, which is selected.

In addition, a specialized dictionary in which specialized words for each specialized field are registered,
A specialized dictionary registered word extraction unit that extracts a specialized dictionary registered word included in the specialized dictionary from the synthesis target text accepted by the text input unit;
The content rate calculation unit calculates the content rate that the user registration word extracted by the user registration word extraction unit and the professional dictionary registration word extracted by the specialized dictionary registration word extraction unit are included in the other person's user dictionary,
The reference user dictionary selection unit selects a user dictionary of another person to be used from the plurality of other person's user dictionaries based on the user registration word calculated by the content rate calculation unit and the content rate of the specialized dictionary registration word. The speech synthesizer according to claim 1.

An unknown word extraction unit that extracts unknown words that are not included in the basic dictionary or the user dictionary of the user from the text to be synthesized;
A specialized dictionary in which specialized words for each specialized field are registered,
A specialized dictionary registration word extraction unit that extracts a specialized dictionary registration word included in the specialized dictionary from the synthesis target text accepted by the text input unit;
The content rate calculation unit may include at least one of a user registration word extracted by the user registration word extraction unit, an unknown word extracted by the unknown word extraction unit, and a dictionary registration word extracted by a specialized dictionary registration word extraction unit. Based on the content rate included in the other person's user dictionary,
The speech synthesizer according to claim 1, wherein the reference user dictionary selection unit selects a user dictionary of another person to be used from the plurality of other person user dictionaries based on the content rate calculated by the content rate calculation unit. .

A speech synthesis method that uses a basic dictionary and a user-specific user dictionary,
A text input step in which the computer accepts input of the text to be synthesized;
A user-registered word extracting step in which the computer extracts a user-registered word included in a user dictionary from the synthesis target text received in the text input step;
About the user registration word extracted in the user registration word extraction step, the computer calculates a content ratio included in each of a plurality of other user's dictionaries via the interface unit,
A reference user dictionary selection step in which the computer selects a user dictionary of another person to use from the user dictionary of the other person based on the content rate calculated for each of the plurality of other person user dictionaries;
Using a user dictionary of another person to be used selected by the reference user dictionary selecting unit, and a reading determining step in which the computer determines a reading of an unknown word that is not included in the basic dictionary or the user dictionary. Synthesis method.

A speech synthesis program that uses a basic dictionary and a user-specific user dictionary,
On the computer,
A text input step that accepts input of text to be synthesized;
A user registration word extraction step for extracting a user registration word included in the user dictionary from the synthesis target text accepted in the text input step;
About the user registration word extracted in the user registration word extraction step, a content rate calculation step for calculating a content rate included for each of a plurality of other user's dictionaries via the interface unit;
A reference user dictionary selection step of selecting a user dictionary of another person to be used from the user dictionary of the other person based on the content rate calculated for each of the plurality of other person user dictionaries;
A speech synthesis program for executing a reading determination step for determining reading of an unknown word that is not included in either the basic dictionary or the user dictionary using the user dictionary of another person to be used selected by the reference user dictionary selection unit .

In the computer,
Further, an unknown word extraction step for extracting unknown words that are not included in the basic dictionary or the user dictionary of the user from the text to be synthesized is performed,
In the content rate calculation step, the user registration word extracted in the user registration word extraction step, and the content rate that the unknown word extracted in the unknown word extraction step is included in another user's dictionary,
In the reference user dictionary selecting step, based on the user registration word calculated for each of the plurality of other person's user dictionaries and the content rate of unknown words, the other person's user dictionary to be used from the plurality of other person's user dictionaries is obtained. The speech synthesis program according to claim 7 to be selected.

In the computer,
Furthermore, from the synthesis target text accepted by the text input unit, to execute a specialized dictionary registration word extraction step for extracting specialized dictionary registration words included in a specialized dictionary in which specialized words for each specialized field are registered,
In the content rate calculation step, the user registration word extracted in the user registration word extraction step, and the content rate in which the specialized dictionary registration word extracted in the specialized dictionary registration word extraction step is included in another user's dictionary,
In the reference user dictionary selection step, a user dictionary of another person to be used is selected from the plurality of other person's user dictionaries based on the user registration word calculated in the content ratio calculation step and the content ratio of the specialized dictionary registration word. The speech synthesis program according to claim 7.