WO2010113396A1

WO2010113396A1 - Device, method, program for reading determination, computer readable medium therefore, and voice synthesis device

Info

Publication number: WO2010113396A1
Application number: PCT/JP2010/001753
Authority: WO
Inventors: 近藤玲史; 安藤真一
Original assignee: 日本電気株式会社
Priority date: 2009-03-31
Filing date: 2010-03-11
Publication date: 2010-10-07
Also published as: JPWO2010113396A1; JP5533853B2

Abstract

A reading determination device (10) determines a reading of a word which has a plurality of reading candidates. The reading determination device (10) is also provided with a word set generation means (14) which generates word sets comprising a plurality of element words similar to the reading candidates, respectively; a corpus database (15) which stores corpus information containing a plurality of example sentences; a feature amount calculation means for calculating feature amounts for the respective plurality of element words of the word sets generated by the word set generation means (14) on the basis of the corpus information stored in the corpus database (15); and a reading determination information generation means (17) which generates reading determination information wherein the feature amounts for the plurality of element words of the word sets calculated by the feature amount calculation means are correlated to the respective reading candidates.

Description

[Name of Invention Determined by ISA Based on Rule 37.2] Reading Judgment Device, Method, Program, Computer-Readable Medium, and Speech Synthesizer

The present invention relates to, for example, a reading determination device, a speech synthesizer, a reading determination method, a reading determination program, and a computer readable medium for determining a reading of a word having a plurality of reading candidates. The present invention relates to a reading determination device, a speech synthesizer, a reading determination method, a reading determination program, and a computer readable medium thereof that can be easily and appropriately determined.

As a method of determining how to read a given sentence, for example, “reading” (reading kana, accent information, etc.) such as letters and words is defined in advance in the dictionary, and the grammatical connection relation of each word in the sentence is determined. A method for determining how to read the entire sentence based on the reading defined in the dictionary while checking is widely known (for example, Non-Patent Documents 1 and 2). Furthermore, a method of giving a more appropriate reading as a sentence by considering phonological rules such as rendaku and devoicing is also known.

According to the morphological analysis shown in Non-Patent Document 2 above, the part-of-speech relationship is used as the rule. However, for example, in Japanese, there are many words (same notation different pronunciation word sets) having the same notation and the same part of speech but having a plurality of readings depending on the field of use and meaning. Specifically, “market” (noun) has two types of readings, “Ichiba” and “shijo”, and “kuroko” (noun) means “mole” and “black”. There are two types of reading. In addition, “磯” (noun) is read as “iso”, but when used in general nouns, it becomes a flat accent, and when used in personal names, it becomes a head-high accent. Therefore, for example, such differences are also important when performing speech synthesis or the like. In order to give a correct reading to a sentence, it is desirable to appropriately select these multiple readings (including accents).

In order to make the above selection, for example, when “fish” is connected before “market”, the reading is “first”, and when “stock” is connected before, the reading is “ "Shijo" can be considered as a method of reading differently. In other words, we will statistically investigate how to read texts that have a chain of “fish” and “market”, and use multiple readings such as “Saizen Chiba”, “Sakanashi Jyo”, “Uoichiba”, and “Uoshijo”. A method of extracting the most frequently read “Uoichiba” from among them and making the reading correct is conceivable. This method is based on the idea of determining how to read according to the learning frequency of the word bigram, for example, and it is possible to improve the accuracy of reading by defining an appropriate bigram set.

However, there are a large number of the same notation different pronunciation groups as described above, and the number of words that can be connected to these same notation different pronunciation groups is very large. For this reason, in order to create an appropriate bigram set, a large amount of learning corpora including how to read correct answers is required, but it is not very practical to obtain such a large amount of learning corpora. In addition, instead of learning the word bigram, there may be a method in which, for example, an attached word having a characteristic connected to each notation differently pronounced word set is described as a rule in advance and used at the time of analysis. However, it can be said that it is substantially difficult to describe the rules for all the homophones having the same notation.

The present invention has been made to solve such problems, and is a reading determination device, a speech synthesizer, a reading determination method, a reading determination program, and a computer readable medium that can easily and appropriately determine how to read a word. The main purpose is to provide

One aspect of the present invention for achieving the above object is a reading determination apparatus for determining how to read a word, having a plurality of reading candidates, and a word set comprising a plurality of element words similar to the reading candidate Respectively, a corpus database for storing corpus information including a plurality of example sentences, and the word generated by the word set generating means based on the corpus information stored in the corpus database. A feature amount calculating means for calculating feature amounts for a plurality of element words in the set, a reading method in which the feature amounts for the plurality of element words in the word set calculated by the feature amount calculating means are associated with the reading candidates, respectively. It is a reading judgment apparatus characterized by comprising reading judgment information generating means for generating judgment information.

Another aspect of the present invention for achieving the above object is a reading determination method for determining how to read a word having a plurality of reading candidates, and includes a plurality of element words similar to the reading candidate. Each of the word sets is generated, and based on the corpus information including a plurality of example sentences, the feature amounts for the plurality of element words of the generated word set are calculated, respectively, and the feature amounts for the plurality of element words of the calculated word set And reading method determination information in which the reading candidates are associated with each other.

Furthermore, one aspect of the present invention for achieving the above object is a non-transitory computer-readable medium that stores a reading determination program for determining how to read a word, having a plurality of reading candidates. Based on the processing for generating each word set composed of a plurality of element words similar to the candidate and corpus information including a plurality of example sentences, the feature amounts for the plurality of element words of the generated word set are calculated. A non-temporary storing a reading judgment program that causes a computer to execute processing, and processing to generate reading judgment information in which the calculated feature quantities for a plurality of element words of the word set and the reading candidates are associated with each other Computer readable medium.

According to the present invention, it is possible to provide a reading determination device, a speech synthesizer, a reading determination method, a reading determination program, and a computer readable medium thereof that can easily and appropriately determine how to read a word.

It is a functional block diagram of the reading judgment apparatus concerning the embodiment of the present invention. It is a block diagram which shows an example of the schematic system configuration | structure of the reading judgment apparatus which concerns on 1st Embodiment of this invention. It is a figure which shows an example of the some entry which linked | related the word, the part of speech, the reading method, and the category on a thesaurus memorize | stored in reading candidate DB. It is a figure which shows an example of thesaurus dictionary information memorize | stored in the thesaurus DB. It is a flowchart which shows an example of the processing flow of the reading judgment apparatus which concerns on 1st Embodiment of this invention. It is a block diagram which shows the schematic system configuration | structure of the reading judgment apparatus which concerns on 2nd Embodiment of this invention. It is a flowchart which shows an example of the processing flow of the reading judgment apparatus which concerns on 2nd Embodiment of this invention. It is a block diagram which shows the schematic system configuration | structure of the reading judgment apparatus which concerns on 3rd Embodiment of this invention. It is a block diagram which shows the schematic system configuration | structure of the reading judgment apparatus which concerns on 7th Embodiment of this invention. It is a block diagram which shows the schematic system configuration | structure of the speech synthesizer concerning 8th Embodiment of this invention. It is a flowchart which shows an example of the processing flow of the speech synthesizer concerning 8th Embodiment of this invention.

Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a functional block diagram of a reading determination apparatus according to an embodiment of the present invention. The reading determination apparatus 10 according to the present embodiment is an apparatus for determining how to read a word having a plurality of reading candidates. The reading determination device 10 generates a word set generation unit 14 that generates a word set, a corpus database 15 that stores corpus information, a feature amount calculation unit 16 that calculates feature amounts of element words, and generates reading determination information. Reading information determination means 17 for reading.

The word set generation unit 14 generates a word set composed of a plurality of element words similar to reading candidates. The corpus database 15 stores corpus information including a plurality of example sentences. Further, the feature amount calculation means 16 calculates the feature amounts for a plurality of element words of the word set generated by the word set generation means 14 based on the corpus information stored in the corpus database 15. The reading determination information generation unit 17 generates reading determination information in which the feature amounts for a plurality of element words in the word set calculated by the feature amount calculation unit 16 are associated with the reading candidates. Thus, by using the word set similar to the reading candidate and the corpus information, the information amount of the reading determination information can be effectively increased and the accuracy thereof can be improved. Therefore, more appropriate and highly accurate reading determination information can be acquired, and further, how to read a word can be determined easily and appropriately using this reading determination information.
(First embodiment)

FIG. 2 is a block diagram showing an example of a schematic system configuration of the reading determination device according to the first embodiment of the present invention. The reading determination apparatus 10 according to the present embodiment includes a same pronunciation different pronunciation word set generation unit 11, a reading candidate DB (database) 12, a thesaurus DB (database) 13, a word set generation unit 14, and a corpus DB (database). ) 15, a context vector generation unit 16, and a reading determination information generation unit 17.

Based on the input word input by the user, the same notation different pronunciation word set generation unit 11 reads from the reading candidate DB 12 a plurality of reading candidates (reading candidate 1, reading candidate 2,..., Reading candidates for the input word. M) and word meanings corresponding to each (meaning 1, meaning 2,..., Meaning M) are acquired.

Here, for example, a phoneme string used when uttering a given text is referred to as “reading”. In the following description, a Japanese syllable sequence is used as a phoneme sequence, but an arbitrary phonetic symbol sequence such as an international phonetic symbol (IPA) can be used without depending on a language. Furthermore, “how to read” includes accent information (accent position, separation, etc.) indicating how the corresponding phoneme string is uttered, and supplementary reading information such as vowel devoicing, accompanying the phoneme string. Also good.

The same notation different pronunciation word set generation unit 11 outputs a plurality of reading candidates for the acquired input word to the word set generation unit 14.

The reading candidate DB 12 stores, for example, a plurality of entries in which words, parts of speech, readings, and categories on the thesaurus are associated as a set as shown in FIG. For example, the notation different pronunciation word set generation unit 11 searches and acquires a corresponding entry from a plurality of entries stored in the reading candidate DB 12 based on the input word “Kuroko”. In this case, the notation different pronunciation word set generation unit 11 reads from the reading candidate DB 12, for example, two entries (number of readings M = 2), reading candidate 1 = “mole” (category is body stain), and , Reading candidate 2 = “Kurogo” (category is Kabuki assistant).

The thesaurus DB 13 stores systematic thesaurus dictionary information by classifying each word according to the upper / lower relationship, partial / whole relationship, synonym relationship, synonym relationship, etc. of a plurality of words as shown in FIG. is doing.

The word set generation unit (word set generation means) 14 selects a word set composed of a plurality of element words similar to a plurality of reading candidates from the notation different pronunciation word set generation unit 11 based on thesaurus dictionary information of the thesaurus DB 13. , Respectively. Here, the plurality of element words similar to the reading candidate include, for example, words belonging to the same category as the reading candidate category on the thesaurus, and include words having the same meaning in a broad sense.

The word set generation unit 14 extracts words belonging to the same category as reading candidate 1 = “mole” on the thesaurus shown in FIG. 4 based on thesaurus dictionary information in the thesaurus DB 13, and reading candidate 1 = “mole” 1 is generated. Here, word set 1 = {element word 1-1: stain, element word 1-2: blot, element word 1-3: black, element word 1-4: mole, element word 1-5: mole, element word 1-6: lentigo}, category = {body stain}, and the number of element words N1 = 6.

Similarly, the word set generation unit 14 generates the word set 2 for the reading candidate 2 = “Kurogo”. Here, word set 2 = {element word 2-1: Kuroko, element word 2-2: black, element word 2-3: black, element word 2-4: guardianship, element word 2-5: black tool }, Category = {kabuki assistant}, and the number of element words N2 = 5.

The word set generation unit 14 uses the thesaurus dictionary information in the thesaurus DB 13 to generate a word set of reading candidates. However, the word set generation unit 14 is not limited to this, and may use synonym dictionary information, for example. The dictionary information can be used. Further, as disclosed in Non-Patent Document 2, the word set generation unit 14 can determine whether each word belongs to the same category by calculating a case frame. Further, the reading candidate DB 12 can have corresponding frames (each frame 1,..., Each frame N) instead of the word meaning (meaning 1,..., Word N). The word set generation unit 14 outputs the generated word set to the context vector generation unit 16.

The corpus DB 15 stores corpus information including a plurality of example sentences (text data). Here, the example of the corpus information may not be given a reading.

The context vector generation unit 16 calculates context vectors for a plurality of element words of the word set generated by the word set generation unit 14 based on the corpus information stored in the corpus DB 15. The context vector generation unit 16 first extracts an example sentence using the element word from the corpus information of the corpus DB 15 for each element word of the word set generated by the word set generation unit 14. . Then, the context vector generation unit 16 calculates the context vector of the element word using each extracted example sentence. Here, the context vector is used as, for example, a feature amount indicating the situation of words around the corresponding word.

In addition, in the vector space method described in Non-Patent Document 2 (Iwanami Lecture, Software Science 15, Natural Language Processing, pp. 421-424), 1 does not appear when the index word T (i) appears in the entire document. In this case, a document vector having 0 as a coefficient is known. On the other hand, in the same manner as the document vector, the context vector in this embodiment focuses on the independent words around the corresponding element word in the vector space around all the independent words included in the example sentence of the corpus information, and calculates the coefficient. You may choose.

For example, it is assumed that the context vector generation unit 16 extracts an example sentence S {T21 T32 T52 T7 T42 T64 T73 T12} in which the element word T7 is used from the corpus information of the corpus DB 15. Then, the context vector generation unit 16 can calculate a context vector extracted by two words before and after (wl = 2) for the element word T7 by the following formula.
D (T7; S) = Σ (i = 1, t) a (i) * V (i)
= [0 ... 0 V (32) 0 ... 0 V (42) 0 ... 0.0 V (52) 0 .... 0.0 V (64) 0 .... 0.0]

In the present embodiment, the description will be given focusing on the context vector as the feature amount expressing the situation of the words around the corresponding element word, but the present invention is not limited to this. For example, the corpus information of the words around the corresponding element word is used. A feature quantity other than the vector expression may be used, such as using the sum of the appearance probabilities. In the following description, it will be described as a context vector including feature quantities other than these vector expressions.

Further, when calculating the context vector, in order to reduce ambiguity such that each word has a plurality of meanings in the same layer, determine which word corresponds to the same word existing in the thesaurus, It is also useful to assign categories. This method can be identified using, for example, a case frame shown in Non-Patent Document 2 (Iwanami Lecture, Software Science 15, Natural Language Processing, pp. 235-240).

Furthermore, when calculating the context vector, it is possible to extract words that represent more features using grammatical knowledge, compress the context vector by a general method such as synonym compression or dimension compression of each element word, The use efficiency of the context vector space may be improved. Regardless of the use or non-use of these methods, it can be said that the storage of the context vector can be realized at a low additional cost because it requires a smaller storage capacity than the storage of the word bigram.

Furthermore, when there are a plurality of example sentences in which the corresponding element word is used in the corpus information of the corpus DB 15, the context vector generation unit 16 selects all of the example sentences or a predetermined number of example sentences. A similar process is performed to calculate a plurality of context vectors. On the other hand, when there is no example sentence using the corresponding element word in the corpus information, the context vector generation unit 16 does not calculate the context vector of the element word. The context vector generation unit 16 outputs to the reading determination information generation unit 17 the context vector calculated as described above and used as a plurality of element words of the word set.

The reading determination information generation unit (reading determination information generation means) 17 generates reading determination information in which context vectors for a plurality of element words of the word set calculated by the context vector generation unit 16 are associated with reading candidates. For example, for each reading candidate, the reading determination information generation unit 17 calculates a representative average context vector (representative context vector) obtained by arithmetically averaging a plurality of corresponding context vectors in the context vector space. Then, the reading determination information generation unit 17 generates reading determination information in which each reading candidate is associated with the calculated representative average context vector (arithmetic mean value).

The reading determination device 10 includes, for example, a CPU (Central Processing Unit) that performs control processing, arithmetic processing, and the like, a ROM (Read （OnlyｍMemory) that stores a control program executed by the CPU, an arithmetic program, processing data, and the like Is composed of a microcomputer mainly including a RAM (Random Access Memory) or the like that temporarily stores. Also, the same notation different pronunciation word set generation unit 11, word set generation unit 14, context vector generation unit 16, reading determination information generation unit 17, and later-described reading determination unit 21 are stored in the ROM, for example, It can be realized by a program executed by the CPU.

FIG. 5 is a flowchart showing an example of a processing flow of the reading determination apparatus according to the first embodiment of the present invention. First, the same notation different pronunciation word set generation unit 11 acquires a plurality of reading candidates for the input word from the reading candidate DB 12 based on the input word input by the user (step S101), and acquires the plurality of acquired readings. Candidates are output to the word set generation unit 14.

Next, the word set generation unit 14 generates a word set composed of a plurality of element words similar to a plurality of reading candidates from the notation different pronunciation word set generation unit 11 based on thesaurus dictionary information in the thesaurus DB 13. In step S102, the generated word set is output to the context vector generation unit 16.

Thereafter, the context vector generation unit 16 calculates context vectors for a plurality of element words of the word set generated by the word set generation unit 14 based on the corpus information stored in the corpus DB 15 (step S103). The calculated context vector is output to the reading determination information generation unit 17.

Further, the reading determination information generation unit 17 calculates a representative average context vector obtained by arithmetically averaging a plurality of corresponding context vectors in the context vector space for each reading candidate (step S104). Then, the reading determination information generation unit 17 generates reading determination information in which each reading candidate is associated with the calculated average context vector (step S105).

As described above, according to the reading determination device 10 according to the first embodiment, a word set composed of element words similar to a plurality of reading candidates is generated, and an average context vector for each reading candidate is calculated using corpus information. Then, reading determination information in which each reading candidate is associated with the average context vector is generated. Thus, by using the word set similar to the reading candidate and the corpus information, the information amount of the reading determination information can be effectively increased and the accuracy thereof can be improved. Therefore, more appropriate and highly accurate reading determination information can be acquired, and further, how to read a word can be determined easily and appropriately using this reading determination information.

Note that, for example, even if a large amount of corpus information (learning corpus) including how to read the correct answer of the input word can be prepared, the present embodiment is more effective than the amount of information related to an arbitrary synonym pronoun word set obtained from the corpus information. Needless to say, the information amount of the reading determination information generated using the word set and the corpus information is larger and more accurate. Moreover, according to the present embodiment, it is not necessary to prepare a large amount of learning corpus information to which correct reading is given, or to write rules for a large number of homophones with the same notation, and information on reading determination information This is superior in that the amount can be increased efficiently and the accuracy of reading can be improved. Further, according to the present embodiment, since reading is estimated based on information such as word similarity, improvement in reading accuracy can be expected by improving the estimation accuracy.
(Second Embodiment)

FIG. 6 is a block diagram showing a schematic system configuration of the reading determination apparatus according to the second embodiment of the present invention. The reading determination device 20 according to the second embodiment reads the input word based on the reading determination information generated by the reading determination information generation unit 17 in addition to the configuration of the reading determination device 10 according to the first embodiment. It further includes a reading determination unit 21 for determining, and an output device 22 for outputting the determination of the reading. As the output device 22, for example, a display device, a printer device, an audio output device, or the like can be used. The reading determination device 20 according to the present embodiment can determine how to read an input word online, for example.

In the reading determination apparatus 20 according to the second embodiment, other configurations are substantially the same as those of the reading determination apparatus 10 according to the first embodiment. Therefore, the same reference numerals are given to the same parts, and detailed description is omitted.

FIG. 7 is a flowchart showing an example of a processing flow of the reading determination apparatus according to the second embodiment of the present invention. For example, information specifying an input sentence and an input word in the input sentence is input to the context vector generation unit 16 (step S201).

Next, the context vector generation unit 16 calculates a context vector for the specified input word in the input sentence as in the first embodiment (step S202), and outputs it to the reading determination unit 21.

Thereafter, the reading determination unit 21 inputs the input word in the input sentence based on the context vector calculated by the context vector generation unit 16 and the average context vector of the reading determination information generated by the reading determination information generation unit 17. Is read (step S203).

Here, according to the first embodiment, the reading determination information is a plurality of sets of information in which an average context vector and a reading candidate are associated with each other. For example, the reading determination unit 21 selects a reading candidate corresponding to the average context vector having the smallest cosine distance (high similarity) from the context vector of the input word among the plurality of average context vectors of the reading determination information. It is determined that the input word is read. The reading determination unit 21 outputs the determined input word reading to the output device 22. The output device 22 outputs how to read the input word output from the reading determination unit 21 by, for example, screen display, print display, voice, or the like (step S204).

As described above, according to the reading determination device 20 according to the second embodiment, the most similar reading is selected from the features of each reading candidate represented by the average context vector, and thus a more appropriate reading can be determined.
(Third embodiment)

FIG. 8 is a block diagram showing a schematic system configuration of a reading determination apparatus according to the third embodiment of the present invention. In addition to the configuration of the reading determination device 20 according to the second embodiment, the reading determination device 30 according to the third embodiment includes an example sentence word acquisition unit 31 that acquires example sentence information of input words from the corpus DB 15, and a reading determination unit 21. And a reading DB 32 for storing the reading of the word determined by the above. The reading determination device 30 according to the present embodiment can determine how to read an input word offline, for example.

In the reading determination apparatus 30 according to the third embodiment, other configurations are substantially the same as those of the reading determination apparatus 20 according to the second embodiment. Therefore, the same reference numerals are given to the same parts, and detailed description is omitted.

For example, the example sentence word acquisition unit 31 acquires a plurality of example sentences including the input word from the corpus DB 15, extracts information specifying the input word from each example sentence, and outputs the information to the context vector generation unit 16. The context vector generation unit 16 calculates the context vector for the input word of each example sentence using the information of each example sentence from the example sentence word acquisition unit 31, and outputs it to the reading determination unit 21.

The reading determination unit 21 selects reading candidates corresponding to the average context vector having the smallest cosine distance from the context vector for each example sentence among the plurality of average context vectors of the reading determination information generated by the reading determination information generation unit 17. And how to read the input word in the example sentence. Then, the reading determination unit 21 outputs the determined reading of the input word in each example sentence to the reading DB 32 and stores it in the reading DB 32. Further, the reading determination unit 21 selects one of the statistically highest readings from the plurality of readings stored in the reading DB 32, determines the reading as the corresponding input word, and outputs the determined reading as the output device 22. Output for.

As described above, according to the reading determination device 30 according to the third embodiment, a plurality of context vectors for the input word are generated using the corpus information, each reading is determined, and stored in the reading DB 32. Then, based on the frequency of the plurality of readings stored in the reading DB 32, it is possible to statistically determine how to read the input word.
(Fourth embodiment)

In the reading determination devices 10, 20, and 30 according to the first to third embodiments, the reading determination information generation unit 17 stores reading determination information in which each reading candidate is associated with one representative average context vector. Although generated, in the reading determination device 40 according to the fourth embodiment, the reading determination information generation unit 47 generates reading determination information in which each reading candidate and a plurality of context vectors are associated with each other.

In this case, the reading determination information is a set of combinations of each context vector calculated by the context vector generation unit 16 and corresponding reading candidates corresponding to the number of context vectors. On the other hand, the reading determination information of the first to third embodiments is a collection of the average context vector and the reading candidate corresponding to the average context vector as many as the number of reading candidates. Therefore, the reading determination information according to the present embodiment has a larger information amount and higher accuracy.

As described above, the reading determination information generation unit 47 sets the reading determination information obtained by combining all the context vectors generated by the context vector generation unit 16 and the reading candidates corresponding to the respective context vectors. Output for.

In the reading determination device 40 according to the fourth embodiment, other configurations are substantially the same as the reading determination devices 10, 20, and 30 according to the first to third embodiments. Therefore, the same parts are denoted by the same reference numerals, and detailed description thereof is omitted.

In this embodiment, as in the second embodiment, the reading determination unit 21 resembles the cosine distance or the like between the context vector obtained from the input sentence and the input word and all context vectors of the reading determination information. Each degree may be calculated. Then, the reading determination unit 21 determines the reading candidate corresponding to the context vector having the highest similarity among all the context vectors of the reading determination information as the reading of the input word.

In the present embodiment, as in the third embodiment, the reading determination unit 21 calculates the similarity between the context vector obtained from each example sentence in the corpus DB 15 and all the context vectors of the reading determination information. Also good. Then, the reading determination unit 21 determines the reading candidate corresponding to the context vector having the highest similarity to the context vector of each example sentence among all the context vectors of the reading determination information as the reading of the input word in the example sentence. Output to the reading DB 32.

As described above, according to the reading determination device 40 according to the fourth embodiment, even when a single representative context vector is obtained and sufficient separation between readings is difficult as in the first to third embodiments. Using the similarity of the context vectors of each element word, an appropriate reading method can be determined.
(Fifth embodiment)

In the reading determination device 10 according to the first embodiment, the word set generation unit 14 includes a word set including element words belonging to the same category on the thesaurus as a word set including a plurality of element words similar to a plurality of reading candidates. However, in the reading determination apparatus 50 according to the fifth embodiment, the word set generation unit 54 generates a word set including element words belonging to adjacent categories in the same hierarchy on the thesaurus. Thus, by generating a word set including a broader synonym, it is possible to effectively increase the information amount of the reading determination information and improve the accuracy thereof.

In the reading determination device 50 according to the fifth embodiment, the other configuration is substantially the same as that of the reading determination device 10 according to the first embodiment. Therefore, the same parts are denoted by the same reference numerals, and detailed description thereof is omitted.

Note that the word set generation unit 54 may control the similarity by changing the method of selecting adjacent categories. For example, in the same way as in the first embodiment, the same notation different pronunciation word set generation unit 11 receives, for example, two entries from the reading candidate DB 12 for the input word “Kuroko”, reading candidate 1 = “mole” ( Category is body stain), and reading candidate 2 = "Kurogo" (category is Kabuki assistant). And the word set production | generation part 54 produces | generates the word set comprised from the some element word which has a similar relationship with the reading candidate corresponding to each reading candidate 1 and 2. FIG.

Here, as described above, the element words having the similar relationship include, in addition to the element words belonging to the same category on the thesaurus, the element words belonging to the adjacent category of the same hierarchy on the thesaurus. The adjacent category refers to the closest category using, for example, the degree of relationship between categories defined on the thesaurus. In FIG. 4, for the category including the reading candidate 1 = “mole”, one level is shown in the upper direction and one level is shown in the lower direction.

The word set generation unit 54 extracts, for example, element words that belong to the same category as the category to which the reading candidate 1 “mole” belongs and a close category on the thesaurus shown in FIG. Then, the word set generation unit 54 follows the similarity between categories defined in advance in the thesaurus from the “body surface state” category and the “body color” category in the same hierarchy as the “body stain” category. , Select the “body surface” category. Further, the word set generation unit 54 extracts element words belonging to the “body stain” category and the “body surface state” category, and generates a word set 1.

Note that word set 1 = {element word 1-1: stain, element word 1-2: blot, element word 1-3: black child, element word 1-4: mole, element word 1-5: mole, element word 1 -6: lentigo, element word 1-7: acne, element word 1-8: pimple, element word 1-9: pores, category = {body stain}, and number of element words N1 = 9.

As described above, according to the reading determination device 50 according to the fifth embodiment, in addition to the element words belonging to the same category on the thesaurus, the word set including the element words belonging to the adjacent categories of the same hierarchy on the thesaurus is generated. Thus, a word set including a broader synonym can be generated.
(Sixth embodiment)

In the reading determination device 10 according to the first embodiment, the word set generation unit 14 includes a word set including element words belonging to the same category on the thesaurus as a word set including a plurality of element words similar to a plurality of reading candidates. In the reading determination device 60 according to the sixth embodiment, the word set generation unit 64 generates a word set including element words that belong to the upper hierarchy and / or lower hierarchy categories on the thesaurus. Also good. Thus, by generating a word set including broader synonyms for the upper and lower relations of the concept, the information amount of the reading determination information can be effectively increased and the accuracy thereof can be improved.

In the reading determination device 60 according to the sixth embodiment, the other configuration is substantially the same as that of the reading determination device 10 according to the first embodiment. Therefore, the same parts are denoted by the same reference numerals, and detailed description thereof is omitted.

Note that the word set generation unit 64 can control the degree of similarity depending on how many upper and / or lower hierarchies are targeted in the thesaurus.

For example, in the same way as in the first embodiment, the same notation different pronunciation word set generation unit 11 receives, for example, two entries from the reading candidate DB 12 for the input word “Kuroko”, reading candidate 1 = “mole” ( Category is body stain), and reading candidate 2 = "Kurogo" (category is Kabuki assistant). And the word set production | generation part 64 produces | generates the word set comprised from the some element word which has the similar relationship with the reading candidate 1 and 2 corresponding to each reading candidate 1 and 2. FIG.

Here, as described above, in addition to the element words belonging to the same category on the thesaurus, the element words having the similar relationship belong to the category of the upper hierarchy and / or the lower hierarchy of the number of hierarchies set in advance on the thesaurus. Element words are also included.

For example, on the thesaurus shown in FIG. 4, the word set generation unit 64 includes element words that belong to the same category as the category to which the reading candidate 1 “mole” belongs, and elements that belong to categories of upper and lower hierarchies having a preset number of hierarchies. Each word is extracted.

Here, there is one “body surface” category in the upper level of the “body stain” category, and there are also “color of body stain” categories and “ It is in the category “Body stain shape”. Therefore, the word set generation unit 64 extracts element words that belong to the categories of the upper one layer and the lower one layer, and generates the word set 1.

Note that word set 1 = {element word 1-1: stain, element word 1-2: blot, element word 1-3: black child, element word 1-4: mole, element word 1-5: mole, element word 1 -6: Lentigo, Element word 1-7: Color, Element word 1-8: Stain, Element word 1-9: Amber, Element word 1-10: Red, Element word 1-11: Black, Element word 1-12 : Gray, element word 1-13: circle, element word 1-14: dot, element word 1-15: triangle}, category = {body stain}, and the number of element words N1 = 15.

As described above, according to the reading determination device 60 according to the sixth embodiment, in addition to the element words belonging to the same category on the thesaurus, it belongs to the category of the upper hierarchy and / or the lower hierarchy of the number of hierarchies set in advance on the thesaurus. By generating a word set including element words, a word set including a broader synonym can be generated.
(Seventh embodiment)

FIG. 9 is a block diagram showing a schematic system configuration of a reading determination apparatus according to the seventh embodiment of the present invention. In addition to the configuration of the reading determination apparatus 10 according to the first embodiment, the reading determination apparatus 70 according to the seventh embodiment of the present invention overlaps with an element word deletion unit 71 that detects and deletes overlapping element words. And a context vector deletion unit 72 that detects and deletes the context vector.

The element word deletion unit 71 detects overlapping element words among the word sets for the plurality of reading candidates 1 to M generated by the word set generation unit 14. Here, the above-described overlapping element word indicates, for example, a case where at least one set of element words overlaps. Then, the element word deletion unit 71 deletes one of the overlapping element words from the word set, and outputs the deleted word set to the context vector generation unit 16. On the other hand, the element word deletion unit 71 outputs a word set that does not include duplicated element words to the context vector generation unit 16 as it is.

Here, overlapping element words have the same context vector. For this reason, if there are overlapping element words between word sets of a plurality of reading candidates, the degree of duplication of reading determination information generated based on the element words also increases. Therefore, by removing previously overlapping element words, the degree of separation in the reading determination information can be increased, and the accuracy of the reading determination information can be increased.

The context vector deletion unit 72 selects the same context vector among the word sets 1 to M from among the context vectors corresponding to the element words of the plurality of word sets 1 to M generated by the context vector generation unit 16. Then, one of the context vectors is deleted and output to the reading determination information generation unit 17. The same context vector refers to a case where at least one set of context vectors is the same, for example.

Here, if there are overlapping context vectors among the plurality of word sets 1 to M, the degree of duplication of the reading determination information generated based on the context vectors also increases. Therefore, by removing overlapping context vectors in advance, the degree of separation in the reading determination information can be increased, and the accuracy of the reading determination information can be increased.

The context vector deletion unit 72 is closer than a predetermined distance from among the context vectors for the element words of the plurality of word sets 1 to M generated by the context vector generation unit 16, and is similar to each other. The detected context vector may be detected, and one of the context vectors may be deleted.

For example, when the cosine distance on the context vector space is smaller than the predetermined threshold ε, the context vector deletion unit 72 determines that the distance is closer than the predetermined distance. The context vector deletion unit 72 deletes one of the detected sets of context vectors close to each other and outputs the result to the reading determination information generation unit 17.

Here, when there are adjacent context vectors among the plurality of word sets 1 to M, the degree of duplication of the reading determination information generated based on the context vectors also increases. Therefore, by removing context vectors that are close in advance, the degree of separation in the reading determination information can be increased, and the accuracy of the reading determination information can be increased.

Further, the context vector deletion unit 72 may multiply the context vector generated by the context vector generation unit 16 by a weighting factor for enhancing the feature. The context vector detection unit 72 may detect the above-described adjacent context vector using the context vector multiplied by the weighting factor, and delete the detected context vector.

For example, for the context vector D = Σ (i = 1, t) a (i) * V (i), the importance of each element word V (i) is b (i). Here, as the importance b (i), for example, a value of a scale tf-idf indicating whether or not a corresponding word is characteristic when it appears in the corpus information may be used. The value of tf-idf is a value calculated from two indices, tf (word appearance frequency) and idf (reverse appearance frequency). A weighting coefficient is set according to the importance b (i) of each word V (i). As a result, when obtaining the similarity between two context vectors, the context vector D is multiplied by the weighting coefficient, whereby the difference regarding the word with high characteristic is emphasized and the difference regarding the word with low characteristic is reduced. Therefore, it is possible to perform similarity calculation more reflecting the characteristics of the corpus information.
(Eighth embodiment)

FIG. 10 is a block diagram showing a schematic system configuration of a speech synthesizer according to the eighth embodiment of the present invention. A speech synthesizer 80 according to the fifth embodiment includes a morpheme analyzer 81 that performs morphological analysis of an input sentence, a reading determination device 20 according to the second embodiment, and a speech generator 82 that generates synthesized speech. ing.

The morpheme analysis unit 81 performs morpheme analysis on the input sentence, divides the input sentence into morphemes, extracts independent words from the plurality of morphemes, and outputs them to the reading determination device 20. Based on the information on how to read the input sentence output from the reading determination device 20, the voice generation unit 82 generates a waveform of a synthesized voice for the input sentence using, for example, a waveform connection type speech synthesis method. Note that the reading information used in the speech synthesis includes, for example, not only a phoneme string but also information on an accent position. As a result, for example, the noun “Tani” can be uttered separately in a head-high shape when used as a person's name, and in a flat shape when used as the opposite of a mountain.

FIG. 11 is a flowchart showing an example of the processing flow of the speech synthesizer according to the eighth embodiment of the present invention. When the input sentence is input to the morpheme analysis unit 81 (step S301), the morpheme analysis unit 81 performs morpheme analysis on the input sentence (step S302), divides the input sentence into a plurality of morphemes, and becomes independent. Extract words. Then, the morpheme analysis unit 81 outputs the extracted independent word as an input word together with the input sentence to the reading determination device 20. Next, the reading determination device 20 performs the above-described reading determination process based on the input sentence and the input word from the morphological analysis unit 81 (step S303), and determines the reading for all independent words (step S304). Then, information on how to read the input sentence is generated (step S305). The reading determination device 20 outputs information on how to read the generated input sentence to the voice generation unit 82. The voice generation unit 82 generates a synthesized voice waveform based on the information on how to read the input sentence from the reading judgment device 20 (step S306), and outputs the voice of the generated synthesized voice waveform (step S307).

Note that the present invention is not limited to the above-described embodiment, and can be appropriately changed without departing from the spirit of the present invention. In the above-described embodiments, the present invention has been described as a hardware configuration, but the present invention is not limited to this. The present invention can also realize arbitrary processing by causing a CPU to execute a computer program. )

The program can be stored and supplied to a computer using various types of non-transitory computer readable media. Non-transitory computer readable media include various types of tangible storage media (tangible storage medium). Examples of non-transitory computer-readable media include magnetic recording media (eg flexible disks, magnetic tapes, hard disk drives), magneto-optical recording media (eg magneto-optical discs), CD-ROMs (Read Only Memory), CD-Rs, CD-R / W, semiconductor memory (for example, mask ROM, PROM (Programmable ROM), EPROM (Erasable ROM), flash ROM, RAM (random access memory)) are included. The program may also be supplied to the computer by various types of temporary computer-readable media. Examples of transitory computer readable media include electrical signals, optical signals, and electromagnetic waves. The temporary computer-readable medium can supply the program to the computer via a wired communication path such as an electric wire and an optical fiber, or a wireless communication path.

The present invention can be applied to, for example, a reading determination device that determines an appropriate reading for a word or a sentence.

This application claims priority based on Japanese Patent Application No. 2009-084920 filed on Mar. 31, 2009, the entire disclosure of which is incorporated herein.

10 Reading Judgment Device 11 Same Pronunciation Different Pronunciation Group Generation Unit 12 Reading DB
13 Thesaurus DB
14 word set generator 15 corpus DB
16 Context vector generation unit 17 Reading determination information generation unit 21 Reading determination unit 22 Output device 31 Example word acquisition unit 32 Reading DB
71 Element word deletion output unit 72 Context vector deletion unit 80 Speech synthesizer

Claims

A reading determination device for determining how to read a word having a plurality of reading candidates,
A word set generation means for generating a word set composed of a plurality of element words similar to the reading candidate,
A corpus database that stores corpus information including multiple example sentences;
Based on the corpus information stored in the corpus database, feature amount calculation means for calculating feature amounts for a plurality of element words of the word set generated by the word set generation means;
A reading determination information generating means for generating reading determination information in which the characteristic amounts for a plurality of element words of the word set calculated by the feature amount calculation means are associated with the reading candidates, respectively. Reading judgment device.
The reading judgment device according to claim 1,
The feature amount calculating means calculates, as the feature amounts, context vectors for a plurality of element words of the word set generated by the word set generation means based on the corpus information stored in the corpus database. A reading judgment device characterized by having a context vector generation unit.
A reading judgment device according to claim 2,
The reading determination information generating means calculates a representative context vector based on the context vectors for a plurality of element words of the word set calculated by the context vector generation unit, the representative context vector, and the reading A reading determination device characterized by generating reading determination information associated with each candidate.
A reading judgment device according to claim 3,
The reading determination information generating means calculates, as the representative context vector, an average value of context vectors for a plurality of element words of the word set calculated by the context vector generation unit. .
A reading judgment device according to any one of claims 1 to 4,
An element word of the word set includes a synonym of the reading candidate.
The reading judgment device according to any one of claims 1 to 5,
An element word of the word set includes a word belonging to a category that has a higher and lower relationship with respect to a category to which the reading candidate belongs on a thesaurus.
The reading judgment device according to any one of claims 1 to 6,
A reading judgment device further comprising a separation processing means for performing a process for increasing the degree of separation between the reading candidates.
The reading determination device according to claim 7,
The reading determination apparatus according to claim 1, wherein the separation processing unit includes a word deletion unit that detects the overlapping element word from element words between the word sets and deletes one of the element words.
The reading judgment device according to claim 7 or 8,
The separation processing means includes a context vector deletion unit that detects the same context vector from context vectors corresponding to element words between the word sets and deletes one of the context vectors. apparatus.
The reading judgment device according to any one of claims 7 to 9,
The method according to claim 1, wherein the separation processing unit includes a context vector deletion unit that detects mutually similar context vectors from context vectors corresponding to element words between the word sets and deletes one of the context vectors. Judgment device.
The reading judgment device according to claim 9, wherein
The context vector deletion unit sets a weighting factor corresponding to the importance of the element word, and detects the similar context vectors based on the context vector multiplied by the weighting factor. Reading judgment device.
The reading judgment device according to any one of claims 1 to 10,
A reading judgment apparatus, further comprising reading judgment means for judging how to read a word based on the reading judgment information generated by the reading judgment information generation means.
The reading judgment device according to claim 11,
The reading determination means determines a reading candidate corresponding to the representative context vector having a high similarity to the context vector of the input word among the plurality of representative context vectors of the reading determination information as the reading of the input word. The reading judgment apparatus characterized by the above.
The reading judgment device according to claim 12, wherein
Further comprising example sentence word acquisition means for acquiring a plurality of example sentence information including the word from a corpus database;
The context vector generation means generates a context vector based on each example sentence information acquired by the example sentence word acquisition means,
The reading determination means determines how to read the word in the example sentence information corresponding to the context vector, and determines the most frequent reading among the determined reading as the reading of the word; A device for judging how to read.
The reading judgment device according to any one of claims 1 to 14,
A thesaurus database for storing thesaurus dictionary information;
The reading method determining apparatus according to claim 1, wherein the word set generation unit generates a word set including a plurality of element words similar to the reading candidate based on thesaurus dictionary information of the thesaurus database.
The reading judgment device according to any one of claims 1 to 15,
A speech synthesizer characterized by synthesizing speech based on how to read the word determined by the reading determination device.
The speech synthesizer according to claim 16, wherein
Morphological analysis is performed on the input text, and the morphological analysis means for dividing the input text into morphemes,
The reading determination device for determining how to read the morpheme divided by the morpheme analysis means;
A speech synthesizer comprising: a speech generation unit that synthesizes speech based on how to read the input sentence determined by the reading determination device.
A reading determination method for determining how to read a word having a plurality of reading candidates,
A word set composed of a plurality of element words similar to the reading candidate is generated respectively.
Based on the corpus information including a plurality of example sentences, the feature amounts for the plurality of element words of the generated word set are calculated, respectively.
A reading determination method characterized by generating reading determination information in which feature quantities for a plurality of element words in the calculated word set are associated with the reading candidates.
A method for judging how to read according to claim 18,
A reading method determination method, characterized in that, based on the corpus information, context vectors for a plurality of element words of the generated word set are respectively calculated as the feature amounts.
A method for judging how to read according to claim 19,
Calculating a representative context vector based on context vectors for a plurality of element words of the calculated word set, and generating reading determination information in which the representative context vector and the reading candidate are associated with each other. Characteristic reading method.
A method of judging how to read according to any one of claims 18 to 20,
A method for judging how to read, further comprising a process of increasing the degree of separation between the reading candidates.
A method of judging how to read according to any one of claims 18 to 21,
A reading determination method, further comprising: determining how to read a word based on the generated reading determination information.
A non-transitory computer-readable medium having a plurality of reading candidates and storing a reading determination program for determining how to read a word,
A process of generating a word set composed of a plurality of element words similar to the reading candidate,
Based on corpus information including a plurality of example sentences, a process of calculating feature amounts for a plurality of element words of the generated word set, respectively,
A non-transitory computer-readable program that stores a reading determination program that causes a computer to execute a process for generating reading determination information that associates the calculated feature quantities for a plurality of element words of the word set with the reading candidates. Medium.