JP2010197411A

JP2010197411A - Language model update device for voice recognition device, and voice recognition device

Info

Publication number: JP2010197411A
Application number: JP2009038611A
Authority: JP
Inventors: Koji Okabe; 浩司岡部; Ryosuke Isotani; 亮輔磯谷; Toru Iwazawa; 透岩沢; Takeshi Hanazawa; 健花沢; Seiya Osada; 誠也長田; Takenori Tsujikawa; 剛範辻川; Fumihiro Adachi; 史博安達; Takayuki Arakawa; 隆行荒川
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2009-02-20
Filing date: 2009-02-20
Publication date: 2010-09-09

Abstract

<P>PROBLEM TO BE SOLVED: To provide a language model update device for voice recognition devices, which can improve a recognition rate of a specified word in which improvement of the recognition rate is desired, and can suppress the degradation of the recognition rate of a similar word. <P>SOLUTION: The language model update device for voice recognition devices 100 comprises: a keyword storage means 101 for storing a keyword; an influence degree calculation means 105 for calculating an influence degree to a similar word by improving a recognition rate of the keyword; a language score adjusting means 104 for adjusting at least either a bonus given to the language score of the keyword, or a penalty given to the language score of the similar word based on the influence degree; and a language score update means 106 for giving at least either the bonus which is adjusted by the language score adjusting means 104 to the language score of the keyword, or the penalty which is adjusted by the language score adjusting means 104 to the language score of the similar word. Thus, the language score of a language model 107 is updated. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、音声認識装置用言語モデル更新装置、音声認識装置、音声認識装置用言語モデル更新方法、音声認識方法、コンピュータプログラムおよび記録媒体に関する。 The present invention relates to a language model update device for a speech recognition device, a speech recognition device, a language model update method for a speech recognition device, a speech recognition method, a computer program, and a recording medium.

話者の音声を認識する音声認識装置が、開発され、一部は実用化されている。このような音声認識装置に関する先行技術文献としては、例えば、特許文献１および２がある。 A speech recognition device for recognizing a speaker's speech has been developed and partly put into practical use. For example, Patent Documents 1 and 2 are prior art documents related to such a speech recognition apparatus.

特許文献１記載の音声認識装置は、言語モデルを用いて、入力音声を認識する音声認識装置である。この音声認識装置では、言語モデルにおいて、前記入力音声の単語列に対する確率値が最大となる経路を探索して、認識結果である単語列を出力する。特許文献１に記載の音声認識装置は、認識辞書に登録されていない未登録単語をユーザが、自分で登録できる単語登録機能を有する。 The speech recognition device described in Patent Document 1 is a speech recognition device that recognizes input speech using a language model. In this speech recognition apparatus, the language model is searched for a path having the maximum probability value for the word sequence of the input speech, and a word sequence as a recognition result is output. The speech recognition apparatus described in Patent Literature 1 has a word registration function that allows a user to register unregistered words that are not registered in the recognition dictionary.

特許文献２記載の音声認識装置では、音声信号と単語辞書が音響的にどの程度近いかを示す音響スコアに、単語の使用頻度スコアを規定の割合で加算し、距離値を算出し、距離値の小さいものを結果として出力する。 In the speech recognition apparatus described in Patent Literature 2, the word score is added to the acoustic score indicating how close the speech signal and the word dictionary are acoustically at a specified rate, the distance value is calculated, and the distance value is calculated. The result with the smaller value is output.

これらの従来の音声認識装置の中でも、例えば、特許文献１に記載の音声認識装置のように、認識辞書に登録されていない未登録単語を自分で登録する単語登録機能を有する音声認識装置には、次のような問題がある。すなわち、このような従来の音声認識装置における単語登録機能は、単純に表記と読みを与える機能であり、登録後の言語スコアを適切に設定することができない。このため、ユーザにより追加された登録語以外の別の単語が、登録語と誤って認識されてしまう場合がある。このような問題の原因は、従来の音声認識装置では、特定の単語の認識率を向上させることはできても、類似語の認識率の過剰低下を防止できないことにある。 Among these conventional speech recognition devices, for example, a speech recognition device having a word registration function for registering unregistered words that are not registered in the recognition dictionary, such as the speech recognition device described in Patent Document 1, There are the following problems. That is, the word registration function in such a conventional speech recognition apparatus is a function that simply gives notation and reading, and the language score after registration cannot be set appropriately. For this reason, another word other than the registered word added by the user may be mistakenly recognized as a registered word. The cause of such a problem is that a conventional speech recognition apparatus can improve the recognition rate of a specific word, but cannot prevent an excessive decrease in the recognition rate of similar words.

特開２００７−２２６０９１号公報JP 2007-226091 A 特開平１０−０９７２８５号公報Japanese Patent Laid-Open No. 10-097285

そこで、本発明は、単語登録機能を有する音声認識装置において、認識率を高めたい特定の単語（本発明において「キーワード」ともいう）の認識率の向上と類似語の認識率の低下抑制の両方を実現できる音声認識装置用言語モデル更新装置および音声認識装置等を提供することを目的とする。 Therefore, the present invention provides a speech recognition apparatus having a word registration function that improves both the recognition rate of a specific word (also referred to as “keyword” in the present invention) for which the recognition rate is to be increased and suppresses the decrease in the recognition rate of a similar word. An object of the present invention is to provide a speech model recognition device language model update device, a speech recognition device, and the like.

本発明の音声認識装置用言語モデル更新装置は、キーワード記憶手段と、影響度算出手段と、言語スコア調整手段と、言語スコア更新手段とを含み、
前記キーワード記憶手段が、キーワードを記憶し、
前記影響度算出手段が、前記キーワードの認識率を高めることによる類似語に対する影響度を算出し、
前記言語スコア調整手段が、前記影響度に基づき、前記キーワードの言語スコアに与えるボーナスおよび前記類似語の言語スコアに与えるペナルティの少なくとも一方を調整し、
前記言語スコア更新手段が、前記キーワードの言語スコアに対する前記言語スコア調整手段において調整した前記ボーナスの付与、および前記類似語の言語スコアに対する前記言語スコア調整手段において調整したペナルティの付与の少なくとも一方を行うことにより、言語モデルの言語スコアを更新することを特徴とする。 The language model update device for a speech recognition apparatus of the present invention includes a keyword storage unit, an influence degree calculation unit, a language score adjustment unit, and a language score update unit,
The keyword storage means stores a keyword,
The influence calculating means calculates an influence on a similar word by increasing a recognition rate of the keyword;
The language score adjusting means adjusts at least one of a bonus given to the language score of the keyword and a penalty given to the language score of the similar word based on the degree of influence,
The language score update unit performs at least one of the provision of the bonus adjusted by the language score adjustment unit with respect to the language score of the keyword and the provision of a penalty adjusted by the language score adjustment unit with respect to the language score of the similar word. Thus, the language score of the language model is updated.

本発明の音声認識装置は、前記本発明の音声認識装置用言語モデル更新装置と、
音声を入力する音声入力手段と、
前記入力された音声の特徴量を抽出する特徴量抽出手段と、
言語モデルと、
音声認識手段とを含み、
前記言語モデル更新装置において前記言語モデルを更新し、
前記音声認識手段が、前記特徴量抽出手段により抽出された前記特徴量から前記更新した言語モデルを用いて前記音声を認識することを特徴とする。 The speech recognition device of the present invention includes the language model update device for the speech recognition device of the present invention,
Voice input means for inputting voice;
Feature quantity extraction means for extracting the feature quantity of the input speech;
Language model,
Voice recognition means,
Updating the language model in the language model updating device;
The speech recognition means recognizes the speech using the updated language model from the feature quantity extracted by the feature quantity extraction means.

本発明の音声認識装置用言語モデルの更新方法は、キーワード記憶手段と、影響度算出手段と、言語スコア調整手段と、言語スコア更新手段とを用い、
前記キーワード記憶手段が、キーワードを記憶するキーワード記憶ステップと、
前記影響度算出手段が、前記キーワードの認識率を高めることによる類似語に対する影響度を算出する影響度算出ステップと、
前記言語スコア調整手段が、前記影響度に基づき、前記キーワードの言語スコアに与えるボーナスおよび前記類似語の言語スコアに与えるペナルティの少なくとも一方を調整する言語スコア調整ステップと、
前記言語スコア更新手段が、前記キーワードの言語スコアに対する前記言語スコア調整手段において調整した前記ボーナスの付与および前記類似語の言語スコアに対する前記言語スコア調整手段において調整したペナルティの付与の少なくとも一方を行うことにより、言語モデルの言語スコアを更新する言語スコア更新ステップを含むことを特徴とする。 The method for updating a language model for a speech recognition apparatus according to the present invention uses a keyword storage unit, an influence calculation unit, a language score adjustment unit, and a language score update unit.
A keyword storage step in which the keyword storage means stores a keyword;
An influence degree calculating step in which the influence degree calculating means calculates an influence degree on a similar word by increasing a recognition rate of the keyword;
A language score adjusting step in which the language score adjusting means adjusts at least one of a bonus given to the language score of the keyword and a penalty given to the language score of the similar word based on the degree of influence;
The language score update means performs at least one of the provision of the bonus adjusted by the language score adjustment means for the language score of the keyword and the provision of the penalty adjusted by the language score adjustment means for the language score of the similar word. Thus, a language score update step of updating the language score of the language model is included.

本発明の音声認識方法は、前記本発明の音声認識装置を用い、
前記本発明の音声認識装置用言語モデルの更新方法により前記言語モデルを更新するステップと、
前記特徴量抽出手段が、前記音声入力手段において入力された音声の特徴量を抽出する特徴量抽出ステップと、
前記音声認識手段が、前記特徴量抽出手段により抽出された前記特徴量から前記言語モデルを用いて前記音声を認識する音声認識ステップとを含むことを特徴とする。 The speech recognition method of the present invention uses the speech recognition device of the present invention,
Updating the language model by the language model updating method for a speech recognition apparatus of the present invention;
A feature amount extraction step in which the feature amount extraction means extracts the feature amount of the voice input by the voice input means;
The speech recognition unit includes a speech recognition step of recognizing the speech using the language model from the feature amount extracted by the feature amount extraction unit.

本発明のコンピュータプログラムは、前記本発明の音声認識装置用言語モデル更新方法または前記本発明の音声認識方法をコンピュータ上で実行可能なコンピュータプログラムである。 The computer program of the present invention is a computer program capable of executing the language model updating method for a speech recognition apparatus of the present invention or the speech recognition method of the present invention on a computer.

本発明の記憶媒体は、前記本発明のコンピュータプログラムを格納した記録媒体である。 The storage medium of the present invention is a recording medium storing the computer program of the present invention.

本発明の音声認識装置用言語モデル更新装置および音声認識装置等によれば、音声認識装置において認識率を高めたい特定の単語の認識率の向上と類似語の認識率の低下抑制の両方を実現できる。 According to the speech model recognition device language model update device and the speech recognition device of the present invention, the speech recognition device achieves both the improvement of the recognition rate of a specific word for which the recognition rate is to be increased and the suppression of the decrease in the recognition rate of a similar word. it can.

図１は、本発明の実施形態１の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of Embodiment 1 of the present invention. 図２は、本発明の実施形態１の言語モデル更新処理フロー図である。FIG. 2 is a language model update processing flowchart according to the first embodiment of the present invention. 図３は、本発明の実施形態２の構成を示すブロック図である。FIG. 3 is a block diagram showing the configuration of the second embodiment of the present invention. 図４は、本発明の実施形態２の音声認識処理フロー図である。FIG. 4 is a flowchart of the speech recognition processing according to the second embodiment of the present invention. 図５は、本発明の実施形態３の構成を示すブロック図である。FIG. 5 is a block diagram showing a configuration of the third embodiment of the present invention. 図６は、本発明の実施形態３の言語モデル更新処理フロー図である。FIG. 6 is a language model update processing flowchart according to the third embodiment of the present invention. 図７Ａは、本発明の実施形態４の構成を示すブロック図である。FIG. 7A is a block diagram showing a configuration of the fourth embodiment of the present invention. 図７Ｂは、類似語の類似度とペナルティ値との関連性を例示する図である。FIG. 7B is a diagram illustrating the relationship between the similarity of similar words and the penalty value. 図８は、本発明の実施形態６の構成を示すブロック図である。FIG. 8 is a block diagram showing a configuration of the sixth embodiment of the present invention. 図９Ａは、本発明の実施形態７の構成を示すブロック図である。FIG. 9A is a block diagram showing a configuration of the seventh embodiment of the present invention. 図９Ｂは、ユーザの入力した影響度とペナルティの軽減度との関係を例示する図である。FIG. 9B is a diagram illustrating the relationship between the influence level input by the user and the penalty reduction level. 図１０は、本発明の実施形態７の言語モデル更新処理フロー図である。FIG. 10 is a flowchart of a language model update process according to the seventh embodiment of the present invention. 図１１は、本発明の実施形態８の構成を示すブロック図である。FIG. 11 is a block diagram showing the configuration of the eighth embodiment of the present invention. 図１２は、本発明の実施形態８に使用する言語モデルの構成を例示する図である。FIG. 12 is a diagram illustrating a configuration of a language model used in the eighth embodiment of the present invention.

本発明において、「キーワード」とは、前述のとおり、音声認識装置において認識率を高めたい単語を意味し、例えば、言語モデルに使用される認識辞書に新たに追加する単語を含む。「類似語」とは、前記単語と音響的に類似する単語を意味する。「言語スコア」とは、単語同士の接続の確からしさや、単語列における位置の確からしさ等の、単語列における各単語の言語的な確からしさを数値化した情報を意味する。このような言語スコアには、例えば、言語モデルにおいて各単語について記録される接続確率値等の情報が含まれる。本発明において「ボーナス」とは、前記キーワードの音声認識装置による認識率を高めるために、前記キーワードの言語スコアに付与する前記言語スコアの割増量を意味する。「ペナルティ」とは、音声認識装置における前記類似語への置換、湧き出し誤り率を低下させるために、前記類似語の言語スコアに付与する前記言語スコアの割引量を意味する。 In the present invention, the “keyword” means a word whose recognition rate is to be increased in the speech recognition apparatus as described above, and includes, for example, a word newly added to a recognition dictionary used for a language model. The term “similar word” means a word that is acoustically similar to the word. “Language score” means information obtained by quantifying the linguistic accuracy of each word in a word string, such as the probability of connection between words and the position of a word string. Such a language score includes, for example, information such as a connection probability value recorded for each word in the language model. In the present invention, “bonus” means an additional amount of the language score to be added to the language score of the keyword in order to increase the recognition rate of the keyword by a speech recognition device. “Penalty” means a discount amount of the language score to be given to the language score of the similar word in order to reduce the substitution to the similar word and the error rate of the similar word in the speech recognition apparatus.

前述のとおり、本発明の音声認識装置用言語モデル更新装置は、キーワード記憶手段と、影響度算出手段と、言語スコア調整手段と、言語スコア更新手段とを含み、
前記キーワード記憶手段が、キーワードを記憶し、
前記影響度算出手段が、前記キーワードの認識率を高めることによる類似語に対する影響度を算出し、
前記言語スコア調整手段が、前記影響度に基づき、前記キーワードの言語スコアに与えるボーナスおよび前記類似語の言語スコアに与えるペナルティの少なくとも一方を調整し、
前記言語スコア更新手段が、前記キーワードの言語スコアに対する前記言語スコア調整手段において調整した前記ボーナスの付与および前記類似語の言語スコアに対する前記言語スコア調整手段において調整したペナルティの付与の少なくとも一方を行うことにより、言語モデルの言語スコアを更新することを特徴とする。 As described above, the language model update device for a speech recognition apparatus according to the present invention includes a keyword storage unit, an influence degree calculation unit, a language score adjustment unit, and a language score update unit.
The keyword storage means stores a keyword,
The influence calculating means calculates an influence on a similar word by increasing a recognition rate of the keyword;
The language score adjusting means adjusts at least one of a bonus given to the language score of the keyword and a penalty given to the language score of the similar word based on the degree of influence,
The language score update means performs at least one of the provision of the bonus adjusted by the language score adjustment means for the language score of the keyword and the provision of the penalty adjusted by the language score adjustment means for the language score of the similar word. Thus, the language score of the language model is updated.

本発明の音声認識装置用言語モデル更新装置は、音声認識装置において、前記キーワードが、優先的に認識され、かつ、前記キーワードの類似語の認識率を過剰に低下させないようにすることができる。すなわち、本発明の音声認識装置用言語モデル更新装置では、前記言語スコア更新手段により、音声認識装置用言語モデルにおいて、前記キーワードおよび前記類似語の少なくとも一方の言語スコアに対し前記ボーナスまたは前記ペナルティを付与できる。これにより、前記言語スコア更新手段は、前記キーワードと前記類似語の間で、例えば、前記キーワードを優先的に音声認識手段において認識できるよう、前記単語の言語スコアに差をつけることができる。ここで、例えば、単純に、類似語の前記キーワードに対する類似度のみを考慮して前記類似語の言語スコアに一律にペナルティを付与した場合、前記キーワードの認識率は向上する。しかし、その反面、類似語によっては、例えば、音声認識装置における認識率が過剰に低下してしまう場合がある。具体例を挙げて説明すると、「和菓子」という単語を前記キーワードとして辞書に追加するためにその類似語にペナルティを与えると、一音素しか音響的には違いがない「私」に大きなペナルティがかかる。このため、「和菓子」の認識率が良くなる代わりに、頻出単語である「私」の認識率が悪くなってしまうという問題がある。このように、類似語の中には、会話で頻繁に使用する必要のある言語等も含まれる。そのような重要度の高い類似語の認識率が過剰に低下してしまうと、全体としての認識率が悪化する。しかし、本発明の音声認識装置用言語モデル更新装置では、前記影響度に基づき、前記キーワードの言語スコアに与える前記ボーナスおよび前記類似語の言語スコアに与える前記ペナルティの少なくとも一方を調整できる。そのため、前記キーワードの認識率の向上と前記類似語における認識率の過剰低下の抑制を両立できる。すなわち、本発明によれば、前記キーワードの認識率を高め、かつ前記類似語も適切に認識できる。なお、本発明における前記キーワードおよび前記類似語の特定については、後述する。 According to the language model update device for a speech recognition apparatus of the present invention, the keyword is preferentially recognized in the speech recognition device, and the recognition rate of similar words of the keyword can be prevented from being excessively reduced. In other words, in the language model update device for a speech recognition device according to the present invention, the language score update means gives the bonus or the penalty to the language score of at least one of the keyword and the similar word in the language model for the speech recognition device. Can be granted. Thereby, the language score update means can make a difference in the language score of the word so that, for example, the keyword can be preferentially recognized by the speech recognition means between the keyword and the similar word. Here, for example, when a penalty is uniformly given to the language score of the similar word simply considering only the similarity of the similar word to the keyword, the recognition rate of the keyword is improved. However, on the other hand, depending on similar words, for example, the recognition rate in the speech recognition apparatus may be excessively lowered. To give a specific example, if you give a penalty to a similar word to add the word “Japanese confectionery” to the dictionary as the keyword, “I”, which is acoustically different from one phoneme, will incur a large penalty. . For this reason, instead of improving the recognition rate of “Japanese confectionery”, there is a problem that the recognition rate of “I”, which is a frequent word, deteriorates. As described above, the similar words include languages that need to be frequently used in conversation. If the recognition rate of similar words with high importance decreases excessively, the recognition rate as a whole deteriorates. However, in the language model update device for a speech recognition device according to the present invention, at least one of the bonus given to the language score of the keyword and the penalty given to the language score of the similar word can be adjusted based on the degree of influence. Therefore, it is possible to achieve both improvement in the recognition rate of the keyword and suppression of excessive reduction in the recognition rate in the similar word. That is, according to the present invention, the recognition rate of the keyword is increased, and the similar word can be recognized appropriately. The specification of the keyword and the similar word in the present invention will be described later.

前記本発明の音声認識装置用言語モデル更新装置において、前記キーワード記憶手段は、例えばユーザにより指定されるキーワードを記憶する。前記キーワード記憶手段は、例えば、音声認識装置に使用される認識辞書から前記キーワードを抽出してこれを記憶できるが、これには制限されない。例えば、前記キーワード記憶手段は、前記キーワードに対応する単語が、前記認識辞書に登録されていない場合に、例えば、前記認識辞書に新たに前記単語を記憶させることができるものが好ましい。また、前記キーワード記憶手段は、前記キーワードに対応する単語を記憶することができるものが好ましい。このようなキーワード記憶手段によれば、前記認識辞書にない単語についても、音声認識装置においてキーワードとして認識させることができる。本発明の音声認識装置用言語モデル更新装置において、前記キーワード記憶手段を用いることにより、例えば、ユーザが前記キーワード記憶手段に対して入力した単語または前記単語を含む文書から前記キーワードを特定し、言語モデル更新処理を行うことができる。 In the language model update device for a speech recognition apparatus according to the present invention, the keyword storage means stores, for example, a keyword specified by a user. The keyword storage means can extract and store the keyword from a recognition dictionary used in a speech recognition device, for example, but is not limited thereto. For example, it is preferable that the keyword storage unit can newly store the word in the recognition dictionary, for example, when a word corresponding to the keyword is not registered in the recognition dictionary. The keyword storage means is preferably capable of storing a word corresponding to the keyword. According to such a keyword storage means, a word that is not in the recognition dictionary can be recognized as a keyword in the speech recognition apparatus. In the language model update device for a speech recognition apparatus according to the present invention, by using the keyword storage unit, for example, the keyword is specified from a word input to the keyword storage unit by a user or a document including the word, and the language Model update processing can be performed.

本発明の音声認識装置用言語モデル更新装置において、前記類似語の特定は、例えば、類似度測定手段を用いて行うことができる。前記類似度測定手段は、前記キーワードと類似語との類似の程度を測定できる手段である。このような類似度測定手段としては、例えば、前記認識辞書を参照して、前記キーワードと前記認識辞書に含まれる各単語との音響的な類似度を測定する手段を用いることができる。前記音響的な類似度を測定する尺度として、例えば、音素の編集距離や音素の混合ガウス分布モデル（ＧＭＭ）から計算される音素間距離等を用いることができる。本発明の音声認識装置用言語モデル更新装置では、前記類似度測定手段により測定した前記類似度に基づき、例えば、前記類似度の高いものを前記認識辞書から類似語として抽出することができる。前記認識辞書からの前記類似語の抽出は、例えば、類似語抽出手段を用いて行うことができる。 In the language model update device for a speech recognition apparatus according to the present invention, the similar words can be specified using, for example, a similarity measurer. The similarity measuring unit is a unit that can measure the degree of similarity between the keyword and the similar word. As such a similarity measurement means, for example, a means for measuring the acoustic similarity between the keyword and each word included in the recognition dictionary with reference to the recognition dictionary can be used. As a scale for measuring the acoustic similarity, for example, a phoneme editing distance, a distance between phonemes calculated from a mixed Gaussian distribution model (GMM) of phonemes, and the like can be used. In the language model update device for a speech recognition apparatus according to the present invention, based on the similarity measured by the similarity measurement unit, for example, a word having a high similarity can be extracted as a similar word from the recognition dictionary. The extraction of the similar words from the recognition dictionary can be performed using, for example, similar word extraction means.

前記本発明の音声認識装置用言語モデル更新装置において、前記影響度算出手段は、音声認識装置における前記キーワードの認識率を高めることにより、前記類似語が受ける影響度を算出する。前記影響度算出手段は、例えば、前記言語モデルにおける前記キーワードの言語スコアの新たな付与により前記類似語が受ける影響度を算出する。前記影響度は、例えば、言語モデルにおける前記キーワードの言語スコアの新たな付与により類似語が受ける影響の度合を意味する。例えば、前記類似語が会話において頻繁に使用される単語である場合、前記キーワードの音声認識装置における認識率を高めることによる前記影響の度合いは、会話における使用頻度の少ない類似語よりも大きいといえる。前記影響度算出手段は、前記類似語についての各種情報に基づき、前記影響度を算出することができる。本発明の音声認識装置用言語モデル更新装置では、例えば、別個の記憶手段に前記類似語についての情報を記憶しておいてよく、前記影響度算出手段を、前記情報を利用して前記影響度を算出できるよう構成できる。前記情報は、例えば、類似語の重要度、認識しやすさ等の情報であり、より具体的には、例えば、前記類似語の出現頻度、前記類似語の認識率およびユーザによる判断情報である。例えば、前記情報が、前記類似語の出現頻度である場合、前記影響度算出手段は、前記類似語の出現頻度が増加するに従い、前記影響度を増加させることができる。また、例えば、前記情報が、前記類似語の認識率である場合、前記影響度算出手段は、前記類似語の認識率が低いほど、前記影響度を増加させることができる。 In the language model update device for a speech recognition apparatus according to the present invention, the influence degree calculating means calculates the influence degree to which the similar word is affected by increasing the recognition rate of the keyword in the speech recognition apparatus. The influence degree calculating means calculates, for example, the degree of influence that the similar word receives due to new assignment of the language score of the keyword in the language model. The degree of influence means, for example, the degree of influence that a similar word is affected by newly adding a language score of the keyword in the language model. For example, when the similar word is a word that is frequently used in a conversation, it can be said that the degree of the influence by increasing the recognition rate of the keyword in the speech recognition device is larger than the similar word that is less frequently used in the conversation. . The influence degree calculating means can calculate the influence degree based on various information about the similar words. In the language model update device for a speech recognition apparatus according to the present invention, for example, information on the similar words may be stored in a separate storage unit, and the influence degree calculating unit may use the information to make the influence degree. Can be calculated. The information is, for example, information such as the importance of a similar word, ease of recognition, and more specifically, for example, the appearance frequency of the similar word, the recognition rate of the similar word, and judgment information by the user. . For example, when the information is the appearance frequency of the similar word, the influence degree calculating means can increase the influence degree as the appearance frequency of the similar word increases. For example, when the information is the recognition rate of the similar word, the influence degree calculating unit can increase the influence degree as the recognition rate of the similar word is lower.

前記本発明の音声認識装置用言語モデル更新装置において、前記言語スコア調整手段は、前記影響度に基づき、前記キーワードの言語スコアに与える前記ボーナスおよび前記類似語の言語スコアに与える前記ペナルティの双方またはいずれか一方を調整する。前記本発明の音声認識装置用言語モデル更新装置では、前記ボーナスと前記ペナルティのいずれかを調整すればよいが、全体としての認識率を高めるためには、双方を調整することが好ましい。前記言語スコア調整手段は、例えば、前記影響度が増加するに従い、前記ボーナスまたは前記ペナルティの大きさを減少させることができる。これにより、前記キーワードに対する前記類似語の認識率の抑制の程度に、各類似語の特性を反映でき、類似語ごとに適切に認識率を抑制でき、全体としての認識率を向上させることができる。 In the language model update device for a speech recognition device according to the present invention, the language score adjustment unit may be configured to provide both the bonus given to the language score of the keyword and the penalty given to the language score of the similar word based on the degree of influence. Adjust either one. In the language model update device for a speech recognition apparatus according to the present invention, either the bonus or the penalty may be adjusted. However, in order to increase the overall recognition rate, it is preferable to adjust both. For example, the language score adjusting means can decrease the size of the bonus or the penalty as the influence increases. Thereby, the characteristic of each similar word can be reflected in the degree of suppression of the recognition rate of the similar word for the keyword, the recognition rate can be appropriately suppressed for each similar word, and the overall recognition rate can be improved. .

前記本発明の音声認識装置用言語モデル更新装置において、前記言語スコア更新手段は、前記キーワードの言語スコアに対する前記調整したボーナスの付与および前記類似語の言語スコアに対する前記調整したペナルティの付与の双方またはいずれかを行う。前記本発明の音声認識装置用言語モデル更新装置において、前記言語スコア更新手段は、前記ボーナスの付与と前記ペナルティの付与のいずれかを行えればよいが、全体としての認識率を高めるためには、双方を行うことが好ましい。これにより、前述のとおり全体としての認識率を向上させることができる。なお、前記言語スコア更新手段が、前記キーワードの言語スコアへの前記調整したボーナスの付与と前記類似語の言語スコアへの前記調整したペナルティの付与の両方を行うことにより、いずれか一方のみを行う場合よりも、言語スコアの変化が小さくなり、全体としての認識率の悪化を抑制できる。前記言語スコア更新手段が、前記言語モデルにおいて前記言語スコアに前記ボーナスおよび前記ペナルティの双方またはいずれか一方を付与する方法は、特に制限されない。すなわち、このような方法は、前記言語モデルを使用する音声認識手段が、前記ボーナスを付与した前記キーワードの言語スコアおよび前記ペナルティを付与した前記類似語の言語スコアを認識できるような方法であればどのような方法でもよい。例えば、本発明の音声認識装置用言語モデル更新装置は、言語モデルに記憶されている言語スコアに前記ボーナスや前記ペナルティを付与して前記言語モデルを変更してもよいし、言語モデルは変更せずに、音声認識手段が前記ボーナスや前記ペナルティを付与した言語スコアを認識できる方法で言語スコアを付与してもよい。 In the language model update device for a speech recognition device according to the present invention, the language score update means includes both the adjustment bonus for the language score of the keyword and the adjustment penalty for the language score of the similar word, or Do one. In the language model update device for a speech recognition apparatus according to the present invention, the language score update unit may perform either the bonus or the penalty, but in order to increase the overall recognition rate. It is preferable to do both. Thereby, the recognition rate as a whole can be improved as mentioned above. In addition, the language score update means performs only one by both giving the adjusted bonus to the language score of the keyword and giving the adjusted penalty to the language score of the similar word. The change in the language score is smaller than in the case, and the deterioration of the recognition rate as a whole can be suppressed. The method in which the language score updating unit gives the bonus and / or the penalty to the language score in the language model is not particularly limited. That is, such a method is a method in which the speech recognition means using the language model can recognize the language score of the keyword to which the bonus is given and the language score of the similar word to which the penalty is given. Any method is acceptable. For example, the language model update device for a speech recognition device of the present invention may change the language model by giving the bonus or the penalty to the language score stored in the language model, or change the language model. Alternatively, the language score may be given by a method in which the voice recognition means can recognize the language score given the bonus or the penalty.

本発明の音声認識装置用言語モデル更新装置では、前記影響度算出手段は、前述のとおり、例えば、前記類似語の出現頻度、前記類似語の前記認識率およびユーザによる判断情報の少なくとも一つに基づき前記影響度を算出できる。前記類似語の出現頻度は、例えば、音声認識装置の過去の認識履歴から算出できる。前記認識率としては、例えば、音声認識装置における各類似語の実際の認識率を使用できる。また、音声合成または特徴量合成により疑似の音声データを作成し、音声認識を行う類似率予測により算出した認識率を使用できる。前記ユーザによる判断情報は、例えば、ユーザによる前記影響度の判断値を示す情報である。前記類似度算出手段は、例えば、前記類似語の出現頻度が増加するに従い、前記影響度を増加することができる。また、前記類似度算出手段は、前記類似語の前記認識率が低いほど、前記影響度を増加することができる。 In the language model update device for a speech recognition apparatus according to the present invention, the influence degree calculation means is, for example, at least one of the appearance frequency of the similar words, the recognition rate of the similar words, and judgment information by the user as described above. Based on this, the degree of influence can be calculated. The appearance frequency of the similar word can be calculated from the past recognition history of the speech recognition apparatus, for example. As the recognition rate, for example, the actual recognition rate of each similar word in the speech recognition apparatus can be used. Further, it is possible to use a recognition rate calculated by similarity prediction in which pseudo speech data is created by speech synthesis or feature amount synthesis and speech recognition is performed. The judgment information by the user is information indicating the judgment value of the influence degree by the user, for example. For example, the similarity calculation means can increase the influence as the appearance frequency of the similar word increases. The similarity calculation means can increase the influence as the recognition rate of the similar word is lower.

本発明の音声認識装置用言語モデル更新装置は、音声認識装置に使用される言語モデルに対して使用することができる。本発明の音声認識装置用言語モデル更新装置を適用できる前記言語モデルは、音声認識装置に使用されるものであれば特に制限されない。前記言語モデルは、単語列の言語的な確からしさを算出するための統計学的規準を示すデータであり、例えば、単語に関する各種情報を含む。前記各種情報としては、例えば、文章を構成する単語とその次にくる単語のつながりやすさの確率や、文頭に出現しやすい単語の確率等の情報を文頭から文末にわたって記述したデータ等である。このような言語モデルは、例えば、Ｎグラムを用いたＮグラム言語モデルであってもよいし、前記Ｎグラムを基本として変更または拡張されたモデルであってもよい。本発明では、前記言語モデルは、必要に応じて内容を改変できるものが好ましく、例えば、新規に前記言語スコアを追加でき、また、キーワードの言語スコアを更新できるものが好ましい。 The language model update device for a speech recognition device according to the present invention can be used for a language model used in the speech recognition device. The language model to which the language model update device for a speech recognition apparatus of the present invention can be applied is not particularly limited as long as it is used in the speech recognition device. The language model is data indicating statistical criteria for calculating the linguistic accuracy of a word string, and includes, for example, various pieces of information regarding words. Examples of the various information include data describing information such as the probability of easy connection between a word constituting a sentence and the next word and the probability of a word likely to appear at the beginning of a sentence from the beginning to the end of the sentence. Such a language model may be, for example, an N-gram language model using N-grams, or a model modified or extended based on the N-grams. In the present invention, the language model is preferably one whose contents can be modified as necessary. For example, one which can newly add the language score and can update the language score of a keyword is preferable.

本発明の音声認識装置用言語モデル更新装置は、音声認識装置において使用する。本発明の音声認識装置用言語モデル更新装置は、例えば、前記本発明の音声認識装置の一部として、前記音声認識装置に含まれていてもよいし、または、別の装置として構成されてもよい。 The language model update device for a speech recognition device of the present invention is used in a speech recognition device. The language model update device for a speech recognition device of the present invention may be included in the speech recognition device as a part of the speech recognition device of the present invention, or may be configured as a separate device, for example. Good.

前記本発明の音声認識装置において、前記音声入力手段は、話者の音声を取得できる手段である。前記音声入力手段において、例えば、ユーザは、音声を入力することができる。このような音声入力手段は、例えば、マイクロフォンである。 In the voice recognition apparatus of the present invention, the voice input means is means capable of acquiring a voice of a speaker. In the voice input means, for example, the user can input voice. Such voice input means is, for example, a microphone.

前記本発明の音声認識装置において、前記特徴量抽出手段は、例えば、前記音声入力手段において入力された音声の特徴を示す情報（本発明において、特徴量という）を抽出する。前記特徴量は、前記音声の特徴を示す情報である。このような特徴量は、後述する音声認識手段において候補としての単語列を検出できる情報であれば特に制限されない。前記特徴量としては、例えば、ケプストラム、スペクトルピッチ、パワー等、前記音声の音声波形に基づき抽出された情報を挙げることができる。前記特徴量は、例えば、スペクトル分析等の音響分析を用いて抽出できる。 In the speech recognition apparatus of the present invention, the feature amount extraction unit extracts, for example, information (referred to as a feature amount in the present invention) indicating the feature of the speech input by the speech input unit. The feature amount is information indicating the feature of the voice. Such a feature amount is not particularly limited as long as it is information that can detect a word string as a candidate in a speech recognition unit to be described later. Examples of the feature amount include information extracted based on the speech waveform of the speech, such as a cepstrum, a spectrum pitch, and power. The feature amount can be extracted using acoustic analysis such as spectrum analysis.

前記本発明の音声認識装置において、前記音声認識手段は、前記特徴量抽出手段により抽出された前記特徴量から、前記言語モデルを用いて前記音声を単語列として認識する。前記音声認識手段は、例えば、前記言語モデルにおける前記言語スコアから単語列の確からしさを算出し、例えば、前記確からしさが最も高い単語列を前記音声に対応する候補の単語列として検出し、出力できる。前記確からしさは、例えば、前記言語スコアを合算することにより算出することができる。前記音声認識手段は、また、前記言語モデルに音響モデルを併用して音声認識を行うこともできる。前記音響モデルとしては、音声認識装置に通常使用される音響モデル、例えば隠れマルコフモデルを使用できる。前記音声認識手段は、必要に応じて音声認識の精度を高めるための他のモデルを使用できる。このような他のモデルには、例えば、文法モデルが挙げられる。 In the speech recognition apparatus according to the present invention, the speech recognition means recognizes the speech as a word string using the language model from the feature quantity extracted by the feature quantity extraction means. The speech recognition means, for example, calculates the probability of a word string from the language score in the language model, for example, detects the word string with the highest probability as a candidate word string corresponding to the speech, and outputs it it can. The certainty can be calculated, for example, by adding the language scores. The speech recognition means can also perform speech recognition using an acoustic model in combination with the language model. As the acoustic model, an acoustic model normally used in a speech recognition apparatus, for example, a hidden Markov model, can be used. The speech recognition means can use another model for improving the accuracy of speech recognition as required. Examples of such other models include a grammar model.

前記本発明の音声認識装置によれば、前記キーワードを優先的に前記音声認識手段において認識できるようにすることができるだけでなく、つぎのことが可能である。すなわち、本発明の音声認識装置では、前記言語スコア調整手段において、前記影響度に基づき、前記キーワードの言語スコアに与える前記ボーナスおよび前記類似語の言語スコアに与える前記ペナルティの双方またはいずれかを調整できる。そのため、前記キーワードの認識率の向上と前記類似語の認識率の過剰低下の抑制の両方を実現できる。 According to the voice recognition apparatus of the present invention, not only can the keyword be recognized by the voice recognition means with priority, but also the following is possible. That is, in the speech recognition device of the present invention, the language score adjusting means adjusts either or both of the bonus given to the language score of the keyword and the penalty given to the language score of the similar word based on the degree of influence. it can. Therefore, both improvement of the recognition rate of the keyword and suppression of excessive decrease in the recognition rate of the similar word can be realized.

次に、本発明の音声認識装置用言語モデルの更新方法について説明する。
本発明の音声認識装置用言語モデルの更新方法は、前述のとおり、キーワード記憶手段と、影響度算出手段と、言語スコア調整手段と、言語スコア更新手段とを用い、
前記キーワード記憶手段が、キーワードを記憶するキーワード記憶ステップ（ａ）と、
前記影響度算出手段が、前記キーワードの認識率を高めることによる類似語に対する影響度を算出する影響度算出ステップ（ｂ）と、
前記言語スコア調整手段が、前記影響度に基づき、前記キーワードの言語スコアに与えるボーナスおよび前記類似語の言語スコアに与えるペナルティの少なくとも一方を調整する言語スコア調整ステップ（ｃ）と、
前記言語スコア更新手段が、前記キーワードの言語スコアに対する前記言語スコア調整手段において調整した前記ボーナスの付与および前記類似語の言語スコアに対する前記言語スコア調整手段において調整したペナルティの付与の少なくとも一方を行うことにより、言語モデルの言語スコアを更新する言語スコア更新ステップ（ｄ）を含むことを特徴とする。 Next, a method for updating a language model for a speech recognition apparatus according to the present invention will be described.
As described above, the language model update method for a speech recognition apparatus of the present invention uses a keyword storage unit, an influence degree calculation unit, a language score adjustment unit, and a language score update unit.
A keyword storage step (a) in which the keyword storage means stores a keyword;
An influence degree calculating step (b) in which the influence degree calculating means calculates an influence degree on a similar word by increasing the recognition rate of the keyword;
A language score adjusting step (c) in which the language score adjusting means adjusts at least one of a bonus given to the language score of the keyword and a penalty given to the language score of the similar word based on the degree of influence;
The language score update means performs at least one of the provision of the bonus adjusted by the language score adjustment means for the language score of the keyword and the provision of the penalty adjusted by the language score adjustment means for the language score of the similar word. Thus, a language score update step (d) for updating the language score of the language model is included.

前記本発明の音声認識装置用言語モデル更新方法によれば、言語モデルにおいて、前記キーワードが、音声認識装置により優先的に認識され、かつ、前記キーワードの類似語の認識率を過剰に低下させることのないように前記言語モデルを更新することができる。すなわち、本発明の音声認識装置用言語モデル更新方法では、前記（ｃ）の言語スコア調整ステップにより、前記キーワードの言語スコアに与える前記ボーナスおよび前記類似語の言語スコアに与える前記ペナルティの双方またはいずれかを、前記影響度算出手段により算出した前記影響度に応じて調整できる。そのため、本発明の音声認識装置用言語モデル更新方法によれば、前記キーワードの認識率の向上と前記類似語における認識率の過剰低下の抑制の両方を実現できる。 According to the language model update method for a speech recognition device of the present invention, the keyword is recognized by the speech recognition device with priority in the language model, and the recognition rate of similar words of the keyword is excessively reduced. The language model can be updated so that there is no problem. That is, in the language model update method for a speech recognition apparatus of the present invention, both or either of the bonus given to the language score of the keyword and the penalty given to the language score of the similar word by the language score adjustment step of (c). This can be adjusted according to the influence degree calculated by the influence degree calculating means. Therefore, according to the language model update method for a speech recognition apparatus of the present invention, it is possible to realize both the improvement of the keyword recognition rate and the suppression of the excessive decrease in the recognition rate of the similar words.

前記本発明の音声認識装置用言語モデル更新方法は、例えば、前記本発明の音声認識装置用言語モデル更新装置を用いて実施することができる。すなわち、前記本発明の音声認識装置用言語モデル更新方法は、例えば、前記本発明の音声認識装置用言語モデル更新装置における前記言語スコア更新手段と、前記影響度算出手段と、前記言語スコア調整手段と、前記言語スコア更新手段を用いて実施することができる。これら手段の構成および機能については、前記本発明の音声認識装置用言語モデル更新装置についての説明で記載したとおりである。 The language model update method for a speech recognition apparatus according to the present invention can be implemented using, for example, the language model update apparatus for a speech recognition apparatus according to the present invention. That is, the language model update method for a speech recognition apparatus according to the present invention includes, for example, the language score update means, the influence degree calculation means, and the language score adjustment means in the language model update apparatus for the speech recognition apparatus according to the present invention. And using the language score update means. The configurations and functions of these means are as described in the description of the language model update device for a speech recognition device of the present invention.

前記本発明の音声認識装置用言語モデル更新方法において、前記ステップ（ｃ）において、前記言語スコア調整手段は、前記影響度増加に従い、前記ボーナスまたは前記ペナルティの大きさを減少させることが好ましい。これにより、前記キーワードに対する前記類似語の認識率の抑制の程度に、各類似語の特性を反映でき、類似語ごとに適切に認識率を抑制でき、全体としての認識率を向上させることができる。 In the language model update method for a speech recognition apparatus according to the present invention, in the step (c), the language score adjusting means preferably decreases the bonus or the penalty as the influence increases. Thereby, the characteristic of each similar word can be reflected in the degree of suppression of the recognition rate of the similar word for the keyword, the recognition rate can be appropriately suppressed for each similar word, and the overall recognition rate can be improved. .

前記本発明の音声認識装置用言語モデル更新方法では、前記（ｄ）の言語スコア更新ステップにおいて、前記言語スコア更新手段は、前記キーワードの言語スコアに対する前記調整したボーナスの付与および前記類似語の言語スコアに対する前記調整したペナルティの付与の双方またはいずれかを行う。これにより、前述のとおり全体としての認識率を向上させることができる。前記言語スコア更新手段が、前記キーワードの言語スコアへの前記調整したボーナスの付与と前記類似語の言語スコアへの前記調整したペナルティの付与の両方を行うことにより、いずれか一方の処理のみを行う場合よりも、言語スコアの変化が小さくなり、全体としての認識率の悪化を抑制できる。前記言語スコア更新手段において、前記言語スコアに前記ボーナスおよび前記ペナルティの双方またはいずれか一方を付与し、前記言語モデルを更新する方法は、特に制限されない。すなわち、このような方法は、前記言語モデルを使用する音声認識装置が、キーワード記憶前の言語スコアに前記ボーナスまたは前記ペナルティを付与した後の言語スコアを認識できるような方法であればどのような方法でもよい。例えば、本発明の音声認識装置用言語モデル更新装置は、言語モデルに記録されている言語スコアに前記ボーナスまたは前記ペナルティを付与してもよいし、言語モデルは変更せずに音声認識装置が前記ボーナスまたは前記ペナルティを付与した後の言語スコアを認識できる他の方法で付与してもよい。 In the language model update method for a speech recognition apparatus according to the present invention, in the language score update step of (d), the language score update means assigns the adjusted bonus to the language score of the keyword and the language of the similar word Both and / or any of the adjustment of the adjusted penalty for the score is performed. Thereby, the recognition rate as a whole can be improved as mentioned above. The language score updating means performs only one of the processes by both giving the adjusted bonus to the language score of the keyword and giving the adjusted penalty to the language score of the similar word. The change in the language score is smaller than in the case, and the deterioration of the recognition rate as a whole can be suppressed. In the language score update means, a method of updating the language model by giving either or either of the bonus and the penalty to the language score is not particularly limited. That is, any method can be used as long as the speech recognition apparatus using the language model can recognize the language score after giving the bonus or the penalty to the language score before keyword storage. It may be a method. For example, the language model update device for a speech recognition device according to the present invention may give the bonus or the penalty to the language score recorded in the language model, or the speech recognition device may change the language model without changing the language model. You may give by the other method which can recognize the language score after giving a bonus or the said penalty.

前記本発明の音声認識装置用言語モデル更新方法では、前記（ｂ）のステップで、例えば、前記影響度算出手段において、前記類似語の出現頻度、前記類似語の認識率およびユーザによる判断情報の少なくとも一つに基づき前記影響度を算出できる。前記類似語の出現頻度、前記類似語の認識率および前記ユーザによる判断情報については、前記本発明の音声認識装置用言語モデル更新装置について説明したとおりである。 In the language model update method for a speech recognition apparatus according to the present invention, in the step (b), for example, in the influence calculation means, the appearance frequency of the similar words, the recognition rate of the similar words, and the judgment information by the user The influence degree can be calculated based on at least one. The appearance frequency of the similar word, the recognition rate of the similar word, and the determination information by the user are as described for the language model update device for a speech recognition device of the present invention.

前記本発明の音声認識装置用言語モデル更新方法では、前記（ｂ）のステップで、前記影響度算出手段において、例えば、前記類似語の出現頻度増加に従い、前記影響度を増加させることができる。 In the language model updating method for a speech recognition apparatus according to the present invention, in the step (b), the influence degree can be increased in the influence degree calculation means, for example, according to the appearance frequency of the similar word.

前記本発明の音声認識装置用言語モデル更新方法では、前記（ｂ）のステップで、前記影響度算出手段において、前記類似語の認識率減少に従い、前記影響度を増加させることができる。 In the language model updating method for a speech recognition apparatus according to the present invention, in the step (b), the influence calculation means can increase the influence according to the recognition rate decrease of the similar words.

前記本発明の音声認識装置用言語モデル更新方法を用いて、本発明の音声認識方法を実施することができる。
本発明の音声認識方法は、前述のとおり、前記本発明の音声認識装置を用い、
前記本発明の音声認識装置用言語モデルの更新方法により前記言語モデルを更新するステップと、
前記特徴量抽出手段が、前記音声入力手段において入力された音声の特徴量を抽出する特徴量抽出ステップ（ｅ）と、
前記音声認識手段が、前記特徴量抽出手段により抽出された前記特徴量から前記言語モデルを用いて前記音声を認識する音声認識ステップ（ｆ）とを含むことを特徴とする。 The speech recognition method of the present invention can be implemented using the language model update method for speech recognition apparatus of the present invention.
As described above, the speech recognition method of the present invention uses the speech recognition device of the present invention,
Updating the language model by the language model updating method for a speech recognition apparatus of the present invention;
A feature quantity extraction step (e) in which the feature quantity extraction means extracts a feature quantity of the voice input by the voice input means;
The voice recognition means includes a voice recognition step (f) for recognizing the voice using the language model from the feature quantity extracted by the feature quantity extraction means.

前記本発明の音声認識方法により、前記キーワードを優先的に認識できるだけでなく、つぎのことが可能である。すなわち、本発明の音声認識方法では、前記（ｃ）のステップで、前記キーワードの言語スコアに与える前記ボーナスおよび前記類似語の言語スコアに与える前記ペナルティの双方またはいずれかを、前記影響度に応じて調整できる。そのため、前記キーワードの認識率の向上と前記類似語の認識率の過剰低下の抑制を両立できる。 The speech recognition method of the present invention can not only recognize the keyword preferentially, but also can: That is, in the speech recognition method of the present invention, in the step (c), either or both of the bonus given to the language score of the keyword and the penalty given to the language score of the similar word are set according to the degree of influence. Can be adjusted. Therefore, it is possible to achieve both improvement in the recognition rate of the keyword and suppression of excessive decrease in the recognition rate of the similar word.

前記本発明の音声認識方法で使用する前記音声入力手段、特徴量抽出手段、音声認識手段および言語モデルとしては、前記本発明の音声認識装置における前記音声入力手段、特徴量抽出手段、音声認識手段および言語モデルをそれぞれ使用できる。このような前記音声入力手段、特徴量抽出手段、音声認識手段および言語モデルそれぞれの構成および機能は、前記本発明の音声認識装置についての説明で記載したとおりである。なお、前記本発明の音声認識方法では、音声認識の精度を高めるための他の手段をさらに用いてもよい。このような他の手段の構成および機能は、前記本発明の音声認識装置についての説明で記載したとおりである。 The speech input means, feature amount extraction means, speech recognition means, and language model used in the speech recognition method of the present invention include the speech input means, feature amount extraction means, speech recognition means in the speech recognition apparatus of the present invention. And language model can be used respectively. The configurations and functions of the voice input unit, the feature amount extraction unit, the voice recognition unit, and the language model are as described in the description of the voice recognition device of the present invention. The voice recognition method of the present invention may further use other means for improving the accuracy of voice recognition. The configuration and function of such other means are as described in the description of the speech recognition apparatus of the present invention.

なお、本発明の音声認識装置用言語モデル更新方法は、前記ステップ（ａ）から（ｄ）以外のステップをさらに含んでいてもよいし、含んでいなくてもよい。本発明の音声認識装置用言語モデル更新方法において、各ステップを行う順序も、本発明の前記効果が得られる限り、特に制限されない。また、本発明の音声認識方法は、前記ステップ（ａ）から（ｆ）以外のステップをさらに含んでいてもよいし、含んでいなくてもよい。本発明の音声認識方法において、各ステップを行う順序も、本発明の前記効果が得られる限り、特に制限されない。 The language model updating method for a speech recognition apparatus according to the present invention may further include steps other than the steps (a) to (d), or may not include the steps. In the language model updating method for a speech recognition apparatus of the present invention, the order in which the steps are performed is not particularly limited as long as the effects of the present invention are obtained. Moreover, the speech recognition method of the present invention may or may not include steps other than the steps (a) to (f). In the speech recognition method of the present invention, the order in which the steps are performed is not particularly limited as long as the effects of the present invention are obtained.

本発明のコンピュータプログラムは、前記本発明の音声認識装置用言語モデルの更新方法および前記本発明の音声認識方法の少なくとも一方をコンピュータ上で実行可能なコンピュータプログラムである。
本発明の記録媒体は、前記本発明のコンピュータプログラムを格納した記録媒体である。 The computer program of the present invention is a computer program capable of executing on a computer at least one of the language model update method for a speech recognition apparatus of the present invention and the speech recognition method of the present invention.
The recording medium of the present invention is a recording medium storing the computer program of the present invention.

以下、図面を参照しながら本発明のさらに具体的な実施形態について説明する。ただし、本発明は、以下の実施形態に限定されない。
［実施形態１］
図１のブロック図に、本実施形態の音声認識装置用言語モデル更新装置（以下、単に「言語モデル更新装置」という）の構成を示す。図示のように、本実施形態の言語モデル更新装置１００は、キーワード記憶手段１０１と、影響度算出手段１０５と、類似語抽出手段１０２と、言語スコア調整手段１０４と、言語スコア更新手段１０６とを含む。本実施形態の言語モデル更新装置１００は、音声認識装置において使用される言語モデル１０７および認識辞書１０８と連絡する。前記言語モデル１０７は、少なくとも認識辞書１０８に保存されている各単語の言語スコアを記録する。前記認識辞書１０８は、認識対象である単語の表記および読みの情報を含む。本実施形態の言語モデル更新装置１００は、キーワードを類似語よりも優先的に認識することができるように、前記言語モデル１０７において、前記キーワードの言語スコアにボーナスを付与でき、また、前記キーワードの類似語の言語スコアにペナルティを付与できる。すなわち、本実施形態の言語モデル更新装置１００は、言語モデル１０７を更新できる。本実施形態の言語モデル更新装置１００は、例えば、少なくともＣＰＵ（Central Processing Unit）と、ＲＯＭ（Read Only Memory）と、ＲＡＭ（Random Access Memory）とから構成し、前記各手段による処理の制御を、前記ＣＰＵが行うようにすることで実現できる。すなわち、例えば、前記ＣＰＵに前記各手段の各機能を提供するコンピュータプログラムを組み込むことで、前記言語モデル更新装置１００における前記各手段を構築でき、本発明の言語モデル更新装置１００を実現できる。また、本実施形態の言語モデル更新装置１００は、その動作を、前記各手段の機能を実現するコンピュータプログラムを組み込んだ、ＬＳＩ（Large Scale Integration）等のハードウェア部品からなる回路部品を実装して実現することもできる。なお、このようなコンピュータプログラムは、これを格納した記録媒体、例えば、ＨＤＤ、ＦＤ、ＣＤ−ＲＯＭ（ＣＤ−Ｒ、ＣＤ−ＲＷ）、ＭＯ、ＤＶＤ、メモリーカード等の形態で利用することもできる。図を参照して、本実施形態の言語モデル更新装置１００の動作を説明する。 Hereinafter, more specific embodiments of the present invention will be described with reference to the drawings. However, the present invention is not limited to the following embodiments.
[Embodiment 1]
The block diagram of FIG. 1 shows the configuration of a language model update device for speech recognition apparatus (hereinafter simply referred to as “language model update device”) of the present embodiment. As shown in the figure, the language model update device 100 of the present embodiment includes a keyword storage unit 101, an influence degree calculation unit 105, a similar word extraction unit 102, a language score adjustment unit 104, and a language score update unit 106. Including. The language model update device 100 according to the present embodiment communicates with a language model 107 and a recognition dictionary 108 used in the speech recognition device. The language model 107 records at least the language score of each word stored in the recognition dictionary 108. The recognition dictionary 108 includes notation and reading information of words to be recognized. The language model updating apparatus 100 according to the present embodiment can give a bonus to the language score of the keyword in the language model 107 so that the keyword can be recognized with priority over similar words. Penalize the language score of similar words. That is, the language model update device 100 according to the present embodiment can update the language model 107. The language model update device 100 according to the present embodiment includes, for example, at least a CPU (Central Processing Unit), a ROM (Read Only Memory), and a RAM (Random Access Memory). This can be realized by the CPU. That is, for example, by incorporating a computer program that provides each function of each means into the CPU, the means in the language model update device 100 can be constructed, and the language model update device 100 of the present invention can be realized. In addition, the language model updating apparatus 100 according to the present embodiment is implemented by mounting a circuit component composed of a hardware component such as an LSI (Large Scale Integration) in which a computer program for realizing the function of each unit is incorporated. It can also be realized. Such a computer program can also be used in the form of a recording medium storing the computer program, for example, HDD, FD, CD-ROM (CD-R, CD-RW), MO, DVD, memory card, etc. . With reference to the drawings, the operation of the language model update apparatus 100 of the present embodiment will be described.

ユーザが、言語モデル１０７を使用する音声認識装置に特定の単語をキーワードとして認識させたい場合、本実施形態では、下記（１−１）から（１−５）の処理により、前記言語モデル１０７を更新することができる。 When the user wants the speech recognition apparatus using the language model 107 to recognize a specific word as a keyword, in the present embodiment, the language model 107 is processed by the following processes (1-1) to (1-5). Can be updated.

（１−１）キーワード記憶処理
まず、ユーザが、前記キーワード記憶手段１０１に対し、認識させたい単語（以下、キーワードという）または前記キーワードを含む文章を入力する。この入力に基づき、前記キーワード記憶手段１０１が、前記認識辞書１０８から前記キーワードを抽出してこれを記憶する。本実施形態では、前記キーワード記憶手段１０１は、例えば、前記キーワードが前記認識辞書１０８に登録されていない場合、そのまま前記キーワードを記憶することができる。次いで、前記類似語抽出手段１０２が、下記（１−２）の類似語抽出処理を行う。 (1-1) Keyword Storage Processing First, a user inputs a word (hereinafter referred to as a keyword) to be recognized or a sentence including the keyword to the keyword storage unit 101. Based on this input, the keyword storage means 101 extracts the keyword from the recognition dictionary 108 and stores it. In the present embodiment, for example, when the keyword is not registered in the recognition dictionary 108, the keyword storage unit 101 can store the keyword as it is. Next, the similar word extraction means 102 performs the similar word extraction process (1-2) below.

（１−２）類似語抽出処理
前記類似語抽出手段１０２が、前記キーワードと一定以上の類似度を持った単語を前記キーワードの類似語として前記認識辞書１０８から抽出する。次いで、前記影響度算出手段１０５が、下記（１−３）の影響度算出処理を行う。 (1-2) Similar Word Extraction Processing The similar word extraction unit 102 extracts a word having a certain degree of similarity with the keyword from the recognition dictionary 108 as a similar word of the keyword. Next, the influence degree calculation means 105 performs the influence degree calculation process (1-3) below.

（１−３）影響度算出処理
前記影響度算出手段１０５が、前記言語モデル１０７において前記キーワードの言語スコアを高めることに伴う前記類似語の認識率に対する影響度を算出する。次いで、言語スコア調整手段１０４が、下記（１−４）の言語スコア調整処理を行う。 (1-3) Influence Level Calculation Processing The degree of influence calculation unit 105 calculates the degree of influence on the recognition rate of the similar words as the language model 107 increases the language score of the keyword. Subsequently, the language score adjustment means 104 performs the language score adjustment process of the following (1-4).

（１−４）言語スコア調整処理
言語スコア調整手段１０４が、影響度算出手段１０５において算出した前記影響度に応じて、前記言語モデル１０７における前記キーワードの言語スコアに与えるボーナスおよび前記類似語の言語スコアに与えるペナルティの双方またはいずれかを調整する。さらに、言語スコア調整手段１０４は、調整後の前記ボーナスおよび前記ペナルティについての情報を前記言語スコア更新手段１０６に対し出力する。前記言語スコア調整手段１０４は、例えば、前記影響度が大きい類似語ほど、前記ボーナスおよび前記ペナルティを減少させるなど、類似語に応じて言語スコアを調整できる。 (1-4) Language Score Adjustment Processing The bonus given to the language score of the keyword in the language model 107 and the language of the similar word according to the influence degree calculated by the influence degree calculation means 105 by the language score adjustment means 104 Adjust both and / or penalties for the score. Further, the language score adjusting unit 104 outputs information about the adjusted bonus and the penalty to the language score updating unit 106. The language score adjusting unit 104 can adjust the language score according to the similar word, for example, the similar word having a larger influence degree reduces the bonus and the penalty.

（１−５）言語スコア更新処理
前記ボーナスおよび前記ペナルティの双方またはいずれかについての情報を受けた前記言語スコア更新手段１０６は、前記情報に基づき、前記言語モデル１０７において、前記キーワードおよび前記類似語の双方またはいずれかの言語スコアに対し、前記言語スコア調整手段１０４において調整した前記ボーナスまたは前記ペナルティを付与する。すなわち、前記言語スコア更新手段１０６は、前記言語モデル１０７を更新する。 (1-5) Language Score Update Processing The language score update means 106 that has received information on both or either of the bonus and the penalty uses the keyword and the similar word in the language model 107 based on the information. The bonus or the penalty adjusted by the language score adjusting means 104 is given to both or any of the language scores. That is, the language score update unit 106 updates the language model 107.

本実施形態により、音声認識装置が使用する言語モデル１０７を、キーワードを類似語よりも優先的に認識できるよう更新できる。ただし、本実施形態では、前記影響度算出手段１０５により算出した影響度に応じて前記ボーナスおよび前記ペナルティの双方またはいずれかを調整してある。このため、例えば、類似語が重要な単語であった場合や、認識率が低い単語であった場合に、類似語の認識率が低下しすぎることがない。すなわち、本実施形態によれば、このような類似語についても適切に認識でき、音声認識装置の全体としての認識率を向上させることができる。 According to this embodiment, the language model 107 used by the speech recognition apparatus can be updated so that keywords can be recognized with priority over similar words. However, in the present embodiment, either or both of the bonus and the penalty are adjusted according to the influence calculated by the influence calculation means 105. For this reason, for example, when a similar word is an important word or a word with a low recognition rate, the recognition rate of the similar word does not decrease too much. That is, according to the present embodiment, such similar words can also be recognized appropriately, and the recognition rate as a whole of the speech recognition apparatus can be improved.

本実施形態において、前記影響度の増加に従い前記ボーナスまたは前記ペナルティの大きさを減少できるように前記言語スコア調整手段１０４を構成することができる。これにより、重要な類似語の認識率の過剰な低下を確実に防止できる。
本実施形態において、前記キーワードの認識率を高めるために、前記キーワードへの前記調整した前記ボーナスの付与と前記調整した前記ペナルティの付与の両方を行ってよい。これにより、いずれか一方の処理のみを行う場合よりも、言語スコアの変化が小さくなり、全体としての認識率の悪化を抑制できる。 In the present embodiment, the language score adjusting means 104 can be configured so that the bonus or penalty can be reduced as the influence increases. Thereby, it is possible to reliably prevent an excessive decrease in the recognition rate of important similar words.
In the present embodiment, in order to increase the recognition rate of the keyword, both the adjustment of the adjusted bonus and the adjustment of the penalty may be performed on the keyword. Thereby, the change of a language score becomes small compared with the case where only any one process is performed, and the deterioration of the recognition rate as a whole can be suppressed.

本実施形態では、前記（１−５）の言語スコア更新処理で、前記言語スコア更新手段１０６による言語スコアの付与の方法は、特に制限されない。例えば、前記言語スコア更新手段１０６は、前記言語モデル１０７に記録されている言語スコアを変更することで、前記ボーナスまたは前記ペナルティを付与してもよいし、言語モデル１０７は変更せずに、例えば、音声認識装置が、前記ボーナスまたは前記ペナルティを付与した前記言語スコアを認識できる他の方法で、言語モデル１０７を更新できる。 In the present embodiment, the language score updating method by the language score updating unit 106 in the language score updating process (1-5) is not particularly limited. For example, the language score updating unit 106 may give the bonus or the penalty by changing the language score recorded in the language model 107, or the language model 107 without changing, for example, The language model 107 can be updated in other ways in which the speech recognizer can recognize the language score with the bonus or penalty.

図２に、本実施形態の音声認識装置用言語モデル更新方法のフロー図を示す。図示のとおり、本実施形態の音声認識装置用言語モデル更新方法（以下、単に「言語モデル更新方法」という）は、下記（ａ）から（ｄ）のステップを含む。
（ａ）前記キーワード記憶手段１０１において、前記キーワードを入力し、記憶するステップ（ステップＳ１）、
（ｂ）前記影響度算出手段１０５において、音声認識装置において、前記キーワードの認識率を高めることによる類似語に対する影響度を算出するステップ（ステップＳ２）、
（ｃ）前記言語スコア調整手段１０４において、前記影響度に基づき、前記キーワードの言語スコアに与える前記ボーナスおよび前記類似語の言語スコアに与える前記ペナルティの双方またはいずれかを調整するステップ（ステップＳ３）、
（ｄ）前記言語スコア更新手段１０６において、前記キーワードの言語スコアに対する前記言語スコア調整手段１０４において調整した前記ボーナスの付与および前記類似語の言語スコアに対する前記言語スコア調整手段１０４において調整したペナルティの付与の双方またはいずれかを行うことにより、前記言語モデル１０７の言語スコアを更新する言語スコア更新ステップ（ステップＳ４）。 FIG. 2 shows a flow chart of the language model update method for the speech recognition apparatus of this embodiment. As shown in the figure, the language model update method for a speech recognition apparatus of the present embodiment (hereinafter simply referred to as “language model update method”) includes the following steps (a) to (d).
(A) inputting and storing the keyword in the keyword storage means 101 (step S1);
(B) In the influence degree calculation means 105, a step of calculating an influence degree on a similar word by increasing the recognition rate of the keyword in the speech recognition device (step S2);
(C) The language score adjusting unit 104 adjusts either or both of the bonus given to the language score of the keyword and the penalty given to the language score of the similar word based on the degree of influence (step S3). ,
(D) In the language score update unit 106, the bonus adjusted in the language score adjusting unit 104 for the language score of the keyword and the penalty adjusted in the language score adjusting unit 104 for the language score of the similar word A language score update step (step S4) for updating the language score of the language model 107 by performing both or any of the above.

［実施形態２］
本実施形態は、実施形態１の言語モデル更新装置を用いた音声認識装置の例である。図３のブロック図に、本実施形態の音声認識装置の構成を示す。図示のように、本実施形態の音声認識装置は、音声入力手段２１１と、特徴量抽出手段２１０と、音声認識手段２０９と、認識辞書２０８と、言語モデル２０７と、実施形態１と同じ構成を有する言語モデル更新装置２００とを含む。言語モデル２０７と認識辞書２０８は、それぞれ、ハードディスク等の一般的な記憶手段に記憶されている。前記認識辞書２０８は、認識対象である単語の表記および読みの情報を含む。前記言語モデル２０７は、多量の音声データから学習した単語、形態素、音素等の出現頻度や接続確率等をモデル化したデータを含む。本実施形態の音声認識装置は、例えば、少なくともＣＰＵとＲＯＭとＲＡＭとから構成し、前記各手段による処理の制御を、前記ＣＰＵが行うようにすることで実現できる。すなわち、例えば、前記ＣＰＵに前記各手段の各機能を提供するコンピュータプログラムを組み込むことで、前記音声認識装置における前記各手段を構築でき、本発明の音声認識装置を実現できる。また、本実施形態の音声認識装置は、その動作を、前記各手段の機能を実現するコンピュータプログラムを組み込んだ、ＬＳＩ等のハードウェア部品からなる回路部品を実装して実現することもできる。なお、このようなコンピュータプログラムは、これを格納した記録媒体、例えば、ＨＤＤ、ＦＤ、ＣＤ−ＲＯＭ（ＣＤ−Ｒ、ＣＤ−ＲＷ）、ＭＯ、ＤＶＤ、メモリーカード等の形態で利用することもできる。図３を参照して、本実施形態の音声認識装置の動作を説明する。 [Embodiment 2]
The present embodiment is an example of a speech recognition device using the language model update device of the first embodiment. The block diagram of FIG. 3 shows the configuration of the speech recognition apparatus of this embodiment. As shown in the figure, the speech recognition apparatus of the present embodiment has the same configuration as the speech input unit 211, the feature amount extraction unit 210, the speech recognition unit 209, the recognition dictionary 208, the language model 207, and the first embodiment. And a language model update device 200. The language model 207 and the recognition dictionary 208 are each stored in general storage means such as a hard disk. The recognition dictionary 208 includes notation and reading information of words to be recognized. The language model 207 includes data that models the appearance frequency, connection probability, and the like of words, morphemes, and phonemes learned from a large amount of speech data. The speech recognition apparatus according to the present embodiment can be realized, for example, by including at least a CPU, a ROM, and a RAM, and controlling the processing by each means by the CPU. That is, for example, each means in the speech recognition apparatus can be constructed by incorporating a computer program that provides each function of each means in the CPU, and the speech recognition apparatus of the present invention can be realized. In addition, the voice recognition device of the present embodiment can also realize its operation by mounting a circuit component made up of hardware components such as an LSI, which incorporates a computer program for realizing the functions of the respective means. Such a computer program can also be used in the form of a recording medium storing the computer program, for example, HDD, FD, CD-ROM (CD-R, CD-RW), MO, DVD, memory card, etc. . With reference to FIG. 3, the operation of the speech recognition apparatus of this embodiment will be described.

本実施形態において、キーワードを特定することなく音声認識を行う場合、本実施形態の音声認識装置は、次のように処理を行う。まず、ユーザが、音声入力手段２１１に音声を入力すると、前記特徴量抽出手段２１０が、下記（２−１）の特徴量抽出処理を行う。
（２−１）特徴量抽出処理
前記特徴抽出手段２１０が、前記入力された音声を分析して特徴量を抽出し、抽出した前記特徴量を、前記音声認識手段２０９に対して出力する。次いで、前記音声認識手段２０９が、下記（２−２）の音声認識処理を行う。 In the present embodiment, when performing speech recognition without specifying a keyword, the speech recognition apparatus of the present embodiment performs processing as follows. First, when a user inputs a voice to the voice input unit 211, the feature quantity extraction unit 210 performs a feature quantity extraction process (2-1) below.
(2-1) Feature Extraction Processing The feature extraction unit 210 analyzes the input speech and extracts a feature amount, and outputs the extracted feature amount to the speech recognition unit 209. Next, the voice recognition unit 209 performs the following voice recognition process (2-2).

（２−２）音声認識処理
前記音声認識手段２０９が、前記特徴量に基づき、前記言語モデル２０７を用いて、最も確からしいと判断できる単語列を検出し、認識結果として出力する。 (2-2) Speech Recognition Processing The speech recognition means 209 detects a word string that can be determined to be most likely based on the feature amount and uses the language model 207, and outputs it as a recognition result.

（２−３）キーワード認識処理
ユーザが、特定の単語をキーワードとして音声認識装置に認識させたい場合、本実施形態の音声認識装置では、前記実施形態１における前記（１−１）のキーワード記憶処理によりキーワード記憶手段２０１においてキーワードを記憶する。 (2-3) Keyword Recognition Processing When the user wants the speech recognition device to recognize a specific word as a keyword, the speech recognition device of the present embodiment uses the keyword storage processing (1-1) in the first embodiment. Thus, the keyword storage unit 201 stores the keyword.

（２−４）言語モデル更新処理
前記キーワード記憶処理以後、前記実施形態１の前記（１−２）から前記（１−５）の処理により、前記言語モデル更新装置２００が、前記言語モデル２０７を更新する。 (2-4) Language Model Update Process After the keyword storage process, the language model update apparatus 200 updates the language model 207 by the processes (1-2) to (1-5) of the first embodiment. Update.

（２−５）キーワード認識処理
前記（２−４）の言語モデル更新処理の後、ユーザが、前記音声入力手段２１１に音声を入力する。この入力に基づき、前記特徴量抽出手段２１０による処理の後、前記音声認識手段２０９が、更新後の前記言語モデル２０７を用いて、音声認識処理を行い、認識結果を出力する。 (2-5) Keyword Recognition Process After the language model update process (2-4), the user inputs voice to the voice input unit 211. Based on this input, after the processing by the feature amount extraction unit 210, the speech recognition unit 209 performs speech recognition processing using the updated language model 207 and outputs a recognition result.

本実施形態では、前記言語モデル更新装置２００において、前記キーワードを記録する前を基準として、前記言語モデル２０７に対し、前記キーワードと類似語の双方またはいずれかの言語スコアに前記ボーナスまたは前記ペナルティが付与されている。このため、前記音声認識手段２０９は、前記キーワードを前記類似語よりも優先的に認識することができる。ただし、本実施形態では、前記影響度算出手段２０５により算出した影響度に応じて、前記キーワードの言語スコアに与える前記ボーナスおよび前記類似語の言語スコアに与える前記ペナルティの双方またはいずれかを調整してある。これにより、本実施形態では、例えば、前記類似語が重要な単語であった場合や、認識率が低い単語であった場合に、類似語の認識率を過剰に低下させることがない。従って、本実施形態では、このような類似語についても適切に認識でき、音声認識装置の全体としての認識率を向上させることができる。 In the present embodiment, in the language model update device 200, the bonus or the penalty is added to the language score of both the keyword and the similar word or any language score with respect to the language model 207 with reference to before the keyword is recorded. Has been granted. Therefore, the voice recognition unit 209 can recognize the keyword preferentially over the similar word. However, in the present embodiment, both or either of the bonus given to the language score of the keyword and the penalty given to the language score of the similar word are adjusted according to the degree of influence calculated by the influence degree calculating unit 205. It is. Thereby, in this embodiment, when the said similar word is an important word, or when it is a word with a low recognition rate, the recognition rate of a similar word is not reduced too much, for example. Therefore, in the present embodiment, such similar words can also be recognized appropriately, and the recognition rate as a whole of the speech recognition apparatus can be improved.

図４に、本実施形態の音声認識方法のフロー図を示す。図示のとおり、本実施形態の音声認識方法は、前記実施形態１における前記（ａ）から（ｄ）のステップに、さらに下記（ｅ）から（ｆ）のステップを含む。
（ｅ）前記特徴抽出手段２１０が、前記音声入力手段２１１から入力された音声を分析して特徴量を抽出するステップ（ステップＳ５）、
（ｆ）前記音声認識手段が、前記特徴量に基づき、前記言語モデル２０７を用いて前記音声を単語列として認識し、出力するステップ（ステップＳ６およびステップＳ７）。 FIG. 4 shows a flowchart of the speech recognition method of this embodiment. As shown in the figure, the speech recognition method of the present embodiment includes the following steps (e) to (f) in addition to the steps (a) to (d) in the first embodiment.
(E) a step of the feature extraction unit 210 analyzing the voice input from the voice input unit 211 and extracting a feature amount (step S5);
(F) The voice recognition means recognizes and outputs the voice as a word string using the language model 207 based on the feature amount (steps S6 and S7).

［実施形態３］
図５に、本実施形態の言語モデル更新装置３００の構成を示す。図示のように、本実施形態の言語モデル更新装置３００は、類似度測定手段３０３をさらに含むことを除き、実施形態１と同じ構成を有する。前記類似度測定手段３０３は、単語同士の音響的な類似度を測定することができ、前記キーワードと認識辞書３０８に登録されている各単語との音響的な類似度を測定できる。図５を参照して、本実施形態の言語モデル更新装置３００の動作を説明する。 [Embodiment 3]
FIG. 5 shows the configuration of the language model update device 300 of this embodiment. As shown in the figure, the language model update device 300 of the present embodiment has the same configuration as that of the first embodiment except that it further includes a similarity measurer 303. The similarity measurer 303 can measure the acoustic similarity between words, and can measure the acoustic similarity between the keyword and each word registered in the recognition dictionary 308. With reference to FIG. 5, the operation of the language model update device 300 of this embodiment will be described.

ユーザが、言語モデル３０７を使用する音声認識装置に特定の単語をキーワードとして認識させたい場合、本実施形態では、つぎのように言語モデル３０７を更新することができる。
（３−１）キーワード記憶処理
まず、ユーザが、前記キーワード記憶手段３０１に対し、認識させたい未登録の単語（キーワード）または前記単語を含む文章を入力する。この入力に基づき、実施形態１と同じキーワード記憶処理により、前記キーワード記憶手段３０１が、前記キーワードを記憶する。次いで、前記類似度測定手段３０３が、下記（３−２）の類似度測定処理を行う。 When the user wants the speech recognition apparatus using the language model 307 to recognize a specific word as a keyword, in this embodiment, the language model 307 can be updated as follows.
(3-1) Keyword Storage Processing First, the user inputs an unregistered word (keyword) to be recognized or a sentence including the word to the keyword storage unit 301. Based on this input, the keyword storage unit 301 stores the keyword by the same keyword storage processing as in the first embodiment. Next, the similarity measurer 303 performs the similarity measure process (3-2) below.

（３−２）類似度測定処理
前記類似度測定手段３０３が、前記認識辞書３０８を参照して、前記キーワードと認識辞書３０８に含まれる各単語との音響的な類似度を測定する。この類似度測定処理後、前記類似語抽出手段３０２が、下記（３−３）の類似語抽出処理を行う。 (3-2) Similarity Measurement Processing The similarity measurement unit 303 refers to the recognition dictionary 308 and measures the acoustic similarity between the keyword and each word included in the recognition dictionary 308. After the similarity measurement process, the similar word extraction unit 302 performs the following similar word extraction process (3-3).

（３−３）類似語抽出処理
前記類似語抽出手段３０２が、前記キーワードと一定以上の類似度を持った単語を前記キーワードと音響的に類似した類似語として抽出する。次いで、前記影響度算出手段３０５が、下記（３−４）の影響度算出処理を行う。 (3-3) Similar Word Extraction Processing The similar word extraction unit 302 extracts a word having a certain degree of similarity with the keyword as a similar word that is acoustically similar to the keyword. Next, the influence degree calculation unit 305 performs the influence degree calculation process (3-4) below.

（３−４）影響度算出処理
前記影響度算出手段３０５が、音声認識装置において、前記キーワードの認識率を高めることによる前記類似語の認識率に対する影響度を算出する。次いで、言語スコア調整手段３０４が、下記（３−５）の言語スコア調整処理を行う。 (3-4) Influence Level Calculation Processing The influence level calculation unit 305 calculates the degree of influence on the recognition rate of the similar word by increasing the recognition rate of the keyword in the speech recognition apparatus. Subsequently, the language score adjustment means 304 performs the language score adjustment process of the following (3-5).

（３−５）言語スコア調整処理
本実施形態では、言語スコア調整手段３０４は、前記キーワードの言語スコアに与えるボーナスおよび前記類似語の言語スコアに与えるペナルティの双方またはいずれかを調整する。前記言語スコア調整手段３０４は、前記影響度が大きいほど前記ボーナスまたは前記ペナルティの大きさが減少するよう調整する。次いで、前記ボーナスまたは前記ペナルティを示す情報を前記言語スコア更新手段３０６に対し出力する。 (3-5) Language Score Adjustment Processing In this embodiment, the language score adjustment unit 304 adjusts either or both of a bonus given to the language score of the keyword and a penalty given to the language score of the similar word. The language score adjusting unit 304 adjusts the bonus or the penalty so as to decrease as the influence degree increases. Next, information indicating the bonus or the penalty is output to the language score updating means 306.

（３−６）言語モデル更新処理
前記情報の出力を受けた前記言語モデル更新装置３０６は、前記キーワードの認識率を高めるために、前記キーワードへの前記調整したボーナスの付与と前記類似語への前記調整したペナルティの付与の両方を行ってよい。いずれか一方のみを行ってもよいが、両方を行う方が、言語スコアの変化が小さくなり、全体としての認識率の悪化を抑制できる。 (3-6) Language model update processing Upon receiving the output of the information, the language model update device 306 gives the adjusted bonus to the keyword and applies it to the similar word in order to increase the recognition rate of the keyword. Both the adjusted penalty may be given. Either one of them may be performed, but if both are performed, the change in the language score becomes smaller, and the deterioration of the recognition rate as a whole can be suppressed.

本実施形態により、音声認識装置が使用する言語モデルを、キーワードを類似語よりも優先的に認識することができるように更新することができ、かつ、重要な類似語の認識率の過剰な低下を抑制して、適切に認識することができる。 According to this embodiment, the language model used by the speech recognition apparatus can be updated so that keywords can be recognized with priority over similar words, and the recognition rate of important similar words is excessively reduced. Can be properly recognized.

図６に、本実施形態の音声認識方法のフロー図を示す。図示のとおり、本実施形態の音声認識方法は、前記実施形態１における前記（ａ）から（ｄ）のステップに、下記（ｇ）のステップをさらに含む言語モデル更新方法である。
（ｇ）前記類似度測定手段３０３が、前記認識辞書３０８を参照して、前記キーワードと認識辞書３０８に含まれる各単語との音響的な類似度を測定するステップ（ステップＳ２’）。 FIG. 6 shows a flowchart of the speech recognition method of the present embodiment. As shown in the figure, the speech recognition method of the present embodiment is a language model update method that further includes the following step (g) in the steps (a) to (d) in the first embodiment.
(G) The similarity measuring unit 303 refers to the recognition dictionary 308 and measures the acoustic similarity between the keyword and each word included in the recognition dictionary 308 (step S2 ′).

［実施形態４］
図７Ａに、本実施形態の言語モデル更新装置４００の構成を示す。本実施形態の言語モデル更新装置４００は、出現回数記憶手段４１０を含むことを除き、実施形態３と同じ構成を有する。すなわち、図示のように、本実施形態の言語モデル更新装置４００は、影響度算出手段４０５において出現回数記憶手段４１０を含む。前記出現回数記憶手段４１０は、キーワード記憶処理前における音声認識装置の動作時の認識結果を認識履歴として記憶している。図７Ａを参照して、本実施形態の言語モデル更新装置４００の動作を説明する。 [Embodiment 4]
FIG. 7A shows the configuration of the language model update device 400 of this embodiment. The language model update device 400 according to the present embodiment has the same configuration as that of the third embodiment except that the appearance number storage unit 410 is included. That is, as shown in the figure, the language model update device 400 of this embodiment includes an appearance count storage unit 410 in the influence degree calculation unit 405. The appearance number storage means 410 stores a recognition result at the time of operation of the speech recognition apparatus before the keyword storage processing as a recognition history. With reference to FIG. 7A, operation | movement of the language model update apparatus 400 of this embodiment is demonstrated.

（４−１）キーワード記憶処理
まず、ユーザが、前記キーワード記憶手段４０１に対し、認識させたい未登録の単語（キーワード）または前記単語を含む文章を入力する。この入力に基づき、実施形態１から実施形態３と同じキーワード記憶処理により、前記キーワード記憶手段４０１が、キーワードに対応する単語を記憶する。次いで、前記類似度測定手段４０３が、下記（４−２）の類似度測定処理を行う。 (4-1) Keyword Storage Processing First, the user inputs an unregistered word (keyword) to be recognized or a sentence including the word to the keyword storage unit 401. Based on this input, the keyword storage unit 401 stores the word corresponding to the keyword by the same keyword storage processing as in the first to third embodiments. Next, the similarity measurer 403 performs the similarity measure process (4-2) below.

（４−２）類似度測定処理
前記類似度測定手段４０３が、前記認識辞書４０８を参照して、前記キーワードと認識辞書４０８に含まれる各単語との音響的な類似度を測定する。本実施形態では、前記類似度測定手段４０３は、音素の編集距離を用いて前記類似度を測定する。具体的には、例えば、前記キーワード記憶手段４０１において、ユーザが、「和菓子」を入力したとする。前記「和菓子」の音素表記は「ｗａｇａｓｉ」であり、認識辞書４０８中の単語の中の一つである「私」の音素表記は「ｗａｔａｓｉ」であり、音素表記において１音素の置換の差異があり、前記類似度は「１」とする。また、例えば「お菓子」という単語の音素表記は「ｏｋａｓｉ」であり、「ｗａｇａｓｉ」と比較すると１音素の脱落と２音素の置換の差異があり、前記類似度はその合計の「３」とする。類似度として音素の編集距離を用いる場合、二つの単語が音響的に類似しているほど、類似度の値は小さくなる。この類似度測定処理後、前記類似語抽出手段４０２が、下記（４−３）の類似語抽出処理を行う。 (4-2) Similarity Measurement Processing The similarity measurement unit 403 refers to the recognition dictionary 408 and measures the acoustic similarity between the keyword and each word included in the recognition dictionary 408. In the present embodiment, the similarity measuring unit 403 measures the similarity using the phoneme editing distance. Specifically, for example, it is assumed that the user inputs “Japanese confectionery” in the keyword storage unit 401. The phoneme notation of the “Japanese confectionery” is “wagasi”, the phoneme notation of “I”, which is one of the words in the recognition dictionary 408, is “watasi”, and there is a difference in replacement of one phoneme in the phoneme notation. Yes, the similarity is “1”. For example, the phoneme notation of the word “candy” is “okasi”, and compared to “wagasi”, there is a difference in dropping one phoneme and replacing two phonemes, and the similarity is “3” in total. To do. When the phoneme editing distance is used as the similarity, the similarity is smaller as the two words are acoustically similar. After the similarity measurement process, the similar word extraction unit 402 performs the following similar word extraction process (4-3).

（４−３）類似語抽出処理
前記類似語抽出手段４０２が、前記キーワードと一定以上の前記類似度を持った単語を前記キーワードと音響的に類似した類似語として抽出する。次いで、前記影響度算出手段４０５が、下記（４−４）の影響度算出処理を行う。 (4-3) Similar Word Extraction Processing The similar word extraction unit 402 extracts a word having a certain degree of similarity with the keyword as a similar word that is acoustically similar to the keyword. Next, the influence degree calculation means 405 performs the influence degree calculation process (4-4) below.

（４−４）影響度算出処理
前記影響度算出手段４０５が、前記出現回数記憶手段４１０を用いて、例えば、音声認識装置において、前記キーワードの認識率を高めることによる前記類似語の認識率に対する影響度を算出する。例えば、前記出現回数記憶手段４１０において、前記認識履歴として、「私」が３０回、「お菓子」は２回出現したとする。前記影響度算出手段４０５は、この出現回数を基に、前記影響度を算出する。さらに、前記影響度算出手段４０５は、結果を言語スコア調整手段４０４に対し出力する。 (4-4) Influence degree calculation process The influence degree calculation means 405 uses the appearance frequency storage means 410, for example, in a speech recognition device, to increase the recognition rate of the keyword to the similar word recognition rate. Calculate the impact. For example, it is assumed that “I” appears 30 times and “sweets” appears twice as the recognition history in the appearance count storage unit 410. The influence degree calculating means 405 calculates the influence degree based on the number of appearances. Further, the influence degree calculating unit 405 outputs the result to the language score adjusting unit 404.

（４−５）言語スコア調整処理
前記言語スコア調整手段４０４は、例えば、ユーザによる登録語として「和菓子」を追加した際に類似語に与える前記ペナルティは、頻出単語である「私」には小さく与えるように調整する。すなわち、本実施形態では、言語スコア調整手段４０４が、前記影響度を用いて、下記式（Ｉ）に従い、類似語の言語スコアに与えるペナルティ値（Ｐｉ）を算出する。

Ｐｉ＝Ｓ（ｓｉｍ（ｗ，ｗｉ））−αｌｏｇＣ（ｗｉ）（Ｉ）

前記式（Ｉ）中、ｗはキーワード、ｗｉは類似語、ｓｉｍ（ａ，ｂ）は単語ａとｂの類似度を表し、Ｓ（ｓｉｍ）は前記類似度からペナルティ値Ｐｉを決定する関数である。Ｃ（ｗ）は前記認識履歴中の単語ｗの出現回数であり、前記影響度算出処理において算出される前記影響度である。αは定数の係数である。類似度ｓｉｍは、上述したように前記音素編集距離を用いる。図７Ｂに、前記言語スコア調整手段４０４において算出されるＳ（ｓｉｍ）と前記類似語の前記類似度との関係を例示する。図示のとおり、Ｓは二つの単語が類似しているほど、すなわち前記類似度の値が小さいほどペナルティが増加するような関数であり、例えば、前記類似度が１の時のＳ（ｓｉｍ）は１．０、前記類似度が３の時は０．２５とすることができる。本実施形態では、前記出現回数の対数に係数を掛けたものを前記Ｓ（ｓｉｍ）から減算することで、前記出現回数が多い類似語ほどペナルティを軽減することができる。例えば、α=０．１の時、「私」のペナルティ値はＰ＝１．０−０．１×ｌｏｇ３０＝１．０−０．１×３．４０１＝０．６５９９となる。「お菓子」のペナルティ値はＰ＝０．２５−０．１×ｌｏｇ２＝０．２５−０．１×０．６９３１＝０．１８０６９となる。前記ペナルティ値を算出後、前記言語スコア調整手段４０４は、前記ペナルティ値を示す情報を前記言語モデル更新装置４０６に対して出力する。 (4-5) Language Score Adjustment Processing The language score adjustment unit 404, for example, gives a penalty to similar words when “Japanese confectionery” is added as a registered word by the user. Adjust to give. That is, in the present embodiment, the language score adjusting unit 404 calculates a penalty value (Pi) given to the language score of the similar word according to the following formula (I) using the degree of influence.

Pi = S (sim (w, wi)) − αlogC (wi) (I)

In the formula (I), w is a keyword, wi is a similar word, sim (a, b) is a similarity between the words a and b, and S (sim) is a function for determining a penalty value Pi from the similarity. is there. C (w) is the number of appearances of the word w in the recognition history, and is the influence degree calculated in the influence degree calculation process. α is a constant coefficient. The similarity sim uses the phoneme editing distance as described above. FIG. 7B illustrates the relationship between S (sim) calculated by the language score adjusting unit 404 and the similarity of the similar words. As shown in the figure, S is a function in which the penalty increases as the two words are similar, that is, as the similarity value is smaller. For example, S (sim) when the similarity is 1 is When the similarity is 1.0, it can be 0.25. In the present embodiment, by multiplying the logarithm of the number of appearances by a coefficient from the S (sim), the penalty can be reduced for similar words having a larger number of appearances. For example, when α = 0.1, the penalty value of “I” is P = 1.0−0.1 × log 30 = 1.0−0.1 × 3.401 = 0.6599. The penalty value of “confectionery” is P = 0.25−0.1 × log 2 = 0.25−0.1 × 0.6931 = 0.8069. After calculating the penalty value, the language score adjustment unit 404 outputs information indicating the penalty value to the language model update device 406.

（４−６）言語モデル更新処理
前記言語スコア調整処理後、前記言語モデル更新装置４０６が、前記情報に基づき、前記言語モデル４０７において前記各類似語の言語スコアに、キーワード記憶処理前を基準として前記ペナルティを付与する。すなわち、前記前記言語モデル更新装置４０６は、前記言語モデル４０７を更新する。 (4-6) Language model update process After the language score adjustment process, the language model update device 406 uses the information before the keyword storage process as a reference for the language score of each similar word in the language model 407 based on the information. Give the penalty. That is, the language model update device 406 updates the language model 407.

本実施形態では、前記言語スコア調整手段４０４により、前記言語モデル４０７において、前記各類似語の言語スコアにペナルティが付与されている。このため、前記言語モデル４０７を使用する音声認識装置は、前記キーワードを類似語よりも優先的に認識することができる。ただし、本実施形態では、前記影響度算出手段４０５において、前記回数検出手段４１０に記憶された情報を基に影響度を算出し、前記影響度に応じて前記ペナルティの大きさを調整してある。そのため、例えば、類似語が重要な単語であった場合や、認識率が低い単語であった場合に、類似語の認識率が低下しすぎることがなく、このような類似語についても適切に認識でき、全体としての認識率を向上させることができる。 In the present embodiment, the language score adjusting unit 404 gives a penalty to the language score of each similar word in the language model 407. For this reason, the speech recognition apparatus using the language model 407 can recognize the keyword preferentially over similar words. However, in the present embodiment, the influence degree calculation means 405 calculates the influence degree based on the information stored in the number of times detection means 410, and adjusts the magnitude of the penalty according to the influence degree. . Therefore, for example, when similar words are important words or words with low recognition rate, the recognition rate of similar words does not decrease too much, and such similar words are also recognized appropriately. And the recognition rate as a whole can be improved.

本実施形態において、前記類似度測定手段４０３は、前記音素の編集距離に代えて、音素のＧＭＭから計算される音素間距離を用いることができる。この場合、音響的に似た音素を持つ単語同士ほど類似しているとして、前記類似度の値を小さくすることができる。 In this embodiment, the similarity measurer 403 can use a phoneme distance calculated from a phoneme GMM instead of the phoneme editing distance. In this case, it is possible to reduce the similarity value, assuming that the words having acoustically similar phonemes are more similar.

本実施形態の言語モデル更新方法は、実施形態１と同じフロー図で説明できる。図２に示すとおり、本実施形態の言語モデル更新方法は、下記（ａ）から（ｄ）のステップを含む言語モデル更新方法である。
（ａ）前記キーワード記憶手段４０１が、前記キーワードに対応する単語を記憶するステップ（ステップＳ１）、
（ｂ）前記影響度算出手段４０５において、前記出現回数記憶手段４１０を用いて、音声認識装置における前記キーワードの認識率を高めることによる類似語に対する影響度を算出するステップ（ステップＳ２）、
（ｃ）前記言語スコア調整手段４０４において、前記式（Ｉ）に従い、前記類似語の言語スコアに与えるペナルティ値（Ｐｉ）を算出するステップ（ステップＳ３）、
（ｄ）前記言語スコア更新手段４０６において、前記言語モデル４０７において、前記類似語の言語スコアに前記ペナルティ値を付与し、前記言語スコアを更新するステップ（ステップＳ４）。 The language model update method of the present embodiment can be described with the same flowchart as that of the first embodiment. As shown in FIG. 2, the language model update method of this embodiment is a language model update method including the following steps (a) to (d).
(A) the keyword storage unit 401 stores a word corresponding to the keyword (step S1);
(B) In the influence degree calculating means 405, using the appearance number storage means 410, calculating the influence degree with respect to similar words by increasing the recognition rate of the keyword in the speech recognition device (step S2);
(C) The language score adjusting means 404 calculates a penalty value (Pi) to be given to the language score of the similar word according to the formula (I) (step S3),
(D) In the language score update unit 406, in the language model 407, the penalty value is assigned to the language score of the similar word, and the language score is updated (step S4).

［実施形態５］
本実施形態の構成は、認識履歴における類似語の出現回数に応じて、前記キーワードの言語スコアに対しボーナスを付与できる以外は、実施形態４と同じである。言語スコア調整手段４０４は、前記影響度に基づき前記ペナルティ値（Ｐｉ）を算出できるだけでなく、前記キーワードの言語スコアに付与するボーナス値（Ｂ）を算出できる。また、前記言語スコア更新手段４０６は、前記類似語の言語スコアにペナルティ値（Ｐｉ）を付与できるだけでなく、キーワードの言語スコアに前記ボーナス値を付与できる。図７Ａを参照して、本実施形態の音声認識装置の動作を説明する。まず、ユーザが、キーワード記憶手段４０１に対し、認識させたい未登録の単語（キーワード）または前記単語を含む文章を入力してから、影響度を算出するまでの処理は、実施形態４における前記（４−１）から前記（４−４）と同じである。本実施形態では、前記（４−４）の影響度算出処理の後、下記（５−１）言語スコア調整処理を行う。 [Embodiment 5]
The configuration of the present embodiment is the same as that of the fourth embodiment except that a bonus can be given to the language score of the keyword according to the number of occurrences of similar words in the recognition history. The language score adjusting unit 404 can calculate not only the penalty value (Pi) based on the influence level but also a bonus value (B) to be given to the language score of the keyword. Further, the language score updating means 406 can not only give a penalty value (Pi) to the language score of the similar word but also give the bonus value to the language score of the keyword. With reference to FIG. 7A, operation | movement of the speech recognition apparatus of this embodiment is demonstrated. First, the process from when the user inputs an unregistered word (keyword) to be recognized or a sentence including the word to the keyword storage unit 401 until calculating the influence degree is the same as that in the fourth embodiment (( 4-1) to (4-4) are the same. In the present embodiment, the following (5-1) language score adjustment process is performed after the influence degree calculation process (4-4).

（５−１）言語スコア調整処理
前記言語スコア調整手段４０４は、実施形態４における前記（４−５）の言語スコア調整処理と同様にして、前記影響度を用いて、前記式（Ｉ）に従い、前記類似語の言語スコアに付与するペナルティ値（Ｐｉ）を算出する。本実施形態では、前記言語スコア調整手段４０４は、前記ペナルティ値を用いて、さらに下記式（ＩＶ）に従い、前記キーワードの言語スコアに与えるボーナス値（Ｂ）を算出する。次いで、前記言語スコア調整手段４０４は、前記ペナルティ値および前記ボーナス値を示す情報を前記言語モデル更新装置４０６に対して出力する。

Ｂ＝ｂ−βｍａｘＰｉ（ＩＶ）

前記式（ＩＶ）中、ｂは定数のボーナス、βは係数であり、ｍａｘＰｉは、前記類似語のペナルティＰｉの最大値とする。なお、ｍａｘＰｉに代えて、前記類似語のペナルティの平均値を用いてもよい。 (5-1) Language Score Adjustment Processing The language score adjustment unit 404 follows the above formula (I) using the degree of influence in the same manner as the language score adjustment processing of (4-5) in the fourth embodiment. Then, a penalty value (Pi) to be given to the language score of the similar word is calculated. In the present embodiment, the language score adjusting unit 404 calculates a bonus value (B) to be given to the language score of the keyword according to the following formula (IV) using the penalty value. Next, the language score adjustment unit 404 outputs information indicating the penalty value and the bonus value to the language model update device 406.

B = b−βmaxPi (IV)

In the formula (IV), b is a constant bonus, β is a coefficient, and maxPi is the maximum value of the penalty Pi of the similar word. Instead of maxPi, an average value of the penalty of the similar word may be used.

（５−２）言語モデル更新処理
前記言語スコア調整処理後、前記言語モデル更新装置４０６が、前記情報に基づき、言語モデル４０７における類似語およびキーワードの言語スコアに対し、それぞれ前記ペナルティ値および前記ボーナス値を付与する。 (5-2) Language Model Update Processing After the language score adjustment processing, the language model update device 406 determines the penalty value and the bonus for the language scores of similar words and keywords in the language model 407 based on the information, respectively. Assign a value.

本実施形態では、キーワードについても、前記ボーナス値により調整を行うことで、前記ペナルティ値のみを用いて調整するよりも調整後の言語スコアの変化が小さくなり、全体としての認識率への悪影響を抑制することができる。 In the present embodiment, the adjustment of the keyword is also performed by the bonus value, so that the change in the adjusted language score is smaller than the adjustment using only the penalty value, and the adverse effect on the recognition rate as a whole is reduced. Can be suppressed.

本実施形態の言語モデル更新方法は、前記実施形態４における前記（ａ）から（ｄ）のステップを含み、かつ、下記（ｃ’）および下記（ｄ’）をさらに含む方法である。
（ｃ’）前記言語スコア調整手段４０４が、前記式（ＩＶ）を用いて前記ボーナス値を算出し、前記ボーナス値を示す情報を前記言語モデル更新装置４０６に対して出力するステップ。
（ｄ’）前記言語モデル更新装置４０６が、前記情報に基づき、言語モデル４０７における前記キーワードの言語スコアに対し前記ボーナス値を付与するステップ。 The language model update method according to the present embodiment includes the steps (a) to (d) in the fourth embodiment, and further includes the following (c ′) and the following (d ′).
(C ′) The language score adjusting unit 404 calculates the bonus value using the formula (IV) and outputs information indicating the bonus value to the language model update device 406.
(D ′) The language model update device 406 gives the bonus value to the language score of the keyword in the language model 407 based on the information.

［実施形態６］
本実施形態では、類似語の認識率に応じて、類似語の言語スコアに与えるペナルティおよびキーワードの言語スコアに与えるボーナスの双方またはいずれかを調整できる例を示す。本実施形態の言語モデル更新装置の構成は、出現回数記憶手段４１０に代えて類似語認識率記憶手段６１０を含むことを除き、実施形態５と同じである。図８のブロック図に本実施形態の言語モデル更新装置６００の構成を示す。図示のとおり、本実施形態では、前記言語モデル更新装置６００は、前記影響度算出手段６０５において類似語認識率記憶手段６１０を含む。本実施形態では、前記類似語認識率記憶手段６１０は、キーワード記憶前の音声認識装置における各類似語の実認識率を記憶する。図８を参照して、本実施形態の言語モデル更新装置６００の動作を説明する。まず、ユーザが、前記キーワード記憶手段６０１に対し、認識させたい未登録の単語（キーワード）または前記単語を含む文章を入力してから、類似語を抽出するまでの処理は、実施形態４および５における前記（４−１）から前記（４−３）と同じである。本実施形態では、（４−３）の類似語抽出処理の後、下記（６−１）の影響度算出処理を行う。 [Embodiment 6]
In the present embodiment, an example is shown in which both or one of the penalty given to the language score of the similar word and the bonus given to the language score of the keyword can be adjusted according to the recognition rate of the similar word. The configuration of the language model update device of the present embodiment is the same as that of the fifth embodiment except that it includes a similar word recognition rate storage unit 610 instead of the appearance number storage unit 410. FIG. 8 is a block diagram showing the configuration of the language model update apparatus 600 of this embodiment. As illustrated, in the present embodiment, the language model update device 600 includes a similar word recognition rate storage unit 610 in the influence degree calculation unit 605. In the present embodiment, the similar word recognition rate storage unit 610 stores the actual recognition rate of each similar word in the speech recognition apparatus before storing keywords. With reference to FIG. 8, the operation of the language model update device 600 of this embodiment will be described. First, a process from when a user inputs an unregistered word (keyword) to be recognized to the keyword storage unit 601 or a sentence including the word until a similar word is extracted is described in the fourth and fifth embodiments. The same as (4-1) to (4-3) above. In this embodiment, after the similar word extraction process of (4-3), the influence degree calculation process of the following (6-1) is performed.

（６−１）影響度算出処理
前記影響度算出手段６０５が、前記類似語認識率記憶手段６１０を用いて、音声認識装置において前記キーワードの認識率を高めることによる前記類似語の認識率に対する影響度を算出し、結果を言語スコア調整手段６０４に対し出力する。次いで、言語スコア調整手段６０４が、下記（６−２）の言語スコア調整処理を行う。 (6-1) Influence Level Calculation Processing Influence on the similar word recognition rate by the influence degree calculating unit 605 using the similar word recognition rate storage unit 610 to increase the keyword recognition rate in the speech recognition apparatus. The degree is calculated and the result is output to the language score adjusting means 604. Subsequently, the language score adjustment means 604 performs the language score adjustment process of the following (6-2).

（６−２）言語スコア調整処理
前記言語スコア調整手段６０４は、前記影響度を用いて、下記式（ＩＩ）に従い、類似語の言語スコアに付与するペナルティ値（Ｐｉ）を算出し、前記ペナルティ値を示す情報を前記言語モデル更新装置６０６に対して出力する。

Ｐｉ＝Ｓ（ｓｉｍ（ｗ，ｗｉ））×Ｐ（ｗｉ）（ＩＩ）

前記式（ＩＩ）中、Ｐ（ｗ）は単語ｗの認識率である。すなわち、Ｐ(ｗｉ)が前記類似語の認識率であり、前記Ｐ(ｗｉ)が減少すると前記影響度が増加する。そして、前記影響度が増加するほどペナルティ値Ｐｉは減少する。すなわち、Ｐ(ｗｉ)とＰｉは正の相関関係を有する。例えば、「私」の認識率が９０％とすると、Ｐ＝１．０×０．９＝０．９となり、「お菓子」の認識率が７０％であるとすると、Ｐ＝０．２５×０．７＝０．１７５となり、前記認識率の悪い単語ほどペナルティが軽減されるようになる。 (6-2) Language Score Adjustment Processing The language score adjustment unit 604 calculates a penalty value (Pi) to be given to a language score of a similar word according to the following formula (II) using the degree of influence, and the penalty Information indicating the value is output to the language model update device 606.

Pi = S (sim (w, wi)) × P (wi) (II)

In the formula (II), P (w) is a recognition rate of the word w. That is, P (wi) is the recognition rate of the similar words, and the degree of influence increases when P (wi) decreases. The penalty value Pi decreases as the degree of influence increases. That is, P (wi) and Pi have a positive correlation. For example, if the recognition rate of “I” is 90%, P = 1.0 × 0.9 = 0.9, and if the recognition rate of “sweets” is 70%, P = 0.25 × 0.7 = 0.175, and the penalty is reduced as the word has a lower recognition rate.

（６−３）言語モデル更新処理
前記言語スコア調整処理後、前記言語モデル更新装置６０６が、前記情報に基づき、言語モデル６０７において、前記キーワード記憶処理の前の類似語の言語スコアに前記ペナルティ値を付与する。すなわち、前記言語モデル更新装置６０６は、前記言語スコアを更新する。 (6-3) Language model update processing After the language score adjustment processing, the language model update device 606 uses the penalty value in the language score of the similar word before the keyword storage processing in the language model 607 based on the information. Is granted. That is, the language model update device 606 updates the language score.

本実施形態では、つぎの効果が得られる。すなわち、前記認識率の低い類似語は、ペナルティを与えることで大きく認識率が下がるため、例えば、影響度を大きくしペナルティを減少させるようにすることで、前記類似語の認識率の過剰な低下を防止できる。 In the present embodiment, the following effects can be obtained. That is, similar words with a low recognition rate are greatly reduced by giving a penalty. For example, by increasing the degree of influence and reducing the penalty, the recognition rate of the similar word is excessively lowered. Can be prevented.

本実施形態において、前記認識率として、前記実認識率に代えて認識率予測によって求められた認識率を用いてもよい。前記認識率予測によって求められた認識率は、例えば、音声合成または特徴量合成により、擬似の音声データを作成し音声認識を行うことで算出することができる。 In the present embodiment, as the recognition rate, a recognition rate obtained by recognition rate prediction may be used instead of the actual recognition rate. The recognition rate obtained by the recognition rate prediction can be calculated, for example, by creating pseudo speech data and performing speech recognition by speech synthesis or feature amount synthesis.

また、本実施形態は、実施形態５と同様に、言語スコア調整手段６０４において、さらに前記キーワードの言語スコアに与えるボーナス値（Ｂ）を算出してよい。これにより、前記言語スコア更新手段６０６が、キーワードの言語スコアに前記ボーナス値を付与することができる。これにより、類似語に対し前記ペナルティ値のみで調整するよりも調整後の言語スコアの変化を小さくでき、全体としての認識率への悪影響を抑制することができる。 In the present embodiment, as in the fifth embodiment, the language score adjusting unit 604 may further calculate a bonus value (B) to be given to the language score of the keyword. Thereby, the language score update means 606 can give the bonus value to the language score of the keyword. Thereby, the change of the adjusted language score can be made smaller than the case where the similar word is adjusted only by the penalty value, and the adverse effect on the recognition rate as a whole can be suppressed.

本実施形態の言語モデル更新方法は、前記実施形態４における前記（ａ）から（ｄ）のステップを含み、前記（ｃ）のステップが、下記（ｃ”）のステップを含む方法である。
（ｃ”）前記言語スコア調整手段６０４が、前記影響度を用いて、前記式（ＩＩ）に従い、類似語の言語スコアに与えるペナルティ値（Ｐｉ）を算出し、算出したペナルティ値を示す情報を前記言語モデル更新装置６０６に対して出力するステップ。 The language model update method of the present embodiment includes the steps (a) to (d) in the fourth embodiment, and the step (c) includes the following step (c ″).
(C ″) The language score adjusting unit 604 calculates a penalty value (Pi) given to the language score of the similar word according to the formula (II) using the degree of influence, and shows information indicating the calculated penalty value. Outputting to the language model update device 606;

［実施形態７］
本実施形態では、ユーザが類似語に対する影響度を判断して、類似語の言語スコアに与えるペナルティを調整できる例を示す。本実施形態の言語モデル更新装置の構成は、類似語提示手段７１０と影響度入力手段７１１を含むことを除き、実施形態４と同じである。図９Ａのブロック図に、本実施形態の言語モデル更新装置７００の構成を示す。図示のとおり、本実施形態では、前記言語モデル更新装置７００は、前記類似語抽出手段７０２において類似語提示手段７１０を含み、前記影響度算出手段７０５において影響度入力手段７１１を含む。本実施形態では、ユーザは、前記影響度入力手段７１１により、自らが判断した前記影響度を入力することができる。図９Ａを参照して、本実施形態の言語モデル更新装置７００の動作を説明する。まず、ユーザが、キーワード記憶手段７０１に対し、認識させたい未登録の単語（キーワード）または前記単語を含む文章を入力してから、類似語を抽出するまでの処理は、実施形態４における前記（４−１）から前記（４−３）と同じである。本実施形態では、前記類似語抽出手段７０２による前記（４−３）の類似語抽出処理の後、抽出した類似語を類似語提示手段７１０により、ユーザに提示する。ユーザは、希望により、下記（７−１）の影響度判断値入力処理を行うことができる。 [Embodiment 7]
In the present embodiment, an example is shown in which the user can determine the degree of influence on a similar word and adjust the penalty given to the language score of the similar word. The configuration of the language model update device of this embodiment is the same as that of Embodiment 4 except that it includes similar word presentation means 710 and influence degree input means 711. FIG. 9A is a block diagram showing the configuration of the language model update apparatus 700 of this embodiment. As shown in the figure, in the present embodiment, the language model update device 700 includes a similar word presentation unit 710 in the similar word extraction unit 702 and an influence degree input unit 711 in the influence degree calculation unit 705. In the present embodiment, the user can input the influence degree determined by the user using the influence degree input means 711. With reference to FIG. 9A, the operation of the language model update apparatus 700 of this embodiment will be described. First, the process from when the user inputs an unregistered word (keyword) to be recognized or a sentence including the word to the keyword storage unit 701 until the similar word is extracted is the same as that in the fourth embodiment ( 4-1) to (4-3) are the same. In the present embodiment, after the similar word extraction process (4-3) by the similar word extraction means 702, the extracted similar words are presented to the user by the similar word presentation means 710. The user can perform the influence degree judgment value input process (7-1) as desired.

（７−１）影響度入力処理
ユーザは、前記類似語提示手段７１０により提示された類似語に関し、前記影響度入力手段７１１において、適切と判断する影響度情報を入力することができる。例えば、前記影響度情報を、５段階で評価して、ペナルティを軽減したいものを５、ペナルティを大きくかけてもいいものを１として入力することができる。入力を受けた影響度算出手段７０５は、前記影響度情報に基づき影響度を算出し、算出した前記影響度を言語スコア調整手段７０４に対し出力する。 (7-1) Influence Level Input Process With respect to the similar words presented by the similar word presentation unit 710, the user can input the degree of influence information determined to be appropriate in the influence level input unit 711. For example, the influence degree information can be evaluated in five stages, and 5 can be input to reduce the penalty, and 1 can be input to increase the penalty. Upon receiving the input, the influence degree calculating means 705 calculates the influence degree based on the influence degree information, and outputs the calculated influence degree to the language score adjusting means 704.

（７−２）言語スコア調整処理
前記影響度を受けた言語スコア調整手段７０４は、前記影響度を用いて、下記式（ＩＩＩ）に従い、類似語の言語スコアに与えるペナルティ値（Ｐｉ）を算出する。

Ｐｉ＝Ｓ（ｓｉｍ（ｗ，ｗｉ））×Ｔ（ｕｉ）（ＩＩＩ）

前記式（ＩＩＩ）中、ｕｉはユーザの入力した影響度であり、Ｔはペナルティの軽減度を表し、図９Ｂに示すように前記影響度が大きいほど減少する関数である。例えば、ユーザが、「私」の影響度を「５」と入力した時のペナルティはＰ＝１．０×０．１＝０．１、「お菓子」の影響度を「３」と入力した時のペナルティはＰ＝０．２５×０．５＝０．１２５とすることができる。前記ペナルティ値を算出後、前記言語スコア調整手段７０４は、前記ペナルティ値を示す情報を前記言語モデル更新装置に対して出力する。 (7-2) Language Score Adjustment Processing The language score adjustment unit 704 that receives the influence degree calculates a penalty value (Pi) given to the language score of the similar word according to the following formula (III) using the influence degree. To do.

Pi = S (sim (w, wi)) × T (ui) (III)

In the formula (III), ui is a degree of influence input by the user, T represents a penalty reduction degree, and is a function that decreases as the degree of influence increases as shown in FIG. 9B. For example, when the user inputs “5” as the influence degree of “I”, the penalty is P = 1.0 × 0.1 = 0.1, and the influence degree of “sweets” is “3”. The penalty of time can be P = 0.25 × 0.5 = 0.125. After calculating the penalty value, the language score adjusting unit 704 outputs information indicating the penalty value to the language model update device.

（７−３）言語モデル更新処理
前記言語スコア調整処理後、前記言語モデル更新装置７０６が、前記情報に基づき言語モデル７０７において、類似語の言語スコアに対し前記ペナルティ値を付与する。すなわち、前記言語モデル更新装置７０６は、前記言語スコアを更新する。 (7-3) Language Model Update Processing After the language score adjustment processing, the language model update device 706 gives the penalty value to the language score of similar words in the language model 707 based on the information. That is, the language model update device 706 updates the language score.

本実施形態では、実施形態４が奏する効果に加え、ユーザの判断でペナルティ値を調整でき、ユーザの希望に最も沿った音声認識を実現できるという効果も得られる。 In the present embodiment, in addition to the effect achieved by the fourth embodiment, the penalty value can be adjusted by the user's judgment, and the effect that voice recognition that best meets the user's wish can be realized.

本実施形態では、ユーザは、前記影響度入力装置７１１において、前記キーワードの言語スコアにボーナス値を入力することもできる。この場合、前記言語スコア調整手段７０４は、前記影響度入力手段７１１においてユーザが入力した影響度に基づき、前記言語モデル７０７において、前記キーワードの言語スコアに付与するボーナス値を算出することができる。次いで、前記言語スコア更新手段７０６が、前記ボーナス値を、前記キーワードの言語スコアに付与し、更新できる。これにより、ユーザが自在にペナルティのみを調整するよりも言語スコアの変化を小さくでき、全体としての認識率への悪影響を抑制することができる。 In this embodiment, the user can input a bonus value to the language score of the keyword in the influence input device 711. In this case, the language score adjusting unit 704 can calculate a bonus value to be given to the language score of the keyword in the language model 707 based on the influence level input by the user in the influence level input unit 711. Then, the language score updating means 706 can add and update the bonus value to the language score of the keyword. Thereby, the change of a language score can be made smaller than a user adjusting only a penalty freely, and the bad influence on the recognition rate as a whole can be suppressed.

図１０に、本実施形態の言語モデル更新方法のフロー図を示す。図示のとおり、本実施形態の言語モデル更新方法は、前記実施形態４における前記（ａ）から（ｄ）のステップに、下記（ｈ）のステップを含み、かつ、前記（ｃ）のステップを下記（ｃ”’）のステップにより行う方法である。
（ｈ）ユーザにおいて、前記類似語提示手段７１０により提示された類似語に関し、前記影響度入力手段７１１において、適切と判断する影響度情報を入力するステップ（ステップＳ３’）。
（ｃ”’）前記影響度算出手段７０５からの出力を受けた言語スコア調整手段７０４が、前記影響度を用いて、前記式（ＩＩＩ）に従い、類似語の言語スコアに付与するペナルティ値（Ｐｉ）を算出するステップ。 FIG. 10 shows a flowchart of the language model update method of the present embodiment. As shown in the figure, the language model update method of the present embodiment includes the following steps (h) in steps (a) to (d) in the fourth embodiment, and the steps (c) are described below. This is a method performed by the step (c ″ ′).
(H) A step of inputting influence degree information to be judged appropriate by the influence degree input means 711 in the influence degree input means 711 for the similar words presented by the similar word presentation means 710 at the user (step S3 ′).
(C ″ ′) The language score adjusting unit 704 that has received the output from the influence degree calculating unit 705 uses the influence degree to assign a penalty value (Pi) to the language score of the similar word according to the formula (III). ).

［実施形態８］
本実施形態では、本発明の音声認識装置および音声認識方法等のさらに別の例を示す。図１１に、本実施形態の音声認識装置の構成を示す。図示のように、本実施形態の音声認識装置は、音声入力手段８１１と、特徴量抽出手段８１０と、音声認識手段８０９と、音響モデル８１３と、認識辞書８０８と、言語モデル８０７と、前記実施形態３の言語モデル更新装置８００とを含む。前記音響モデル８１３、前記言語モデル８０７および前記認識辞書８０８は、それぞれ、ハードディスク等の一般的な記憶手段に記憶されている。本実施形態では、前記音響モデル８１３として、隠れマルコフモデル（ＨＭＭ）を用いる。前記認識辞書８０８は、音声認識手段８０９の認識対象である単語の表記および読みの情報を含む。前記言語モデル８０７は、多量の音声データから学習した単語、形態素、音素等の出現頻度や接続確率等をモデル化したデータである。前記言語モデル８０７としては、例えば、Ｎグラム言語モデルを使用できる。本実施形態では、前記言語モデル８０７は、一般的なＮグラムを基本として、前記言語モデル更新装置８００によりキーワードおよび類似語の双方またはいずれかを識別できるよう拡張されている。図１２に、本実施形態で使用する前記言語モデルの内容を例示する。図示のとおり、前記言語モデル８０７は、文書を構成する単語と、前記単語に続く単語の接続のしやすさの確率や、文頭に位置し易い単語の確率を、文頭から文末にかけて記述している。本実施形態では、前記言語モデル８０７において、キーワードおよび類似語のそれぞれに対し例えば識別情報を新たに付与することができる。また、本実施形態では、言語モデル８０７は、前記認識辞書８０８に保存されている各単語に対し言語スコアを記録できる。本実施形態では、言語スコアとして接続確率値を新たに付与する例について説明する。すなわち、本実施形態では、前記言語モデル８０７には、図示のとおり、前記単語が接続される確率値に対して与えるべきボーナスまたはペナルティを記憶する領域（Ａ）が設けられている。具体的には、前記言語モデル８０７は、前記言語モデル更新手段８００が前記言語スコアにボーナスを付与する場合は、前記領域（Ａ）において、＋の数値（Ｂ）を記録し、前記言語スコアにペナルティを付与する場合は、前記領域（Ａ）において、−の数値（Ｐｉ）を記録できるように構成されている。図１１および図１２を参照して、本実施形態の音声認識装置の動作を説明する。 [Eighth embodiment]
In the present embodiment, yet another example of the speech recognition apparatus and speech recognition method of the present invention is shown. FIG. 11 shows the configuration of the speech recognition apparatus of this embodiment. As shown in the figure, the speech recognition apparatus of this embodiment includes a speech input means 811, a feature quantity extraction means 810, a speech recognition means 809, an acoustic model 813, a recognition dictionary 808, a language model 807, and the implementation described above. Language model updating apparatus 800 according to the third aspect. The acoustic model 813, the language model 807, and the recognition dictionary 808 are each stored in a general storage unit such as a hard disk. In the present embodiment, a hidden Markov model (HMM) is used as the acoustic model 813. The recognition dictionary 808 includes notation and reading information of a word that is a recognition target of the voice recognition unit 809. The language model 807 is data obtained by modeling the appearance frequency, connection probability, and the like of words, morphemes, and phonemes learned from a large amount of speech data. As the language model 807, for example, an N-gram language model can be used. In the present embodiment, the language model 807 is expanded based on a general N-gram so that the language model update device 800 can identify and / or identify keywords and / or similar words. FIG. 12 illustrates the contents of the language model used in the present embodiment. As shown in the figure, the language model 807 describes from the beginning of the sentence to the end of the sentence, the probability of easy connection between the word constituting the document and the word following the word, and the probability of the word that is likely to be located at the beginning of the sentence. . In the present embodiment, in the language model 807, for example, identification information can be newly given to each of the keyword and the similar word. In this embodiment, the language model 807 can record a language score for each word stored in the recognition dictionary 808. In the present embodiment, an example in which a connection probability value is newly given as a language score will be described. In other words, in the present embodiment, the language model 807 is provided with an area (A) for storing a bonus or penalty to be given to the probability value to which the word is connected, as shown. Specifically, when the language model updating unit 800 gives a bonus to the language score, the language model 807 records a positive numerical value (B) in the area (A), and adds the language score to the language score. When a penalty is given, a numerical value (Pi) of − can be recorded in the area (A). With reference to FIG. 11 and FIG. 12, the operation of the speech recognition apparatus of this embodiment will be described.

（８−１）通常音声認識処理
キーワードを特定することなく音声認識を行う場合、本実施形態の音声認識装置は、次のように処理を行う。まず、ユーザが、音声入力手段であるマイクロフォン８１１に音声を入力すると、前記特徴抽出手段８１０が、下記（８−２）の特徴量抽出処理を行う。
（８−２）特徴量抽出処理
前記特徴抽出手段８１０が、前記入力された音声を分析して特徴量を抽出し、抽出した前記特徴量を、前記音声認識手段８０９に対して出力する。次いで、前記音声認識手段８０９は、下記（８−３）の音声認識処理を行う。
（８−３）音声認識処理
前記音声認識手段８０９が、前記特徴量に基づき、前記音響モデル８１３においてモデル化されている音素との類似度を示す確率値を算出する。また、前記音声認識手段８０９は、前記言語モデル８０７から、単語列の候補を探索し、単語ごとに接続確率を加算し、かつ、前記音響モデル８１３を用いて算出した確率値を加算して、接続確率値が最大となる単語列を最も確からしい単語列として出力する。 (8-1) Normal Speech Recognition Processing When performing speech recognition without specifying a keyword, the speech recognition apparatus according to the present embodiment performs processing as follows. First, when the user inputs a voice to the microphone 811 which is a voice input unit, the feature extraction unit 810 performs a feature amount extraction process (8-2) below.
(8-2) Feature Amount Extraction Processing The feature extraction unit 810 analyzes the input speech and extracts a feature amount, and outputs the extracted feature amount to the speech recognition unit 809. Next, the voice recognition unit 809 performs voice recognition processing (8-3) below.
(8-3) Speech Recognition Processing The speech recognition means 809 calculates a probability value indicating the similarity with the phoneme modeled in the acoustic model 813 based on the feature amount. In addition, the speech recognition unit 809 searches for word string candidates from the language model 807, adds a connection probability for each word, and adds a probability value calculated using the acoustic model 813, The word string having the maximum connection probability value is output as the most likely word string.

（８−４）キーワード認識処理
ユーザが、特定の言語をキーワードとして音声認識装置に認識させたい場合、前記キーワード記憶手段８０１に対し、認識させたい未登録の単語（キーワード）または前記単語を含む文章を入力する。続いて、実施形態３と同じキーワード記憶処理により、前記キーワード記憶手段８０１が、キーワードに対応する単語を記憶する。次いで、類似度測定手段８０３が、下記（８−５）の類似度測定処理を行う。 (8-4) Keyword Recognition Processing When the user wants the speech recognition apparatus to recognize a specific language as a keyword, the keyword storage unit 801 does not want to recognize an unregistered word (keyword) or a sentence containing the word Enter. Subsequently, the keyword storage unit 801 stores the word corresponding to the keyword by the same keyword storage process as in the third embodiment. Next, the similarity measurer 803 performs the similarity measure process (8-5) below.

（８−５）類似度測定処理
前記類似度測定手段８０３が、前記認識辞書８０８を参照して、前記キーワードと認識辞書８０８に含まれる各単語との音響的な類似度を測定する。この類似度測定処理後、前記類似語抽出手段８０２が、下記（８−６）の類似語抽出処理を行う。 (8-5) Similarity Measurement Processing The similarity measurement means 803 refers to the recognition dictionary 808 and measures the acoustic similarity between the keyword and each word included in the recognition dictionary 808. After the similarity measurement process, the similar word extraction unit 802 performs the similar word extraction process (8-6) below.

（８−６）類似語抽出処理
前記類似語抽出手段８０２が、前記キーワードと一定以上の類似度を持った単語を前記キーワードと音響的に類似した類似語として抽出する。次いで、前記影響度算出手段８０５が、下記（８−７）の影響度算出処理を行う。 (8-6) Similar Word Extraction Processing The similar word extraction unit 802 extracts a word having a certain degree of similarity with the keyword as a similar word that is acoustically similar to the keyword. Next, the influence degree calculation means 805 performs the influence degree calculation process (8-7) below.

（８−７）影響度算出処理
前記影響度算出手段８０５が、前記言語モデル８０７において前記キーワードの音声認識装置における認識率を高めることに伴う類似語の認識率に対する影響度を算出する。次いで、言語スコア調整手段８０４が、下記（８−８）の言語スコア調整処理を行う。 (8-7) Influence degree calculation process The influence degree calculating means 805 calculates the influence degree of the language model 807 with respect to the recognition rate of similar words that accompanies increasing the recognition rate of the keyword in the speech recognition apparatus. Next, the language score adjustment unit 804 performs the language score adjustment process of (8-8) below.

（８−８）言語スコア調整処理
本実施形態では、言語スコア調整手段８０４は、前記影響度が大きいほど前記キーワードの言語スコアに与えるボーナスおよび前記類似語の言語スコアに与えるペナルティの大きさが少なくするよう調整する。具体的には、前記影響度を用いて、類似語の言語スコアに付与するペナルティ値（Ｐｉ）およびキーワードの言語スコアに付与するボーナス値（Ｂ）の双方またはいずれかを算出する。次いで、前記ペナルティ値（Ｐｉ）および前記ボーナス値（Ｂ）を示す情報を前記言語スコア更新手段８０６に対し出力する。 (8-8) Language Score Adjustment Processing In this embodiment, the language score adjustment unit 804 has a smaller bonus that is given to the language score of the keyword and a penalty that is given to the language score of the similar word as the degree of influence is larger. Adjust to Specifically, the penalty degree (Pi) given to the language score of the similar word and / or the bonus value (B) given to the language score of the keyword are calculated using the degree of influence. Next, information indicating the penalty value (Pi) and the bonus value (B) is output to the language score updating means 806.

（８−９）言語モデル更新処理
前記情報を受けた前記言語モデル更新装置８０６は、前記言語モデル８０７において、前記キーワードおよび前記類似語の双方またはいずれかの単語に対し、前記単語を特定するための識別情報を付与する。さらに、前記キーワードについては、前記領域（Ａ）に、前記ボーナス値（Ｂ）を記録する。また、前記類似語については、前記領域（Ａ）に、前記ペナルティ値（Ｐｉ）を記録する。さらに、前記言語モデル更新装置８０６は、前記音声認識手段８０９が、前記言語モデル８０７を探索する際に、下記（１）および（２）の双方またはいずれかの処理を行う。これにより、前記キーワードの接続確率が高くなり、前記キーワードが認識される精度を高めることができる。
（１）前記類似語の言語スコアに対し、前記領域（Ａ）に記録したペナルティ値を付与する。
（２）前記キーワードの言語スコア８０７に対し、前記領域（Ａ）に記録したボーナス値を付与する。 (8-9) Language model update processing In response to the information, the language model update device 806 specifies the word for the keyword and / or the similar word in the language model 807. The identification information is assigned. Further, for the keyword, the bonus value (B) is recorded in the area (A). For the similar words, the penalty value (Pi) is recorded in the area (A). Furthermore, the language model update device 806 performs either or both of the following processes (1) and (2) when the speech recognition unit 809 searches for the language model 807. Thereby, the connection probability of the keyword is increased, and the accuracy with which the keyword is recognized can be increased.
(1) The penalty value recorded in the area (A) is assigned to the language score of the similar word.
(2) The bonus value recorded in the area (A) is assigned to the language score 807 of the keyword.

本実施形態によれば、言語モデル８０７に対し、前記言語スコア調整手段８０４により、類似語に対する影響度に応じて調整した言語スコアを付与できる。そのため、キーワードの認識率を高めることができ、かつ、例えば、類似語が重要な単語であった場合や、認識率が低い単語であった場合に、類似語の認識率が低下しすぎることがない。本実施形態によれば、このような類似語についても適切に認識でき、全体としての認識率を向上させることができる。 According to this embodiment, the language score adjusted according to the influence degree with respect to a similar word can be provided with respect to the language model 807 by the said language score adjustment means 804. FIG. Therefore, the recognition rate of keywords can be increased, and for example, when the similar words are important words or the words with low recognition rates, the recognition rate of the similar words may decrease too much. Absent. According to the present embodiment, such similar words can be recognized appropriately, and the overall recognition rate can be improved.

なお、本実施形態において、前記音声認識手段８０９が、前記（１）および前記（２）を行う手段を含めてもよい。この場合は、前記音声認識手段８０９において、音声認識処理を行う際に、前記識別情報によりキーワードおよび類似語を識別して、前記（１）および前記（２）の双方またはいずれかの処理を行うことができる。そのため、この場合は、前記言語モデル更新装置８０６は、前記（１）および前記（２）の処理のための機能を持たなくてよい。 In this embodiment, the voice recognition means 809 may include means for performing (1) and (2). In this case, when the speech recognition unit 809 performs speech recognition processing, the identification information identifies the keywords and similar words, and performs either (1) or (2) or any one of the processing. be able to. Therefore, in this case, the language model update device 806 may not have a function for the processes (1) and (2).

本実施形態の音声認識方法は、前記実施形態４における前記（ａ）から（ｄ）のステップに、下記（ｅ）から（ｆ）のステップをさらに含む音声認識方法である。
（ｅ）前記特徴量抽出手段８１０が、前記音声入力手段８１１において入力された音声を分析して音響特徴量を抽出するステップ。
（ｆ）前記音声認識手段８０９が、前記音響モデル８１２および前記言語モデル８０７を参照して、音響スコアと言語スコアを総合評価して、最も確からしいと判断できる単語列を、認識結果として出力するステップ。 The speech recognition method of the present embodiment is a speech recognition method further including the following steps (e) to (f) in the steps (a) to (d) in the fourth embodiment.
(E) The feature quantity extraction unit 810 analyzes the voice input by the voice input unit 811 and extracts an acoustic feature quantity.
(F) The speech recognition unit 809 refers to the acoustic model 812 and the language model 807, comprehensively evaluates the acoustic score and the language score, and outputs a word string that can be determined to be most likely as a recognition result. Step.

１００、２００、３００、４００、６００、７００、８００言語モデル更新装置
１０１、２０１、３０１、４０１、６０１、７０１、８０１キーワード記憶手段
１０２、２０２、３０２、４０２、６０２、７０２、８０２類似語抽出手段
１０４、２０４、３０４、４０４、６０４、７０４、８０４言語スコア調整手段
１０５、２０５、３０５、４０５、６０５、７０５、８０５類似度算出手段
１０６、２０６、３０６、４０６、６０６、７０６、８０６言語スコア更新手段
１０７、２０７、３０７、４０７、６０７、７０７、８０７言語モデル
１０８、２０８、３０８、４０８、６０８、７０８、８０８認識辞書
１０９、２０９、３０９、４０９、６０９、７０９、８０９音声認識手段
２１０、８１０特徴量抽出手段
２１１、８１１音声入力手段
３０３、８０３類似度測定手段
４１０出現回数記憶手段
６１０認識率記憶手段
７１０類似語提示手段
７１１影響度入力手段
８１３音響モデル

100, 200, 300, 400, 600, 700, 800 Language model update apparatus 101, 201, 301, 401, 601, 701, 801 Keyword storage means 102, 202, 302, 402, 602, 702, 802 Similar word extraction means 104, 204, 304, 404, 604, 704, 804 Language score adjustment means 105, 205, 305, 405, 605, 705, 805 Similarity calculation means 106, 206, 306, 406, 606, 706, 806 Language score update Means 107, 207, 307, 407, 607, 707, 807 Language model 108, 208, 308, 408, 608, 708, 808 Recognition dictionary 109, 209, 309, 409, 609, 709, 809 Speech recognition means 210, 810 Feature amount extraction means 211, 811 Voice input means 303 (or 803) the similarity measurement unit 410 Occurrences storage unit 610 recognition rate storage unit 710 similar word presentation means 711 impact input means 813 acoustic model

Claims

Including keyword storage means, influence calculation means, language score adjustment means, and language score update means,
The keyword storage means stores a keyword,
The influence calculating means calculates an influence on a similar word by increasing a recognition rate of the keyword;
The language score adjusting means adjusts at least one of a bonus given to the language score of the keyword and a penalty given to the language score of the similar word based on the degree of influence,
The language score update means performs at least one of the provision of the bonus adjusted by the language score adjustment means for the language score of the keyword and the provision of the penalty adjusted by the language score adjustment means for the language score of the similar word. Thus, a language model update device for a speech recognition device, wherein the language score of the language model is updated.

2. The language model update device for a speech recognition device according to claim 1, wherein the language score adjusting means decreases the size of the bonus or the penalty in accordance with the increase in the degree of influence.

The influence degree calculating means calculates the influence degree based on at least one selected from the group consisting of an appearance frequency of the similar words, a recognition rate of the similar words, and judgment information by a user. 3. The language model update device for a speech recognition device according to 1 or 2.

The language model update device for a speech recognition apparatus according to any one of claims 1 to 3, wherein the influence degree calculating unit increases the influence degree according to an increase in appearance frequency of the similar words.

The language model update device for a speech recognition device according to any one of claims 1 to 4, wherein the language score adjustment unit calculates the penalty (Pi) according to the following formula (I).

Pi = S (sim (w, wi)) − αlogC (wi) (I)

In the formula (I), w is a keyword, wi is a similar word, sim (a, b) is a similarity between the words a and b, and S (sim) is a function for determining a penalty Pi from the similarity. C (w) is the number of appearances of the word w in the recognition history, and α is a constant coefficient.

The language model update device for a speech recognition apparatus according to any one of claims 1 to 5, wherein the influence degree calculation means increases the influence degree in accordance with a decrease in the recognition rate of the similar words.

The language model updating apparatus for a speech recognition apparatus according to any one of claims 1 to 6, wherein the language score adjusting unit calculates a penalty (Pi) according to the following formula (II).

Pi = S (sim (w, wi)) × P (wi) (II)

In the formula (II), P (w) is a recognition rate of the word w.

The language model update device for a speech recognition device according to any one of claims 1 to 7, wherein the language score adjusting unit calculates the penalty (Pi) according to the following formula (III).

Pi = S (sim (w, wi)) × T (ui) (III)

In the above formula (III), ui is the influence degree input by the user, and T is the penalty reduction degree.

The language model update device for a speech recognition device according to any one of claims 1 to 8, wherein the language score adjusting unit calculates the bonus (B) according to the following formula (IV).

B = b−β [Pi] (IV)

In the formula (IV), b is a constant bonus, β is a coefficient, and [Pi] is a maximum value of the penalty Pi of the similar word or an average value of the penalty of the similar word.

The language model update device for a speech recognition device according to any one of claims 1 to 9,
Voice input means for inputting voice;
Feature quantity extraction means for extracting the feature quantity of the input speech;
Language model,
Voice recognition means,
Updating the language model in the language model updating device for the speech recognition device;
The speech recognition apparatus, wherein the speech recognition unit recognizes the speech using the updated language model from the feature amount extracted by the feature amount extraction unit.

Using keyword storage means, influence calculation means, language score adjustment means, and language score update means,
A keyword storage step in which the keyword storage means stores a keyword;
An influence degree calculating step in which the influence degree calculating means calculates an influence degree on a similar word by increasing a recognition rate of the keyword;
A language score adjusting step in which the language score adjusting means adjusts at least one of a bonus given to the language score of the keyword and a penalty given to the language score of the similar word based on the degree of influence;
The language score update means performs at least one of the provision of the bonus adjusted by the language score adjustment means for the language score of the keyword and the provision of the penalty adjusted by the language score adjustment means for the language score of the similar word. A language score update method for a speech recognition apparatus, comprising: a language score update step of updating a language score of a language model.

12. The method for updating a language model for a speech recognition apparatus according to claim 11, wherein, in the language score adjustment step, the language score adjustment means decreases the bonus or the penalty according to the increase in the influence degree. .

In the influence degree calculating step, the influence degree calculating means calculates the influence degree based on at least one selected from the group consisting of the appearance frequency of the similar words, the recognition rate of the similar words, and judgment information by the user. The method for updating a language model for a speech recognition apparatus according to claim 11 or 12.

The speech recognition apparatus according to any one of claims 11 to 13, wherein, in the influence degree calculating step, the influence degree calculating unit increases the influence degree in accordance with an increase in appearance frequency of the similar words. To update the language model.

The speech recognition device according to any one of claims 11 to 14, wherein, in the language score adjustment step, the language score adjustment unit calculates the penalty (Pi) according to the following formula (I). How to update the language model.

Pi = S (sim (w, wi)) − αlogC (wi) (I)

In the formula (I), w is a keyword, wi is a similar word, sim (a, b) is a similarity between the words a and b, and S (sim) is a function for determining a penalty Pi from the similarity. C (w) is the number of appearances of the word w in the recognition history, and α is a constant coefficient.

The speech recognition according to any one of claims 11 to 15, wherein, in the influence degree calculating step, the influence degree calculating means increases the influence degree as the recognition rate of the similar word decreases. How to update the language model for devices.

17. The speech recognition apparatus according to claim 11, wherein in the language score adjustment step, the language score adjustment unit calculates the penalty (Pi) according to the following formula (II): How to update the language model.

Pi = S (sim (w, wi)) × P (wi) (II)

In the formula (II), P (w) is a recognition rate of the word w.

The speech recognition device according to any one of claims 11 to 17, wherein, in the language score adjustment step, the language score adjustment unit calculates the penalty (Pi) according to the following formula (III). How to update the language model.

Pi = S (sim (w, wi)) × T (ui) (III)

In the above formula (III), ui is the influence degree input by the user, and T is the penalty reduction degree.

The speech recognition device according to any one of claims 11 to 18, wherein, in the language score adjustment step, the language score adjustment unit calculates the bonus (B) according to the following formula (IV). How to update the language model.

B = b−β [Pi] (IV)

In the formula (IV), b is a constant bonus, β is a coefficient, and [Pi] is a maximum value of the penalty Pi of the similar word or an average value of the penalty of the similar word.

Using the speech recognition device according to claim 10,
Updating the language model by the speech recognition language model update method according to any one of claims 11 to 19;
A feature amount extraction step in which the feature amount extraction means extracts the feature amount of the voice input by the voice input means;
A speech recognition method, wherein the speech recognition means includes a speech recognition step of recognizing the speech using a language model from the feature quantity extracted by the feature quantity extraction means.

A computer program capable of executing the method according to any one of claims 11 to 20 on a computer.

A recording medium storing the computer program according to claim 21.