JPH08123471A

JPH08123471A - Speech recognition device

Info

Publication number: JPH08123471A
Application number: JP6265175A
Authority: JP
Inventors: Keisuke Watanabe; 圭輔渡邉; Akito Nagai; 明人永井; Yasushi Ishikawa; 泰石川
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1994-10-28
Filing date: 1994-10-28
Publication date: 1996-05-17
Anticipated expiration: 2017-03-18
Also published as: JP3265864B2

Abstract

PURPOSE: To output a recognition result as a series of semantic origins through a backward search by extracting the semantic origins included in a spoken speech simultaneously with recognition. CONSTITUTION: A syntax network given semantic origins is generated from syntax and semantic knowledge wherein semantic origins are made to correspond to syntax rules and outputted to a syntax network storage part 3. A forward search part 5 searches a syntax hypothesis for an input speech according to the syntax network by using an acoustic dictionary part 4, and outputs a search history to a search history storage part 6. A search history rewriting part 7 refers to the semantic origins given to the syntax network and rewrites the search history held in the search history storage part 6. A backward search part 8 traces back the search history held in the search history storage part 6 to generate a recognition result as a series of semantic origins.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、自然言語によるマン
・マシン・インタフェースに用いられる音声認識装置に
関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech recognition apparatus used for a natural language man-machine interface.

【０００２】[0002]

【従来の技術】図２５は、例えば、Proceedings of 199
1 International Conference on Acoustics, Speech &
Signal Processingの701-704頁に示された従来の連続音
声認識装置である。３は構文ネットワークを保持する構
文ネットワーク記憶部、４は音響モデルの標準パタンを
保持する音響辞書部、５は前記構文ネットワークと前記
音響モデルを用いて、入力音声に対する構文仮説の探索
を構文ネットワークにしたがって行ない、構文ネットワ
ークの構文ノード、前記構文ノードに到達した時刻、前
記構文ノードで前記時刻での探索スコア、前記構文ノー
ドの１つ前に到達した構文ノード、前記１つ前に到達し
た構文ノードに到達した時刻、前記構文ノードと前記１
つ前の構文ノード間の単語を含む探索履歴を出力する前
向き探索部、５は前向き探索部から出力される探索履歴
を保持する探索履歴記憶部、８は探索履歴記憶部に保持
された探索履歴を読み出し、探索履歴にしたがって構文
ネットワーク上を辿り、認識結果を生成する後向き探索
部である。2. Description of the Related Art FIG. 25 shows, for example, Proceedings of 199.
1 International Conference on Acoustics, Speech &
It is a conventional continuous speech recognition apparatus shown on pages 701 to 704 of Signal Processing. Reference numeral 3 is a syntax network storage unit that holds a syntax network, 4 is an acoustic dictionary unit that holds a standard pattern of an acoustic model, and 5 is a syntax network that searches for a syntax hypothesis for an input speech using the syntax network and the acoustic model. Therefore, the syntax node of the syntax network, the time when the syntax node is reached, the search score at the time when the syntax node is reached, the syntax node which is one before the syntax node, and the syntax node which is one before the syntax node The time at which the syntax node and the 1
A forward search unit that outputs a search history including a word between the previous syntax nodes, 5 is a search history storage unit that holds the search history output from the forward search unit, and 8 is a search history that is stored in the search history storage unit. Is a backward search unit that reads out, follows the syntax network according to the search history, and generates a recognition result.

【０００３】図２６は、前向き探索部５に入力される入
力音声を示す図である。入力される音声は、ｔ₀，
ｔ₁，ｔ₂，・・・のような一定時間間隔でフレーム単
位に抽出される。そして、抽出された音声は、周波数分
析がなされ、例えば、１６次元の特徴パラメータｖ₀，
ｖ₁，ｖ₂，・・・が抽出される。この特徴パラメータ
ｖ₀，ｖ₁，ｖ₂，・・・は、前向き探索部５に入力さ
れる。FIG. 26 is a diagram showing an input voice input to the forward search unit 5. The input voice is t ₀ ,
It is extracted in frame units at fixed time intervals such as t ₁ , t ₂ , .... Then, the extracted voice is subjected to frequency analysis, and, for example, 16-dimensional feature parameter v ₀ ,
v ₁ , v ₂ , ... Are extracted. The characteristic parameters v ₀ , v ₁ , v ₂ , ... Are input to the forward search unit 5.

【０００４】図２７は、音響辞書部４に記憶された音響
モデルの標準パタンを示す図である。例えば、音素／ａ
／の音響モデルとして、ＨＭＭ（隠れマルコフモデル）
によるパタンＡが登録されている。前向き探索部５は、
入力した特徴パラメータｖ₀，ｖ₁，ｖ₂，・・・の系
列を図２７に示すパタンＡ，パタンＢ，パタンＣ，・・
・と比較することにより、入力された特徴パラメータの
系列に対して各音素／ａ／，／ｉ／，／ｕ／，・・・の
尤度計算を行う。FIG. 27 is a diagram showing the standard pattern of the acoustic model stored in the acoustic dictionary unit 4. For example, phoneme / a
HMM (Hidden Markov Model) as acoustic model of /
Pattern A by is registered. The forward search unit 5
The sequence of the input characteristic parameters v ₀ , v ₁ , v ₂ , ... Is shown in FIG. 27 as pattern A, pattern B, pattern C, ...
By performing comparison with, the likelihood calculation of each phoneme / a /, / i /, / u /, ... Is performed on the input feature parameter sequence.

【０００５】図２８は、構文ネットワークを生成するた
めの構文の一例を示す図である。構文規則は、規則部と
辞書部に分けられている。図２８の辞書部の右辺に記述
されたものは、終端記号と呼ばれる。終端記号は、それ
以上展開されることはない。即ち、規則部及び辞書部に
おいて、終端記号が左辺に記述されることはない。一
方、規則部及び辞書部において、〈〉で挟まれた記号
は、非終端記号である。非終端記号は、辞書部におい
て、左辺に記述される。また、規則部は、左辺、右辺と
も非終端記号によって記述される。FIG. 28 is a diagram showing an example of a syntax for generating a syntax network. The syntax rules are divided into a rule part and a dictionary part. What is described on the right side of the dictionary portion in FIG. 28 is called a terminal symbol. The terminal symbol is not expanded any further. That is, in the rule part and the dictionary part, the terminal symbol is not described on the left side. On the other hand, in the rule section and the dictionary section, the symbols sandwiched between <> are non-terminal symbols. The non-terminal symbol is described on the left side in the dictionary part. The rule part is described by non-terminal symbols on both the left side and the right side.

【０００６】図２９は、図２８に示した構文により作成
された構文ネットワークを示す図である。図２９に示す
構文ネットワークは、構文ネットワーク記憶部３に記憶
されているネットワークである。図２９において、Ｎ
１，Ｎ２，Ｎ３・・・は、構文ノードである。また、１
つの構文ノードから他の構文ノードへの矢印は、構文ア
ークと呼ばれるものである。FIG. 29 is a diagram showing a syntax network created by the syntax shown in FIG. The syntax network shown in FIG. 29 is a network stored in the syntax network storage unit 3. In FIG. 29, N
1, N2, N3 ... Are syntax nodes. Also, 1
The arrow from one syntax node to another syntax node is called a syntax arc.

【０００７】前向き探索部５は、例えば、音響分析され
たある時刻ｔ₂₃の特徴パラメータｖ₂₃が入力されると、
構文ネットワーク記憶部３に保持された構文ネットワー
ク、及び音響辞書部４に保持された標準パタンを参照し
て、構文ネットワークのすべての構文ノードに対して、
特徴パラメータとその構文ノードに接続している単語の
標準パタンを用いて尤度を計算する。この尤度が探索ス
コアである。そして、例えば、図３０に示すような探索
履歴を出力する。図３０に示す探索履歴は、図２９の構
文ネットワークにおける構文ノードＮ３での、単語ｈｊ
ａｋｕに対する時刻ｔ₂₃での探索履歴を示しており、ｇ
ｎには構文ノードＮ３、ｆｒｍには構文ノードＮ３に到
達した時刻ｔ₂₃、ｐｒｏｂには構文ノードＮ３で時刻ｔ
₂₃での、ｐｇｎには構文ノードＮ３の１つ前に到達した
構文ノードＮ２、ｓｆｒｍには１つ前に到達した構文
ノードＮ２に到達した時刻ｔ₁₅、ｗｏｒｄには構文ノー
ドＮ３と１つ前に到達した構文ノードＮ２間の単語ｈｊ
ａｋｕ、が保持されている。The forward search unit 5 receives, for example, the acoustically analyzed characteristic parameter v ₂₃ at a certain time t ₂₃ ,
By referring to the syntax network held in the syntax network storage unit 3 and the standard pattern held in the acoustic dictionary unit 4, for all syntax nodes of the syntax network,
The likelihood is calculated using the feature parameter and the standard pattern of words connected to the syntactic node. This likelihood is the search score. Then, for example, the search history as shown in FIG. 30 is output. The search history shown in FIG. 30 is the word hj in the syntax node N3 in the syntax network of FIG.
It shows the search history at time t ₂₃ for aku, g
n is the syntax node N3, frm is the syntax node N3 at time t ₂₃ , and prob is the syntax node N3 at time t ₂₃ .
_{At 23} , pgn arrives at the syntax node N2 that arrives one before the syntax node N3, sfrm arrives at the syntax node N2 that arrives one before, and the time t ₁₅ arrives, and the word immediately precedes the syntax node N3. Hj between the syntax nodes N2 reaching
aku is held.

【０００８】次に、後向き探索部８の動作を図３１及び
図３２を用いて説明する。図３２に示すように、最終時
刻ｔ₇₃において生成された探索履歴の中で、最大の探索
スコアを持つ探索履歴を選択し、その後同様にして時刻
ｔ₃₅においても最大の探索スコアを持つ探索履歴を選択
する。このようにして、探索履歴の系列（ｊ）−（ｈ）
−（ｆ）−（ｃ）−（ａ）を得る。このようにして、正
解候補の１つとして「ｇｏｈｊａｋｕｅｎｄｅｏ
ｎｅｇａｉｓｉｍａｓｕ」という単語の系列を得る。ま
た、第２番目の探索スコア、あるいは、第３番目の探索
スコアの単語等を組み合せることにより、上位Ｎ個の正
解候補を得て、音声認識の結果とする。そして、これら
出力された単語の系列に対してその後、意味素性を抽出
する処理を行う。Next, the operation of the backward search section 8 will be described with reference to FIGS. 31 and 32. As shown in FIG. 32, among the search histories generated at the final time t ₇₃ , the search history having the maximum search score is selected, and similarly, the search history having the maximum search score also at the time t ₃₅ . Select. In this way, the search history sequence (j)-(h)
-(F)-(c)-(a) is obtained. In this way, as one of the correct answer candidates, "go hjaku en deo"
We obtain a sequence of words "negaisimasu". Also, by combining the words of the second search score or the third search score, etc., the top N correct answer candidates are obtained and used as the result of speech recognition. Then, a process of extracting semantic features is performed on these output word sequences.

【０００９】このように従来の音声認識装置において
は、前向き探索部５から出力されるすべての探索履歴が
探索履歴記憶部６に保持され、後向き探索部８において
索履歴記憶部６に保持された探索履歴を読み出し、探索
履歴にしたがって構文ネットワーク上を辿ることで認識
結果を生成することができる。As described above, in the conventional voice recognition apparatus, all search histories output from the forward search section 5 are held in the search history storage section 6 and in the backward search section 8 in the search history storage section 6. The recognition result can be generated by reading the search history and following the search history on the syntax network.

【００１０】[0010]

【発明が解決しようとする課題】音声認識装置を、自然
言語によるマン・マシン・インタフェースに用いる場
合、例えば、電話によるホテル予約システムに用いる場
合、システムを駆動するために認識結果に含まれる意味
を抽出する必要がある。しかしながら、上記のような従
来の音声認識装置では、後向き探索部８から出力される
認識結果が単語の系列であるので、認識処理の後に単語
から例えば意味素性を抽出する処理を行なう必要があっ
た。When the voice recognition device is used for a man-machine interface in natural language, for example, for a hotel reservation system by telephone, the meaning included in the recognition result to drive the system is determined. Need to be extracted. However, in the conventional speech recognition device as described above, since the recognition result output from the backward search unit 8 is a series of words, it is necessary to perform a process of extracting, for example, a semantic feature from the words after the recognition process. .

【００１１】また、探索履歴記憶部６には、前向き探索
部５から出力されるすべての探索履歴が保持されるの
で、意味的には同じであるが、助詞や語尾などがわずか
に異なる構文仮説がすべて保持されることになる。この
結果、後向き探索部８から出力される上位Ｎ個の認識結
果は、意味的に同じ候補で占められ、正解が上位Ｎ個に
含まれないため、正しい認識結果が得られないという問
題点があった。Further, since the search history storage unit 6 holds all the search histories output from the forward search unit 5, they are syntactically the same, but the particle hypotheses and endings are slightly different. Will be all retained. As a result, the top N recognition results output from the backward search unit 8 are occupied by semantically the same candidates, and correct answers are not included in the top N recognition results, so that a correct recognition result cannot be obtained. there were.

【００１２】この発明は、上述のような課題を解決する
ためになされたもので、第１の目的は、後向き探索にお
いて認識と同時に発話に含まれる意味素性を抽出し、意
味素性の系列としての認識結果を出力する音声認識装置
を得るものである。The present invention has been made to solve the above problems, and a first object of the present invention is to extract a semantic feature included in an utterance at the same time as recognition in a backward search and to obtain a semantic feature sequence. A voice recognition device for outputting a recognition result is obtained.

【００１３】また、第２の目的は、上位Ｎ個の認識結果
が意味的に同じ候補で占められることなく、意味的に異
なった正解候補を多く出力する音声認識装置を得るもの
である。A second object of the present invention is to obtain a speech recognition apparatus which outputs a large number of correct answer candidates which are semantically different from each other without the top N recognition results being occupied by the semantically same candidates.

【００１４】[0014]

【課題を解決するための手段】この発明に係る音声認識
装置は、意味情報を付与した構文ネットワークを保持す
る構文ネットワーク記憶部と、音響モデルの標準パタン
を保持する音響辞書部と、前記構文ネットワークと前記
音響モデルを用いて、入力音声に対する構文仮説の探索
を前記構文ネットワークにしたがって行ない、探索履歴
を出力する前向き探索部と、前記前向き探索部から出力
される探索履歴を保持する探索履歴記憶部と、前記構文
ネットワークに付与された意味情報を参照して、前記探
索履歴記憶部に保持される探索履歴を書き換える探索履
歴書き換え部と、前記探索履歴記憶部に保持された探索
履歴を読み出し、前記探索履歴にしたがって前記構文ネ
ットワーク上を辿ることによって、認識結果を生成する
後向き探索部を備えたことを特徴とする。A speech recognition apparatus according to the present invention includes a syntactic network storage unit for holding a syntactic network to which semantic information is added, an acoustic dictionary unit for holding a standard pattern of an acoustic model, and the syntactic network. Using the acoustic model, a search for a syntactic hypothesis for input speech is performed according to the syntax network, and a forward search unit that outputs a search history and a search history storage unit that holds the search history output from the forward search unit A search history rewriting unit that rewrites the search history held in the search history storage unit with reference to the semantic information given to the syntax network; and a search history stored in the search history storage unit, A backward search unit that generates a recognition result by tracing the syntax network according to the search history is provided. Characterized in that was.

【００１５】前記音声認識装置は、更に、入力音声の文
法を規定する構文知識に意味情報を対応づけた構文・意
味知識を保持する構文・意味知識記憶部と、前記構文・
意味知識から意味情報を付与した構文ネットワークを生
成する構文ネットワーク生成部を備えたことを特徴とす
る。The speech recognition apparatus further includes a syntax / semantic knowledge storage unit that holds syntax / semantic knowledge in which semantic information is associated with syntactic knowledge that defines a grammar of input speech, and the syntax / semantic knowledge storage unit.
It is characterized by comprising a syntax network generation unit for generating a syntax network to which semantic information is added from semantic knowledge.

【００１６】この発明に係る音声認識装置は、意味情報
と演算規則を付与した構文ネットワークを保持する構文
ネットワーク記憶部と、音響モデルの標準パタンを保持
する音響辞書部と、前記構文ネットワークと前記音響モ
デルを用いて、入力音声に対する構文仮説の探索を前記
構文ネットワークにしたがって行ない、探索履歴を出力
する前向き探索部と、前記前向き探索部から出力される
探索履歴を保持する探索履歴記憶部と、前記構文ネット
ワークに付与された意味情報を参照して、前記探索履歴
記憶部に保持される探索履歴を書き換える探索履歴書き
換え部と、前記探索履歴記憶部に保持された探索履歴を
読み出し、前記探索履歴にしたがって構文ネットワーク
上を辿り、前記構文ネットワーク上に付与された意味情
報及び演算規則により意味情報の演算を行ない、認識結
果を出力する後向き探索部を備えたことを特徴とする。A speech recognition apparatus according to the present invention includes a syntactic network storage unit that holds a syntactic network to which semantic information and operation rules are added, an acoustic dictionary unit that holds a standard pattern of an acoustic model, the syntactic network and the acoustics. Using the model, a search for a syntactic hypothesis for input speech is performed according to the syntax network, a forward search unit that outputs a search history, a search history storage unit that holds the search history output from the forward search unit, and A search history rewriting unit that rewrites the search history held in the search history storage unit by referring to the semantic information given to the syntax network, and a search history stored in the search history storage unit are read out, and stored in the search history. Therefore, the syntactic network is traced, and the semantic information and the operation rule given on the syntactic network are Ri performs calculation of semantic information, and further comprising a backward search unit for outputting a recognition result.

【００１７】前記音声認識装置は、更に、入力音声の文
法を規定する構文知識の中で、単語を規定する辞書部に
おいて単語に意味情報を対応づけ、規則部において意味
情報の演算規則を対応づけた構文・意味知識を保持す
る、構文・意味知識記憶部と、前記構文・意味知識か
ら、意味情報と演算規則を付与した構文ネットワークを
生成する構文ネットワーク生成部を備えたことを特徴と
する。In the speech recognition device, further, in the syntactic knowledge that defines the grammar of the input speech, the dictionary part that defines the word associates the word with the semantic information, and the rule part associates the operation rule of the semantic information. It is characterized by further comprising a syntax / semantic knowledge storage unit that holds the syntax / semantic knowledge, and a syntax network generation unit that generates a syntax network to which semantic information and operation rules are added from the syntax / semantic knowledge.

【００１８】この発明に係る音声認識装置は、前記前向
き探索部が前記構文ネットワークの構文ノード、前記構
文ノードに到達した時刻、前記構文ノードで前記時刻で
の探索スコア、前記構文ノードの１つ前に到達した構文
ノード、前記１つ前に到達した構文ノードに到達した時
刻、前記構文ノードと前記１つ前の構文ノード間の単語
を探索履歴として出力するとともに、前記探索履歴書き
換え部は、探索履歴記憶部に保持されている同時刻・同
構文ノードに対する探索履歴で、単語の意味情報が同一
であるものが複数存在する場合、前記意味情報が同一で
ある探索履歴の中の一部の探索履歴を前記探索履歴記憶
部に残して、前記意味情報が同一である他の探索履歴を
前記探索履歴記憶部から削除することを特徴とする。In the speech recognition apparatus according to the present invention, the forward search unit receives a syntax node of the syntax network, a time when the syntax node reaches the syntax node, a search score at the time at the syntax node, and one before the syntax node. The syntactic node that has reached, the time when the syntactic node that has reached the preceding syntax node is reached, and the word between the syntactic node and the syntax node that is the previous syntax node are output as a search history, and the search history rewriting unit When a plurality of search histories for the same time and same syntax node held in the history storage unit have the same word semantic information, a part of the search history having the same semantic information is searched. The history is left in the search history storage unit, and another search history having the same semantic information is deleted from the search history storage unit.

【００１９】前記探索履歴書き換え部は、探索履歴記憶
部に保持されている探索履歴で、単語の意味情報が特定
のものである探索履歴を、その探索履歴が保持する１つ
前に到達した構文ノード及び１つ前に到達した構文ノー
ドに到達した時刻に対応する探索履歴で書き換えること
を特徴とする。The search history rewriting unit is a search history stored in the search history storage unit, and a search history in which the semantic information of a word is specific is the syntax that arrives immediately before the search history is stored. It is characterized in that it is rewritten with a search history corresponding to the time when the node and the syntax node which arrived one before are reached.

【００２０】[0020]

【作用】上記のように構成された音声認識装置において
は、探索履歴書き換え部が、構文ネットワークに付与さ
れた意味情報を参照して、探索履歴記憶部に保持される
探索履歴を書き換えるので、後向き探索部において認識
と同時に発話に含まれる意味を抽出し、意味の系列とし
ての認識結果を出力できるようになる。In the speech recognition apparatus configured as described above, since the search history rewriting unit rewrites the search history held in the search history storage unit by referring to the semantic information given to the syntax network, At the same time as the recognition in the search unit, the meaning included in the utterance is extracted, and the recognition result as a meaning series can be output.

【００２１】また、構文・意味知識記憶部が構文知識に
対して意味情報を対応づけて保持しているので、構文ネ
ットワーク生成部は、意味情報を付与した構文ネットワ
ークを自動的に生成する。Further, since the syntax / semantic knowledge storage unit holds the semantic information in association with the syntactic knowledge, the syntactic network generation unit automatically generates the syntactic network to which the semantic information is added.

【００２２】更に、また、後向き探索部で、構文ネット
ワーク上に付与された意味情報及び意味情報の演算規則
により意味情報の演算を行なうので、意味情報の演算結
果を認識結果として出力できるようになる。Furthermore, since the backward search unit calculates the semantic information according to the semantic information and the arithmetic rule of the semantic information provided on the syntax network, the arithmetic result of the semantic information can be output as the recognition result. .

【００２３】また、構文・意味知識記憶部が意味情報と
演算規則を保持しているので、構文ネットワーク生成部
が意味情報と演算規則を付与した構文ネットワークを自
動的に生成する。Further, since the syntax / semantic knowledge storage unit holds the semantic information and the operation rule, the syntax network generation unit automatically generates the syntax network to which the semantic information and the operation rule are added.

【００２４】また、探索履歴書き換え部が、探索履歴記
憶部に保持されている同時刻・同構文ノードで単語の意
味情報が同一である探索履歴のうち、一部を探索履歴記
憶部に残し、他の探索履歴を探索履歴記憶部から削除す
るため、探索履歴記憶部に保持される、同時刻・同構文
ノードで意味的に同じ構文仮説を持つ探索履歴の数が減
少し、後向き探索部が出力する上位Ｎ個の認識結果は、
意味的に異なった正解候補を多く含むようになる。Further, the search history rewriting unit leaves a part of the search history stored in the search history storage unit and having the same word semantic information at the same time and same syntax node in the search history storage unit, Since other search histories are deleted from the search history storage unit, the number of search histories held in the search history storage unit that have the same semantic hypothesis at the same time and same syntax node decreases, and the backward search unit The top N recognition results to be output are
This will include many correct answer candidates that are semantically different.

【００２５】更に、探索履歴書き換え部が、探索履歴記
憶部に保持されている探索履歴で、単語の意味素性が特
定のものである探索履歴を、その探索履歴が保持する１
つ前に到達した構文ノードかつ１つ前に到達した構文ノ
ードに到達した時刻に対応する探索履歴で書き換えるた
め、意味素性が特定のものである単語が構文仮説から削
除され、後向き探索部が出力する上位Ｎ個の認識結果
は、意味的に異なった正解候補を多く含むようになる。Furthermore, the search history rewriting unit holds the search history stored in the search history storage unit, the search history having a particular semantic feature of a word.
The search history corresponding to the time at which the previous syntax node and the time at which the previous syntax node was reached are rewritten, so words with specific semantic features are deleted from the syntax hypothesis, and the backward search section outputs. The top N recognition results that are set include many correct answer candidates that are semantically different.

【００２６】[0026]

【Example】

実施例１．図１は、この発明の一実施例である音声認識
装置を示すもので、１は入力音声の文法を規定する構文
知識に意味情報を対応づけた、構文・意味知識を保持す
る構文・意味知識記憶部、２は前記構文・意味知識か
ら、意味情報を付与した構文ネットワークを生成する構
文ネットワーク生成部、３は前記構文ネットワーク生成
部が生成した構文ネットワークを保持する構文ネットワ
ーク記憶部である。４は音響モデルの標準パタンを保持
する音響辞書部、５は前記構文ネットワークと前記音響
モデルを用いて、入力音声に対する構文仮説の探索を前
記構文ネットワークにしたがって行ない、前記構文ネッ
トワークの構文ノード、前記構文ノードに到達した時
刻、前記構文ノードで前記時刻での探索スコア、前記構
文ノードの１つ前に到達した構文ノード、前記１つ前に
到達した構文ノードに到達した時刻、前記構文ノードと
前記１つ前の構文ノード間の単語を含む探索履歴を出力
する前向き探索部である。６は前記前向き探索部から出
力される探索履歴を保持する探索履歴記憶部、７は前記
構文ネットワークに付与された意味情報を参照して、前
記探索履歴記憶部に保持される探索履歴を書き換える探
索履歴書き換え部である。８は前記探索履歴記憶部に保
持された探索履歴を読み出し、前記探索履歴にしたがっ
て前記構文ネットワーク上を辿ることによって、認識結
果を生成する後向き探索部である。Example 1. FIG. 1 shows a speech recognition apparatus according to an embodiment of the present invention. Reference numeral 1 is a syntax / semantic knowledge holding syntax / semantic knowledge in which semantic information is associated with syntactic knowledge defining a grammar of input speech. A storage unit 2, a syntax network generation unit that generates a syntax network to which semantic information is added from the syntax / semantic knowledge, and a syntax network storage unit 3 that holds the syntax network generated by the syntax network generation unit. Reference numeral 4 denotes an acoustic dictionary unit that holds a standard pattern of an acoustic model, and 5 uses the syntax network and the acoustic model to search for a syntax hypothesis for an input speech according to the syntax network. The time at which the syntax node is reached, the search score at the time at the syntax node, the syntax node at which the syntax node has arrived one before the syntax node, the time at which the syntax node has reached the previous syntax node, the syntax node and the It is a forward search unit that outputs a search history including a word between previous syntax nodes. 6 is a search history storage unit that holds the search history output from the forward search unit, and 7 is a search for rewriting the search history stored in the search history storage unit by referring to the semantic information given to the syntax network. It is a history rewriting unit. Reference numeral 8 denotes a backward search unit that reads out the search history held in the search history storage unit and follows the syntax network according to the search history to generate a recognition result.

【００２７】図２は、構文・意味知識記憶部１に保持さ
れる構文・意味知識の一例を示すものである。例えば、
右辺が終端記号ｈｊａｋｕである規則〈百〉：＝ｈｊａ
ｋｕでは、意味情報として意味素性１００を終端記号ｈ
ｊａｋｕに対して定義している。FIG. 2 shows an example of the syntax / semantic knowledge stored in the syntax / semantic knowledge storage unit 1. For example,
Rule <hundred>: = hja whose right side is terminal symbol hjaku
In ku, the semantic feature 100 is used as the semantic information and the terminal symbol h
It is defined for jaku.

【００２８】図３は、構文・意味知識記憶部１に保持さ
れる構文・意味知識から、構文ネットワーク生成部２に
よって生成され、構文ネットワーク記憶部３に保持され
る構文ネットワークの一例を示すものである。各々の構
文アークには、構文知識の終端記号が付与され、同時に
終端記号に対する意味素性も付与される。例えば、終端
記号ｈｊａｋｕの構文アークには、終端記号ｈｊａｋｕ
に対する意味情報として意味素性１００が付与されてい
る。FIG. 3 shows an example of the syntax network generated by the syntax network generation unit 2 from the syntax / semantic knowledge stored in the syntax / semantic knowledge storage unit 1 and stored in the syntax network storage unit 3. is there. Each syntactic arc is given a terminal symbol of syntactic knowledge, and at the same time, a semantic feature for the terminal symbol. For example, in the syntax arc of the terminal symbol hjaku, the terminal symbol hjaku
A semantic feature 100 is given as the semantic information for the.

【００２９】図２及び図３において、格助詞と語尾に対
する意味情報として、意味素性ＮＵＬＬを付与してい
る。格助詞と語尾に対してＮＵＬＬを付与しているの
は、音声認識処理において、重要な意味を持たないと考
えているからである。例えば、ホテル予約システムにお
いて金額を質問した場合、音声認識装置は回答として金
額を認識できれば良い。したがって、回答が「５００円
でお願いします」、あるいは、「５００円がいいで
す」、あるいは、「５００円なんですが」等である場
合、音声認識装置は、「５００円」を認識できれば良
く、その他の単語に意味を持たせる必要がない。したが
って、図２及び図３に示すように、格助詞及び語尾に対
しては、ＮＵＬＬという意味素性を付与している。In FIG. 2 and FIG. 3, the semantic feature NULL is given as the semantic information for the case particle and the ending. The reason why NULL is given to the case particle and the ending is that it is considered to have no significant meaning in the speech recognition processing. For example, when asking the amount of money in the hotel reservation system, the voice recognition device may recognize the amount of money as an answer. Therefore, if the answer is "Please give me 500 yen", "I like 500 yen", or "I have 500 yen", etc., the voice recognition device should be able to recognize "500 yen". , No need to give meaning to other words. Therefore, as shown in FIGS. 2 and 3, the case particle and the ending are given a semantic feature of NULL.

【００３０】この実施例においても、従来例と同様、前
向き探索部５は、図４に示すような探索履歴を出力す
る。探索履歴書き換え部７は、構文ネットワーク記憶部
３に保持される構文ネットワークを参照することによ
り、探索履歴のｗｏｒｄ欄に記された単語を、その単語
の意味素性に書き換える。即ち、ｈｊａｋｕを１００に
書き換える。図４の探索履歴を書き換えた例を図５に示
す。Also in this embodiment, the forward search unit 5 outputs the search history as shown in FIG. 4, as in the conventional example. The search history rewriting unit 7 refers to the syntax network held in the syntax network storage unit 3 to rewrite the word written in the word column of the search history to the semantic feature of the word. That is, hjaku is rewritten to 100. FIG. 5 shows an example in which the search history of FIG. 4 is rewritten.

【００３１】後向き探索部８の動作を、図６に示す探索
履歴が探索履歴記憶部６に保持されている場合について
説明する。まず、入力音声の最終時刻ｔ₇₃、且つ、構文
ネットワークでの最終構文ノードＮ６に対応する探索履
歴の中で、最大の探索スコアを持つ探索履歴（ｊ）で示
される探索履歴を、探索履歴記憶部６から選ぶ。次に、
探索履歴（ｊ）のｐｇｎの値Ｎ５と等しい値をｇｎに持
ち、探索履歴（ｊ）のｓｆｒｍの値ｔ₃₅と等しい値をｆ
ｒｍに持つ探索履歴（ｇ），（ｈ），（ｉ）のうち、探
索スコアが最大の探索履歴（ｈ）を探索履歴記憶部６か
ら選ぶ。以下、同様に探索履歴を辿ることにより、探索
履歴の系列（ｊ）−（ｈ）−（ｆ）−（ｃ）−（ａ）が
得られる。得られる探索履歴の系列は、時間的に逆向き
であるので、系列の最後の探索履歴（ａ）から順次ｗｏ
ｒｄ欄を参照することで、意味素性の系列「５１００
ＭＯＮＥＹＮＵＬＬＮＵＬＬ」を認識結果として出
力する。The operation of the backward search unit 8 will be described when the search history shown in FIG. 6 is held in the search history storage unit 6. First, the search history indicated by the search history (j) having the maximum search score among the search history corresponding to the final syntax node N6 in the syntax network at the final time t ₇₃ of the input voice is stored in the search history memory. Select from Part 6. next,
The value gn has a value equal to the value N5 of pgn in the search history (j), and f has a value equal to the value t ₃₅ of sfrm in the search history (j).
Of the search histories (g), (h), and (i) held in rm, the search history (h) having the largest search score is selected from the search history storage unit 6. Thereafter, the search history is traced in the same manner to obtain the search history sequence (j)-(h)-(f)-(c)-(a). Since the obtained search history sequence is reverse in time, it is sequentially searched from the last search history (a) in the sequence.
By referring to the rd column, the sequence of semantic features “5 100
"MONEY NULL NULL" is output as the recognition result.

【００３２】以上のように、この実施例に係わる音声認
識装置は、入力音声の文法を規定する構文知識に意味情
報を対応づけた、構文・意味知識を保持する構文・意味
知識記憶部と、前記構文・意味知識から、意味情報を付
与した構文ネットワークを生成する構文ネットワーク生
成部と、前記構文ネットワーク生成部が生成した構文ネ
ットワークを保持する構文ネットワーク記憶部と、音響
モデルの標準パタンを保持する音響辞書部と、前記構文
ネットワークと前記音響モデルを用いて、入力音声に対
する構文仮説の探索を前記構文ネットワークにしたがっ
て行ない、前記構文ネットワークの構文ノード、前記構
文ノードに到達した時刻、前記構文ノードで前記時刻で
の探索スコア、前記構文ノードの１つ前に到達した構文
ノード、前記１つ前に到達した構文ノードに到達した時
刻、前記構文ノードと前記１つ前の構文ノード間の単語
を含む探索履歴を出力する前向き探索部と、前記前向き
探索部から出力される探索履歴を保持する探索履歴記憶
部と、前記構文ネットワークに付与された意味情報を参
照して、前記探索履歴記憶部に保持される探索履歴を書
き換える探索履歴書き換え部と、前記探索履歴記憶部に
保持された探索履歴を読み出し、前記探索履歴にしたが
って前記構文ネットワーク上を辿ることによって、認識
結果を生成する後向き探索部を備えたものである。As described above, the speech recognition apparatus according to this embodiment has a syntax / semantic knowledge storage unit for storing syntax / semantic knowledge, in which semantic information is associated with syntactic knowledge defining the grammar of input speech, A syntax network generation unit that generates a syntax network to which semantic information is added from the syntax / semantic knowledge, a syntax network storage unit that holds the syntax network generated by the syntax network generation unit, and a standard pattern of an acoustic model. Using the acoustic dictionary unit, the syntax network and the acoustic model, a search for a syntax hypothesis for input speech is performed according to the syntax network, the syntax node of the syntax network, the time when the syntax node is reached, and the syntax node. The search score at the time, the syntax node that has reached the previous syntax node, the one , A forward search unit for outputting a search history including a time at which the syntax node has been reached, a word between the syntax node and the previous syntax node, and a search for holding the search history output from the forward search unit. A history storage unit, a search history rewriting unit that rewrites the search history stored in the search history storage unit by referring to the semantic information given to the syntax network, and a search history stored in the search history storage unit. A backward search unit for generating a recognition result by reading out and tracing on the syntax network according to the search history is provided.

【００３３】実施例２．この発明の一実施例である、図
１の探索履歴書き換え部７の動作について説明する。な
お、実施例２において、図７の１〜６及び８の動作は、
実施例１と同じなので省略する。この実施例では、意味
的に同様な正解候補が多数発生することを防止する例に
ついて説明する。例えば、認識結果が「ｇｏｈｊａｋ
ｕ」、あるいは、「ｇｏｂｊａｋｕ」、あるいは、
「ｇｏｐｊａｋｕ」のいずれの場合であっても「５
１００」を意味するものであり、これら３つの認識結果
を１つの認識結果にまとめてしまう例について説明す
る。Example 2. The operation of the search history rewriting unit 7 of FIG. 1, which is an embodiment of the present invention, will be described. In the second embodiment, the operations of 1 to 6 and 8 in FIG.
Since it is the same as the first embodiment, the description thereof will be omitted. In this embodiment, an example in which a large number of semantically similar correct answer candidates are prevented from occurring will be described. For example, the recognition result is "go hjak
u ”or“ go bjaku ”or
In any case of "go pjaku", "5
An example in which these three recognition results are combined into one recognition result will be described.

【００３４】図７は、探索履歴記憶部６に保持された探
索履歴の中で、ｇｎ，ｆｒｍ，ｐｇｎ，ｓｆｒｍがすべ
て等しい３つの探索履歴の一例を示すものである。探索
履歴書き換え部７は、ネットワーク記憶部３に保持され
る構文ネットワークを参照することにより、それぞれの
探索履歴のｗｏｒｄ欄に記された単語の意味素性を得
る。図７に示す３つの探索履歴に記された単語は、すべ
て同じ意味素性１００を持つので、探索履歴書き換え部
７は、探索スコアｐｒｏｂが最大でない探索履歴（ｂ）
及び探索履歴（ｃ）を探索履歴記憶部６から削除する。
これは、時刻ｔ₂₃において構文ノードＮ３に到達する意
味的に同じ２つの構文仮説を棄却したことに他ならな
い。したがって、後向き探索部８において、これら２つ
の構文仮説に基づいた認識結果が生成されることはな
い。FIG. 7 shows an example of three search histories among the search histories stored in the search history storage unit 6, where gn, frm, pgn, and sfrm are all the same. The search history rewriting unit 7 obtains the semantic features of the words described in the word column of each search history by referring to the syntax network held in the network storage unit 3. Since all the words written in the three search histories shown in FIG. 7 have the same semantic feature 100, the search history rewriting unit 7 causes the search history prob whose search score prob is not the maximum (b).
And the search history (c) is deleted from the search history storage unit 6.
This is nothing but dismissed semantically identical two syntaxes hypotheses reaching the syntax node N3 at time t _23. Therefore, the backward search unit 8 does not generate a recognition result based on these two syntax hypotheses.

【００３５】図８は、探索履歴書き換え部の７の動作を
示すタイミングチャート図である。図６に示すような
（ａ）〜（ｌ）までの探索履歴は、図８に示すような時
刻に前向き探索部５により生成され、探索履歴記憶部６
に記憶される。例えば、時刻ｔ₂₃において、探索履歴
（ｃ），（ｄ），（ｅ）が探索履歴記憶部６に一度記憶
される。探索履歴書き換え部７は、時刻ｔ₂₃に生成され
た３つの探索履歴を検査し、前述したように、ｇｎ，ｆ
ｒｍ，ｐｇｎ，ｓｆｒｍがすべて等しい場合に、探索ス
コアが最大のものを除き、他のものを削除する。図８に
示す例においては、探索履歴（ｃ）が残され、探索履歴
（ｄ）と（ｅ）が削除される。同様のことが時刻ｔ₃₅に
も行われ、探索履歴（ｈ）のみが残され、他の探索履歴
（ｇ）と（ｉ）が削除される。更に、時刻ｔ₇₃において
は、探索履歴（ｇ）が残され、他の探索履歴（ｋ）と
（ｌ）が削除される。FIG. 8 is a timing chart showing the operation of the search history rewriting section 7. Search histories (a) to (l) shown in FIG. 6 are generated by the forward search unit 5 at the time shown in FIG.
Is stored. For example, at time t ₂₃ , the search history (c), (d), (e) is once stored in the search history storage unit 6. The search history rewriting unit 7 inspects the three search histories generated at time t ₂₃ , and as described above, gn, f
If rm, pgn, and sfrm are all equal, remove the one with the highest search score and delete the others. In the example shown in FIG. 8, the search history (c) is left and the search history (d) and (e) are deleted. The same is done at time t ₃₅ , only the search history (h) is left, and the other search histories (g) and (i) are deleted. Furthermore, at time t ₇₃ , the search history (g) is left and the other search histories (k) and (l) are deleted.

【００３６】図９は、探索履歴が削除される前と削除さ
れた後の状態を示す図である。図９（ａ）は、探索履歴
書き換え部が実施例１による意味素性付与機能を有して
いる場合の認識結果を示す図である。図９（ｂ）は、探
索履歴書き換え部が実施例２による探索履歴削除機能を
有する場合の認識結果を示すものである。図９（ａ）の
場合、２×３×３×３＝５４通りの組み合せが考えら
れ、上位Ｎ個の正解候補は、５４通りの中からＮ個のも
のを選び出すことになる。しかし、５４通りの組み合せ
の中には、意味素性が同じ物が多数含まれているため、
上位Ｎ個には、結果として同じ物が多数含まれてしま
う。しかし、図９（ｂ）に示すように、意味的に同じも
のを削除した場合には、２通りの組み合せしか存在せ
ず、上位Ｎ個に対して、意味の異なる正解候補を多く得
ることが可能になる。FIG. 9 is a diagram showing a state before the search history is deleted and a state after the search history is deleted. FIG. 9A is a diagram showing a recognition result when the search history rewriting unit has the semantic feature adding function according to the first embodiment. FIG. 9B shows the recognition result when the search history rewriting unit has the search history deleting function according to the second embodiment. In the case of FIG. 9A, 2 × 3 × 3 × 3 = 54 combinations can be considered, and the upper N correct answer candidates will be selected from the 54 candidates. However, among the 54 combinations, there are many that have the same semantic features.
As a result, the same number of items are included in the top N items. However, as shown in FIG. 9 (b), when the same meaning is deleted, there are only two combinations, and many correct answer candidates having different meanings can be obtained for the top N. It will be possible.

【００３７】なお、図８において、探索履歴書き換え部
７が探索履歴を削除するのは、探索履歴が新たに生成さ
れたそれぞれの時刻において行う場合を説明したが、探
索履歴記憶部６に探索履歴が記憶される時刻には削除を
行わず、後向き探索部８が後向きの探索を実行する直前
に、探索履歴書き換え部７が探索履歴記憶部に記憶され
た不必要な探索履歴を削除するようにしても構わない。
なお、図９（ｃ）については、後述する実施例３におい
て説明する。In FIG. 8, it is explained that the search history rewriting unit 7 deletes the search history at each time when the search history is newly generated. Is deleted at a time when is stored, and the search history rewriting unit 7 deletes the unnecessary search history stored in the search history storage unit immediately before the backward search unit 8 executes the backward search. It doesn't matter.
Note that FIG. 9C will be described in Example 3 described later.

【００３８】以上のように、この実施例は、探索履歴記
憶部に保持されている同時刻・同構文ノードに対する探
索履歴で、単語の意味素性が同一であるものが複数存在
する場合、前記意味素性が同一である探索履歴の中で最
大のスコアを持つもののみを前記探索履歴記憶部に残し
て、他の前記意味素性が同一である探索履歴を前記探索
履歴記憶部から削除する探索履歴書き換え部を備えたも
のである。As described above, in this embodiment, when there are a plurality of search histories held at the same time and same syntax node in the search history storage unit, which have the same semantic feature of a word, the meaning is Rewriting a search history that leaves only the search history having the same maximum score in the search history storage unit having the same feature in the search history storage unit and deletes another search history having the same semantic feature from the search history storage unit It has a section.

【００３９】実施例３．この発明の一実施例である、図
１の探索履歴書き換え部７の動作を、探索履歴記憶部６
に図１０に示す探索履歴が保持されている場合について
説明する。なお、実施例３において、図１の１〜６及び
８の動作は、実施例１と同じなので省略する。この実施
例においては、認識結果に対して影響を与えない意味素
性のものを予め削除してしまう場合について説明する。
例えば、「５００円で」という認識結果があった場合
に、これを「５００円」としてしまうような場合であ
る。「で」の意味素性がＮＵＬＬである場合に、「で」
はなくても良いと考え、この「で」を削除する場合につ
いて以下に説明する。Example 3. The operation of the search history rewriting unit 7 of FIG.
The case where the search history shown in FIG. 10 is held will be described. In the third embodiment, the operations of 1 to 6 and 8 in FIG. In this embodiment, a case will be described in which a semantic feature that does not affect the recognition result is deleted in advance.
For example, when there is a recognition result of "500 yen", this is set to "500 yen". If the semantic feature of "de" is NULL, "de"
Considering that it may not be necessary, the case of deleting this "by" will be described below.

【００４０】探索履歴書き換え部７は、探索履歴記憶部
６に保持された探索履歴の中で、ある特定の意味素性、
例えば、認識結果として必要でない意味素性ＮＵＬＬを
持つ単語の探索履歴（ｄ）に対して以下のような書き換
えを行なう。まず、探索履歴（ｄ）のｐｇｎの値Ｎ４と
等しい値をｇｎに持ち、探索履歴（ｄ）のｓｆｒｍの値
ｔ₃₀と等しい値をｆｒｍに持つ探索履歴（ａ），
（ｂ），（ｃ）を探索履歴記憶部６から選ぶ。次に、探
索履歴（ｄ）の探索スコア１８２．３９５と、探索履歴
（ａ），（ｂ），（ｃ）のうの最大探索スコアである１
５８．９６２との差ｄｅｌｔａ＝２３．４３３を求め
る。そして、探索履歴（ｄ）を探索履歴記憶部６から削
除し、その代わりに、図１１に示すような、ｇｎの値が
探索履歴（ｄ）のｇｎの値Ｎ５、ｆｒｍの値が探索履歴
（ｄ）のｆｒｍの値ｔ₃₅、ｐｒｏｂの値がそれぞれ探索
履歴（ａ），（ｂ），（ｃ）のｐｒｏｂの値にｄｅｌｔ
ａを加算したもの、ｓｆｒｍの値がそれぞれ探索履歴
（ａ），（ｂ），（ｃ）のｓｆｒｍの値、ｗｏｒｄの値
がそれぞれ探索履歴（ａ），（ｂ），（ｃ）のｗｏｒｄ
の値である探索履歴（ｅ），（ｆ），（ｇ）を作成し、
探索履歴記憶部６に書き込む。The search history rewriting unit 7 has a certain semantic feature in the search history stored in the search history storage unit 6,
For example, the following rewriting is performed on the search history (d) of a word having a semantic feature NULL that is not necessary as a recognition result. First, a search history (a) having a value in gn that is equal to the value N4 of pgn in the search history (d) and having a value in frm that is equal to the value t ₃₀ of sfrm in the search history (d),
(B) and (c) are selected from the search history storage unit 6. Next, the search score 182.395 of the search history (d) and the maximum search score of the search history (a), (b), and (c) are 1
The difference with 58.962 is delta = 23.433. Then, the search history (d) is deleted from the search history storage unit 6, and instead, as shown in FIG. 11, the gn value N5 and the frm value of the search history (d) are the search history (d). The value of frm t ₃₅ and the value of prob in d) are deleted to the values of prob in the search history (a), (b), and (c), respectively.
The sum of a, the value of sfrm is the value of sfrm in search history (a), (b), and (c), and the value of word is the value of search history (a), (b), and (c), respectively.
Create search histories (e), (f), (g) that are values of
Write to the search history storage unit 6.

【００４１】図１２は、探索履歴記憶部６の状態を示す
図である。図１２（ａ）は、図１０に示した探索履歴記
憶部を示している。また、図１２（ｂ）は、図１１に示
す探索履歴記憶部の状態を示している。また、図１３
は、認識結果を示す図である。図１３（ａ）は、図１０
に示す認識結果を示している。図１３（ｂ）は、図１１
に示す認識結果を示している。FIG. 12 is a diagram showing the state of the search history storage unit 6. FIG. 12A shows the search history storage unit shown in FIG. Further, FIG. 12B shows the state of the search history storage unit shown in FIG. In addition, FIG.
[Fig. 8] is a diagram showing a recognition result. FIG. 13A is a diagram of FIG.
The recognition results shown in are shown. FIG. 13B is a diagram shown in FIG.
The recognition results shown in are shown.

【００４２】図９（ｃ）は、実施例１及び実施例２にお
いて認識された結果に対して、更に、実施例３による認
識結果として必要でない意味素性ＮＵＬＬを持つ単語の
探索履歴を削除した場合の状態を示す図である。図９
（ｃ）に示すように、構文ノードＮ３〜Ｎ６間の「ＭＯ
ＮＥＹＮＵＬＬＮＵＬＬ」は、２つの「ＮＵＬＬ」が
削除され、「ＭＯＮＥＹ」として認識されることにな
る。FIG. 9C shows a case in which the search history of words having a semantic feature NULL, which is not necessary as a recognition result according to the third embodiment, is deleted from the results recognized in the first and second embodiments. It is a figure which shows the state of. Figure 9
As shown in (c), “MO between syntax nodes N3 to N6
Two "NULL" are deleted and "MONY NULL" will be recognized as "MONEY".

【００４３】図１４は、更に、実施例２及び実施例３に
よる探索履歴の削除を組み合せた場合の他の例を示す図
である。図１４（ａ）のような探索履歴が記憶されてい
る場合、実施例３に示したような不必要な探索履歴を
（意味素性がＮＵＬＬである探索履歴）書き換えること
により、図１４（ｂ）に示すような状態になる。この図
１４（ｂ）に示す状態に対して、実施例２に示したよう
な同一の意味素性を持つ探索履歴を削除することによ
り、図１４（ｃ）のような状態となる。このように、実
施例２と実施例３を組み合せることにより、認識結果と
して必要でない探索履歴や同一の意味を持つ探索履歴を
少なくすることができる。図９（ｃ）に示す例は、実施
例２を先に適用し、その後実施例３を適用した場合を示
している。図１４の場合は、実施例３を先に適用し、そ
の後実施例２を適用した場合を示している。いずれかの
実施例を先に適用することにより、効果的に探索履歴の
数を減少させることができる。したがって、例えば、実
施例２を適用し、その後実施例３を適用し、再び実施例
２を適用するようにし、実施例２と実施例３のいずれか
が先に適用される場合の両方を実行することが望まし
い。FIG. 14 is a diagram showing another example in which the deletion of the search history according to the second and third embodiments is further combined. When the search history as shown in FIG. 14A is stored, the unnecessary search history as shown in the third embodiment is rewritten (the search history whose semantic feature is NULL), so that FIG. The state becomes as shown in. By deleting the search history having the same semantic feature as shown in the second embodiment in the state shown in FIG. 14B, the state shown in FIG. 14C is obtained. As described above, by combining the second embodiment and the third embodiment, it is possible to reduce the search history that is not necessary as the recognition result or the search history having the same meaning. In the example shown in FIG. 9C, the second embodiment is applied first, and then the third embodiment is applied. In the case of FIG. 14, Example 3 is applied first, and then Example 2 is applied. By applying any of the embodiments first, the number of search histories can be effectively reduced. Thus, for example, applying Example 2 and then applying Example 3 and then applying Example 2 again, performing both cases where either Example 2 or Example 3 is applied first. It is desirable to do.

【００４４】以上のように、この実施例は、探索履歴記
憶部に保持されている探索履歴で、単語の意味素性がＮ
ＵＬＬという特定のものである探索履歴を、その探索履
歴が保持する１つ前に到達した構文ノード及び１つ前に
到達した構文ノードに到達した時刻に対応する探索履歴
で、書き換える探索履歴書き換え部を備えたものであ
る。As described above, in this embodiment, in the search history stored in the search history storage unit, the semantic features of words are N
A search history rewriting unit that rewrites a search history, which is a specific thing called ULL, with a search history corresponding to the syntax node that has reached the previous syntax node and the time that the syntax node has reached the previous syntax node held by the search history. It is equipped with.

【００４５】実施例４．この発明のの一実施例である、
図１の構文・意味知識記憶部１、構文ネットワーク生成
部２、構文ネットワーク記憶部３及び後向き探索部８に
ついて説明する。なお、実施例４において、図１の４〜
７の動作は、実施例１と同じなので省略する。前述した
実施例においては、例えば、「５１００」という認識
結果を得ることができるが、実際の意味は、５×１００
＝５００という認識結果を得ることが望ましい。この実
施例では、５×１００というような演算を行い、その結
果を認識結果として出力できるような場合について説明
する。Example 4. 1 is an embodiment of the present invention,
The syntax / semantic knowledge storage unit 1, the syntax network generation unit 2, the syntax network storage unit 3, and the backward search unit 8 in FIG. 1 will be described. In addition, in Example 4, 4 of FIG.
Since the operation of No. 7 is the same as that of the first embodiment, the description thereof will be omitted. In the above-described embodiment, for example, a recognition result of "5 100" can be obtained, but the actual meaning is 5 x 100.
It is desirable to obtain a recognition result of = 500. In this embodiment, a case where a calculation such as 5 × 100 is performed and the result can be output as a recognition result will be described.

【００４６】図１５は、構文・意味知識記憶部１に保持
される構文・意味知識の一例であって、単語を規定する
辞書部において単語に意味素性を対応づけ、規則部にお
いて意味素性の演算規則を対応づけたものである。例え
ば、ｓｅｍ（〈数１〉）は、非終端記号〈数１〉から得
られる意味素性を表す。また、図１５の（２）の構文規
則に定義された意味素性の演算規則ｓｅｍ（〈料金
２〉）＝ｓｅｍ（〈数１〉）×ｓｅｍ（〈数２〉）は、
非終端記号〈料金２〉に対する意味素性が、ｓｅｍ
（〈数１〉）とｓｅｍ（〈数２〉）の積から得られるこ
とを表す。FIG. 15 shows an example of the syntax / semantic knowledge stored in the syntax / semantic knowledge storage unit 1. In the dictionary unit that defines the words, the words are associated with the semantic features, and in the rule unit, the semantic features are calculated. It is the correspondence between the rules. For example, sem (<Equation 1>) represents a semantic feature obtained from the non-terminal symbol <Equation 1>. Further, the semantic feature calculation rule sem (<price 2>) = sem (<expression 1>) × sem (<expression 2>) defined in the syntax rule (2) of FIG.
The semantic feature for the non-terminal symbol <Charge 2> is sem
It is obtained from the product of (<Equation 1>) and sem (<Equation 2>).

【００４７】図１６は、構文・意味知識記憶部１に保持
される構文・意味知識から、構文ネットワーク生成部２
によって生成され、構文ネットワーク記憶部３に保持さ
れる構文ネットワークの一例を示すものである。各々の
構文アークには、構文知識の終端記号が付与され、同時
に終端記号に対する意味素性も付与される。例えば、終
端記号ｈｊａｋｕの構文アークには、終端記号ｈｊａｋ
ｕに対する意味素性１００が付与される。また、例え
ば、構文ノードＮ１には意味素性の演算規則ｓｅｍ
（〈料金２〉）＝ｓｅｍ（〈数１〉）×ｓｅｍ（〈数
２〉）及びｓｅｍ（〈数１〉）＝ｓｅｍ（ｇｏ〉）が付
与される。FIG. 16 shows the syntax network generation unit 2 based on the syntax / semantic knowledge stored in the syntax / semantic knowledge storage unit 1.
3 shows an example of a syntax network generated by the syntax network storage unit 3 and held in the syntax network storage unit 3. Each syntactic arc is given a terminal symbol of syntactic knowledge, and at the same time, a semantic feature for the terminal symbol. For example, in the syntax arc of the terminal symbol hjaku, the terminal symbol hjak
A semantic feature of 100 is added to u. In addition, for example, the syntax node N1 has a semantic feature calculation rule sem.
(<Charge 2>) = sem (<Numerical expression 1>) × sem (<Numerical expression 2>) and sem (<Numerical expression 1>) = sem (go>) are given.

【００４８】後向き探索部８の動作を、探索履歴記憶部
６に図１７に示す探索履歴が保持されている場合につい
て説明する。入力音声の最終時刻ｔ₂₃、且つ、構文ネッ
トワークでの最終構文ノードＮ３に対応する探索履歴の
中で、最大の探索スコアを持つ探索履歴（ｃ）を、探索
履歴記憶部６から選ぶ。そして、意味素性の演算を以下
のように行なう。まず、ｗｏｒｄの値である単語ｈｊａ
ｋｕの意味素性ｓｅｍ（ｈｊａｋｕ）＝１００を、構文
ネットワークを参照することにより得る。次に、ｐｇｎ
の値である構文ノードＮ２に付与された意味素性の演算
規則を参照して、意味素性の演算ｓｅｍ（〈百〉）＝ｓｅｍ（ｈｊａｋｕ）ｓｅｍ（〈数２〉）＝ｓｅｍ（〈百〉）を行い、ｓｅｍ（〈数２〉）＝１００を得る。意味素性
の演算が終ると、探索履歴（ｃ）のｐｇｎの値Ｎ２と等
しい値をｇｎに持ち、探索履歴（ｃ）のｓｆｒｍの値ｔ
₁₅と等しい値をｆｒｍに持つ探索履歴（ａ）を探索履歴
記憶部６から選ぶ。そして、意味素性の演算を以下のよ
うに行なう。まず、ｗｏｒｄの値である単語ｇｏの意味
素性ｓｅｍ（ｇｏ）＝５を、構文ネットワークを参照す
ることにより得る。次に、ｐｇｎの値である構文ノード
Ｎ１に付与された意味素性の演算規則を参照して、意味
素性の演算を行い、ｓｅｍ（〈数１〉）＝５を得る。更
に、構文ノードＮ２における意味素性の演算により、ｓ
ｅｍ（〈数２〉）＝１００が得られているので、ｓｅｍ
（〈料金２〉）＝ｓｅｍ（〈数１〉）×ｓｅｍ（〈数
２〉）＝５×１００＝５００が得られる。The operation of the backward search unit 8 will be described when the search history storage unit 6 holds the search history shown in FIG. From the search history storage unit 6, the search history (c) having the maximum search score is selected from the search times corresponding to the final time t ₂₃ of the input voice and the final syntax node N3 in the syntax network. Then, the semantic feature calculation is performed as follows. First, the word hja that is the value of word
The semantic feature sem (hjaku) = 100 of ku is obtained by referring to the syntax network. Then pgn
The semantic feature calculation sem (<hundred>) = sem (hjaku) sem (<numerical formula 2>) = sem (<hundred>) with reference to the semantic feature calculation rule assigned to the syntax node N2 that is the value of To obtain sem (<Equation 2>) = 100. When the semantic feature calculation is completed, gn has a value equal to the pgn value N2 of the search history (c), and the sfrm value t of the search history (c).
A search history (a) having a value equal to ₁₅ in frm is selected from the search history storage unit 6. Then, the semantic feature calculation is performed as follows. First, the semantic feature sem (go) = 5 of the word go, which is the value of word, is obtained by referring to the syntax network. Next, the semantic feature is calculated with reference to the semantic feature calculation rule assigned to the syntax node N1 that is the value of pgn, and sem (<Equation 1>) = 5 is obtained. Further, by the operation of the semantic feature in the syntax node N2, s
Since em (<Equation 2>) = 100 is obtained, sem
(<Charge 2>) = sem (<Equation 1>) × sem (<Equation 2>) = 5 × 100 = 500 is obtained.

【００４９】後向き探索部８は、図１５の構文ノードＮ
１に付与された意味素性の演算規則ｓｅｍ（文）＝ｓｅ
ｍ（〈料金２〉）より、認識結果として意味素性５００
を出力する。The backward search section 8 uses the syntax node N of FIG.
Semantic feature calculation rule sem (sentence) = se given to 1
From m (<Charge 2>), a semantic feature of 500 is obtained as a recognition result.
Is output.

【００５０】以上のように、この実施例は、入力音声の
文法を規定する構文知識の、単語を規定する辞書部にお
いて単語に意味素性を対応づけ、規則部において意味素
性の演算規則を対応づけた構文・意味知識を保持する構
文・意味知識記憶部と、探索履歴記憶部に保持された探
索履歴を読み出し、前記探索履歴にしたがって構文ネッ
トワーク上を辿り、前記構文ネットワーク上に付与され
た意味素性及び意味素性の演算規則により意味素性の演
算を行ない、意味素性の系列を認識結果として出力する
後向き探索部を備えたものである。As described above, in this embodiment, in the syntactic knowledge that defines the grammar of the input speech, the dictionary part that defines the word associates the word with the semantic feature, and the rule part associates the semantic feature calculation rule. The syntactic / semantic knowledge storage unit that holds the syntactic / semantic knowledge, and the search history stored in the search history storage unit are read, traced on the syntactic network according to the search history, and the semantic features assigned on the syntactic network. And a backward search unit that performs a semantic feature calculation according to the semantic feature calculation rule and outputs a semantic feature sequence as a recognition result.

【００５１】実施例５．この実施例においては、フレー
ム同期を用いた連続音声認識において、Ｎ−Ｂｅｓｔパ
ラダイムに基づく場合を説明する。また、この実施例に
おいては、前述した実施例１〜４に用いた手法を、特に
従来のｌａｔｔｉｃｅＮ−Ｂｅｓｔ法と比較する場合
を説明する。連続音声認識において、Ｎ−Ｂｅｓｔパラ
ダイムに基づく効率的なサーチアルゴリズムとして、R.
Schwartz and Y.-L. Chow: ■The N-Best algorithm:
An efficent and exact procedure for finding the N
most likely sentence hypotheses■, Proc. ICASSP, p
p.81-84(1990).とR. Schwartz and S. Austin: ■A com
parison of several approximate algorithms for find
ing multiple (N-BEST) sentence hypotheses■, Proc.
ICASSP, pp.701-704(1991).が提案されている。このア
プローチでは、助詞や語尾などがわずかに異なる意味的
に同じ構文仮説が、前向き探索時に多く生成され、得ら
れる上位Ｎ個の正解候補は、意味的に同じ候補を多く含
んだものとなる。この結果、正解である候補が上位Ｎ個
に含まれず、正しい認識結果が得られないという問題点
がある。そこで、この実施例５では、前述した実施例１
〜４と同様に、小さなＮに対して意味的に異なった正解
候補を多く得るために、意味的に同じ仮説を前向き探索
時に枝刈りし、認識と同時に発話に含まれる意味素性を
抽出する手法を提案する。Embodiment 5 FIG. In this embodiment, a case where continuous speech recognition using frame synchronization is based on the N-Best paradigm will be described. In addition, in this example, a case where the method used in Examples 1 to 4 described above is compared with a conventional lattice N-Best method in particular will be described. In continuous speech recognition, as an efficient search algorithm based on the N-Best paradigm, R.
Schwartz and Y.-L. Chow: ■ The N-Best algorithm:
An efficent and exact procedure for finding the N
most likely sentence hypotheses ■, Proc. ICASSP, p
p.81-84 (1990). and R. Schwartz and S. Austin: ■ A com
parison of several approximate algorithms for find
ing multiple (N-BEST) sentence hypotheses ■, Proc.
ICASSP, pp.701-704 (1991). Is proposed. In this approach, many syntactic hypotheses that are semantically the same with slightly different particles and endings are generated during the forward search, and the top N correct answer candidates that are obtained include many semantically identical candidates. As a result, the correct candidates are not included in the top N pieces, and a correct recognition result cannot be obtained. Therefore, in the fifth embodiment, the above-described first embodiment is used.
As in the case of ~ 4, in order to obtain many correct answer candidates that are semantically different for a small N, a method of pruning the same hypothesis during the forward search and extracting the semantic features included in the utterance at the same time as recognition To propose.

【００５２】システムにおける音声認識の目的は、シス
テムへの入力パラメータとしての意味素性系列をユーザ
の発話から得ることがである。したがって、正解である
意味素性系列が認識結果から失われないように、限られ
たＮ個の正解候補の中により多くの意味的に異なった正
解候補を得ることが重要である。認識時に意味を用いる
手法として、南，山田，吉岡，鹿野：“自由発声音声認
識における意味を考慮した２段ＬＲパーザ”、音構論，
ｐｐ．６９−７０（１９９３）がＨＭＭ−ＬＲの枠組に
おいて、意味を用いたビームサーチを提案しているが、
本手法は、フレーム同期Ｎ−Ｂｅｓｔサーチの前向き探
索において、構文ネットワークに付与した意味情報を用
いて意味的に同じ仮説の枝刈りを行うものである。The purpose of speech recognition in the system is to obtain a semantic feature sequence as an input parameter to the system from the user's utterance. Therefore, it is important to obtain more semantically different correct answer candidates from the limited N correct answer candidates so that the correct feature semantic sequence is not lost from the recognition result. Minami, Yamada, Yoshioka, Kano: "2-stage LR parser considering meaning in free speech recognition" as a method of using meaning during recognition, sound structure theory,
pp. 69-70 (1993) proposes a beam search using meaning in the framework of HMM-LR.
In the forward search of the frame-synchronous N-Best search, the present method prunes hypotheses that are semantically the same using the semantic information given to the syntactic network.

【００５３】対話システムの認識部で用いる構文は、図
１８に示すように、最上位の規則（文開始記号を左辺に
持つ規則）をシステムの動作を指定する項目の系列で定
義する。各項目は項目内文法により定義し、文法の辞書
部では非終端記号に対する音素系列と、その単語に対す
る意味素性を定義する。また、文法の規則部には、意味
素性に対する演算規則を定義する。演算規則は、トレー
スバック時に複数の意味素性から１つの意味素性を生成
するために用いられるものである。図１８の例では、
〈料金３〉の意味素性は、非終端記号〈数〉に対する意
味素性＄１と、〈千〉に対する意味素性＄２の積として
得られる。単語に対して定義する意味素性の１つに、意
味を持たないという意味素性「ＮＵＬＬ」を定義する。
例えば、「８月２６日」と「８月の２６日」の２つの発
話は、どちらも同じ意味素性を表し、格助詞の「の」に
意味素性を持たない。このような語に対して「ＮＵＬ
Ｌ」を与える。以上述べた意味情報は、構文を構文ネッ
トワークに展開する際に、構文アーク及び構文ノード内
に埋め込まれる。As shown in FIG. 18, the syntax used in the recognition unit of the dialogue system defines the highest-level rule (rule having a sentence start symbol on the left side) as a series of items designating the operation of the system. Each item is defined by the in-item grammar, and the dictionary part of the grammar defines the phoneme sequence for the non-terminal symbol and the semantic feature for the word. In the rule part of the grammar, a calculation rule for semantic features is defined. The calculation rule is used to generate one semantic feature from a plurality of semantic features during traceback. In the example of FIG. 18,
The semantic feature of <charge 3> is obtained as the product of the semantic feature $ 1 for the non-terminal symbol <number> and the semantic feature $ 2 for <thousand>. As one of the semantic features defined for a word, a semantic feature “NULL” having no meaning is defined.
For example, two utterances of "August 26th" and "August 26th" represent the same semantic feature, and the case particle "no" has no semantic feature. For such words, "NUL
L ”is given. The semantic information described above is embedded in the syntax arc and the syntax node when the syntax is expanded in the syntax network.

【００５４】R. Schwartz and S. Austin: ■A compari
son of several approximate algorithms for finding
multiple (N-BEST) sentence hypotheses■, Proc. ICA
SSP,pp.701-704(1991).のｌａｔｔｉｃｅＮ−Ｂｅｓ
ｔ法では、構文ノードに入ってくるすべての単語に対す
るトレースバックポインタを保持するが、この実施例で
提案する手法では、以下に述べる２つの方法で仮説の枝
刈りを行う。（１）開始フレーム時刻及び遷移元の構文ノードが等し
く、且つ、意味素性が等しいトレースバックポインタ
は、最大スコアの物のみを残す（図１９参照）。（２）意味素性が「ＮＵＬＬ」である単語のトレースバ
ックポインタを、遷移元の構文ノードに保持されたその
単語の開始時刻のトレースバックポインタを用いて書き
換える（図２０参照）。R. Schwartz and S. Austin: ■ A compari
son of several approximate algorithms for finding
multiple (N-BEST) sentence hypotheses ■, Proc. ICA
SSP, pp.701-704 (1991) .lattice N-Bes.
In the t method, the traceback pointers for all the words that enter the syntax node are held, but in the method proposed in this embodiment, the hypothesis pruning is performed by the two methods described below. (1) Traceback pointers having the same start frame time and transition source syntax node and the same semantic feature leave only the one with the maximum score (see FIG. 19). (2) The traceback pointer of the word whose semantic feature is "NULL" is rewritten using the traceback pointer of the start time of the word held in the transition source syntax node (see FIG. 20).

【００５５】（１）により、意味素性が等しく時刻ｔ’
にノードＳ’を出て時刻ｔにノードＳに入る複数の仮説
は、１つに削減される。また、（２）により、意味素性
が「ＮＵＬＬ」の単語は仮説から削除される（図２１参
照）。更に、図２２に示すように、途中の構文ノードで
の時刻が異なっていても意味素性が同一の仮説は、最大
スコアのものだけを残し、他をすべて仮説から削除する
ために、（２）による「ＮＵＬＬ」の削除を行った後、
（１）を適用する。According to (1), the semantic features are equal and time t '
The plurality of hypotheses exiting node S ′ at time t and entering node S at time t are reduced to one. Further, according to (2), the word whose semantic feature is "NULL" is deleted from the hypothesis (see FIG. 21). Further, as shown in FIG. 22, hypotheses having the same semantic feature even if the time is different in the middle of the syntax nodes, only the one having the maximum score is left and all other hypotheses are deleted from the hypothesis. After deleting "NULL" by
Apply (1).

【００５６】この実施例による手法の評価を、ホテル予
約をタスクとする不特定話者・連続音声認識実験で行っ
た。用いた構文規則は、不要語などを含む比較的自由度
の高いもので、語構文規則数２８６、語彙１４６語であ
る。入力文は、２５種類の文を成人男性５人が発声した
計１２５文を用いた。認識結果として出力される正解候
補数Ｎを５，１０，２０，・・・，１００と変えて、意
味素性の系列が異なる候補がいくつ含まれるかを調べ
た。図２３に１２５文の実験結果の平均を示す。また、
意味素性系列での認識率を図２４に示す。従来のｌａｔ
ｔｉｃｅＮ−Ｂｅｓｔ法では、上位１００位までの正
解候補の中で、意味素性系列が異なる候補は４．９個で
あり、意味的には同じ候補が大部分をしめているといえ
る。これに対し、この実施例で提案する手法では１０位
までに意味素性系列が異なる候補が５．１個存在し、ｌ
ａｔｔｉｃｅＮ−Ｂｅｓｔ法に比べ、小さいＮで意味
的に異なる正解候補が多く得られている。また、認識率
においても、本手法での１０位までの結果は、ｌａｔｔ
ｉｃｅＮ−Ｂｅｓｔ法での１００位までの結果より良
いものとなっている。ただし、本手法を適用した後で
も、Ｎ個の候補がすべて互いに異なる意味を持つもので
はない。これは、前向き探索においては、「ＮＵＬＬ」
以外の意味素性を持つ開始時刻の異なる仮説を枝刈りで
きないからである。これに対しては、後向きのトレース
バック時に、同じ意味素性系列を持つ候補を枝刈りする
ことで、更に多くの意味的に異なる候補を得ることがで
きる。The method according to this embodiment was evaluated by an unspecified speaker / continuous speech recognition experiment whose task was hotel reservation. The syntax rules used have a relatively high degree of freedom, including unnecessary words, and have 286 word syntax rules and 146 words. As the input sentence, a total of 125 sentences in which 5 adult men uttered 25 kinds of sentences were used. The number N of correct candidates output as a recognition result was changed to 5, 10, 20, ..., 100, and it was examined how many candidates having different sequences of semantic features were included. FIG. 23 shows the average of the experimental results of 125 sentences. Also,
FIG. 24 shows the recognition rate in the semantic feature series. Conventional lat
In the nice N-Best method, there are 4.9 candidates with different semantic feature sequences among the top 100 correct candidates, and it can be said that most of the candidates have the same meaning. On the other hand, in the method proposed in this embodiment, 5.1 candidates having different semantic feature sequences exist up to the 10th position, and l
Compared with the attic N-Best method, many correct answer candidates that are semantically different with a small N are obtained. Also in terms of recognition rate, the results up to the 10th place in this method are latt
It is better than the result up to the 100th position in the ice N-Best method. However, even after applying this method, the N candidates do not all have different meanings. This is "NULL" in the forward search.
This is because it is not possible to prune hypotheses with different start times that have semantic features other than. On the other hand, by pruning candidates having the same semantic feature sequence during backward traceback, more semantically different candidates can be obtained.

【００５７】以上のように、この実施例は、前向き探索
において、意味的に同じ仮説を枝刈りして効率良く探索
を行い、小さなＮに対して異なる意味を持つ正解候補を
が多く得られる手法を提案した。認識実験の結果、ｌａ
ｔｔｉｃｅＮ−Ｂｅｓｔ法は、１００位まで考慮して
も得られない意味素性系列の候補が、１０位までのトレ
ースバックによって得られ、本手法の有効性を認識し
た。As described above, in this embodiment, in the forward search, the hypotheses that are semantically the same are pruned, the search is efficiently performed, and many correct answer candidates having different meanings are obtained for small N. Proposed. The result of the recognition experiment, la
In the tissue N-Best method, candidates of the semantic feature series that were not obtained even when considering up to the 100th position were obtained by traceback up to the 10th position, recognizing the effectiveness of this method.

【００５８】[0058]

【発明の効果】この発明は、以上説明したように構成さ
れているので、以下に示すような効果を奏する。Since the present invention is constructed as described above, it has the following effects.

【００５９】構文ネットワークに付与された意味情報を
参照して、探索履歴記憶部に保持される探索履歴を書き
換える探索履歴書き換え部を設けたことにより、後向き
探索部において認識と同時に発話に含まれる意味を抽出
し、意味の系列としての認識結果を出力できる。By providing the search history rewriting section for rewriting the search history held in the search history storage section by referring to the semantic information given to the syntax network, the meaning included in the utterance at the same time as recognition in the backward search section Can be extracted and the recognition result as a sequence of meanings can be output.

【００６０】また、意味情報を付与した構文ネットワー
クを自動的に生成することができる。Further, it is possible to automatically generate a syntax network to which semantic information is added.

【００６１】また、構文ネットワーク上に付与された意
味情報及び意味情報の演算規則により意味情報の演算を
行ない、意味情報の系列を認識結果として出力する後向
き探索を設けたことにより、認識と同時に発話に含まれ
る意味情報を抽出できる。Further, by providing a backward search for calculating the semantic information according to the semantic information and the semantic information arithmetic rule given on the syntactic network and outputting the sequence of the semantic information as a recognition result, utterance is recognized and uttered at the same time. The semantic information contained in can be extracted.

【００６２】また、意味情報と演算規則を付与した構文
ネットワークを自動的に生成することができる。Further, it is possible to automatically generate a syntactic network to which semantic information and operation rules are added.

【００６３】また、探索履歴記憶部に保持されている同
時刻・同構文ノードで単語の意味情報が同一である探索
履歴のうち、不要な探索履歴を探索履歴記憶部から削除
する探索履歴書き換え部を設けたことにより、探索履歴
部に保持される、同時刻・同構文ノードで意味的に同じ
構文仮説を持つ探索履歴の数が減少し、後向き探索部が
出力する上位Ｎ個の認識結果として、意味的に異なった
正解候補を多く含むものが得られる。A search history rewriting unit for deleting an unnecessary search history from the search history storage unit, which has the same word semantic information at the same time and same syntax node, is deleted from the search history storage unit. By providing, the number of search histories held in the search history section and having the same syntax hypothesis at the same time and same syntax node is reduced, and as the top N recognition results output by the backward search section. , Which includes many correct answer candidates that are semantically different.

【００６４】更に、探索履歴記憶部に保持されている探
索履歴で、単語の意味情報が特定のものである探索履歴
を、その探索履歴が保持する１つ前に到達した構文ノー
ド及び１つ前に到達した構文ノードに到達した時刻に対
応する探索履歴で書き換える探索履歴書き換え部を設け
たことにより、意味情報が特定のものである単語が構文
仮説から削除され、後向き探索部が出力する上位Ｎ個の
認識結果として、意味的に異なった正解候補を多く含む
ものが得られる。Further, in the search history stored in the search history storage unit, the search history whose semantic information of a word is specific is the syntax node and the previous syntax node that has reached the previous one held by the search history. By providing the search history rewriting unit that rewrites with the search history corresponding to the time when the syntax node has reached, the word whose semantic information is specific is deleted from the syntax hypothesis, and the top N output by the backward search unit is output. As a result of recognition of the individual, one including many correct answer candidates that are semantically different is obtained.

[Brief description of drawings]

【図１】本発明の一実施例を示す音声認識装置の機能
ブロック構成図。FIG. 1 is a functional block configuration diagram of a voice recognition device showing an embodiment of the present invention.

【図２】本発明の実施例１での構文・意味知識記憶部
に保持される構文・意味知識の一例を示す図。FIG. 2 is a diagram showing an example of syntax / semantic knowledge held in a syntax / semantic knowledge storage unit according to the first embodiment of the present invention.

【図３】本発明の構文・意味知識記憶部に保持される
構文・意味知識から、構文ネットワーク生成部によって
生成され、構文ネットワーク記憶部に保持される構文ネ
ットワークの一例を示す図。FIG. 3 is a diagram showing an example of a syntax network generated by a syntax network generation unit from the syntax / semantic knowledge held in the syntax / semantic knowledge storage unit of the present invention and held in the syntax network storage unit.

【図４】本発明の実施例１における探索履歴の一例を
示す図。FIG. 4 is a diagram showing an example of a search history according to the first embodiment of the present invention.

【図５】本発明の実施例１における探索履歴書き換え
部によって、書き換えられた探索履歴の一例を示す図。FIG. 5 is a diagram showing an example of a search history rewritten by a search history rewriting unit according to the first embodiment of the present invention.

【図６】本発明の実施例１における探索履歴記憶部に
保持された探索履歴の一例を示す図。FIG. 6 is a diagram showing an example of a search history stored in a search history storage unit according to the first embodiment of the present invention.

【図７】本発明の実施例２における、ｇｎ，ｆｒｍ，
ｐｇｎ，ｓｆｒｍがすべて等しく、且つ、ｗｏｒｄの意
味素性が等しい探索履歴の一例を示す図。FIG. 7 shows gn, frm, in Example 2 of the present invention.
FIG. 6 is a diagram showing an example of a search history in which pgn and sfrm are all the same, and words have the same semantic features.

【図８】本発明の実施例における探索履歴書き換え部
の動作を示す図。FIG. 8 is a diagram showing an operation of a search history rewriting unit in the embodiment of the present invention.

【図９】本発明の実施例における探索履歴書き換え部
の動作を示す図。FIG. 9 is a diagram showing an operation of a search history rewriting unit in the embodiment of the present invention.

【図１０】本発明の実施例３における探索履歴の一例
を示す図。FIG. 10 is a diagram showing an example of a search history according to the third embodiment of the present invention.

【図１１】本発明の実施例３における探索履歴書き換
え部によって新たに作成される探索履歴の一例を示す
図。FIG. 11 is a diagram showing an example of a search history newly created by a search history rewriting unit according to the third embodiment of the present invention.

【図１２】本発明の実施例３における探索履歴書き換
え部の動作を示す図。FIG. 12 is a diagram showing an operation of a search history rewriting unit in Embodiment 3 of the present invention.

【図１３】本発明の実施例３における探索履歴書き換
え部の動作を示す図。FIG. 13 is a diagram showing an operation of a search history rewriting unit in Embodiment 3 of the present invention.

【図１４】本発明の実施例２及び実施例３における探
索履歴書き換え部の動作を示す図。FIG. 14 is a diagram showing an operation of a search history rewriting unit in Embodiments 2 and 3 of the present invention.

【図１５】本発明の実施例４での構文・意味知識記憶
部に保持される構文・意味知識の一例を示す図。FIG. 15 is a diagram showing an example of syntax / semantic knowledge held in a syntax / semantic knowledge storage unit according to the fourth embodiment of the present invention.

【図１６】本発明の実施例４における構文・意味知識
記憶部に保持される構文・意味知識から、構文ネットワ
ーク生成部によって生成され、構文ネットワーク記憶部
に保持される構文ネットワークの一例を示す図。FIG. 16 is a diagram showing an example of a syntax network generated by a syntax network generation unit from the syntax / semantic knowledge stored in the syntax / semantic knowledge storage unit according to the fourth embodiment of the present invention and stored in the syntax network storage unit. .

【図１７】本発明の実施例４における探索履歴記憶部
に保持された探索履歴の一例を示す図。FIG. 17 is a diagram showing an example of a search history stored in a search history storage unit according to the fourth embodiment of the present invention.

【図１８】本発明の実施例５における構文の例を示す
図。FIG. 18 is a diagram showing an example of syntax in Example 5 of the present invention.

【図１９】本発明の実施例５における前向き探索にお
ける意味素性単位での枝刈りを示す図。FIG. 19 is a diagram showing pruning in semantic feature units in a forward search in Example 5 of the present invention.

【図２０】本発明の実施例４における意味素性ＮＵＬ
Ｌに対するトレースバックポインタの書き換えを示す
図。FIG. 20 is a semantic feature NUL according to the fourth embodiment of the present invention.
The figure which shows the rewriting of the trace back pointer with respect to L.

【図２１】本発明の実施例５における意味素性がＮＵ
ＬＬである単語の削除を示す図。FIG. 21 shows that the semantic feature in the fifth embodiment of the present invention is NU.
The figure which shows the deletion of the word which is LL.

【図２２】本発明の実施例５における途中のノードで
の時刻が異なる仮説の枝刈りを示す図。FIG. 22 is a diagram showing pruning of a hypothesis with a different time at a node on the way in the fifth embodiment of the present invention.

【図２３】本発明の実験結果を示す図。FIG. 23 is a diagram showing an experimental result of the present invention.

【図２４】本発明の意味素性系列での認識率を示す
図。FIG. 24 is a diagram showing a recognition rate in the semantic feature series of the present invention.

【図２５】従来の音声認識装置を示す図。FIG. 25 is a diagram showing a conventional voice recognition device.

【図２６】入力音声を示す図。FIG. 26 is a diagram showing an input voice.

【図２７】音響辞書部を示す図。FIG. 27 is a diagram showing an audio dictionary unit.

【図２８】従来の構文を示す図。FIG. 28 is a diagram showing a conventional syntax.

【図２９】従来の構文ネットワークを示す図。FIG. 29 is a diagram showing a conventional syntax network.

【図３０】従来の探索履歴を示す図。FIG. 30 is a diagram showing a conventional search history.

【図３１】従来の音声認識装置の動作を説明する図。FIG. 31 is a view for explaining the operation of a conventional voice recognition device.

【図３２】従来の音声認識装置の動作を説明する図。FIG. 32 is a view for explaining the operation of a conventional voice recognition device.

[Explanation of symbols]

１構文・意味知識記憶部、２構文ネットワーク生成
部、３構文ネットワーク記憶部、４音響辞書部、５
前向き探索部、６探索履歴記憶部、７探索履歴書
き換え部、８後向き探索部。1 syntax / semantic knowledge storage unit, 2 syntax network generation unit, 3 syntax network storage unit, 4 acoustic dictionary unit, 5
Forward search unit, 6 Search history storage unit, 7 Search history rewriting unit, 8 Reverse search unit.

Claims

[Claims]

1. A syntactic network storage unit that holds a syntactic network to which semantic information is added, an acoustic dictionary unit that holds a standard pattern of an acoustic model, and a syntactic hypothesis for input speech using the syntactic network and the acoustic model. Is performed according to the syntax network to output a search history, a forward search unit, a search history storage unit that holds the search history output from the forward search unit, and the semantic information given to the syntax network is referred to. Then, a search history rewriting unit that rewrites the search history stored in the search history storage unit, and a search history stored in the search history storage unit are read,
A voice recognition device comprising a backward search unit for generating a recognition result by tracing on the syntax network according to the search history.

2. The speech recognition device further comprises a syntax / syntax in which semantic information is associated with syntactic knowledge defining a grammar of input speech.
The speech recognition apparatus according to claim 1, further comprising: a syntax / semantic knowledge storage unit that holds semantic knowledge, and a syntax network generation unit that generates a syntax network to which semantic information is added from the syntax / semantic knowledge.

3. A syntactic network storage unit that holds a syntactic network to which semantic information and operation rules are added, an acoustic dictionary unit that holds a standard pattern of an acoustic model, and an input speech using the syntactic network and the acoustic model. A forward search unit that performs a search of a syntactic hypothesis according to the syntax network, and outputs a search history, a search history storage unit that holds the search history output from the forward search unit, and a meaning given to the syntax network. A search history rewriting unit that rewrites the search history stored in the search history storage unit with reference to information; and a search history stored in the search history storage unit,
Tracing the syntax network according to the search history,
A speech recognition apparatus comprising a backward search section that calculates the semantic information according to the semantic information and the arithmetic rule provided on the syntax network and outputs a recognition result.

4. The voice recognition device further associates semantic information with a word in a dictionary unit that defines a word in syntactic knowledge that defines a grammar of an input voice, and calculates a calculation rule of the semantic information in a rule unit. A syntactic / semantic knowledge storage unit that holds the syntactic / semantic knowledge associated with each other, and a syntactic network generating unit that generates a syntactic network to which semantic information and operation rules are added from the syntactic / semantic knowledge. The voice recognition device according to claim 3.

5. The forward search unit, the syntax node of the syntax network, the time when the syntax node is reached, the search score at the time in the syntax node, the syntax node that arrived one before the syntax node, The time when the syntax node reached the previous one, the syntax node and the 1
The word between the preceding syntax nodes is output as a search history, and the search history rewriting unit is a search history for the same time and same syntax node held in the search history storage unit, and the word semantic information is the same. When there is a plurality of certain ones, some search histories among the search histories having the same semantic information are left in the search history storage unit, and other search histories having the same semantic information are used as the search histories. The voice recognition device according to claim 1, wherein the voice recognition device is deleted from the storage unit.

6. The search history rewriting unit holds, in the search history stored in the search history storage unit, a search history in which the semantic information of a word is specific 1
The speech recognition apparatus according to claim 1, wherein the speech recognition device is rewritten with a search history corresponding to the time when the syntax node reached the previous time and the time when the syntax node reached the previous time was reached.