JP2005257910A

JP2005257910A - Symbol string transduction method and voice recognition method using the symbol string transduction method, and symbol string transduction device and voice recognition device using the symbol string transduction device

Info

Publication number: JP2005257910A
Application number: JP2004067621A
Authority: JP
Inventors: Takaaki Hori; 貴明堀
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2004-03-10
Filing date: 2004-03-10
Publication date: 2005-09-22
Anticipated expiration: 2024-03-10
Also published as: JP4430964B2

Abstract

<P>PROBLEM TO BE SOLVED: To reduce computational quantity for performing the minimum cost search in a symbol string transduction method. <P>SOLUTION: A symbol string transduction method is disclosed for outputting an output symbol string corresponding to the state transition process of a WFST of a post stage in which the cumulative value of weights to state transitions which are to be applied respectively to the WFSTs of the preceding stage and the post stage of a symbol string transducing part become the minimum. A state transition process in which the cumulative weights become the minimum in possible state transition processes in the WFST of the post stage at the time when cumulative weights to hypothesis which expresses one state transition process of the WFST of the preceding stage and the output symbol string in the state transition process of the hypothesis are made the input symbol string of the WFST of the post stage is calculated while reading symbols one by one. The cumulative weights are corrected by adding the cumulative weights to the cumulative weights of the hypothesis and, at the point of time when all of input symbol strings are read, when the hypothesis in which the cumulative weights are the minimum and an output symbol string corresponding to the state transition process of the hypothesis are made the input symbol string of the WFST of the post stage, the output symbol string to the state transition process in which the cumulative weights become the minimum in possible state transition processes in the WFST of the post stage is made a transduced result. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、重み付き有限状態変換器によって表現された記号列変換規則によって、入力された記号列に対して生成可能な数多くの出力記号列から、適用される変換規則の重みの累積値が最小となる出力記号列を効率的に見つけることを可能とする記号列変換方法、及びそれを用いた音声認識方法、並びに記号列変換装置、及びその記号列変換装置を用いた音声認識装置に関するものである。 The present invention minimizes the cumulative value of the weights of conversion rules to be applied from a large number of output symbol strings that can be generated for an input symbol string by a symbol string conversion rule expressed by a weighted finite state converter. The present invention relates to a symbol string conversion method that makes it possible to efficiently find an output symbol string, a speech recognition method using the same, a symbol string conversion device, and a speech recognition device using the symbol string conversion device. is there.

（ＷＦＳＴ：重み付き有限状態変換器）
重み付き有限状態変換器(英訳 Weighted Finite-State Transducer :WFST)とは、記号列を別の記号列に変換するための規則を、状態と状態遷移の図に表現する方式を言う。
重み付き有限状態変換器については、例えば、非特許文献１に開示されている。以下、この重み付き有限状態変換器をＷＦＳＴと称す。
ＷＦＳＴは、状態と、状態から状態へと遷移できることを表す状態遷移、および、状態遷移において受理される入力記号と、その際に出力される出力記号、および、その状態遷移の重みの集合によって定義される。ＷＦＳＴは、ある入力記号列が与えられたときに、初期状態からその入力記号列の記号を順に受理する状態遷移に従って出力記号を出力しながら状態遷移を繰り返し、終了状態に達すると終了するモデルである。形式的にはＷＦＳＴは次の８つの組（Ｑ，Σ，△，ｉ，Ｆ，Ｅ，λ，ρ）によって定義される。
１．Ｑは有限の状態の集合。
２．Σは入力記号の有限の集合。
３．△は出力記号の有限の集合。
４．ｉ∈Ｑは初期状態。
５．Ｆ∈Ｑは終了状態の集合。
６．Ｅ∈Ｑ×Σ×△×Ｑは、現状態から入力記号により、出力記号を出力して次状態に遷移する状態遷移の集合。
７．λは初期重み。
８．ρ（ｑ）は終了状態ｑの終了重み。ｑ∈Ｆ。 (WFST: weighted finite state transducer)
Weighted Finite-State Transducer (WFST) is a method of expressing rules for converting a symbol string into another symbol string in a diagram of states and state transitions.
The weighted finite state converter is disclosed in Non-Patent Document 1, for example. Hereinafter, this weighted finite state transducer is referred to as WFST.
WFST is defined by a set of a state, a state transition indicating that the state can be transitioned from state to state, an input symbol accepted in the state transition, an output symbol output at that time, and a weight of the state transition Is done. WFST is a model that, when a certain input symbol string is given, repeats the state transition while outputting the output symbols according to the state transition that sequentially accepts the symbols of the input symbol string from the initial state, and terminates when the end state is reached. is there. Formally, WFST is defined by the following eight sets (Q, Σ, Δ, i, F, E, λ, ρ).
1. Q is a set of finite states.
2. Σ is a finite set of input symbols.
3. Δ is a finite set of output symbols.
4). iεQ is the initial state.
5). FεQ is a set of end states.
6). EεQ × Σ × Δ × Q is a set of state transitions in which an output symbol is output from the current state according to an input symbol to transition to the next state.
7). λ is the initial weight.
8). ρ (q) is the end weight of the end state q. qεF.

（ＷＦＳＴの例）
ＷＦＳＴの一例を図１に示す。
図１において、１０はマル（“○”）で表された状態を示しており、そのマルの中の数字はその状態の番号を表している。１１は二重マル（“◎”）で表された終了状態を示しており、その二重マルの中の数字は、その終了状態の番号と状態遷移が終了して最後に累積される終了重みが“（状態番号）／（終了重み）”のように表されている。以後、状態の番号を用いて状態を指し示す場合は、単に状態とその番号を用いて“状態０”や“状態３”のように称す。
１２は各状態を結ぶ矢印（“→”）で表された状態遷移を示しており、各々の状態遷移に付与された記号や数字は、その状態遷移に関連付けられた入力記号、出力記号、重みを“（入力記号）：（出力記号）／（重み）”のように表したものである。
図１のＷＦＳＴを表によって定義することもできる。図２は、各行が一つの状態遷移を表し、その状態遷移における遷移元の状態番号と遷移先の状態番号、入力記号、出力記号、重みが記されている。最終状態（図１では状態３）は、遷移先、入力記号、出力記号を空とし、状態遷移終了時に累積される重み（終了重み）を記入する書式となっている。一般に、ＷＦＳＴの初期状態は状態０とされ、初期重みλも省略されることが多い。そのため、本発明でも初期状態は状態０とし、初期重みは省略して明記しないこととする。 (Example of WFST)
An example of WFST is shown in FIG.
In FIG. 1, 10 indicates a state represented by a circle (“◯”), and the number in the circle represents the state number. Reference numeral 11 denotes an end state represented by a double circle (“◎”), and the number in the double circle is the number of the end state and the end weight accumulated at the end of the state transition. Is expressed as “(state number) / (end weight)”. Hereinafter, when the state is indicated using the state number, the state and the number are simply referred to as “state 0” or “state 3”.
Reference numeral 12 denotes a state transition represented by an arrow (“→”) connecting each state, and a symbol or a number given to each state transition is an input symbol, an output symbol, or a weight associated with the state transition. Is expressed as “(input symbol) :( output symbol) / (weight)”.
The WFST of FIG. 1 can also be defined by a table. In FIG. 2, each row represents one state transition, and the state number of the transition source, the state number of the transition destination, the input symbol, the output symbol, and the weight in the state transition are described. The final state (state 3 in FIG. 1) has a format in which the transition destination, the input symbol, and the output symbol are empty, and the weight (end weight) accumulated at the end of the state transition is entered. In general, the initial state of WFST is set to state 0, and the initial weight λ is often omitted. Therefore, in the present invention, the initial state is set to state 0, and the initial weight is omitted and not specified.

（入力記号列ａ，ａ，ｂ，ｃを出力記号列ｄ，ｄ，ｃ，ｂに変換する過程）
図１のＷＦＳＴは、例えば、入力記号列ａ，ａ，ｂ，ｃを出力記号列ｄ，ｄ，ｃ，ｂに変換することができ、その際の状態遷移過程は、状態番号の系列を用いて表すと０，０，１，３であり、重みの累積値（以下「累積重み」と称す）は、０．５＋０．５＋０．３＋１＋０．５＝２．８となる。しかし、図１のＷＦＳＴでは、ａ，ａ，ｂ，ｃという入力記号列に対しては、０，０，１，３と０，０，２，３の２通りの状態遷移過程が考えられる。一般に、ある入力記号列に対して複数の状態遷移の可能性がある場合（これを非決定性という）は、状態遷移過程における累積重みが最小になる状態遷移過程を選択し、その累積重み最小の状態遷移過程に対応する出力記号列を選択する。図１の例においても、ａ，ａ，ｂ，ｃという入力記号列に対して累積重みが最も小さい状態遷移過程０，０，１，３を選んで、変換結果をｄ，ｄ，ｃ，ｂとする。 (Process of converting input symbol string a, a, b, c to output symbol string d, d, c, b)
The WFST of FIG. 1 can convert, for example, the input symbol string a, a, b, c into the output symbol string d, d, c, b, and the state transition process at this time uses a sequence of state numbers. In other words, 0, 0, 1, 3 and the cumulative value of weight (hereinafter referred to as “cumulative weight”) is 0.5 + 0.5 + 0.3 + 1 + 0.5 = 2.8. However, in the WFST of FIG. 1, there are two possible state transition processes of 0, 0, 1, 3 and 0, 0, 2, 3 for the input symbol strings a, a, b, and c. In general, when there is a possibility of multiple state transitions for an input symbol string (this is called non-determinism), the state transition process that minimizes the cumulative weight in the state transition process is selected, and the minimum cumulative weight is selected. An output symbol string corresponding to the state transition process is selected. Also in the example of FIG. 1, the state transition processes 0, 0, 1, 3 having the smallest cumulative weight are selected for the input symbol strings a, a, b, c, and the conversion results are d, d, c, b. And

あるＷＦＳＴ（これをＡとする）があり、このＡに対して記号列Ｘが入力記号列として与えられたとき、累積重みが最小となる出力記号列（すなわち記号列変換結果）を求めるには、次の累積重みの最小値Ｗ（Ｘ）を計算する必要がある。

ここで、Ｗ（Ｘ→Ｙ；Ａ）は、ＷＦＳＴＡによって記号列Ｘが記号列Ｙに変換されるときの状態遷移過程における累積重みを表す。この累積重みＷ（Ｘ→Ｙ；Ａ）の最小値Ｗ（Ｘ）を求めて、その最小値を与えるＹが記号列変換結果となる。
このＷ（Ｘ）を効率的に求めるには、一般にグラフのコスト最小探索の技術の一つである横型探索法を利用する。例えば、グラフの横型探索法の手順は、非特許文献２に開示されている。
ＷＦＳＴによる記号列変換は、入力記号列によって初期状態から終了状態に至るコスト（累積重み）最小の状態遷移過程を探し出すことによって行われる。 To obtain an output symbol string (that is, a symbol string conversion result) having a minimum cumulative weight when there is a certain WFST (this is A) and a symbol string X is given as an input symbol string to this A Therefore, it is necessary to calculate the minimum value W (X) of the next cumulative weight.

Here, W (X → Y; A) represents the cumulative weight in the state transition process when the symbol string X is converted to the symbol string Y by WFST A. The minimum value W (X) of the cumulative weight W (X → Y; A) is obtained, and Y giving the minimum value is the symbol string conversion result.
In order to obtain this W (X) efficiently, a horizontal search method, which is one of the techniques for searching a graph with a minimum cost, is generally used. For example, the procedure of the horizontal search method of the graph is disclosed in Non-Patent Document 2.
Symbol string conversion by WFST is performed by searching for a state transition process with a minimum cost (cumulative weight) from an initial state to an end state by an input symbol string.

（一つのＷＦＳＴを用いた記号列変換）
一つのＷＦＳＴを用いた記号列変換の一実施例を図３に示す。
まず、本明細書において“仮説”とは、ある記号列が順に入力され、現時点までに読み込まれた入力記号列に対して、ＷＦＳＴにおいて初期状態からその入力記号列によって状態遷移を繰り返した場合の可能性のある一つの状態遷移過程を表すものとする。
記号列入力部１０３では記号を順に読み込み、仮説展開部１０４に送る。仮説展開部１０４では、記号列入力部１０３から受け取った記号とＷＦＳＴ格納部１０１から読み込んだＷＦＳＴに従って、これまで読み込んだ記号列に対する仮説の集合を新たに受け取った記号を用いて各仮説の状態遷移過程を更新することにより新たな仮説を生成し、仮説絞込み部１０５に送る。仮説絞込み部１０５では、仮説展開部１０４から受け取った仮説の集合に対し、同じ状態に到達している仮説の中で累積重みが最小の仮説以外の仮説を削除することにより仮説を絞り込む。入力記号列が最後まで読み込まれていれば、累積重み最小の仮説に対応する出力記号列を記号列出力部１０６に送る。入力記号列が最後まで読み込まれていなければ、仮説を仮説展開部１０４に送る。記号列出力部１０６では、仮説絞込み部１０５から受け取った出力記号列を出力する。 (Symbol conversion using one WFST)
An example of symbol string conversion using one WFST is shown in FIG.
First, in the present specification, the “hypothesis” means that a certain symbol string is sequentially input and the state transition is repeated from the initial state in the WFST by the input symbol string for the input symbol string read up to the present time. Let us represent one possible state transition process.
The symbol string input unit 103 sequentially reads the symbols and sends them to the hypothesis developing unit 104. In the hypothesis developing unit 104, according to the symbol received from the symbol string input unit 103 and the WFST read from the WFST storage unit 101, the state transition of each hypothesis using the newly received set of hypotheses for the symbol string read so far By updating the process, a new hypothesis is generated and sent to the hypothesis narrowing unit 105. The hypothesis narrowing unit 105 narrows down the hypotheses by deleting hypotheses other than the hypothesis having the smallest cumulative weight among the hypotheses reaching the same state from the hypothesis set received from the hypothesis developing unit 104. If the input symbol string has been read to the end, the output symbol string corresponding to the hypothesis with the smallest cumulative weight is sent to the symbol string output unit 106. If the input symbol string has not been read to the end, the hypothesis is sent to the hypothesis developing unit 104. The symbol string output unit 106 outputs the output symbol string received from the hypothesis narrowing unit 105.

（記号列変換手順）
次に、この実施の形態に基づいて記号列を変換する手順の一例を示す。
まず、ＷＦＳＴのある状態遷移をｅと表すとき、ｎ[ｅ]を遷移先の状態（次状態）、ｉ[ｅ]を入力記号、ｏ[ｅ]を出力記号、ｗ[ｅ]を重みと定義する。また、ある仮説をｈと表わすとき、ｓ[ｈ]を到達している状態、Ｗ[ｈ]をその状態遷移過程における累積重み、Ｏ[ｈ]をその状態遷移過程において出力される記号列とする。
この手順において、仮説は仮説のリスト（以後これを仮説リストと呼ぶ）を用いて管理する。仮説リストに対し、仮説を挿入したり、仮説を取り出したりすることができる。但し、仮説リストに仮説を挿入する場合に、仮説リスト内に同じ状態に到達している仮説があれば、累積重みの小さい方だけを仮説リストに残し、仮説を絞り込む。 (Symbol string conversion procedure)
Next, an example of a procedure for converting a symbol string based on this embodiment will be described.
First, when a state transition with WFST is represented as e, n [e] is a transition destination state (next state), i [e] is an input symbol, o [e] is an output symbol, and w [e] is a weight. Define. Also, when a certain hypothesis is represented as h, a state in which s [h] is reached, W [h] is a cumulative weight in the state transition process, and O [h] is a symbol string output in the state transition process. To do.
In this procedure, hypotheses are managed using a list of hypotheses (hereinafter referred to as a hypothesis list). Hypotheses can be inserted into and extracted from the hypothesis list. However, when a hypothesis is inserted into the hypothesis list, if there is a hypothesis that reaches the same state in the hypothesis list, only the smaller cumulative weight is left in the hypothesis list to narrow down the hypotheses.

ＷＦＳＴを用いた記号列変換手順を図４に示す。
以下、図４の手順を、ＷＦＳＴを用いる記号列変換の一実施例（図３）と対比して説明する。
ステップS101より開始し、初期設定として、ステップS102において空の仮説リストＨとＨ’を生成する。ステップS103において、初期の仮説ｈを生成し、ｓ[ｈ]＝０（ＷＦＳＴの初期状態）、Ｗ[ｈ]＝０、Ｏ[ｈ]＝φ（空の記号列）とし、仮説リストＨに挿入する。
ステップS104では、記号列入力部１０３において記号を一つ読み込み、その記号をｘに代入する。次のステップS105からS108は、仮説展開部１０４において実行される。
ステップS105では、仮説リストＨから仮説を一つ取り出しｈに代入し、状態ｓ[ｈ]から入力記号がｘに等しい状態遷移のリストＥを用意する。
ステップS106ではＥ＝φ（空のリスト）であればS110に進む。そうでなければ、S107に進みＥから状態遷移を一つ取り出し、ｅに代入する。
ステップS108で新たな仮説ｆを生成し、ｓ[ｆ]＝ｎ[ｅ]、Ｗ[ｆ]＝Ｗ[ｈ]＋ｗ[e]、Ｏ[ｆ]＝Ｏ[ｈ]・ｏ[ｅ]とする。ここで、“・”は二つの記号または記号列を接続し、一つの記号列にする演算を表す。 FIG. 4 shows a symbol string conversion procedure using WFST.
Hereinafter, the procedure of FIG. 4 will be described in comparison with an embodiment (FIG. 3) of symbol string conversion using WFST.
Starting from step S101, as a default setting, empty hypothesis lists H and H ′ are generated in step S102. In step S103, an initial hypothesis h is generated, and s [h] = 0 (initial state of WFST), W [h] = 0, O [h] = φ (empty symbol string) are set in the hypothesis list H. insert.
In step S104, the symbol string input unit 103 reads one symbol and substitutes the symbol for x. The following steps S105 to S108 are executed in the hypothesis developing unit 104.
In step S105, one hypothesis is extracted from the hypothesis list H and assigned to h, and a state transition list E having an input symbol equal to x is prepared from the state s [h].
In step S106, if E = φ (empty list), the process proceeds to S110. Otherwise, the process proceeds to S107, and one state transition is extracted from E and substituted into e.
In step S108, a new hypothesis f is generated, and s [f] = n [e], W [f] = W [h] + w [e], O [f] = O [h] · o [e] To do. Here, “·” represents an operation of connecting two symbols or symbol strings to form one symbol string.

ステップS109は、仮説絞込み部１０５で実行され、仮説ｆを仮説リストＨ’に挿入することにより仮説を絞り込む。
ステップS109からS106に戻り、次の状態遷移について仮説を展開する。
ステップS110では、Ｈ＝φ（すべての仮説を展開済み）であればS111に進む。そうでなければS106に戻り、次の仮説を展開する。
ステップS111では、新たに生成された仮説のリストＨ’の要素を、すでに空となったＨにすべて移し、S112に進む。
ステップS112では、記号列入力部１０３において次の入力記号が存在するならばS104に戻り、そうでなければ、入力記号列がすべて読み込まれたと判断しS113に進む。
ステップS113では、仮説リストＨの中で終了状態に到達している仮説の累積重みにその終了状態の終了重みを加えた後で、その終了状態に到達している仮説の中から累積重み（Ｗ[ｈ]）が最小となる仮説ｈを選び、その出力記号列Ｏ[ｈ]を記号列変換結果として、記号列出力部１０６において出力する。
ステップS114にてＷＦＳＴを用いる記号列変換手順を終了する。 Step S109 is executed by the hypothesis narrowing unit 105 to narrow down the hypotheses by inserting the hypothesis f into the hypothesis list H ′.
Returning from step S109 to S106, a hypothesis is developed for the next state transition.
In step S110, if H = φ (all hypotheses have been developed), the process proceeds to S111. Otherwise, return to S106 and develop the next hypothesis.
In step S111, all the elements of the newly generated hypothesis list H ′ are moved to the already empty H, and the process proceeds to S112.
In step S112, if there is a next input symbol in the symbol string input unit 103, the process returns to S104. Otherwise, it is determined that all the input symbol strings have been read, and the process proceeds to S113.
In step S113, after adding the end weight of the end state to the cumulative weight of the hypothesis reaching the end state in the hypothesis list H, the cumulative weight (W The hypothesis h that minimizes [h]) is selected, and the output symbol string O [h] is output as a symbol string conversion result in the symbol string output unit 106.
In step S114, the symbol string conversion procedure using WFST is terminated.

（ＷＦＳＴに入力記号列ａ，ａ，ｂ，ｃが与えられた場合の出力記号列を求める過程）
この記号列変換手順に従って、図１のＷＦＳＴに入力記号列ａ，ａ，ｂ，ｃが与えられた場合の出力記号列を求める過程を順を追って説明する。但し、ある仮説（現状態番号ｓ、出力記号列Ｏ、累積重みＷ）がある場合、その仮説を（ｓ，Ｏ，Ｗ）のように表すものとする。また、ＷＦＳＴのある状態遷移（現状態番号ｓ、次状態番号ｎ、入力記号ｘ、出力記号ｙ、重みｗ）を＜ｓ→ｎ，ｘ：ｙ／ｗ＞と表すものとする。
S101から開始し、S102で空のリストＨとＨ’を作る。
S103により仮説リストＨの中の仮説（０，φ，０）を挿入する。 (Process of obtaining an output symbol string when input symbol strings a, a, b, and c are given to WFST)
The process of obtaining the output symbol string when the input symbol strings a, a, b, and c are given to the WFST of FIG. 1 according to this symbol string conversion procedure will be described in order. However, if there is a certain hypothesis (current state number s, output symbol string O, cumulative weight W), the hypothesis is represented as (s, O, W). A state transition (current state number s, next state number n, input symbol x, output symbol y, weight w) with WFST is represented as <s → n, x: y / w>.
Starting from S101, empty lists H and H ′ are created in S102.
In S103, the hypothesis (0, φ, 0) in the hypothesis list H is inserted.

記号“ａ”読み込み
S104で記号ａを読み込み、S105において仮説リストＨから仮説（０，φ，０）を取り出す。この仮説の現状態０から入力記号がａに等しい状態遷移＜０→０，ａ：ｄ／０．５＞を含む状態遷移リストＥを作る。
S106でＥ＝φではないのでS107に進み、状態遷移＜０→０，ａ：ｄ／０．５＞を取り出し、S108で新たな仮説（０，ｄ，０．５）を生成し、S109でＨ’に挿入する。
S106に戻りＥ＝φであるためS110に進み、Ｈ＝φであるためS111に進む。Ｈ’の要素（０，ｄ，０．５）をＨに移し、S112で次の入力記号が存在するのでS104に戻る。
続いて、S104で記号ａを読み込み、S105において仮説リストＨから仮説（０，ｄ，０．５）を取り出す。この仮説の現状態０から入力記号がａに等しい状態遷移＜０→０，ａ：ｄ／０．５＞を含む状態遷移リストＥを生成する。
S106でＥ＝φではないのでS107に進み、Ｅから状態遷移＜０→０，ａ：ｄ／０．５＞を取り出す。S108で新たな仮説（０，ｄｄ，１）を生成し、S109でＨ’に挿入する。
S106に戻りＥ＝φであるためS110に進み、Ｈ＝φであるためS111に進む。Ｈ’の要素（０，ｄｄ，１）をＨに移し、S112で次の入力記号が存在するのでS104に戻る。 Read symbol “a”
The symbol a is read in S104, and the hypothesis (0, φ, 0) is extracted from the hypothesis list H in S105. A state transition list E including a state transition <0 → 0, a: d / 0.5> whose input symbol is equal to a is created from the current state 0 of this hypothesis.
Since E = φ is not satisfied in S106, the process proceeds to S107, the state transition <0 → 0, a: d / 0.5> is extracted, and a new hypothesis (0, d, 0.5) is generated in S108. Insert into H '.
Returning to S106, since E = φ, the process proceeds to S110, and since H = φ, the process proceeds to S111. The element (0, d, 0.5) of H ′ is moved to H, and since the next input symbol exists in S112, the process returns to S104.
Subsequently, the symbol a is read in S104, and the hypothesis (0, d, 0.5) is extracted from the hypothesis list H in S105. A state transition list E including a state transition <0 → 0, a: d / 0.5> whose input symbol is equal to a is generated from the current state 0 of this hypothesis.
Since E = φ is not satisfied in S106, the process proceeds to S107, and the state transition <0 → 0, a: d / 0.5> is extracted from E. A new hypothesis (0, dd, 1) is generated in S108, and inserted into H ′ in S109.
Returning to S106, since E = φ, the process proceeds to S110, and since H = φ, the process proceeds to S111. The element (0, dd, 1) of H ′ is moved to H, and since the next input symbol exists in S112, the process returns to S104.

記号“ｂ”読み込み
続いて、S104で記号ｂを読み込み、S105において仮説リストＨから仮説（０，ｄｄ，１）を取り出す。この仮説の現状態０から入力記号がｂに等しい状態遷移＜０→１，ｂ：ｃ／０．３＞と＜０→２，ｂ：ｂ／１＞を含む状態遷移リストＥを作る。
S106でＥ＝φではないのでS107に進み、Ｅから状態遷移＜０→１，ｂ：ｃ／０．３＞を取り出す。S108で新たな仮説（１，ｄｄｃ，１．３）を生成し、S109でＨ’に挿入する。
S106に戻りＥ＝φではないのでS107に進み、Ｅから状態遷移＜０→２，ｂ：ｂ／１＞を取り出す。S108で新たな仮説（２，ｄｄｂ，２）を生成して、S109でＨ’に挿入する。
S106に戻りＥ＝φであるためS110に進み、Ｈ＝φであるためS111に進み、Ｈ’の要素（１，ｄｄｃ，１．３）と（２，ｄｄｂ，２）はＨに移され、S112で次の入力記号が存在するのでS104に戻る。 Reading the symbol “b” Subsequently, the symbol b is read in S104, and the hypothesis (0, dd, 1) is extracted from the hypothesis list H in S105. A state transition list E including state transitions <0 → 1, b: c / 0.3> and <0 → 2, b: b / 1> whose input symbol is equal to b is created from the current state 0 of this hypothesis.
Since E = φ is not satisfied in S106, the process proceeds to S107, and the state transition <0 → 1, b: c / 0.3> is extracted from E. A new hypothesis (1, ddc, 1.3) is generated in S108, and inserted into H ′ in S109.
Returning to S106, since E = φ is not established, the process proceeds to S107, and the state transition <0 → 2, b: b / 1> is extracted from E. A new hypothesis (2, ddb, 2) is generated in S108 and inserted into H ′ in S109.
Returning to S106, since E = φ, the process proceeds to S110, and since H = φ, the process proceeds to S111. Elements (1, ddc, 1.3) and (2, ddb, 2) of H ′ are moved to H, Since the next input symbol exists in S112, the process returns to S104.

記号“ｃ”読み込み
続いて、S104で記号ｃを読み込み、S105において仮説リストＨから仮説（１，ｄｄｃ，１．３）を取り出す。この仮説の現状態１から入力記号がｃに等しい状態遷移＜１→３，ｃ：ｂ／１＞を含む状態遷移リストＥを作る。
S106でＥ＝φではないのでS107に進み、Ｅから状態遷移＜１→３，ｃ：ｂ／１＞を取り出す。S108で新たな仮説（１，ｄｄｃｂ，２．３）を生成し、S109でＨ’に挿入する。
S106に戻りＥ＝φであるためS110に進み、Ｈ≠φであるためS105に戻り、仮説リストＨから仮説（２，ｄｄｂ，２）を取り出す。この仮説の現状態２から入力記号がｃに等しい状態遷移＜２→３，ｃ：ａ／０．６＞を含む状態遷移リストＥを作る。
S106でＥ＝φではないのでS107に進み、Ｅから状態遷移＜２→３，ｃ：ａ／０．６＞を取り出す。S108で新たな仮説（３，ｄｄｂａ，２．６）を生成し、S109でＨ’に挿入する。このとき、Ｈ’の中には既に仮説（３，ｄｄｃｂ，２．３）が含まれており、仮説（３，ｄｄｂａ，２．６）は同じ状態３に到達しているので、累積重みの小さい仮説（３，ｄｄｃｂ，２．３）を残し、仮説（３，ｄｄｂａ，２．６）はリストから削除する。
S106に戻りＥ＝φであるためS110に進み、Ｈ＝φであるためS111に進みＨ’の要素（３，ｄｄｃｂ，２．３）をＨに移し、S112で次の入力記号が存在しないのでS113に進む。
S113で、Ｈ内の仮説（３，ｄｄｃｂ，２．３）の到達状態３は終了状態であるため、終了重みを加えて（３，ｄｄｃｂ，２．８）とし、この仮説が終了状態に到達した唯一の仮説であり、累積重み最小となるので、その出力記号列ｄｄｃｂを変換結果として出力し、S114で記号列変換処理を終了する。 Reading Symbol “c” Subsequently, the symbol c is read in S104, and the hypothesis (1, ddc, 1.3) is extracted from the hypothesis list H in S105. A state transition list E including a state transition <1 → 3, c: b / 1> whose input symbol is equal to c is created from the current state 1 of this hypothesis.
Since E = φ is not satisfied in S106, the process proceeds to S107, and the state transition <1 → 3, c: b / 1> is extracted from E. A new hypothesis (1, ddcb, 2.3) is generated in S108, and inserted into H ′ in S109.
Returning to S106, since E = φ, the process proceeds to S110, and since H ≠ φ, the process returns to S105, and the hypothesis (2, ddb, 2) is extracted from the hypothesis list H. A state transition list E including a state transition <2 → 3, c: a / 0.6> whose input symbol is equal to c is created from the current state 2 of this hypothesis.
Since E = φ is not satisfied in S106, the process proceeds to S107, and the state transition <2 → 3, c: a / 0.6> is extracted from E. A new hypothesis (3, ddba, 2.6) is generated in S108, and inserted into H ′ in S109. At this time, the hypothesis (3, ddcb, 2.3) is already included in H ′, and since the hypothesis (3, ddba, 2.6) has reached the same state 3, The small hypothesis (3, ddcb, 2.3) is left, and the hypothesis (3, ddba, 2.6) is deleted from the list.
Returning to S106, since E = φ, the process proceeds to S110, and since H = φ, the process proceeds to S111, the element H '(3, ddcb, 2.3) is moved to H, and the next input symbol does not exist in S112. Proceed to S113.
In S113, since the arrival state 3 of the hypothesis (3, ddcb, 2.3) in H is an end state, the end weight is added to (3, ddcb, 2.8), and this hypothesis reaches the end state. This is the only hypothesis that the accumulated weight is minimum, so that the output symbol string ddcb is output as the conversion result, and the symbol string conversion process is terminated in S114.

（二つのＷＦＳＴによる記号列の変換）
次に二つのＷＦＳＴがあって、順に記号列の変換を行う問題を考える。つまり、二つのＷＦＳＴ、ＡとＢがあって、入力記号列Ｘが与えられたときに、初めにＡを用いて記号列Ｘを記号列Ｙに変換し、その記号列ＹをＢの入力記号列として、更に変換して出力記号列Ｚを得ることを意味する。
ＷＦＳＴの理論では、この問題は、ＡによってＸを変換する際にその出力記号列となり得るすべてのＹをＢの入力記号列として考慮し、それら入力記号列に対して可能性のあるＢの出力記号列Ｚの集合の中から、Ａにおける状態遷移過程の累積重みとＢにおける状態遷移過程の累積重みの和が最小となる変換結果を得る問題となる。従って、

を計算し、このＷ（Ｘ）を与えるＹとＺの組のうちＺを記号列変換結果とする。 (Conversion of symbol string by two WFST)
Next, let us consider a problem in which there are two WFSTs and the symbol strings are converted in order. That is, when there are two WFSTs, A and B, and an input symbol string X is given, the symbol string X is first converted to the symbol string Y using A, and the symbol string Y is converted to the input symbol of B. This means that the output symbol string Z is further converted as a string.
In WFST theory, the problem is that when transforming X by A, all possible Ys that can be its output symbol string are considered as B input symbol strings, and possible B outputs for those input symbol strings From the set of symbol strings Z, there is a problem of obtaining a conversion result that minimizes the sum of the cumulative weight of the state transition process in A and the cumulative weight of the state transition process in B. Therefore,

And Z of the set of Y and Z giving this W (X) is taken as the symbol string conversion result.

（仮名の記号列を漢字の記号列に変換するＷＦＳＴの例と漢字の連接に重みを与えるＷＦＳＴ）
例として、図５と図６に示す二つのＷＦＳＴを考える。図５は、仮名の記号列を漢字の記号列に変換するＷＦＳＴの一例である。ただし、図５に現れる“ε”の記号は“：”の左側にある場合は、入力記号なしで状態遷移し、右側にある場合はその状態遷移において何も出力しないことを表す。
このＷＦＳＴは、例えば、記号列“あ，め”を状態遷移過程０，１，５（累積重み１）によって“雨”に変換し、“あ，め，だ，ま”は状態遷移過程０，１，５，０，４，５（累積重み３）によって“雨，玉”に変換する。しかし、日本語では一般に“あめだま”に対応する漢字は“雨玉”よりも“飴玉”の方が一般的であるが、図５のＷＦＳＴでは、記号列“飴，玉”を出力する状態遷移過程０，２，５，０，４，５の累積重みは４であるのに対し、“雨玉”に変換する累積重みが３であることから、累積重みが小さい場合の“雨，玉”という変換結果になる。 (Example of WFST converting kana symbol string to kanji symbol string and giving weight to kanji concatenation)
As an example, consider the two WFSTs shown in FIGS. FIG. 5 is an example of a WFST that converts a kana symbol string into a kanji symbol string. However, when the symbol “ε” appearing in FIG. 5 is on the left side of “:”, the state transition is performed without an input symbol, and when it is on the right side, nothing is output in the state transition.
This WFST, for example, converts the symbol string “A, Me” into “rain” by the state transition processes 0, 1, 5 (cumulative weight 1), and “A, Me, Da, Ma” is the state transition process 0, It is converted into “rain, ball” by 1, 5, 0, 4, 5 (cumulative weight 3). However, in Japanese, the kanji corresponding to “Amedama” is generally “Kamatama” rather than “Ametama”, but the WFST in FIG. 5 outputs the symbol string “列, ball”. The cumulative weight of state transition processes 0, 2, 5, 0, 4, and 5 is 4, whereas the cumulative weight to be converted to “raindrop” is 3, so that “rain, The result is a “ball”.

一方、図６は漢字の連接に重みを与えるＷＦＳＴである。このＷＦＳＴでは、すべての状態遷移において入力記号と出力記号が同じ、つまり記号列を変換せずにそのまま出力するが、状態遷移過程において入力記号列（漢字列）の記号の連接が日本語としてもっともらしい場合には重みが小さく、もっともらしくない場合には重みが大きくなっている。例えば、入力記号列“雨，降り”を受理する状態遷移過程０，１，３の累積重みは０となり、入力記号列“飴，降り”を受理する状態遷移過程０，２，３の累積重みは３となり、入力記号列“雨，降り”の方が日本語としてもっともらしく連接された記号列であることを示している。 On the other hand, FIG. 6 is a WFST that gives weight to kanji connection. In this WFST, the input symbol and the output symbol are the same in all state transitions, that is, the symbol string is output as it is without being converted. The weight is small when it seems to be large, and the weight is large when it is not likely. For example, the cumulative weights of the state transition processes 0, 1, and 3 that accept the input symbol string “rain, descend” are 0, and the cumulative weights of the state transition processes 0, 2, and 3 that accept the input symbol string “飴, descend” 3 indicates that the input symbol string “rain, rain” is the most likely connected symbol string in Japanese.

そして、この図５と図６のＷＦＳＴを用いて記号列を変換する場合、例えば、仮名の記号列“あ，め，だ，ま”を入力すると、図５のＷＦＳＴから得られる出力記号列は、“雨，玉”で累積重みが３と“飴，玉”で累積重みが４の場合がある。これら出力記号列をそれぞれ図６のＷＦＳＴに入力した場合、“雨，玉”に対する累積重み５が、“飴，玉”に対する累積重みが１となり、図５と図６の二つのＷＦＳＴによる累積重みの和を計算すると次のようになる。
“あ，め，だ，ま”→ “雨，玉”における累積重み３＋５＝８
“あ，め，だ，ま”→ “飴，玉”における累積重み４＋１＝５
これらの累積重みを比較した結果、より累積重みの小さい出力記号列は“飴，玉”となる。このような変換結果を得るには、漢字の連接に関する重みを持つ図６のＷＦＳＴの情報が不可欠である。 When the symbol string is converted using the WFST of FIGS. 5 and 6, for example, if the kana symbol string “A, Me, Da, Ma” is input, the output symbol string obtained from the WFST of FIG. , “Rain, ball” may have a cumulative weight of 3 and “carp, ball” may have a cumulative weight of 4. When these output symbol strings are respectively input to the WFST in FIG. 6, the cumulative weight 5 for “rain, ball” is 1, the cumulative weight for “carp, ball” is 1, and the cumulative weight by the two WFSTs in FIG. 5 and FIG. The sum of is calculated as follows.
“A, Me, Da, Ma” → Cumulative weight in “Rain, Jade” 3 + 5 = 8
Cumulative weight in “A, Me, Da, Ma” → “Aoi, Jade” 4 + 1 = 5
As a result of comparing these accumulated weights, an output symbol string having a smaller accumulated weight is “飴, ball”. In order to obtain such a conversion result, the information of WFST in FIG. 6 having weights related to kanji connection is indispensable.

（二つのＷＦＳＴを用いた従来の記号列変換）
二つのＷＦＳＴを用いた従来の記号列変換の一実施例を図７に示す。
まず、二つのＷＦＳＴを用いて２段階の記号列変換を行う場合の前段に用いられるＷＦＳＴを前段ＷＦＳＴ、後段に用いられるＷＦＳＴを後段ＷＦＳＴと呼ぶことにする。また、二つのＷＦＳＴを用いる場合の“仮説”とは、ある記号列が順に入力され、現時点までに読み込まれた入力記号列に対して、前段ＷＦＳＴにおいて初期状態からその入力記号列によって状態遷移を繰り返した場合の可能性のある一つの状態遷移過程と、その前段ＷＦＳＴの状態遷移過程に対応する出力記号列を後段ＷＦＳＴの入力記号列としたときのある一つの状態遷移過程の組を表すものとする。 (Conventional symbol string conversion using two WFSTs)
FIG. 7 shows an example of conventional symbol string conversion using two WFSTs.
First, the WFST used in the preceding stage when performing two-stage symbol string conversion using two WFSTs is called the preceding WFST, and the WFST used in the succeeding stage is called the succeeding WFST. A “hypothesis” in the case of using two WFSTs means that a certain symbol string is sequentially input, and a state transition is performed from the initial state in the previous stage WFST by the input symbol string in the previous stage WFST. Represents a set of one state transition process that may be repeated, and one state transition process when the output symbol string corresponding to the state transition process of the preceding WFST is used as the input symbol string of the succeeding WFST And

図７の記号列入力部１０３では記号を順に読み込み、仮説展開部１０４に送る。仮説展開部１０４では、記号列入力部１０３から受け取った記号と前段ＷＦＳＴ格納部２０１から読み込んだ前段ＷＦＳＴと後段ＷＦＳＴ格納部２０２から読み込んだ後段ＷＦＳＴに従って、これまで読み込んだ記号列に対する仮説の集合を新たに受け取った記号を用いて各仮説の状態遷移過程を更新することにより新たな仮説を生成し、仮説絞込み部１０５に送る。仮説絞込み部１０５では、仮説展開部１０４から受け取った仮説の集合に対し、同じ状態に到達している仮説の中で累積重みが最小の仮説以外の仮説を削除することにより仮説を絞り込む。入力記号列が最後まで読み込まれていれば、累積重み最小の仮説に対応する出力記号列を記号列出力部１０６に送る。入力記号列が最後まで読み込まれていなければ、仮説を仮説展開部１０４に送る。記号列出力部１０６では、仮説絞込み部１０５から受け取った出力記号列を出力する。 The symbol string input unit 103 in FIG. 7 sequentially reads the symbols and sends them to the hypothesis developing unit 104. In the hypothesis developing unit 104, a set of hypotheses for the symbol string read up to now is obtained according to the symbol received from the symbol string input unit 103, the preceding WFST read from the preceding WFST storage unit 201, and the subsequent WFST read from the subsequent WFST storage unit 202. A new hypothesis is generated by updating the state transition process of each hypothesis using the newly received symbol, and is sent to the hypothesis narrowing unit 105. The hypothesis narrowing unit 105 narrows down the hypotheses by deleting hypotheses other than the hypothesis having the smallest cumulative weight among the hypotheses reaching the same state from the hypothesis set received from the hypothesis developing unit 104. If the input symbol string has been read to the end, the output symbol string corresponding to the hypothesis with the smallest cumulative weight is sent to the symbol string output unit 106. If the input symbol string has not been read to the end, the hypothesis is sent to the hypothesis developing unit 104. The symbol string output unit 106 outputs the output symbol string received from the hypothesis narrowing unit 105.

二つのＷＦＳＴを用いた場合も、前述の一つのＷＦＳＴにおける最小コスト探索と同様の手法で計算することができる。図７に示すように、仮説展開部１０４において、前段ＷＦＳＴ格納部２０１から前段ＷＦＳＴ（これをＷＦＳＴＡとする）、後段ＷＦＳＴ格納部２０２から後段ＷＦＳＴ（これをＷＦＳＴＢとする）を読み込んで、これら二つのＷＦＳＴに従って仮説を更新し、新たな仮説を生成する。二つのＷＦＳＴＡとＢを用いて最小コスト探索を行う手順を示す、ただし、ＷＦＳＴが一つの場合とは異なり、仮説ｈはＷＦＳＴＡにおける到達状態ｓ_A[ｈ]とＷＦＳＴＢにおける到達状態ｓ_B[ｈ]の組を持つものとする。また、仮説リストに仮説を挿入した場合は、仮説リスト内にＷＦＳＴＡにおける到達状態とＷＦＳＴＢにおける到達状態が共に等しい仮説があれば、累積重みの小さい方だけを仮説リストに残し、仮説を絞り込む。 Even when two WFSTs are used, the calculation can be performed by the same method as the minimum cost search in the above-mentioned one WFST. As shown in FIG. 7, the hypothesis developing unit 104 reads the preceding WFST (referred to as WFST A) from the preceding WFST storage 201 and the subsequent WFST (referred to as WFST B) from the subsequent WFST storage 202. The hypothesis is updated according to these two WFSTs, and a new hypothesis is generated. A procedure for performing a minimum cost search using two WFSTs A and B is shown. However, unlike the case of one WFST, the hypothesis h is an arrival state s _A [h] in WFST A and an arrival state s B in WFST _B. Assume that there is a set [h]. When a hypothesis is inserted into the hypothesis list, if there is a hypothesis in the hypothesis list where the arrival state in WFST A and the arrival state in WFST B are both equal, only the smaller cumulative weight is left in the hypothesis list to narrow down the hypotheses .

（二つのＷＦＳＴを用いた従来の記号列変換手順）
この二つのＷＦＳＴを用いた従来の記号列変換手順を図８に示す。
以下、図８の手順を、二つのＷＦＳＴを用いる従来の記号列変換の一実施例（図７）と対比して説明する。
ステップS201より開始し、初期設定として、ステップS202において空の仮説リストＨとＨ’を生成する。S203において、初期の仮説ｈを生成し、ｓ_A[ｈ]＝０（ＷＦＳＴＡの初期状態）、ｓ_B[ｈ]＝０（ＷＦＳＴＢの初期状態）、Ｗ[ｈ]＝０、Ｏ[ｈ]＝φ（空の記号列）とし、仮説リストＨに挿入する。
ステップS204では、記号列入力部１０３において記号を一つ読み込みその記号をｘに代入する。 (Conventional symbol string conversion procedure using two WFSTs)
A conventional symbol string conversion procedure using these two WFSTs is shown in FIG.
Hereinafter, the procedure of FIG. 8 will be described in comparison with an embodiment (FIG. 7) of a conventional symbol string conversion using two WFSTs.
Starting from step S201, as a default setting, empty hypothesis lists H and H ′ are generated in step S202. In S203, an initial hypothesis h is generated, and s _A [h] = 0 (initial state of WFST A), s _B [h] = 0 (initial state of WFST B), W [h] = 0, O [ h] = φ (empty symbol string) and insert into hypothesis list H.
In step S204, the symbol string input unit 103 reads one symbol and substitutes the symbol for x.

次のステップS205からS208は、仮説展開部１０４において実行される。
ステップS205では、仮説リストＨから仮説を一つ取り出しｈに代入し、状態ｓ_A[ｈ]から入力記号がｘに等しい状態遷移ｅ_Aと、状態ｓ_B[ｈ]から入力記号ｏ[ｅ_A]に等しい状態遷移ｅ_Bの組（ｅ_A，ｅ_B）のリストＰを用意する、但し、ｏ[ｅ_A]＝εの場合は、ｅ_B＝φ（空の状態遷移）とする組（ｅ_A，φ）をリストＰに挿入する。
ステップS206でＰ＝φ（空のリスト）であればS210に進む。そうでなければS207に進みＰから状態遷移の組を一つ取り出し（ｅ_A，ｅ_B）に代入する。
ステップS208で新たな仮説ｆを生成し、ｓ_A[ｆ]＝ｎ[ｅ_A]とする。更に、もしｅ_B＝φならば、ｓ_B[ｆ]＝ｓ_B[ｈ]、Ｗ[ｆ]＝Ｗ[ｈ]＋ｗ[ｅ_A]、Ｏ[ｆ]＝Ｏ[ｈ]とし、そうでなければｓ_A[ｆ]＝ｎ[ｅ_A]、ｓ_B[ｆ]＝ｎ[ｅ_B]、Ｗ[ｆ]＝Ｗ[ｈ]＋ｗ[ｅ_A]＋ｗ[ｅ_B]、Ｏ[ｆ]＝Ｏ[ｈ]・ｏ[ｅ_B]とする（ｅ_B＝φの場合は、状態遷移ｅ_Aの出力記号がεなので、ＷＦＳＴＢでは状態遷移しないことから、ＷＦＳＴＢにおける到達状態ｓ_B[ｆ]は変化せずｓ_B[ｆ]＝ｓ_B[ｈ]となり、ＷＦＳＴＢからの記号出力が無いので出力記号列も変化せずＯ[ｆ]＝Ｏ[ｈ]となる）。
ステップS209は、仮説絞込み部１０５において、仮説ｆを仮説リストＨ’に挿入することにより仮説を絞り込む。 The following steps S205 to S208 are executed in the hypothesis developing unit 104.
In step S205, one hypothesis is extracted from the hypothesis list H and assigned to h, and the state transition e _A whose input symbol is equal to x from the state s _A [h] and the input symbol o [e _A from the state s _B [h]. ] _A list P of a set (e _A , e _B ) of state transitions e _B equal to] is prepared, provided that when o [e _A ] = ε, e _B = φ (empty state transition) ( e _A , φ) is inserted into list P.
If P = φ (empty list) in step S206, the process proceeds to S210. Otherwise, the process proceeds to S207, and one set of state transitions is extracted from P and assigned to (e _A , e _B ).
In step S208, a new hypothesis f is generated, and s _A [f] = n [e _A ] is set. Furthermore, if e _B = φ, then s _B [f] = s _B [h], W [f] = W [h] + w [e _A ], O [f] = O [h], and so on Otherwise, s _A [f] = n [e _A ], s _B [f] = n [e _B ], W [f] = W [h] + w [e _A ] + w [e _B ], O [f] = O [h] · o [e _B ] (when e _B = φ, since the output symbol of the state transition e _A is ε, the state transition does not occur in WFST B, so the arrival state s _B [ f] does not change and s _B [f] = s _B [h], and since there is no symbol output from WFST B, the output symbol string does not change and O [f] = O [h].
In step S209, the hypothesis narrowing unit 105 narrows down the hypotheses by inserting the hypothesis f into the hypothesis list H ′.

ステップS206に戻り、次の状態遷移の組について仮説を展開する。
ステップS210では、Ｈ＝φ（すべての仮説を展開済み）であればS211に進む。そうでなければS205に戻り、次の仮説を展開する。
ステップS211では、新たに生成された仮説のリストＨ’の要素を、すでに空となったＨにすべて移し、S212に進む。
ステップS212では、記号列入力部１０３において次の入力記号が存在するならばS204に戻り、そうでなければ、入力記号列がすべて読み込まれたと判断し、S213に進む。
ステップS213では、仮説リストＨの中から終了状態に到達している仮説ｈ（ｓ_A[ｈ]∈Ｆ_Aかつｓ_B[ｈ]∈Ｆ_B、Ｆ_AはＷＦＳＴＡの終了状態の集合、Ｆ_BはＷＦＳＴＢの終了状態の集合を表す）の累積重みＷ[ｈ]に対して終了重み（ρ（ｓ_A[ｈ]）＋ρ（ｓ_B[ｈ]））を加え、累積重み（Ｗ[ｈ]）が最小の仮説ｈ’を選び、その出力記号列Ｏ[ｈ’]を変換結果として、記号列出力部１０６において出力する。
ステップS214にて二つのＷＦＳＴを用いる従来の記号列変換手順を終了する。 Returning to step S206, a hypothesis is developed for the next set of state transitions.
In step S210, if H = φ (all hypotheses have been expanded), the process proceeds to S211. Otherwise, return to S205 to develop the next hypothesis.
In step S211, all the elements of the newly generated hypothesis list H ′ are moved to the already empty H, and the process proceeds to S212.
In step S212, if there is a next input symbol in the symbol string input unit 103, the process returns to S204. Otherwise, it is determined that all the input symbol strings have been read, and the process proceeds to S213.
In step S213, a hypothesis h (s _A [h] εF _A and s _B [h] εF _B , F _A , F _A is a set of end states of WFST A, _B represents the set of end states of WFST B), and the end weight (ρ (s _A [h]) + ρ (s _B [h])) is added to the accumulated weight W [h], and the accumulated weight (W [ h]) selects the minimum hypothesis h ′ and outputs the output symbol string O [h ′] as a conversion result in the symbol string output unit 106.
In step S214, the conventional symbol string conversion procedure using two WFSTs is terminated.

（入力記号列“あ，め，だ，ま”が与えられた場合の出力記号列を求める過程）
この記号列変換手順に従って、図５のＷＦＳＴＡ、図６のＷＦＳＴＢとして、入力記号列“あ，め，だ，ま”が与えられた場合の出力記号列を求める過程を順に追って説明する。但し、ある仮説を（ＷＦＳＴＡの現状態番号ｓ_A、ＷＦＳＴＢの現状態番号ｓ_B、出力記号列Ｏ、累積重みＷ）がある場合、その仮説を（ｓ_A，ｓ_B，Ｏ，Ｗ）のように表すものとする。
S201から開始し、S202で空の仮説リストＨとＨ’を作る。
S203により仮説リストＨの中に仮説（０，０，φ，０）を挿入する。 (The process of obtaining the output symbol string when the input symbol string “A, Me, Da, Ma” is given)
In accordance with this symbol string conversion procedure, the process of obtaining the output symbol string when the input symbol string “A, Me, Dama,” is given as WFST A in FIG. 5 and WFST B in FIG. 6 will be described in order. However, a certain hypothesis (current state number s _A of WFST A, current state number s _B of WFST B, the output symbol string O, cumulative weight W) If there is, the hypothesis _{_{(s A, s B, O}} , W ).
Starting from S201, empty hypothesis lists H and H ′ are created in S202.
A hypothesis (0, 0, φ, 0) is inserted into the hypothesis list H by S203.

“あ”読み込み
S204で記号“あ”を読み込み、S205において仮説リストＨから仮説（０，０，φ，０）を取り出す。この仮説のＷＦＳＴＡの現状態０から入力記号が“あ”の状態遷移と、ＷＦＳＴＢの現状態０から入力記号がＷＦＳＴＡの状態遷移の出力記号に等しい状態遷移の組
（＜０→１，あ：雨／０＞，＜０→１，雨：雨／０＞），
（＜０→１，あ：雨／０＞，＜０→３，雨：雨／０＞），
（＜０→２，あ：飴／０＞，＜０→２，飴：飴／０＞），
（＜０→２，あ：飴／０＞，＜０→４，飴：飴／０＞）
を含む状態遷移の組のリストＰを作る。
S206でＰ＝φではないのでS207に進み、Ｐから状態遷移の組（＜０→１，あ：雨／０＞，＜０→１，雨：雨／０＞）を取り出し、S207で新たな仮説（１，１，雨，０）を生成し、S209でＨ’に挿入する。
S206に戻りＰ＝φではないのでS207に進み、Ｐから状態遷移の組（＜０→１，あ：雨／０＞，＜０→３，雨：雨／０＞）を取り出し、S208で新たな仮説（１，３，雨，０）を生成し、S209でＨ’に挿入する。
S206に戻りＰ＝φではないのでS207に進み、Ｐから状態遷移の組（＜０→２，あ：飴／０＞、＜０→２，飴：飴／０＞）を取り出し、S208で新たな仮説（２，２，飴，０）を生成し、S209でＨ’に挿入する。
S206に戻りＰ＝φではないのでS207に進み、Ｐから状態遷移の組（＜０→２，あ：飴／０＞、＜０→４，飴：飴／０＞）を取り出し、S208で新たな仮説（２，４，飴，０）を生成し、S209でＨ’に挿入する。
S206に戻りＰ＝φであるためS210に進む。Ｈ＝φであるためS211に進みＨ’内の仮説
（１，１，雨，０），（１，３，雨，０），（２，２，飴，０），（２，４，飴，０）
をＨに移し、S212で次の入力記号が存在するのでS204に戻る。 "A" reading
The symbol “a” is read in S204, and a hypothesis (0, 0, φ, 0) is extracted from the hypothesis list H in S205. A pair of state transitions in which the input symbol is “A” from the current state 0 of WFST A of this hypothesis and a state transition in which the input symbol from the current state 0 of WFST B is equal to the output symbol of the state transition of WFST A (<0 → 1 , Oh: Rain / 0>, <0 → 1, Rain: Rain / 0>),
(<0 → 1, A: Rain / 0>, <0 → 3, Rain: Rain / 0>),
(<0 → 2, A: 飴 / 0>, <0 → 2, 飴: 飴 / 0>),
(<0 → 2, A: 飴 / 0>, <0 → 4, 飴: 飴 / 0>)
A list P of state transition sets including is created.
Since P = φ is not satisfied in S206, the process proceeds to S207, and a set of state transitions (<0 → 1, A: rain / 0>, <0 → 1, rain: rain / 0>) is extracted from P, and a new one is generated in S207. A hypothesis (1, 1, rain, 0) is generated and inserted into H ′ in S209.
Returning to S206, since P = φ is not established, the process proceeds to S207, and a set of state transitions (<0 → 1, A: rain / 0>, <0 → 3, rain: rain / 0>) is extracted from P, and is newly added in S208. A hypothesis (1, 3, rain, 0) is generated and inserted into H ′ in S209.
Returning to S206, since P = φ is not satisfied, the process proceeds to S207, and a set of state transitions (<0 → 2, A: 飴 / 0>, <0 → 2, 飴: 飴 / 0>) is extracted from P, and new in S208 A hypothesis (2, 2, 飴, 0) is generated and inserted into H ′ in S209.
Returning to S206, since P = φ is not established, the process proceeds to S207, and a set of state transitions (<0 → 2, A: 飴 / 0>, <0 → 4, 飴: 飴 / 0>) is extracted from P, and new in S208 A hypothesis (2, 4, 飴, 0) is generated and inserted into H ′ in S209.
Returning to S206, since P = φ, the process proceeds to S210. Since H = φ, the process proceeds to S211 and the hypotheses (1,1, rain, 0), (1,3, rain, 0), (2,2, 飴, 0), (2,4, 飴) in H ′ , 0)
Is moved to H, and since there is a next input symbol in S212, the process returns to S204.

“め”読み込み
続いて、S204で記号“め”を読み込み、S205において仮説リストＨから仮説（１，１，雨，０）を取り出す。この仮説のＷＦＳＴＡの現状態１から入力記号が“め”の状態遷移とＷＦＳＴＢの現状態１から入力記号がＷＦＳＴＡの状態遷移の出力記号に等しい状態遷移の組
（＜１→５，め：ε／１＞，φ）
を含む状態遷移の組のリストＰを作る。ここで、ＷＦＳＴＡの状態遷移＜１→５，め：ε／１＞の出力記号はεであることから、ＷＦＳＴＢの状態遷移はφとなっている。
S206でＰ＝φではないのでS207に進み、Ｐから状態遷移の組（＜１→５，め：ε／１＞，φ）を取り出し、S208で新たな仮説（５，１，雨，１）を生成し、S209でＨ’に挿入する。
S206に戻りＰ＝φであるためS210に進む。Ｈ≠φであるためS205に戻り、仮説リストＨから仮説（１，３，雨，０）を取り出す。この仮説の現状態１から入力記号が“め”の状態遷移と、ＷＦＳＴＢの現状態３から入力記号がＷＦＳＴＡの状態遷移の出力記号に等しい状態遷移の組
（＜１→５，め：ε／１＞，φ）
を含む状態遷移の組のリストＰを作る。
S206でＰ＝φではないのでS207に進み、Ｐから状態遷移の組（＜１→５，め：ε／１＞，φ）を取り出し、S208で新たな仮説（５，３，雨，１）を生成し、S209でＨ’に挿入する。
S206に戻りＰ＝φであるためS210に進む。Ｈ≠φであるためS205に戻り、仮説リストＨから仮説（２，２，飴，０）を取り出す。この仮説の現状態２から入力記号が“め”の状態遷移と、ＷＦＳＴＢの現状態２から入力記号がＷＦＳＴＡの状態遷移の出力記号に等しい状態遷移の組
（＜２→５，め：ε／２＞，φ）
を含む状態遷移の組のリストＰを作る。
S206でＰ＝φではないのでS207に進み、Ｐから状態遷移の組（＜２→５，め：ε／２＞，φ）を取り出し、S208で新たな仮説（５，２，飴，２）を生成し、S209でＨ’に挿入する。
S206に戻りＰ＝φであるためS210に進む。S210でＨ≠φであるためS205に戻り、仮説リストＨから仮説（２，４，飴，０）を取り出す。この仮説の現状態２から入力記号が“め”の状態遷移と、ＷＦＳＴＢの現状態２から入力記号がＷＦＳＴＡの状態遷移の出力記号に等しい状態遷移の組
（＜２→５，め：ε／２＞，φ）
を含む状態遷移の組のリストＰを作る。
S206でＰ＝φではないのでS207に進み、Ｐから状態遷移の組（＜２→５，め：ε／２＞，φ）を取り出し、S208で新たな仮説（５，４，飴，２）を生成し、S209でＨ’に挿入する。
S206に戻りＰ＝φであるためS210に進む。Ｈ＝φであるためS211に進み、Ｈ’内の仮説（５，１，雨，１），（５，３，雨，１），（５，２，飴，２），（５，４，飴，２）をＨに移し、S212で次の入力記号が存在するのでS204に戻る。 Next, the symbol “me” is read in S204, and a hypothesis (1, 1, rain, 0) is extracted from the hypothesis list H in S205. A pair of state transitions in which the input symbol is “m” from the current state 1 of WFST A and a state transition in which the input symbol from the current state 1 of WFST B is equal to the output symbol of the state transition of WFST A (<1 → 5 Me: ε / 1>, φ)
A list P of state transition sets including is created. Here, since the output symbol of the state transition <1 → 5, fifth: ε / 1> of WFST A is ε, the state transition of WFST B is φ.
Since P = φ is not satisfied in S206, the process proceeds to S207, a set of state transitions (<1 → 5, ε / 1>, φ) is extracted from P, and a new hypothesis (5, 1, rain, 1) is obtained in S208. And is inserted into H ′ in S209.
Returning to S206, since P = φ, the process proceeds to S210. Since H ≠ φ, the process returns to S205, and the hypothesis (1, 3, rain, 0) is extracted from the hypothesis list H. A pair of state transitions in which the input symbol is “M” from the current state 1 of this hypothesis and state transitions whose input symbol is equal to the output symbol of the state transition of WFST A from the current state 3 of WFST B (<1 → 5: ε / 1>, φ)
A list P of state transition sets including is created.
Since P = φ is not satisfied in S206, the process proceeds to S207, a set of state transitions (<1 → 5: ε / 1>, φ) is extracted from P, and a new hypothesis (5, 3, rain, 1) is obtained in S208. And is inserted into H ′ in S209.
Returning to S206, since P = φ, the process proceeds to S210. Since H ≠ φ, the process returns to S205, and the hypothesis (2, 2, 飴, 0) is extracted from the hypothesis list H. A set of state transitions in which the input symbol is “m” from the current state 2 of this hypothesis and a state transition in which the input symbol from the current state 2 of WFST B is equal to the output symbol of the state transition of WFST A
(<2 → 5: ε / 2>, φ)
A list P of state transition sets including is created.
Since P = φ is not satisfied in S206, the process proceeds to S207, a set of state transitions (<2 → 5: ε / 2>, φ) is extracted from P, and a new hypothesis (5, 2, 飴, 2) is obtained in S208. And is inserted into H ′ in S209.
Returning to S206, since P = φ, the process proceeds to S210. Since H ≠ φ in S210, the process returns to S205, and a hypothesis (2, 4, 飴, 0) is extracted from the hypothesis list H. A set of state transitions in which the input symbol is “m” from the current state 2 of this hypothesis and a state transition in which the input symbol from the current state 2 of WFST B is equal to the output symbol of the state transition of WFST A
(<2 → 5: ε / 2>, φ)
A list P of state transition sets including is created.
Since P = φ is not satisfied in S206, the process proceeds to S207, a set of state transitions (<2 → 5: ε / 2>, φ) is extracted from P, and a new hypothesis (5, 4, 飴, 2) is obtained in S208. And is inserted into H ′ in S209.
Returning to S206, since P = φ, the process proceeds to S210. Since H = φ, the process proceeds to S211 and the hypotheses (5, 1, rain, 1), (5, 3, rain, 1), (5, 2, hail, 2), (5, 4, 4) in H ′.飴, 2) is moved to H, and since the next input symbol exists in S212, the process returns to S204.

“だ”読み込み
続いて、S204で記号“だ”を読み込み、S205において仮説リストＨから仮説（５，１，雨，１）を取り出す。この仮説のＷＦＳＴＡの現状態５から入力記号が“だ”の状態遷移と、ＷＦＳＴＢの現状態１から入力記号がＷＦＳＴＡの状態遷移の出力記号に等しい状態遷移の組
（＜０→４，だ：玉／１＞，＜１→４，玉：玉／５＞）
を含む状態遷移の組のリストＰを作る。ここで、ＷＦＳＴＡの現状態５から状態０へは入力記号なしで遷移できるのでＰに含まれるＷＦＳＴＡの状態遷移は＜０→４，だ：玉／１＞となる。
S206でＰ＝φではないのでS207に進み、Ｐから状態遷移の組（＜０→４，だ：玉／１＞，＜１→４，玉：玉／５＞）を取り出し、S208で新たな仮説（４，４，雨玉，７）を生成し、S209でＨ’に挿入する。
S206に戻りＰ＝φであるためS210に進む。Ｈ≠φであるためS205に戻り、仮説リストＨから仮説（５，３，雨，１）を取り出す。この仮説のＷＦＳＴＡの現状態５から入力記号“だ”の状態遷移とＷＦＳＴＢの現状態３から入力記号がＷＦＳＴＡの状態遷移の出力記号に等しい状態遷移の組は存在しないのでＰ＝φとする。
S206でＰ＝φであるためS210に進み、Ｈ≠φであるためS205に戻る。
S205において仮説リストＨから仮説（５，２，飴，２）を取り出す。この仮説のＷＦＳＴＡの現状態５から入力記号が“だ”の状態遷移と、ＷＦＳＴＢの現状態２からの入力記号がＷＦＳＴＡの状態遷移の出力記号に等しい状態遷移の組
（＜０→４，だ：玉／１＞，＜２→４，玉：玉／１＞）
を含む状態遷移の組のリストＰを作る。 Reading “DA” Subsequently, the symbol “DA” is read in S204, and a hypothesis (5, 1, rain, 1) is extracted from the hypothesis list H in S205. A pair of state transitions in which the input symbol is “da” from the current state 5 of WFST A of this hypothesis and a state transition in which the input symbol from the current state 1 of WFST B is equal to the output symbol of the state transition of WFST A (<0 → 4 , Da: Jade / 1>, <1 → 4, Jade: Jade / 5>)
A list P of state transition sets including is created. Here, since it is possible to transition from the current state 5 of WFST A to state 0 without an input symbol, the state transition of WFST A included in P is <0 → 4: ball / 1>.
Since P = φ is not satisfied in S206, the process proceeds to S207, and a set of state transitions (<0 → 4, ball: 1/1, <1 → 4, ball: ball / 5>) is extracted from P, and a new one is created in S208. A hypothesis (4, 4, rainball, 7) is generated and inserted into H ′ in S209.
Returning to S206, since P = φ, the process proceeds to S210. Since H ≠ φ, the process returns to S205, and the hypothesis (5, 3, rain, 1) is extracted from the hypothesis list H. In this hypothesis, there is no set of state transitions in which the input symbol is “da” from the current state 5 of WFST A and state transitions in which the input symbol from the current state 3 of WFST B is equal to the output symbol of the state transition of WFST A, so P = φ And
Since P = φ in S206, the process proceeds to S210, and because H ≠ φ, the process returns to S205.
In S205, a hypothesis (5, 2, 飴, 2) is extracted from the hypothesis list H. A pair of state transitions in which the input symbol from the current state 5 of the hypothesis WFST A is “da” and a state transition in which the input symbol from the current state 2 of WFST B is equal to the output symbol of the state transition of WFST A (<0 → 4, Da: ball / 1>, <2 → 4, ball: ball / 1>)
A list P of state transition sets including is created.

S206でＰ＝φではないので S207に進み、Ｐから状態遷移の組（＜０→４，だ：玉／１＞，＜２→４，玉：玉／１＞）を取り出し、S208で新たな仮説（４，４，飴玉，４）を生成し、S209でＨ’に挿入する。但し、Ｈ’には仮説（４，４，雨玉，７）が存在し、ＷＦＳＴＡにおける到達状態とＷＦＳＴＢにおける到達状態の組が同一であるため、累積重みの小さい仮説（４，４，飴玉，４）を残し、（４，４，雨玉，７）は削除する。
S206に戻りＰ＝φであるためS210に進む。Ｈ≠φであるためS205に戻り、仮説リストＨから仮説（５，４，飴，２）を取り出す。この仮説の現状態５から入力記号が“だ”の状態遷移と、ＷＦＳＴＢの現状態４から入力記号がＷＦＳＴＡの状態遷移の出力記号に等しい状態遷移の組は存在しないのでＰ＝φとする。
S206でＰ＝φであるためS210に進み、Ｈ＝φであるためS211に進む。
Ｈ’内の仮説（４，４，飴玉，４）をＨに移し、S212で次の入力記号が存在するのでS204に戻る。 Since S is not P = φ in S206, the process proceeds to S207, and a set of state transitions (<0 → 4, ball: 1/1, <2 → 4, ball: ball / 1>) is extracted from P, and a new one is created in S208. A hypothesis (4, 4, jasper, 4) is generated and inserted into H ′ in S209. However, since there is a hypothesis (4, 4, rainball, 7) in H ′ and the set of the arrival state in WFST A and the arrival state in WFST B is the same, the hypothesis (4, 4, 4, Leave jasper 4) and delete (4, 4, rainball 7).
Returning to S206, since P = φ, the process proceeds to S210. Since H ≠ φ, the process returns to S205, and the hypothesis (5, 4, 飴, 2) is extracted from the hypothesis list H. Since there is no set of state transitions in which the input symbol is “da” from the current state 5 of this hypothesis, and there is no set of state transitions from the current state 4 of WFST B which is the output symbol of the state transition of WFST A, P = φ To do.
Since P = φ in S206, the process proceeds to S210, and because H = φ, the process proceeds to S211.
The hypothesis (4, 4, jasper, 4) in H ′ is moved to H, and since the next input symbol exists in S212, the process returns to S204.

“ま”読み込み
続いて、S204で記号“ま”を読み込み、S205において仮説リストＨから仮説（４，４，飴玉，４）を取り出す。この仮説のＷＦＳＴＡの現状態４から入力記号が“ま”の状態遷移と、ＷＦＳＴＢの現状態４から入力記号がＷＦＳＴＡの状態遷移の出力記号に等しい状態遷移の組
（＜４→５，ま：ε／１＞，φ）
を含む状態遷移の組のリストＰを作る。ここで、ＷＦＳＴＡの状態遷移＜４→５，ま：ε／１＞の出力記号はεであることから、ＷＦＳＴＢの状態遷移はφとなる。
S206でＰ＝φではないのでS207に進み、Ｐから状態遷移の組（＜４→５，ま：ε／１＞，φ）を取り出し、S208で新たな仮説（５，４，飴玉，５）を生成し、S209でＨ’に挿入する。
S206に戻りＰ＝φであるためS210に進む。Ｈ＝φであるためS211に進みＨ’内の仮説（５，４，飴玉，５）をＨに移し、S212で次の入力記号が存在しないのでS213に進む。
S213で、Ｈ内の仮説（５，４，飴玉，５）のＷＦＳＴＡにおける到達状態５、ＷＦＳＴＢにおける到達状態４は、共に終了状態であり、それぞれの終了重みを加えて（５，４，飴玉，５）とし、この仮説が終了状態に到達した唯一の仮説であることから、その仮説の出力記号列“飴玉”を変換結果として出力し、S214で記号列変換処理を終了する。
二つのＷＦＳＴを用いて記号列を変換する別の手段として、二つのＷＦＳＴを事前に合成して一つのＷＦＳＴとし、一つのＷＦＳＴに対する記号列変換手順を適用する方法がある。ＷＦＳＴの合成方法に関しては、例えば、非特許文献１に開示されている。
しかし、ＷＦＳＴを合成すると、二つの状態遷移の組合せに対して状態や状態遷移ができることからＷＦＳＴの状態数や状態遷移数が非常に大きくなってしまうことがある。そのためコンピュータでＷＦＳＴを扱う場合に、メモリサイズ等の制約から、記号列変換を実行するのが難しい場合がある。
E.Roche and Y.Schabes著、“Finite-State Language Processing,”mit Press,1997の15章“Speech Recognition by Composition of Weighted Finite Automata” 長尾真著“岩波講座ソフトウェア１４：知識と推論”、岩波書店、１０９〜１１３頁 Next, the symbol “ma” is read in S204, and the hypothesis (4, 4, jasper, 4) is extracted from the hypothesis list H in S205. A set of state transitions in which the input symbol is “ma” from the current state 4 of WFST A of this hypothesis, and a state transition in which the input symbol from the current state 4 of WFST B is equal to the output symbol of the state transition of WFST A (<4 → 5 , MA: ε / 1>, φ)
A list P of state transition sets including is created. Here, since the output symbol of state transition <4 → 5, or ε / 1> of WFST A is ε, the state transition of WFST B is φ.
Since P = φ is not satisfied in S206, the process proceeds to S207, and a set of state transitions (<4 → 5, or ε / 1>, φ) is extracted from P, and a new hypothesis (5, 4, Kodama, 5) is extracted in S208. ) And inserted into H ′ in S209.
Returning to S206, since P = φ, the process proceeds to S210. Since H = φ, the process proceeds to S211 and the hypothesis (5, 4, jasper, 5) in H ′ is moved to H. Since there is no next input symbol in S212, the process proceeds to S213.
In S213, the arrival state 5 in WFST A and the arrival state 4 in WFST B of the hypothesis (5, 4, jasper, 5) in H are both end states, and the respective end weights are added (5, 4). , Kodama, 5), and since this hypothesis is the only hypothesis that has reached the end state, the output symbol string “Kadama” of the hypothesis is output as a conversion result, and the symbol string conversion process is terminated in S214. .
As another means for converting a symbol string using two WFSTs, there is a method of combining two WFSTs in advance to form one WFST and applying a symbol string conversion procedure for one WFST. A method for synthesizing WFST is disclosed in Non-Patent Document 1, for example.
However, when the WFST is combined, the number of states and state transitions of the WFST may become very large because the state and state transition can be performed with respect to the combination of two state transitions. For this reason, when the WFST is handled by a computer, it may be difficult to execute symbol string conversion due to restrictions on the memory size and the like.
E. Roche and Y. Schabes, “Finite-State Language Processing,” mit Press, 1997, Chapter 15 “Speech Recognition by Composition of Weighted Finite Automata” Nagao Makoto "Iwanami Lecture Software 14: Knowledge and Reasoning", Iwanami Shoten, 109-113

二つのＷＦＳＴを用いて最小コスト探索を行う従来の手法では、各々の仮説を更新して新たな仮説を生成する際に、前段のＷＦＳＴにおいて可能な状態遷移と、後段のＷＦＳＴにおいて可能な状態遷移のすべての組合せに対して新たな仮説を生成することから、仮説の数が多くなり計算量が大きくなるという問題があった。 In the conventional method of performing a minimum cost search using two WFSTs, when each hypothesis is updated and a new hypothesis is generated, a state transition that is possible in the preceding WFST and a state transition that is possible in the subsequent WFST Since new hypotheses are generated for all the combinations, there is a problem that the number of hypotheses increases and the amount of calculation increases.

本発明は、
記号列を順に読み込む記号列入力部と、
状態遷移によって記号列を変換する二つの重み付き有限状態変換器を用いて記号列を２段階で変換するための前段に用いる前段重み付き有限状態変換器と後段に用いる後段重み付き有限状態変換器とを用いて記号列を変換する記号列変換部と、
後段重み付き有限状態変換器による変換結果を出力する記号列出力部とを有し、
記号列入力部から記号を順に読み込んで、入力記号列を読み終えた時点で、前記記号列変換部の前段と後段の重み付き有限状態変換器においてそれぞれ適用される状態遷移に対する重みの累積値（累積重み）が最小となる後段の重み付き有限状態変換器の状態遷移過程に対応する出力記号列を記号列出力部から出力する記号列変換方法において、
記号を順に読み込みながら、前段重み付き有限状態変換器の一つの状態遷移過程を表す仮説に対する累積重みを、その仮説の状態遷移過程における出力記号列を後段重み付き有限状態変換器の入力記号列としたときの後段重み付き有限状態変換器において可能な状態遷移過程の中で累積重みが最小となる状態遷移過程を求め、その累積重みを仮説の累積重みに加算することで補正し、
入力記号列がすべて読み込まれた時点で、累積重み最小の仮説とその仮説の状態遷移過程に対応する出力記号列を後段重み付き有限状態変換器の入力記号列としたときに後段重み付き有限状態変換器において可能な状態遷移過程の中で累積重みが最小となる状態遷移過程に対する出力記号列をもって記号列変換結果とすることを特徴とする。
この記号列変換方法によって効率的に記号列を変換する。 The present invention
A symbol string input section for sequentially reading the symbol strings;
A front-stage weighted finite state converter used in the preceding stage and a back-stage weighted finite state converter used in the subsequent stage for converting the symbol string in two stages using two weighted finite state converters that convert the symbol string by state transition A symbol string conversion unit for converting a symbol string using and
A symbol string output unit that outputs a conversion result obtained by the latter-stage weighted finite state converter;
When the symbols are sequentially read from the symbol string input unit and the input symbol string has been read, cumulative weight values for state transitions respectively applied to the weighted finite state converters at the preceding and succeeding stages of the symbol string converter ( In the symbol string conversion method for outputting the output symbol string corresponding to the state transition process of the latter-stage weighted finite state converter with the smallest (cumulative weight) from the symbol string output unit,
While reading the symbols in order, the cumulative weight for the hypothesis representing one state transition process of the preceding stage weighted finite state transformer, the output symbol string in the hypothetical state transition process, and the input symbol string of the latter stage weighted finite state transformer The state transition process that minimizes the cumulative weight among the possible state transition processes in the latter-stage weighted finite state converter is corrected, and the cumulative weight is corrected by adding it to the hypothetical cumulative weight,
When all input symbol strings are read, the latter weighted finite state when the hypothesis with the smallest cumulative weight and the output symbol string corresponding to the state transition process of that hypothesis are used as the input symbol string of the latter weighted finite state converter An output symbol string corresponding to a state transition process having a minimum cumulative weight among possible state transition processes in the converter is used as a symbol string conversion result.
The symbol string is efficiently converted by this symbol string conversion method.

本発明により二つのＷＦＳＴを用いて記号列と変換する手順では、仮説補正部を除けば前段のＷＦＳＴだけを用いて探索する手順（図４）とほぼ同じ計算手順となり、従来の手順（図８）に比べて計算量を抑えることができる。仮説補正部における計算の負荷はあるが、前段のＷＦＳＴの状態遷移において出力記号がεではない場合だけ仮説補正の計算が入るので、前段のＷＦＳＴの出力記号がεである割合が多いほど、処理量削減の効果は大きくなる。
本発明による音声認識方法を用いて、被験者が新聞記事中の１００文章を読み上げた音声を入力したときの、音声認識処理に要した処理時間を表１に示す。ただし、処理時間は、実際に発話された時間を音声認識処理に要した処理時間で割った値（実時間比）とする。単語発音辞書の語彙サイズは２万である。実験にはＩＢＭ互換機、ＣＰＵはPentium（登録商標）III、クロック数８００MHz相当の計算機を使用した。

従来の記号変換方法を用いた音声認識装置と比べて、処理時間は約３分の２、記号変換手順の違いに関係のない音響モデルによる音響スコア計算の計算部分を除けば約３分の１に短縮された。また、どちらの方法においても音声認識結果は同一であったことから、本発明を用いた場合の音声認識における精度に劣化はない。 According to the present invention, the procedure for converting a symbol string using two WFSTs is almost the same as the procedure for searching using only the preceding WFST (FIG. 4) except for the hypothesis correction unit, and the conventional procedure (FIG. 8). ) Can reduce the amount of calculation. Although there is a calculation load in the hypothesis correction unit, hypothesis correction calculation is performed only when the output symbol is not ε in the state transition of the preceding WFST. Therefore, the more the ratio that the output symbol of the preceding WFST is ε, the more processing is performed. The effect of volume reduction is increased.
Table 1 shows the processing time required for the speech recognition processing when the subject inputs speech that reads 100 sentences in a newspaper article using the speech recognition method according to the present invention. However, the processing time is a value (actual time ratio) obtained by dividing the actual utterance time by the processing time required for the speech recognition processing. The vocabulary size of the word pronunciation dictionary is 20,000. In the experiment, an IBM compatible machine, a CPU with a Pentium (registered trademark) III, and a computer with a clock frequency of 800 MHz were used.

Compared to a speech recognition apparatus using a conventional symbol conversion method, the processing time is about two thirds, and about one third except for the calculation part of the acoustic score calculation by an acoustic model that is not related to the difference in the symbol conversion procedure. Shortened to Moreover, since the speech recognition result is the same in both methods, there is no deterioration in the accuracy in speech recognition when the present invention is used.

以下、図面を用いて本発明の実施の形態について説明する。
（記号列変換装置）
図９は本発明における一実施例の形態を示す図である。
記号列入力部１０３では、記号列を順に読み込み、仮説展開部１０４に送る。仮説展開部１０４では、記号入力部１０３から受け取った記号と前段ＷＦＳＴ格納部２０１から読み込んだ前段のＷＦＳＴに従って、現時点までに読み込まれた入力記号列に対して可能性のある一つの状態遷移過程を仮説と表すと、現時点で存在する仮説の集合に、新たに受け取った記号を用いて各仮説の状態遷移過程を更新することにより新たな仮説を生成し、仮説補正部３０１に送る。仮説補正部３０１では、後段ＷＦＳＴ格納部２０２から読み込んだ後段のＷＦＳＴに従って、仮説展開部１０４から受け取った仮説の各々について、仮説の状態遷移過程における出力記号列を後段のＷＦＳＴの入力記号列としたときの後段のＷＦＳＴにおいて可能な状態遷移過程の中で累積重みが最小となる状態遷移過程を求め、その累積重みを仮説の累積重みに加えることで補正し、仮説絞込み部１０５に送る。仮説絞込み部１０５では、仮説補正部３０１から受け取った仮説に対し、状態遷移過程が重複している仮説の中で累積重みが最小の仮説だけを残し、その他の仮説を削除することにより仮説を絞り込む。入力記号列が最後まで読み込まれていれば、累積重み最小の仮説とその仮説の状態遷移過程に対応する出力記号列を後段のＷＦＳＴの入力記号列としたときに後段のＷＦＳＴにおいて可能な状態遷移過程の中で累積重みが最小となる状態遷移過程に対する出力記号列を記号列出力部１０５に送る。入力記号列が最後まで読み込まれていなければ、残った仮説を仮説展開部１０４に送る。記号列出力部１０５では、仮説絞込み部１０５から受け取った出力記号列を出力する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
(Symbol converter)
FIG. 9 is a diagram showing an embodiment of the present invention.
The symbol string input unit 103 sequentially reads the symbol strings and sends them to the hypothesis developing unit 104. The hypothesis expansion unit 104 performs one possible state transition process for the input symbol string read up to the present time according to the symbol received from the symbol input unit 103 and the previous WFST read from the previous WFST storage unit 201. When expressed as a hypothesis, a new hypothesis is generated by updating the state transition process of each hypothesis using a newly received symbol in a set of existing hypotheses, and the hypothesis correction unit 301 transmits the new hypothesis. The hypothesis correction unit 301 sets the output symbol string in the hypothesis state transition process as the input symbol string of the subsequent WFST for each hypothesis received from the hypothesis expansion unit 104 in accordance with the subsequent WFST read from the subsequent WFST storage unit 202. The state transition process having the smallest accumulated weight among the possible state transition processes in the subsequent WFST is obtained, corrected by adding the accumulated weight to the accumulated weight of the hypothesis, and sent to the hypothesis narrowing unit 105. The hypothesis narrowing unit 105 narrows down the hypotheses by leaving only the hypothesis with the smallest cumulative weight among the hypotheses having the same state transition process from the hypothesis received from the hypothesis correction unit 301 and deleting other hypotheses. . If the input symbol string has been read to the end, the state transition possible in the subsequent WFST when the hypothesis having the minimum cumulative weight and the output symbol string corresponding to the state transition process of the hypothesis are used as the input symbol string of the subsequent WFST An output symbol string for the state transition process having the smallest accumulated weight in the process is sent to the symbol string output unit 105. If the input symbol string has not been read to the end, the remaining hypothesis is sent to the hypothesis developing unit 104. The symbol string output unit 105 outputs the output symbol string received from the hypothesis narrowing unit 105.

（記号列変換手順）
図１０に本発明の一実施の形態において記号列を変換する具体的な手順を示す。
まず、本発明では、前段のＷＦＳＴにおける“仮説”と、その仮説の状態遷移過程から出力される記号列を後段のＷＦＳＴの入力記号列とした場合に、後段のＷＦＳＴの可能な一つの状態遷移過程を、“仮説”と区別して“補助仮説”と呼ぶことにする。一つの仮説に対して必ず一つ以上の補助仮説が存在するので、本実施の形態では、仮説ｈに対する補助仮説の集合を補助仮説リストＬ[ｈ]で表す。
また、仮説ｈがあるとき、ｓ_A[ｈ]を仮説ｈの状態遷移過程において最後に到達した状態、Ｗ[ｈ]を仮説ｈの状態遷移過程における（補正された）累積重みとする。一方、仮説ｈに対して補助仮説ｇ（ｇ∈Ｌ[ｈ]）があるとき、ｓ_B[ｇ]を補助仮説ｇの状態遷移過程において最後に到達した状態、Ｗ[ｇ]を仮説ｈの状態遷移過程における累積重みと補助仮説ｇの状態遷移過程における累積重みの和、Ｏ[ｇ]を補助仮説ｇの状態遷移過程において出力される記号列とする。 (Symbol string conversion procedure)
FIG. 10 shows a specific procedure for converting a symbol string in one embodiment of the present invention.
First, in the present invention, when a “hypothesis” in the preceding WFST and a symbol string output from the state transition process of the hypothesis are used as an input symbol string of the succeeding WFST, one possible state transition of the succeeding WFST The process is distinguished from “hypothesis” and called “auxiliary hypothesis”. Since at least one auxiliary hypothesis always exists for one hypothesis, in this embodiment, a set of auxiliary hypotheses for the hypothesis h is represented by an auxiliary hypothesis list L [h].
Further, when there is a hypothesis h, s _A [h] is the state reached last in the state transition process of hypothesis h, and W [h] is the (corrected) cumulative weight in the state transition process of hypothesis h. On the other hand, when there is an auxiliary hypothesis g (gεL [h]) with respect to the hypothesis h, s _B [g] is the last state reached in the state transition process of the auxiliary hypothesis g, and W [g] is the hypothesis h. The sum of the cumulative weight in the state transition process and the cumulative weight in the state transition process of the auxiliary hypothesis g, O [g], is a symbol string output in the state transition process of the auxiliary hypothesis g.

以下、図１０の手順を図９の実施の形態と対比して説明する。
ステップS301より開始し、初期設定として、ステップS302で空の仮説リストＨとＨ’を生成し、ステップS303において、初期の仮説ｈを生成し、ｓ_A[ｈ]＝０、Ｗ[ｈ]＝０、Ｌ[ｈ]＝φ（空のリスト）とし、仮説リストＨに挿入する。
また、補助仮説ｇを生成し、ｓ_B[ｇ]＝０、Ｗ[ｇ]＝０、Ｌ[ｇ]＝φ（空の記号列）とし、ｈの補助仮説リストＬ[ｈ]に挿入する。
ステップS304では、記号列入力部１０３において記号を一つ読み込み、その記号をｘに代入する。
次のステップS305からS308は、仮説展開部１０４において実行される。
ステップS305では、仮説リストＨから仮説を一つ取り出しｈに代入し、状態ｓ_A[ｈ]から入力記号ｘで遷移可能な状態遷移のリストＥ_Aを生成する。
ステップS306で、Ｅ_A＝φ（空のリスト）ならばS311に進む。そうでなければS307に進む。
ステップS307では、Ｅ_Aから状態遷移を一つ取り出して、ｅ_Aに代入し、
ステップS308で新たな仮説ｆを生成し、ｓ_A[ｆ]＝ｎ[ｅ_A]、Ｗ[ｆ]＝Ｗ[ｈ]＋ｗ[ｅ_A]、Ｌ[ｆ]＝Ｌ[ｈ]とする。
ステップS309は、仮説補正部３０１において、もしｏ[ｅ_A]≠εならば仮説ｆを補正する手段（この手順の説明の後に別途説明する）を実行する。
ステップS310は、仮説絞込み部１０５において、仮説ｆを仮説リストＨ’に挿入することにより仮説を絞り込む。 Hereinafter, the procedure of FIG. 10 will be described in comparison with the embodiment of FIG.
Starting from step S301, as initial settings, empty hypothesis lists H and H ′ are generated in step S302. In step S303, an initial hypothesis h is generated, and s _A [h] = 0, W [h] = 0 and L [h] = φ (empty list) are inserted into the hypothesis list H.
Also, an auxiliary hypothesis g is generated, and s _B [g] = 0, W [g] = 0, and L [g] = φ (empty symbol string) are inserted into the auxiliary hypothesis list L [h] of h. .
In step S304, the symbol string input unit 103 reads one symbol and substitutes the symbol for x.
The following steps S305 to S308 are executed in the hypothesis developing unit 104.
In step S305, one hypothesis is extracted from the hypothesis list H and substituted into h, and a state transition list E _A that can be transitioned with the input symbol x is generated from the state s _A [h].
If E _A = φ (empty list) in step S306, the process proceeds to S311. Otherwise, the process proceeds to S307.
In step S307, fetches one state transition from E _A, is substituted into e _A,
In step S308, a new hypothesis f is generated, and s _A [f] = n [e _A ], W [f] = W [h] + w [e _A ], and L [f] = L [h].
In step S309, in the hypothesis correction unit 301, if o [e _A ] ≠ ε, means for correcting the hypothesis f (described separately after the description of this procedure) is executed.
In step S310, the hypothesis narrowing unit 105 narrows down the hypotheses by inserting the hypothesis f into the hypothesis list H ′.

ステップS311では、Ｈ＝φ（すべての仮説を展開済み）であればS312に進む。そうでなければS305に戻り、次の仮説を展開する。
ステップS312では、新たに生成された仮説のリストＨ’の要素を、既に空となったＨにすべて移し、S313に進む。
ステップS313では、記号列入力部１０３において次の入力記号が存在するならば S304に戻り、そうでなければ、入力記号列がすべて読み込まれたと判断し、S314に進む。
ステップS314では、仮説リストＨの中から終了状態に到達している仮説（ｓ_A[ｈ]∈Ｆ_A）で、かつ累積重み（Ｗ[ｈ]）が最小の仮説ｈを選び、更にその仮説に対応する補助仮説リストＬ[ｈ]の中から、終了状態に達している仮説（ｓ_B[ｇ]∈Ｆ_B）で、かつ累積重み（Ｗ[ｇ]）が最小の補助仮説ｇ’を選び、その出力記号列Ｏ[ｇ’]を変換結果として記号列出力部１０６において出力する。
ステップS315にて本発明の一実施の形態における記号列変換手順を終了する。 In step S311, if H = φ (all hypotheses have been expanded), the process proceeds to S312. Otherwise, return to S305 and develop the next hypothesis.
In step S312, all the elements of the newly generated hypothesis list H ′ are moved to the already empty H, and the process proceeds to S313.
In step S313, if there is a next input symbol in the symbol string input unit 103, the process returns to S304. Otherwise, it is determined that all the input symbol strings have been read, and the process proceeds to S314.
In step S314, a hypothesis h that has reached the end state (s _A [h] εF _A ) and has the smallest cumulative weight (W [h]) is selected from the hypothesis list H, and the hypothesis is further selected. From the auxiliary hypothesis list L [h] corresponding to, a hypothesis g ′ that has reached the end state (s _B [g] ∈F _B ) and has the smallest cumulative weight (W [g]) The output symbol string O [g ′] is selected and output in the symbol string output unit 106 as a conversion result.
In step S315, the symbol string conversion procedure in the embodiment of the present invention is terminated.

（仮説補正手順）
次に仮説補正部３０１における仮説補正手順について説明する。
本発明における仮説の補正は、前段ＷＦＳＴにおける仮説の状態遷移過程に対する累積重みを、その仮説の状態遷移過程における出力記号列を後段ＷＦＳＴの入力記号列としたときの後段ＷＦＳＴにおいて可能な状態遷移過程の中で累積重みが最小となる状態遷移過程を求め、その累積重みを仮説の累積重みに加算することを意味する。入力記号列Ｘが読み込まれた時点でその記号列Ｘを受理して記号列Ｙを出力する仮説の補正された累積重みは、

となり、先に述べた式（２）における二つのＷＦＳＴの累積重みの和の最小値と一致する。
従って、補正された累積重みを用いて、それを最小とする前段ＷＦＳＴの出力記号列Ｙを求めて、そのＹを後段ＷＦＳＴの入力記号列としたときの可能な状態遷移過程の中で累積重みが最小となる状態遷移過程に対応する出力記号列を記号列変換結果となる。 (Hypothesis correction procedure)
Next, a hypothesis correction procedure in the hypothesis correction unit 301 will be described.
The hypothesis correction in the present invention is a state transition process that is possible in the subsequent WFST when the cumulative weight for the hypothetical state transition process in the preceding WFST is the output symbol string in the hypothetical state transition process as the input symbol string of the subsequent WFST. It means that a state transition process in which the cumulative weight is the smallest is obtained and the cumulative weight is added to the hypothetical cumulative weight. When the input symbol string X is read, the corrected cumulative weight of the hypothesis that accepts the symbol string X and outputs the symbol string Y is:

Thus, it agrees with the minimum value of the sum of the cumulative weights of the two WFSTs in the above-described equation (2).
Therefore, using the corrected accumulated weight, the output symbol string Y of the preceding WFST that minimizes the obtained weight is obtained, and the accumulated weight in the possible state transition process when Y is used as the input symbol string of the succeeding WFST. The output symbol string corresponding to the state transition process that minimizes is the symbol string conversion result.

本発明では、各仮説に対応付けられた補助仮説リストを利用して仮説の補正を効率的に行う。
補正される仮説をｆとし、その補助仮説リストをＬ[ｆ]、仮説ｆの最後の状態遷移における出力記号をｙとする。
後段ＷＦＳＴの状態ｓ_B[ｇ]（ｇはＬ[ｆ]に含まれる一つの補助仮説を表す）から、入力記号ｙによる状態遷移ｅ_Bを経て次状態ｎ[ｅ_B] に至る補助仮説ｊを生成し、その到達状態をｓ[ｊ]＝ｎ[ｅ_B]とする。また、仮説ｆの状態遷移過程において、ｆの最後の状態遷移（ｅ_A）よりも前にあって前回補正計算が行われたときの到達状態をｔとする。
このとき、補助仮説ｊの累積重みＷ[ｊ]は、次の３つの要素を加算することにより計算できる。
（Ａ１）仮説ｆの状態遷移過程における状態ｔにおいて生成された補助仮説ｇの累積重み：
この値はＷ[ｇ]に等しい。つまり、出力記号がεであった状態遷移過程では、補助仮説リストはそのまま後に続く状態遷移過程の仮説に引く継がれるので、状態ｔにおいて生成された補助仮説リストはＬ[ｆ]に等しく、補助仮説リストＬ[ｆ]の個々の補助仮説の累積重みＷ[ｇ]をそのまま用いる。
（Ａ２）状態ｔから状態ｓ_A[ｆ]までの間に累積された重み：
但し、出力記号がεであった状態遷移過程では、補助仮説リストはそのまま新たに生成された仮説に引き継がれるので、状態ｔに至った仮説の補正された累積重みは、現在の補助仮説リストＬ[ｆ]の中で最小の累積重み、すなわち

（Ａ３）後段ＷＦＳＴの状態ｓ_B[ｇ]からｓ_B[ｊ]への状態遷移ｅ_Bにおける重み：ｗ[ｅ_B]
従って、補助仮説ｊの累積重みはＷ[ｊ]＝Ｗ[ｇ]＋ＡＷ[ｆ]＋ｗ[ｅ_B]のように計算する。
そして、仮説ｆの補正は、Ｗ[ｆ]を新たに生成された補助仮説ｊの累積重みＷ[ｊ]の中で最小値

In the present invention, correction of hypotheses is efficiently performed using an auxiliary hypothesis list associated with each hypothesis.
The hypothesis to be corrected is f, the auxiliary hypothesis list is L [f], and the output symbol in the last state transition of the hypothesis f is y.
An auxiliary hypothesis j from the state s _B [g] (g represents one auxiliary hypothesis included in L [f]) to the next state n [e _B ] through the state transition e _B by the input symbol y And its arrival state is s [j] = n [e _B ]. In addition, in the state transition process of hypothesis f, let t be the arrival state when the previous correction calculation was performed before the last state transition (e _A ) of f.
At this time, the cumulative weight W [j] of the auxiliary hypothesis j can be calculated by adding the following three elements.
(A1) Cumulative weight of auxiliary hypothesis g generated in state t in the state transition process of hypothesis f:
This value is equal to W [g]. That is, in the state transition process whose output symbol is ε, the auxiliary hypothesis list is directly inherited by the hypothesis of the subsequent state transition process, so that the auxiliary hypothesis list generated in the state t is equal to L [f], The cumulative weight W [g] of each auxiliary hypothesis in the hypothesis list L [f] is used as it is.
(A2) Weights accumulated from state t to state s _A [f]:
However, in the state transition process in which the output symbol is ε, the auxiliary hypothesis list is carried over to the newly generated hypothesis as it is, so that the corrected cumulative weight of the hypothesis reaching the state t is the current auxiliary hypothesis list L The smallest cumulative weight in [f], ie

(A3) the weight in the state transition e _B from subsequent WFST state s _B [g] s to _{B [j]: w [e} B]
Therefore, the cumulative weight of the auxiliary hypothesis j is calculated as W [j] = W [g] + AW [f] + w [e _B ].
Then, the correction of the hypothesis f is performed by reducing the minimum value among the cumulative weights W [j] of the newly generated auxiliary hypothesis j.

（仮説の補正手順）
以下、仮説の補正手順を図１１を用いて説明する。
この仮説補正手順は、図１０に示したステップS308において実行される。この手順において補正される仮説をｆ、前段ＷＦＳＴからの出力記号をｙ（＝ｏ[ｅ_A]）とする。
ステップS401では、仮説ｆを補正する手順を開始する。
ステップS402では、空の補助仮説リストＬ’を生成する。
ステップS403では、重みの補正項ＡＷ[ｆ]をＡＷ[ｆ]＝Ｗ[ｆ]−Ｗ[ｇ’]として求める。但し、ｇ’は、仮説ｆの補助仮説リストＬ[ｆ]の中で累積重みが最小の補助仮説を指す。
ステップS404では、Ｌ[ｆ]から補助仮説を一つ取り出し、ｇに代入する。状態ｓ_B[ｇ]から入力記号ｙで遷移可能な状態遷移のリストＥ_Bを生成し、ステップS405に進む。
ステップ405では、Ｅ_B＝φ（空のリスト）ならばS406に進む。そうでなければS409に進む。
ステップS406では、リストＥ_Bから状態遷移を一つ取り出し、ｅ_Bに代入する。
ステップS407では、補助仮説ｊを生成し、ｓ_B[ｊ]＝ｎ[ｅ_B]、Ｗ[ｊ]＝Ｗ[ｇ]＋ＡＷ[ｆ]＋ｗ[ｅ_B]、Ｏ[ｊ]＝Ｏ[ｇ]・ｏ[ｅ_B]とする。
ステップS408では、補助仮説ｊを補助仮説リストＬ’に挿入する。
ステップS405では、Ｅ_Bが空（ｓ_B[ｇ]からのすべての状態遷移に対して補助仮説を展開済み）であればS409に進む。そうでなければS406に戻り、次の状態遷移を調べる。
ステップS409では、Ｌ[ｆ]が空（すべての補助仮説を展開済み）であればS410に進む。そうでなければS404に戻り、次の仮説を展開する。
ステップS410では、新たに生成された補助仮説のリストＬ’の要素を、すでに空となった補助仮説リストＬ’[ｆ]にすべて移し、S411に進む。
ステップS411では、補助仮説リストＬ[ｆ]の中で累積重みが最初の補助仮説ｇ"に対してＷ[ｆ]＝Ｗ[ｇ"]とする。
ステップS412にて本発明の一実施の形態における仮説補正手順を終了する。 (Hypothesis correction procedure)
The hypothesis correction procedure will be described below with reference to FIG.
This hypothesis correction procedure is executed in step S308 shown in FIG. The hypothesis corrected in this procedure is f, and the output symbol from the previous stage WFST is y (= o [e _A ]).
In step S401, a procedure for correcting hypothesis f is started.
In step S402, an empty auxiliary hypothesis list L ′ is generated.
In step S403, the weight correction term AW [f] is obtained as AW [f] = W [f] −W [g ′]. However, g ′ indicates an auxiliary hypothesis having the smallest cumulative weight in the auxiliary hypothesis list L [f] of the hypothesis f.
In step S404, one auxiliary hypothesis is extracted from L [f] and substituted into g. A list E _{B of} state transitions that can be transitioned with the input symbol y is generated from the state s _B [g], and the process proceeds to step S405.
In step 405, if E _B = φ (empty list), the process proceeds to S406. Otherwise, go to S409.
In step S406, it is taken out one state transition from the list E _B, is substituted into e _B.
In step S407, an auxiliary hypothesis j is generated, and s _B [j] = n [e _B ], W [j] = W [g] + AW [f] + w [e _B ], O [j] = O [g ] · O [e _B ].
In step S408, the auxiliary hypothesis j is inserted into the auxiliary hypothesis list L ′.
In step S405, if E _B is empty (the auxiliary hypothesis has been expanded for all state transitions from s _B [g]), the process proceeds to S409. Otherwise, the process returns to S406 and the next state transition is examined.
In step S409, if L [f] is empty (all auxiliary hypotheses have been expanded), the process proceeds to S410. Otherwise, return to S404 and develop the next hypothesis.
In step S410, all the elements of the newly generated auxiliary hypothesis list L ′ are moved to the already empty auxiliary hypothesis list L ′ [f], and the process proceeds to S411.
In step S411, the cumulative weight is set to W [f] = W [g "] for the first auxiliary hypothesis g" in the auxiliary hypothesis list L [f].
In step S412, the hypothesis correction procedure in one embodiment of the present invention is terminated.

また、本発明では、仮説リストに仮説を挿入する場合、従来手法と同様に、仮説リスト内に前段ＷＦＳＴの同じ状態に到達している仮説が存在していれば、累積重みの小さい方を仮説リストに残し、仮説を絞り込む。但し、この仮説絞込みによって補助仮説が失われることがあるので、絞り込まれた仮説の補助仮説リストを、残った仮説の補助仮説リストに連結する。その際、補助仮説リスト内の補助仮説の累積重みを補正する必要がある。これは、仮説ｈと仮説ｆが前段ＷＦＳＴの同じ状態に到達しているとき、補助仮説の累積重みを

のように補正する。この補正は、先に示した仮説の補正手順における（Ａ１）と（Ａ２）の項を加えたものに等しい。そして、
Ｗ[ｈ]＜Ｗ[ｆ] ならばＬ[ｈ]＝Ｌ[ｈ]＋Ｌ[ｆ]、ｆを削除
Ｗ[ｈ]＞Ｗ[ｆ] ならばＬ[ｆ]＝Ｌ[ｈ]＋Ｌ[ｆ]、ｈを削除
のように補助仮説リストを連結する。 In the present invention, when a hypothesis is inserted into the hypothesis list, as in the conventional method, if there is a hypothesis that reaches the same state of the preceding WFST in the hypothesis list, the hypothesis with the smaller cumulative weight is selected. Leave the list and narrow down the hypotheses. However, since the auxiliary hypotheses may be lost by this hypothesis narrowing down, the auxiliary hypothesis list of the narrowed down hypotheses is connected to the auxiliary hypothesis list of the remaining hypotheses. At that time, it is necessary to correct the cumulative weight of the auxiliary hypotheses in the auxiliary hypothesis list. This is because when hypothesis h and hypothesis f reach the same state in the previous stage WFST, the cumulative weight of the auxiliary hypothesis is

Correct as follows. This correction is equivalent to the addition of the terms (A1) and (A2) in the hypothesis correction procedure described above. And
If W [h] <W [f] L [h] = L [h] + L [f], delete f If W [h]> W [f] L [f] = L [h] + L [ f], concatenate the auxiliary hypothesis lists like h is deleted.

本発明による記号変換手順（図１０）は、仮説補正手順（ステップS309）を除けば前段のＷＦＳＴだけを用いて探索する手順（図４）とほぼ同じ計算手順となり、従来の記号列変換手順（図８）と比べて計算量を少なく抑えることができる。仮説補正手順における計算の負荷はあるが、前段のＷＦＳＴの状態遷移において出力記号がεでない場合においてのみ仮説補正手順が実行されるので、前段のＷＦＳＴの出力記号がεである割合が多いほど、処理量削減の効果は大きくなる。 The symbol conversion procedure (FIG. 10) according to the present invention is almost the same calculation procedure as the procedure (FIG. 4) for searching using only the preceding WFST except for the hypothesis correction procedure (step S309). Compared with FIG. 8), the amount of calculation can be reduced. Although there is a calculation load in the hypothesis correction procedure, the hypothesis correction procedure is executed only when the output symbol is not ε in the state transition of the preceding WFST. Therefore, the higher the ratio that the output symbol of the previous WFST is ε, The effect of reducing the amount of processing increases.

（入力記号列“あ，め，だ，ま”が与えられた場合の出力記号列を求める過程）
次に、本発明の記号列変換手順に従って、図５をＷＦＳＴＡ、図６をＷＦＳＴＢとして、入力記号列“あ，め，だ，ま”が与えられた場合の出力記号列を求める過程を順を追って説明する。但し、仮説（ＷＦＳＴＡの現状態番号ｓ_A、累積重みＷ、補助仮説リストＬ）は（ｓ_A，Ｗ，Ｌ）のように表し、補助仮説（ＷＦＳＴＢの現状態番号ｓ_B、出力記号列Ｏ、累積重みＷ）は（ｓ_B，Ｏ，Ｗ）のように表す。また、補助仮説リストＬは｛（ｓ_B，Ｏ，Ｗ），（ｓ’_B，Ｏ’，Ｗ’），・・・｝のように表す。
S301から開始し、S302で空の仮説リストＨとＨ’を作る。
S303により仮説リストＨの中に仮説（０，０，｛０，φ，０｝）を挿入する。 (The process of obtaining the output symbol string when the input symbol string “A, Me, Da, Ma” is given)
Next, according to the symbol string conversion procedure of the present invention, a process of obtaining an output symbol string when an input symbol string “A, M, D, D” is given with WFST A as FIG. 5 and WFST B as FIG. I will explain in order. However, the hypothesis (WFST A current state number s _A, the cumulative weight W, auxiliary hypothesis list L) is (s _A, W, L) expressed as, current state number s _B auxiliary hypothesis (WFST B, output symbols Column O, cumulative weight W) is expressed as (s _B , O, W). The auxiliary hypothesis list L is expressed as {(s _B , O, W), (s ′ _B , O ′, W ′),.
Starting from S301, empty hypothesis lists H and H ′ are created in S302.
A hypothesis (0, 0, {0, φ, 0}) is inserted into the hypothesis list H through S303.

“あ”読み込み
S304で記号“あ”を読み込み、S305において仮説リストＨから仮説（０，０，｛０，φ，０｝）を取り出す。
この仮説のＷＦＳＴＡの現状０から入力記号が“あ”の状態遷移
＜０→１，あ：雨／０＞，
＜０→２，あ：飴／０＞
を含むリストＥ_Aを作る。
S306でＥ_A＝φではないのでS307に進み、Ｅ_Aから状態遷移＜０→１，あ：雨／０＞を取り出し、S308で新たな仮説（１，０，｛０，φ，０｝）を生成する。S309で状態遷移＜０→１，あ：雨／０＞の出力記号はεではないので、仮説（１，０，｛０，φ，０｝）を補正する手順S401に進む。
S402で空の補助仮説リストＬ’を生成する。
S403で重みＡＷ＝０となる。
S404で補助仮説リストＬ＝｛（０，φ，０）｝から補助仮説（０，φ，０）を取り出し、ＷＦＳＴＢの状態０から入力記号“雨”で遷移可能な状態遷移
＜０→１，雨：雨／０＞，
＜０→３，雨：雨／０＞
を含むリストＥ_Bを作る。
S405でＥ_A＝φではないのでS406に進む。Ｅ_Aから状態遷移＜０→１，雨：雨／０＞を取り出し、S407で補助仮説（１，雨，０）を生成し、S408で補助仮説（１，雨，０）をＬ’に挿入する。
S405に戻り、Ｅ_B＝φではないのでS406に進む。Ｅ_Bから状態遷移＜０→３，雨：雨／０＞を取り出し、S407で補助仮説（３，雨，０）を生成し、S408で補助仮説（３，雨，０）をＬ’に挿入する。
S405に戻り、Ｅ_B＝φであるため、S409に進む。補助仮説リストＬ＝φなのでS410に進み、Ｌ’の要素（１，雨，０），（３，雨，０）をＬに移す。
S411において仮説の累積重みは０となり、S412で仮説の補正を終了する。この結果、仮説は｛１，０，（１，雨，０），（３，雨，０）｝となる。 "A" reading
The symbol “a” is read in S304, and a hypothesis (0, 0, {0, φ, 0}) is extracted from the hypothesis list H in S305.
State transition <0 → 1, a: rain / 0> from the current state 0 of this hypothesis WFST A where the input symbol is “a”
<0 → 2, A: 飴 / 0>
Make a list E _A, including.
Since E _A = φ is not satisfied in S306, the process proceeds to S307, where state transition <0 → 1, a: rain / 0> is extracted from E _A , and a new hypothesis (1, 0, {0, φ, 0}) is obtained in S308. Is generated. In S309, since the output symbol of the state transition <0 → 1, a: rain / 0> is not ε, the process proceeds to step S401 for correcting the hypothesis (1, 0, {0, φ, 0}).
In S402, an empty auxiliary hypothesis list L ′ is generated.
In S403, the weight AW = 0.
In S404, the auxiliary hypothesis (0, φ, 0) is extracted from the auxiliary hypothesis list L = {(0, φ, 0)}, and the state transition that can be transitioned from the state 0 of WFST B with the input symbol “rain” <0 → 1 , Rain: Rain / 0>,
<0 → 3, rain: rain / 0>
Make a list E _B, including.
Since E _A is not φ in S405, the process proceeds to S406. State transition from E _A <0 → 1, rain: Rain / 0> insert was removed, auxiliary hypothesis S407 (1, rain, 0) generates the auxiliary hypotheses S408 (1, rain, 0) to L ' To do.
Returning to S405, since E _B = φ is not satisfied, the process proceeds to S406. State transition from E _B <0 → 3, rain: Rain / 0> was removed, S407 auxiliary hypothesis (3, rain, 0) generates, insertion auxiliary hypothesis (3, rain, 0) into L 'in S408 To do.
Returning to S405, since E _B = φ, the process proceeds to S409. Since the auxiliary hypothesis list L = φ, the process proceeds to S410, and the elements (1, rain, 0) and (3, rain, 0) of L ′ are moved to L.
In S411, the cumulative weight of the hypothesis becomes 0, and the correction of the hypothesis is terminated in S412. As a result, the hypothesis is {1, 0, (1, rain, 0), (3, rain, 0)}.

S310に戻り、仮説（１，０，｛（１，雨，０），（３，雨，０）｝）をＨ’に挿入する。
S306に戻りＥ_A＝φではないのでS307に進み、Ｅ_Aから状態遷移＜０→２，あ：飴／０＞を取り出し、S308で新たな仮説（２，０，｛０，φ，０｝）を生成し、S309で状態遷移
＜０→２，あ：飴／０＞の出力記号はεではないので、仮説（２，０，｛０，φ，０｝）を補正するためS401に進む。
S402で空の補助仮説リストＬ’を生成する。
S403で重みＡＷ＝０となる。
S404で補助仮説リストＬ＝｛（０，φ，０）｝から補助仮説（０，φ，０）を取り出し、ＷＦＳＴＢの状態０から入力記号“雨”で遷移可能な状態遷移
＜０→２，飴：飴／０＞，
＜０→４，飴：飴／０＞
を含むリストＥ_Bを作る。
S405でＥ_B＝φではないのでS406に進む。Ｅ_Bから状態遷移＜０→２，飴：飴／０＞を取り出し、S407で補助仮説（２，飴，０）を生成し、S408で補助仮説（２，飴，０）をＬ’に挿入する。
S405に戻りＥ_B＝φではないのでS406に進む。Ｅ_Bから状態遷移＜０→４，飴：飴／０＞を取り出し、S407で補助仮説（４，飴，０）を生成し、S408で補助仮説（４，飴，０）をＬ’に挿入する。
S405に戻りＥ_B＝φであるため、S409に進む。補助仮説リストＬ＝φなのでS410に進み、Ｌ’の要素（２，飴，０），（４，飴，０）をＬに移す。
S411において仮説の累積重みは０となり、S412で仮説の補正を終了する。この結果、仮説は（２，０，｛（２，飴，０），（４，飴，０）｝）となっている。
S310に戻り、仮説（２，０，｛（２，飴，０），（４，飴，０）｝）をＨ’に挿入する。
S306に戻り、Ｅ_A＝φであるため、S311に進む。Ｈ＝φであるためS312に進みＨ’内の仮説（１，０，｛（１，雨，０），（３，雨，０）｝）と（２，０，｛（２，飴，０），（４，飴，０）｝）をＨに移し、S313で次の入力記号が存在するのでS304の戻る。 Returning to S310, the hypothesis (1, 0, {(1, rain, 0), (3, rain, 0)}) is inserted into H ′.
Returning to S306, since E _A is not φ, the process proceeds to S307, where state transition <0 → 2, a: 飴 / 0> is extracted from E _A , and a new hypothesis (2, 0, {0, φ, 0} is obtained in S308. ) And the output symbol of the state transition <0 → 2, A: 飴 / 0> is not ε in S309, and the process proceeds to S401 to correct the hypothesis (2, 0, {0, φ, 0}). .
In S402, an empty auxiliary hypothesis list L ′ is generated.
In S403, the weight AW = 0.
In S404, the auxiliary hypothesis (0, φ, 0) is extracted from the auxiliary hypothesis list L = {(0, φ, 0)}, and the state transition that can be transitioned from the state 0 of WFST B with the input symbol “rain” <0 → 2 , 飴: 飴 / 0>,
<0 → 4, 飴: 飴 / 0>
Make a list E _B, including.
Since E _B is not equal to φ in S405, the process proceeds to S406. State transition from E _B <0 → 2, candy: candy / 0> was removed, auxiliary hypothesis S407 (2, candy, 0) generates, insertion auxiliary hypothesis (2, candy, 0) into L 'in S408 To do.
Returning to S405, since E _B is not φ, the process proceeds to S406. State transition from E _B <0 → 4, candy: candy / 0> was removed, auxiliary hypothesis S407 (4, candy, 0) generates, insertion auxiliary hypothesis (4, candy, 0) into L 'in S408 To do.
Returning to S405, since E _B = φ, the process proceeds to S409. Since the auxiliary hypothesis list L = φ, the process proceeds to S410, and the elements (2, 飴, 0) and (4, 飴, 0) of L ′ are moved to L.
In S411, the cumulative weight of the hypothesis becomes 0, and the correction of the hypothesis is terminated in S412. As a result, the hypothesis is (2, 0, {(2, 飴, 0), (4, 飴, 0)}).
Returning to S310, the hypothesis (2, 0, {(2, 飴, 0), (4, 飴, 0)}) is inserted into H ′.
Returning to S306, since E _A = φ, the process proceeds to S311. Since H = φ, the process proceeds to S312 and the hypotheses (1, 0, {(1, rain, 0), (3, rain, 0)}) and (2, 0, {(2, 飴, 0) in H ′ ), (4, 飴, 0)}) are moved to H, and the next input symbol is present in S313, so that the process returns to S304.

“め”読み込み
続いて、S304で記号“め”を読み込み、S305において仮説リストＨから仮説（１，０，｛（１，雨，０），（１，１，雨，０）｝）を取り出す。この仮説のＷＦＳＴＡの現状態１から入力記号が“め”の状態遷移
＜１→５，め：ε／１＞
を含むリストＥ_Aを作る。
S306でＥ_A＝φではないのでS307に進み、Ｅ_Aから状態遷移＜１→５，め：ε／１＞を取り出し、S308で新たな仮説（５，１，｛（１，雨，０），（１，雨，０）｝）を生成し、、S309に進む。
状態遷移＜１→５，め：ε／１＞の出力記号はεであるため、次のS310に進み仮説（５，１，｛（１，雨，０），（１，雨，０）｝）をＨ’に挿入する。
S306に戻りＥ_A＝φであるためS311に進む。Ｈ≠φであるためS305に戻り、仮説リストＨから仮説（２，０，｛（２，飴，０），（４，飴，０）｝）を取り出す。この仮説の現状態２から入力記号が“め”の状態遷移
（＜２→５，め：ε／２＞，φ）
を含むリストＥ_Aを作る。 Next, the symbol “me” is read in S304, and the hypothesis (1, 0, {(1, rain, 0), (1, 1, rain, 0)}) is extracted from the hypothesis list H in S305. . State transition from the current state 1 of the hypothetical WFST A to the input symbol “M” <1 → 5, M: ε / 1>
Make a list E _A, including.
Since E _A is not φ in S306, the process proceeds to S307, where state transition <1 → 5: ε / 1> is extracted from E _A , and a new hypothesis (5, 1, {(1, rain, 0) is acquired in S308. , (1, rain, 0)}) and proceeds to S309.
Since the output symbol of state transition <1 → 5, ε / 1> is ε, the process proceeds to the next S310 and the hypothesis (5, 1, {(1, rain, 0), (1, rain, 0)} ) Is inserted into H ′.
Returning to S306, since E _A = φ, the process proceeds to S311. Since H ≠ φ, the process returns to S305, and a hypothesis (2, 0, {(2, 飴, 0), (4, 飴, 0)}) is extracted from the hypothesis list H. State transition from the current state 2 of this hypothesis to the input symbol “M” (<2 → 5, M: ε / 2>, φ)
Make a list E _A, including.

S306でＥ_A＝φでないのでS307に進み、Ｅ_Aから状態遷移＜２→５，め：ε／２＞を取り出し、S308で新たな仮説（５，２，｛（２，飴，０），（４，飴，０）｝）を生成し、S309に進む。
状態遷移＜２→５，め：ε／２＞の出力記号はεであるため、次のS310に進み仮説（５，２，｛（２，飴，０），（４，飴，０）｝）をＨ’に挿入する。
但し、Ｈ’は既に仮説（５，１，｛（１，雨，０），（１，雨，０）｝）が存在し、ＷＦＳＴＡにおける到達状態が５であるため累積重みの小さい仮説（５，１，｛（１，雨，０），（１，雨，０）｝）を残し、仮説（５，２，｛（２，飴，０），（４，飴，０）｝）は削除するが、補助仮説リストは連結させて、仮説を（５，１，｛（１，雨，１），（１，雨，１），（２，飴，２），（４，飴，２）｝）のようにする。
S306に戻りＥ_A＝φであるためS311に進む。Ｈ＝φであるためS312に進みH’内の仮説（５，１，｛（１，雨，１），（１，雨，１），（２，飴，２），（４，飴，２）｝）をＨに移し、S313で次の入力記号が存在するのでS304に戻る。 Since E _A is not φ in S306, the process proceeds to S307, where state transition <2 → 5: ε / 2> is extracted from E _A , and a new hypothesis (5, 2, {(2, 飴, 0), (4, 飴, 0)}) is generated, and the process proceeds to S309.
Since the output symbol of the state transition <2 → 5th: ε / 2> is ε, the process proceeds to the next S310 and the hypothesis (5, 2, {(2, 飴, 0), (4, 飴, 0)} ) Is inserted into H ′.
However, since H ′ already has a hypothesis (5, 1, {(1, rain, 0), (1, rain, 0)}) and the arrival state in WFST A is 5, a hypothesis with a small cumulative weight ( 5,1, {(1, rain, 0), (1, rain, 0)}) and the hypothesis (5,2, {(2, 飴, 0), (4, 飴, 0)}) Delete the auxiliary hypothesis list, but connect the hypotheses to (5, 1, {(1, rain, 1), (1, rain, 1), (2, 飴, 2), (4, 飴, 2 )}).
Returning to S306, since E _A = φ, the process proceeds to S311. Since H = φ, the process proceeds to S312 and the hypothesis (5, 1, {(1, rain, 1), (1, rain, 1), (2, 飴, 2), (4, 飴, 2) in H ′ )}) Is moved to H, and since the next input symbol exists in S313, the process returns to S304.

“だ”読み込み
続いて、S304で記号“だ”を読み込み、S305において仮説リストＨから仮説（５，１，｛（１，雨，１），（１，雨，１），（２，飴，２），（４，飴，２）｝）を取り出す。この仮説のＷＦＳＴＡの現状態５から入力記号が“だ”の状態遷移
＜０→４，だ：玉／１＞
を含むリストＥ_Aを作る。ここで、ＷＦＳＴＡの現状態５から状態０へは入力記号なしで遷移できるので、Ｅ_Aに含まれるＷＦＳＴＡの状態遷移は＜０→４，だ：玉／１＞となっている。
S306でＥ_A＝φではないのでS307に進み、Ｅ_Aから状態遷移＜０→４，だ：玉／１＞を取り出し、S308で新たな仮説（４，２，｛（１，雨，１），（１，雨，１），（２，飴，２），（４，飴，２）｝）を生成し、S309で状態遷移＜０→４，だ：玉／１＞の出力記号はεではないので、仮説（４，２，｛（１，雨，１），（３，雨，１），（２，飴，２），（４，飴，２）｝）を補正するためS401に進む。
S402で空の補助仮説リストＬ’を生成する。
S403で重みＡＷ[ｆ]＝２−min（１，１，２，２）＝１となる。
S404で補助仮説リストＬ＝｛（１，雨，１），（３，雨，１），（２，飴，２），（４，飴，２）｝から補助仮説（１，雨，１）を取り出し、ＷＦＳＴＢの状態１から入力記号“玉”で遷移可能な状態遷移
＜１→４，玉：玉／５＞
を含むリストＥ_Bを作る。 “DA” reading Subsequently, the symbol “DA” is read in S304, and hypotheses (5, 1, {(1, rain, 1), (1, rain, 1), (2, 飴, 2), (4, 飴, 2)}). State transition from the current state 5 of the hypothetical WFST A where the input symbol is “da” <0 → 4: ball / 1>
Make a list E _A, including. Here, because from the current state 5 of WFST A to state 0 can transition with no input symbol, the state transition of WFST A that is included in the E _A is: has become a <0 → 4, It is the ball / 1>.
Since E _A is not φ in S306, the process proceeds to S307, where state transition <0 → 4: ball / 1> is extracted from E _A and a new hypothesis (4, 2, {(1, rain, 1) is extracted in S308. , (1, rain, 1), (2, 飴, 2), (4, 飴, 2)}), and the output symbol of state transition <0 → 4: ball / 1> is ε in S309 Therefore, S401 is corrected to correct the hypothesis (4, 2, {(1, rain, 1), (3, rain, 1), (2, 飴, 2), (4, 飴, 2)}). move on.
In S402, an empty auxiliary hypothesis list L ′ is generated.
In S403, the weight AW [f] = 2−min (1, 1, 2, 2) = 1.
In S404, auxiliary hypothesis list L = {(1, rain, 1), (3, rain, 1), (2, 飴, 2), (4, 飴, 2)} from auxiliary hypothesis (1, rain, 1) State transition that can be transitioned from the state 1 of WFST B with the input symbol “ball” <1 → 4, ball: ball / 5>
Make a list E _B, including.

S405でＥ_B＝φではないのでS406に進む。Ｅ_Bから状態遷移＜１→４，玉：玉／５＞を取り出し、S407で補助仮説（４，雨玉，７）を生成する。ここで、補助仮説の累積重みは
Ｗ[ｇ]＋ＡＷ[ｆ]＋ｗ[Ｅ_B]＝１＋１＋５＝７
のように計算されている。そして、S408で補助仮説（４，雨玉，７）をＬ’に挿入する。
S405に戻りＥ_B＝φであるため、S409に進む。
Ｌ＝φではないのでS404に進み、補助仮説（３，雨，１）を取り出し、ＷＦＳＴＢの状態３から入力記号“玉”で遷移可能な状態遷移は存在しないのでＥ_B＝φとする。
S405に戻りＥ_B＝φであるためS409に進む。補助仮説リストＬ＝φではないのでS404に進み、補助仮説（２，飴，２）を取り出し、ＷＦＳＴＢの状態２から入力記号“玉”で遷移可能な状態遷移
＜２→４，玉：玉／１＞
を含むリストＥ_Bを作る。
S405でＥ_B＝φでないのでS406に進む。Ｅ_Bから状態遷移＜２→４，玉：玉／１＞を取り出し、S407で補助仮説（４，飴玉，４）を生成する。ここで、補助仮説の累積重みは
Ｗ[ｇ]＋ＡＷ[ｆ]＋ｗ[Ｅ_B]＝２＋１＋１＝４
のように計算されている。
S408で補助仮説（４，飴玉，４）にＬ’に挿入する。このときＬ’には補助仮説（４，雨玉，７）が存在するので累積重みの小さい補助仮説（４，飴玉，４）を残し、（４，雨玉，７）は削除する。 Since E _B is not equal to φ in S405, the process proceeds to S406. State transition from E _B <1 → 4, Ball: Ball / 5> was removed, auxiliary hypothesis S407 (4, candy, 7) for generating a. Here, the cumulative weight of the auxiliary hypothesis is W [g] + AW [f] + w [E _B ] = 1 + 1 + 5 = 7
It is calculated as follows. In S408, the auxiliary hypothesis (4, rainball, 7) is inserted into L ′.
Returning to S405, since E _B = φ, the process proceeds to S409.
Since it is not L = φ, the process proceeds to S404, and the auxiliary hypothesis (3, rain, 1) is extracted. Since there is no state transition that can be transitioned from the state 3 of WFST B with the input symbol “ball”, E _B = φ.
Returning to S405, since E _B = φ, the process proceeds to S409. Since the auxiliary hypothesis list L is not φ, the process proceeds to S404, the auxiliary hypothesis (2, 飴, 2) is extracted, and the state transition that can be transitioned from the state 2 of WFST B with the input symbol “ball” <2 → 4, ball: ball / 1>
Make a list E _B, including.
Since E _B is not equal to φ in S405, the process proceeds to S406. State transition from E _B <2 → 4, Ball: Ball / 1> was removed, auxiliary hypothesis S407 (4, hard candy, 4) for generating a. Here, the cumulative weight of the auxiliary hypothesis is W [g] + AW [f] + w [E _B ] = 2 + 1 + 1 = 4
It is calculated as follows.
In S408, the auxiliary hypothesis (4, jasper, 4) is inserted into L ′. At this time, since the auxiliary hypothesis (4, rainball, 7) exists in L ′, the auxiliary hypothesis (4, jasper, 4) having a small cumulative weight is left and (4, rainball, 7) is deleted.

S405に戻りＥ_B＝φであるため、S409に進む。
Ｌ＝φではないのでS404に進み、補助仮説（４，飴，２）を取り出し、ＷＦＳＴＢの状態４から入力記号“玉”で遷移可能な状態遷移は存在しないのでＥ_B＝φとする。
S405に戻りＥ_B＝φであるため、S409に進む。補助仮説リストＬ＝φなのでS410に進み、L’の要素（４，飴玉，３）をＬに移す。
S411において仮説の累積重みは４となり、S412で仮説の補正を終了し、結果として、仮説は（４，４，｛（４，飴玉，４）｝）となる。
S310に戻って仮説（４，４，｛（４，飴玉，４）｝）をＨ’に挿入する。
S306に戻りＥ_A＝φであるためS311に進む。Ｈ＝φであるためS312に進みＨ’内の仮説（４，４，｛（４，飴玉，４）｝）をＨに移し、S313で次の入力記号が存在するのでS304に戻る。 Returning to S405, since E _B = φ, the process proceeds to S409.
Since it is not L = φ, the process proceeds to S404, and the auxiliary hypothesis (4, 飴, 2) is extracted. Since there is no state transition that can be transitioned from the state 4 of WFST B with the input symbol “ball”, E _B = φ.
Returning to S405, since E _B = φ, the process proceeds to S409. Since the auxiliary hypothesis list L = φ, the process proceeds to S410, and the element of L ′ (4, jasper, 3) is moved to L.
In S411, the cumulative weight of the hypothesis is 4, and the correction of the hypothesis is finished in S412. As a result, the hypothesis is (4, 4, {(4, jasper, 4)}).
Returning to S310, the hypothesis (4, 4, {(4, jasper, 4)}) is inserted into H ′.
Returning to S306, since E _A = φ, the process proceeds to S311. Since H = φ, the process proceeds to S312 and the hypothesis (4, 4, {(4, jasper, 4)}) in H ′ is moved to H. Since the next input symbol exists in S313, the process returns to S304.

“ま”読み込み
続いて、S304で記号“ま”を読み込み、S305において仮説リストＨから仮説（４，４，｛（４，飴玉，４）｝）を取り出す。この仮説はＷＦＳＴＡの現状態１から入力記号が“ま”の状態遷移
＜４→５，ま：ε／１＞
を含むリストＥ_Aを作る。
S306でＥ_A＝φではないのでS307に進みＥ_Bから状態遷移＜４→５，ま：ε／１＞を取り出し、S308で新たな仮説（５，５，｛（４，飴玉，４）｝）を生成し、S309に進む。
状態遷移＜４→５，ま：ε／１＞の出力記号はεであるため、次のS310に進み仮説５，５，｛（４，飴玉，４）｝）をＨ’に挿入する。
S306に戻りＥ_A＝φであるためS311に進む。Ｈ＝φであるためS312に進みＨ’内の仮説（５，５，｛（４，飴玉，４）｝）をＨに移し、S313で次の入力が存在しないのでS314に進む。
S314で、Ｈ内の仮説（５，５，｛（４，飴玉，４）｝）が終了状態に到達した唯一の仮説であり、その仮説の補助仮説の出力記号列“飴玉”を変換結果として出力し、S315で記号列変換処理を終了する。 Next, the symbol “ma” is read in S304, and the hypothesis (4, 4, {(4, jasper, 4)}) is extracted from the hypothesis list H in S305. This hypothesis is the state transition from the current state 1 of WFST A to the input symbol “MA” <4 → 5, MA: ε / 1>
Make a list E _A, including.
S306 state transition from E _B proceeds to S307 because it is not E _A = phi in <4 → 5, or: epsilon / 1> was removed, a new hypothesis S308 (5,5, {(4, hard candy, 4) }) And proceeds to S309.
Since the output symbol of state transition <4 → 5, or ε / 1> is ε, the process proceeds to the next S310, and hypothesis 5, 5, {(4, jasper, 4)}) is inserted into H ′.
Returning to S306, since E _A = φ, the process proceeds to S311. Since H = φ, the process proceeds to S312 and the hypothesis (5, 5, {(4, jasper, 4)}) in H ′ is moved to H. Since there is no next input in S313, the process proceeds to S314.
In S314, the hypothesis (5, 5, {(4, Kodama, 4)}) in H is the only hypothesis that has reached the end state, and the output symbol string “Kadama” of the auxiliary hypothesis of that hypothesis is converted The result is output, and the symbol string conversion process ends in S315.

（音声認識）
一方、本発明を音声認識に適用し、効率的に音声認識を行うこともできる。
図１２は、本発明の一実施の形態である。
音声を入力する音声信号入力部４０１から送られた音声信号はその音声の短時間音響パターンの時系列を記号列として抽出する音声特徴記号列抽出部４０５において音響特徴記号列に変換し、その音響特徴記号列を入力として本発明による記号列変換を行う記号列変換部１０２に送る。続いて、記号列変換部１０２では、音響モデル格納部４０１から音声固定単位（例えば音素）の標準的特徴を音声信号をある短い時間（例えば１０ミリ秒）ごとに分析して得られる音響パターンの系列の照合により与える音響モデルを、単語発音辞書格納部４０２からは種々の単語の発音を前記音声固定単位の系列によって与える単語発音辞書を、音声認識用言語格納部４０３からは発話される単語の連接のしやすさを与える言語モデルを読み出し、音響特徴記号列抽出部４０１から送られた音響特徴記号列を読み込み、累積重み最小の出力記号列を求め、記号列出力部１０６に送る。記号列出力部１０６では、受け取った出力記号列を音声認識結果として出力する。 (voice recognition)
On the other hand, the present invention can be applied to speech recognition to efficiently perform speech recognition.
FIG. 12 shows an embodiment of the present invention.
The voice signal sent from the voice signal input unit 401 for inputting voice is converted into an acoustic feature symbol string by the voice feature symbol string extraction unit 405 that extracts the time series of the short-time acoustic pattern of the voice as a symbol string, and the sound The feature symbol string is input and sent to the symbol string converter 102 which performs symbol string conversion according to the present invention. Subsequently, in the symbol string conversion unit 102, an acoustic pattern obtained by analyzing a standard feature of a speech fixed unit (for example, phonemes) from the acoustic model storage unit 401 for each short time (for example, 10 milliseconds) is analyzed. From the word pronunciation dictionary storage unit 402, an acoustic model to be given by sequence matching, a word pronunciation dictionary to give the pronunciation of various words by a sequence of the fixed speech unit, and from the speech recognition language storage unit 403, A language model giving ease of concatenation is read, the acoustic feature symbol string sent from the acoustic feature symbol string extraction unit 401 is read, an output symbol string with the smallest cumulative weight is obtained, and sent to the symbol string output unit 106. The symbol string output unit 106 outputs the received output symbol string as a speech recognition result.

音声認識用の単語発音辞書や言語モデルをＷＦＳＴによって記述する方法は、例えば、国際会議ＡＳＲ２０００における、M. Mohri, F.Pereira, M. Riley著“Weighted finite-state transducers in speech recognition ”,Proceeding of ASR2000, pp.97-106,2000に開示されている。
種々の音声固定単位（例えば音素）の標準的な音響パターン系列の集合を表す音響モデルとしては、例えば、それら音響パターンの系列の集合を確率・統計理論に基づいてモデル化する隠れマルコフモデル法（Hidden Markov Model,以後ＨＭＭと呼ぶ）が主流である。このＨＭＭ法の詳細は、例えば、社団法人電子情報通信学会、中川聖一著「確率モデルによる音声認識」に開示されている。
音声認識の場合は、前段のＷＦＳＴの重みとして、音響モデルによって計算される音響特徴記号（音響パターン）のスコアを用いる。ただし、このスコアは、大きいほど入力音響パターンが音響モデルによって表される音声固定単位に近いことを表すので、マイナスの音響スコアをもって重みとする。隠れマルコフモデルによる音響スコアの計算では、例えばガウス分布に基づく確率値が用いられる。
音声認識に用いる音響パターンには、短い時間（例えば１０ミリ秒）ごとに音声信号を分析することにより得られるメルケプストラム(mel-frequency cepstral coefficients, ＭＦＣＣと呼ばれる)、デルタＭＦＣＣ、ＬＰＣケプストラム、対数パワーなどがある。 The method of describing a word pronunciation dictionary and language model for speech recognition by WFST is, for example, “Weighted finite-state transducers in speech recognition”, Proceeding of M. Mohri, F. Pereira, M. Riley at the international conference ASR2000. ASR2000, pp.97-106,2000.
As an acoustic model that represents a set of standard acoustic pattern sequences of various speech fixed units (for example, phonemes), for example, a hidden Markov model method that models a set of these acoustic pattern sequences based on probability / statistical theory ( Hidden Markov Model (hereinafter referred to as HMM) is the mainstream. Details of the HMM method are disclosed in, for example, “Recognition of Speech by Stochastic Model” by Seichi Nakagawa, Institute of Electronics, Information and Communication Engineers.
In the case of speech recognition, the score of the acoustic feature symbol (acoustic pattern) calculated by the acoustic model is used as the weight of the preceding WFST. However, the larger the score, the closer the input acoustic pattern is to the sound fixed unit represented by the acoustic model, so a negative acoustic score is used as the weight. In the calculation of the acoustic score by the hidden Markov model, for example, a probability value based on a Gaussian distribution is used.
The acoustic patterns used for speech recognition include mel-frequency cepstral coefficients (referred to as MFCC), delta MFCC, LPC cepstrum, logarithmic power obtained by analyzing a speech signal every short time (for example, 10 milliseconds). and so on.

重み付き有限状態変換器の一例を表す図。The figure showing an example of a weighted finite state converter. 図１の重み付き有限状態変換器を表によって定義した図。The figure which defined the weighted finite state converter of FIG. 1 with the table | surface. 重み付き有限状態変換器による記号変換方法の実施の形態を表す図。The figure showing embodiment of the symbol conversion method by a weighted finite state converter. 重み付き有限状態変換器による記号変換方法の手順を説明する図。The figure explaining the procedure of the symbol conversion method by a weighted finite state converter. 仮名の系列を表す記号列を、その仮名に対応する漢字の系列に変換する重み付き有限状態変換器の一例を表す図。The figure showing an example of the weighted finite state converter which converts the symbol string showing the series of kana into the kanji series corresponding to the kana. 日本語における漢字の系列の接続の可能性を重みによって与える重み付き有限状態変換器の一例を表す図。The figure showing an example of the weighted finite state converter which gives the possibility of the connection of the kanji series in Japanese by weight. 二つの重み付き有限状態変換器による従来の記号列変換方法の実施の形態を表す図。The figure showing embodiment of the conventional symbol sequence conversion method by two weighted finite state converters. 二つの重み付き有限状態変換器による記号列変換方法の手順を説明する図。The figure explaining the procedure of the symbol sequence conversion method by two weighted finite state converters. 本発明における二つの重み付き有限状態変換器による記号列変換方法の実施の形態を表す図。The figure showing embodiment of the symbol sequence conversion method by the two weighted finite state converters in this invention. 本発明における二つの重み付き有限状態変換器による記号列変換方法の手順を説明する図。The figure explaining the procedure of the symbol sequence conversion method by the two weighted finite state converters in this invention. 本発明における二つの重み付き有限状態変換器による記号列変換方法における仮説の補正方法を説明する図。The figure explaining the correction method of the hypothesis in the symbol sequence conversion method by the two weighted finite state converters in this invention. 本発明における二つの重み付き有限状態変換器による記号列変換方法を音声認識の方法として用いた場合の一実施の形態を表す図。The figure showing one Embodiment at the time of using the symbol sequence conversion method by the two weighted finite state converters in this invention as a speech recognition method.

Explanation of symbols

１０状態
１１終了状態
１２状態遷移
１０１ＷＦＳＴ格納部
１０２記号列変換部
１０３記号列入力部
１０４仮説展開部
１０５仮説絞込み部
１０６記号列出力部
２０１前段ＷＦＳＴ格納部
２０２後段ＷＦＳＴ格納部
３０１仮説補正部
４０１音響モデル格納部
４０２単語辞書ＷＦＳＴ格納部
４０３言語モデルＷＦＳＴ格納部
４０４音声信号入力部
４０５音声特徴記号列抽出部

10 State 11 End State 12 State Transition 101 WFST Storage Unit 102 Symbol String Conversion Unit 103 Symbol String Input Unit 104 Hypothesis Development Unit 105 Hypothesis Narrowing Unit 106 Symbol String Output Unit 201 Preceding WFST Storage Unit 202 Subsequent WFST Storage Unit 301 Hypothesis Correction Unit 401 Acoustic model storage unit 402 Word dictionary WFST storage unit 403 Language model WFST storage unit 404 Speech signal input unit 405 Speech feature symbol string extraction unit

Claims

A symbol string input section for sequentially reading the symbol strings;
A front-stage weighted finite state converter used in the preceding stage and a back-stage weighted finite state converter used in the subsequent stage for converting the symbol string in two stages using two weighted finite state converters that convert the symbol string by state transition A symbol string conversion unit for converting a symbol string using and
A symbol output unit that outputs a conversion result obtained by the latter-stage weighted finite state converter;
When the symbols are sequentially read from the symbol string input unit and the input symbol string has been read, cumulative weight values for state transitions respectively applied to the weighted finite state converters at the preceding and succeeding stages of the symbol string converter ( In the symbol string conversion method for outputting the output symbol string corresponding to the state transition process of the latter-stage weighted finite state converter with the smallest (cumulative weight) from the symbol string output unit,
While reading the symbols in order, the cumulative weight for the hypothesis representing one state transition process of the preceding stage weighted finite state transformer, the output symbol string in the hypothetical state transition process, and the input symbol string of the latter stage weighted finite state transformer The state transition process that minimizes the cumulative weight among the possible state transition processes in the latter-stage weighted finite state converter is corrected, and the cumulative weight is corrected by adding it to the hypothetical cumulative weight,
When all input symbol strings are read, the latter weighted finite state when the hypothesis with the smallest cumulative weight and the output symbol string corresponding to the state transition process of that hypothesis are used as the input symbol string of the latter weighted finite state converter A symbol string conversion method characterized in that an output symbol string for a state transition process having a minimum cumulative weight among possible state transition processes in a converter is used as a symbol string conversion result.

An input speech signal is converted into a symbol string representing its acoustic characteristics, and the symbol string is converted into an output symbol string corresponding to the input speech signal using the symbol string conversion method according to claim 1. Voice recognition method.

A symbol string input section for sequentially reading the symbol strings;
A front-stage weighted finite state converter used in the preceding stage and a back-stage weighted finite state converter used in the subsequent stage for converting the symbol string in two stages using two weighted finite state converters that convert the symbol string by state transition A symbol string conversion unit for converting a symbol string using and
A symbol output unit that outputs a conversion result obtained by the latter-stage weighted finite state converter;
When the symbols are sequentially read from the symbol string input unit and the input symbol string has been read, cumulative weight values for state transitions respectively applied to the weighted finite state converters at the preceding and succeeding stages of the symbol string converter ( In the symbol string converter that outputs the output symbol string corresponding to the state transition process of the weighted finite state converter of the subsequent stage with the smallest (cumulative weight) from the symbol string output unit,
The symbol string converter
While reading the symbols in order, the cumulative weight for the hypothesis representing one state transition process of the preceding stage weighted finite state transformer, the output symbol string in the hypothetical state transition process, and the input symbol string of the latter stage weighted finite state transformer A state transition process in which the cumulative weight is minimized among the possible state transition processes in the latter-stage weighted finite state converter, and correcting by adding the cumulative weight to the hypothetical cumulative weight;
When all input symbol strings are read, the latter weighted finite state when the hypothesis with the smallest cumulative weight and the output symbol string corresponding to the state transition process of that hypothesis are used as the input symbol string of the latter weighted finite state converter And a means for converting the output symbol string for the state transition process having the smallest accumulated weight among the possible state transition processes in the converter into a symbol string conversion result.

An input speech signal is converted into a symbol string representing its acoustic characteristics, and the symbol string is converted into an output symbol string corresponding to the input speech signal using the symbol string converter according to claim 2. Voice recognition device.