JP4341390B2

JP4341390B2 - Error correction method and apparatus for label sequence matching and program, and computer-readable storage medium storing label sequence matching error correction program

Info

Publication number: JP4341390B2
Application number: JP2003401670A
Authority: JP
Inventors: 幸紀南田; 正志森本
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2003-12-01
Filing date: 2003-12-01
Publication date: 2009-10-07
Anticipated expiration: 2023-12-01
Also published as: JP2005165538A

Description

本発明は、ラベルの系列マッチングの誤り修正方法及び装置及びプログラム及びラベルの系列のマッチング誤り修正プログラムを格納したコンピュータ読み取り可能な記憶媒体に係り、特に、たんぱく質のアミノ酸の配列等を比較するためのラベルの系列のマッチングにおいて発生するマッチングの誤りを対話的に修正するためのラベルの系列マッチングの誤り修正方法及び装置及びプログラム及びラベルの系列のマッチング誤り修正プログラムを格納したコンピュータ読み取り可能な記憶媒体に関する。 The present invention relates to an error correction method and apparatus for label sequence matching, a computer program, and a computer-readable storage medium storing a label sequence matching error correction program, and more particularly for comparing amino acid sequences of proteins and the like. FIELD OF THE INVENTION The present invention relates to a label sequence matching error correction method and apparatus for interactively correcting a matching error that occurs in label sequence matching, a program, and a computer-readable storage medium storing a label sequence matching error correction program. .

ラベルの系列のマッチング（または対応付け）は、多くの場面で利用される。似た文章を互いにずれが最小になるように単語毎に対応付けたり、たんぱく質のアミノ酸の配列を比較したりする際に用いられる技術である。 Matching (or association) of a series of labels is used in many situations. This is a technique used when associating similar sentences with each word so as to minimize the deviation and comparing the amino acid sequences of proteins.

近年では、マルチメディア技術の発達により、映像や音声を計算機上で扱えるようになってきている。これに伴い、原稿や台本のテキストという文字情報と、映像番組中で話者が発話している音声信号（通常、映像番組は音声を含む）を対応付けるように、異種メディア間でのマッチングを求めることも試みられるようになってきている（例えば、非特許文献１、２参照）。 In recent years, with the development of multimedia technology, video and audio can be handled on a computer. Along with this, matching between different types of media is requested so that character information such as text of a manuscript or script and an audio signal (usually a video program includes audio) spoken by a speaker in the video program are associated. This has also been attempted (for example, see Non-Patent Documents 1 and 2).

マッチングの問題は、数学的にはあるコスト関数の下で最小化問題と捉えることができる。ここで、Ａ，Ｂをラベルの集合とする。ａをＡの元、ｂをＢの元とする（ａ∈Ａ，ｂ∈Ｂ）。Ａの元の系列Ｓ_Ａ＝｛ａ_１，ａ_２，ａ_３，…，ａ_ｎ｝（ａ_ｉ∈Ａ，１≦ｉ≦ｎ）をＡのラベル列、Ｂの元の系列Ｓ_Ｂ＝｛ｂ_１，ｂ_２，ｂ_３，…，ｂ_ｍ｝（ｂ_ｉ∈Ｂ，１≦ｉ≦ｍ）をＢのラベル列とする。 The matching problem can be mathematically regarded as a minimization problem under a certain cost function. Here, A and B are set as labels. Let a be an element of A and b be an element of B (a∈A, b∈B). A original sequence S _A = {a ₁ , a ₂ , a ₃ ,..., A _n } (a _i εA, 1 ≦ i ≦ n) is a label sequence of A, and B original sequence S _B = { Let b ₁ , b ₂ , b ₃ ,..., b _m } (b _i εB, 1 ≦ i ≦ m) be a label string of B.

ラベル列Ｓ_Ａとラベル列Ｓ_Ｂのマッチング結果をｗとする。ｗは、整数［１，ｎ］を整数［１，ｍ］に写す写像であり、ラベル列Ｓ_Ａのｉ番目のラベルａ_ｉと、ラベル列Ｓ_Ｂのｗ（ｉ）番目のラベルｂ_ｗ（ｉ）が対応付いていることを意味するものとする。ｗには、一般には何らかの制約が課せられる。例えば、ラベルの並び順を逆転させないという制約（ｉ＜ｊならばｗ（ｉ）≦ｗ（ｊ）や、ｗの値があまり急に変化しないという傾斜制限（ｗ（ｉ＋１）≦ｗ（ｉ）＋α、αは定数）を課す場合がある。 Let w be the matching result of the label string S _A and the label string S _B. w is a mapping which maps the integer [1, n] to an integer [1, m], the i-th label string _{S A} label _{a i} and, the label string _{S B} of w (i) th label _{b w ( i)} means that it corresponds. In general, some restrictions are imposed on w. For example, the restriction that the order of the labels is not reversed (if i <j, w (i) ≦ w (j), or the inclination restriction that the value of w does not change abruptly (w (i + 1) ≦ w (i) + Α and α are constants).

ラベルａとラベルｂを対応付ける場合のコストをｃ（ａ，ｂ）とする。コストｃ（ａ，ｂ）は、ａとｂの関数であり、ａとｂが似ているほど小さい値をとり、完全に一致する場合に最小値０をとるものとする。 Let c (a, b) be the cost for associating label a with label b. The cost c (a, b) is a function of a and b, and takes a smaller value as a and b are similar, and takes a minimum value 0 when they completely match.

ラベル列Ｓ_Ａとラベル列Ｓ_Ｂのマッチング結果ｗのコストＣを、以下の数式１で The cost C of the matching result w between the label string S _A and the label string S _B is expressed by the following formula 1.

と定義する。最小のコストＣを与えるｗを、ラベル列Ｓ_Ａとラベル列Ｓ_Ｂの最適マッチング結果ｗ_０と定義する。Ｃ（ｗ）は、マッチング結果をスカラー量に写す関数であり、コスト関数と呼ぶことにする。よって、２系列のマッチングを求める問題は、コスト関数の下での最小化問題に帰着する。

It is defined as The w that gives the minimum cost C is defined as the optimum matching result w ₀ between the label sequence S _A and the label sequence S _B. C (w) is a function that copies the matching result to a scalar quantity, and is called a cost function. Therefore, the problem of obtaining two series of matching results in a minimization problem under a cost function.

マッチングを求めるためには、一般最小化アルゴリズムを使用することができる。例えば、全探索や、焼きなまし法を用いることができる（例えば、非特許文献３参照）。特に、並び順の逆転がない場合には、高速な動的計画法（ダイナミックプログラミング、ＤＰ）によるＤＰマッチングと呼ばれる方法がよく利用されている。 A general minimization algorithm can be used to find the match. For example, a full search or an annealing method can be used (see, for example, Non-Patent Document 3). In particular, when there is no reversal of the order of arrangement, a method called DP matching by high-speed dynamic programming (dynamic programming, DP) is often used.

これらの最小化アルゴリズムは、所与のコスト関数の下で最善の解を求めるものである。 These minimization algorithms seek the best solution under a given cost function.

なお、最小化問題は、コスト最小化問題、エネルギー最小化問題、最適化問題などとも呼ばれる。また、符合を反転させればコストを最大化する問題となるので、最小化問題と最大化問題は本質的に同じ問題である。 Note that the minimization problem is also called a cost minimization problem, an energy minimization problem, an optimization problem, or the like. Also, since the problem is to maximize the cost if the sign is reversed, the minimization problem and the maximization problem are essentially the same problem.

ここで、本明細書中で用いるいくつかの用語について定義しておく。 Here, some terms used in this specification are defined.

『対応が「正しい」または、「誤り」である』とは、人間がその対応を吟味し判定した結果を指すものとする。コスト最小化法により計算されたマッチング結果は、コストを最小化するという意味において常に正しいといえるが、マッチング結果を利用する人間は、別の独自の基準によって正誤を判定するので、必ずしもその結果を受け入れられるものではない。また、音声信号とテキストを対応付ける場合には、音声認識の結果に誤りが混入しているような場合もあり、そのような場合には、コストを最小化するようなマッチングであっても誤りを含むと考えられ、その正誤はコストを評価することでは判定し得ず、人間の判定によらざるを得ないであろう。 “Correspondence is“ correct ”or“ error ”” refers to a result of a person examining and determining the correspondence. The matching result calculated by the cost minimization method is always correct in the sense of minimizing the cost, but humans who use the matching result judge correctness based on another unique standard, so the result is not necessarily obtained. It is not acceptable. In addition, when associating speech signals with text, there may be cases where errors are mixed in the speech recognition results. In such cases, even if matching is performed to minimize costs, errors may occur. It is considered that it is included, and the correctness cannot be determined by evaluating the cost, and it must be determined by human judgment.

コストの最小化法などの機械的な手順でマッチングを行うことを『自動マッチング』と呼ぶこととする。自動機械で実行可能であるという意味の含みをもっているが、必ずしも機械で実行させなければならないという意味ではない。
平成１２年度視聴覚障害者向け放送ソフト制作技術研究開発プロジェクト最終報告書，通信・放送機構谷村正剛、中川裕志、“ドラマのビデオ音声トラックとシナリオのセリフの時刻同期法”、情報処理学会第１１８回知能と複雑系研究会研究報告、ＳＩＧ−ＩＣＳ，ｎｏ．１８８−４，ｐｐ．２５−３１ Richard O. Duda, Peter E. Hart and David G. Stork, 尾上守夫監訳、パターン識別、新技術コミュニケーションズ Matching by a mechanical procedure such as a cost minimization method is called “automatic matching”. Although it has the meaning of being executable on an automatic machine, it does not necessarily mean that it must be executed on a machine.
2000 Final Report on Broadcasting Software Production Technology Research and Development Project for Visual and Visually Impaired Persons, Communications and Broadcasting Organization Masatake Tanimura, Hiroshi Nakagawa, “Time Synchronization Method of Drama Video Audio Track and Scenario Dialogue”, Information Processing Society of Japan, 118th Research Report on Intelligence and Complex Systems, SIG-ICS, no. 188-4, pp. 25-31 Richard O. Duda, Peter E. Hart and David G. Stork, Translated by Morio Onoe, Pattern Identification, New Technology Communications

しかしながら、前述したコスト関数の最小化問題としてマッチングを求める方法で、異なるメディア系列のマッチングを求めようとしてもうまくいかない。 However, the above-described method for obtaining matching as a cost function minimization problem does not work even if an attempt is made to obtain matching of different media sequences.

台本の文字列と発話音声の音声認識結果の系列を対応付けることを例に考える。このような場合には、必ずしも原稿通りに発話がなされるとは限らない。また、番組制作の過程で、台本と異なるセリフが挿入されたり、削除されたり、セリフの順序が入れ替わる場合がある。また、現在の技術水準では、音声認識の誤りを無くすることは難しく、現実的には、音声認識結果は誤りを含むことを前提とする必要がある。従って、台本の文字列と、音声認識結果は、そもそも一致するとは限らない。 Consider as an example the correspondence between a script character string and a sequence of speech recognition results of uttered speech. In such a case, the utterance is not always made according to the manuscript. In the course of program production, words different from the script may be inserted or deleted, and the order of words may be changed. Also, with the current technical level, it is difficult to eliminate errors in speech recognition. In reality, it is necessary to assume that speech recognition results include errors. Therefore, the script character string and the speech recognition result do not always coincide with each other.

そのような場合でも、コスト関数を最小化するという意味で最善のマッチング結果が得られるのであるが、それが利用者にとって受容できるものであるとは限らない。利用者の望むマッチング結果の基準が一定であり、定式化できるものであれば、それらをコスト関数に反映させることによって、マッチング結果を利用者の望むマッチング結果に近づけることが可能であろう。しかしながら、利用者の希望を定式化することが困難な場合がある。また、利用者の希望がマッチングを実施する度に異なり、その都度定式化を行う作業が発生することを許容できない場合がある。従って、現実的には、利用者が各々望むようにマッチングの結果を修正する工程が必要となる。マッチングの修正工程では、利用者は番組と台本を見比べ、マッチング結果の正誤を判断しながらマッチング結果を修正することになる。 Even in such a case, the best matching result can be obtained in the sense of minimizing the cost function, but this is not necessarily acceptable to the user. If the criterion of the matching result desired by the user is constant and can be formulated, it will be possible to bring the matching result closer to the matching result desired by the user by reflecting them in the cost function. However, it may be difficult to formulate the user's wishes. In addition, there are cases where the user's desire is different each time matching is performed, and it is not allowed to generate a work for formulation each time. Therefore, in reality, a process of correcting the matching result as each user desires is required. In the matching correction process, the user compares the program with the script and corrects the matching result while judging whether the matching result is correct or incorrect.

また、大局的に修正しなければならない場合があり、このことがさらに問題を生む。例えば、利用者が望むマッチング結果とラベル１つ分ずれているような場合を想定すると、ずれている全てのラベルについて修正を施さなければならず、手数が大きくなってしまう。 In addition, there may be a need for global correction, which creates further problems. For example, assuming a case where the matching result desired by the user is deviated by one label, all the deviated labels must be corrected, which increases the number of steps.

また、修正すべきラベルがどこにいくつ存在するのかは、目視して確認しないとわからないのであるから、マッチング結果の修正作業を単純に行おうとすれば、マッチングの結果を最初から最後まで目視し、全ての対応について正誤を判定し、間違っている対応の全てについて正しい対応先へ対応させるという手順となろう。 In addition, since it is difficult to know where and how many labels should be corrected by visual inspection, if you try to correct the matching result simply, visually check the matching result from the beginning to the end. The procedure will be to determine whether the response is correct and correct, and to respond to all correct responses to the correct response destination.

一方、自動マッチングを用いず、最初から人手でマッチングを行おうとすれば、ラベル列Ｓ_Ａとラベル列Ｓ_Ｂを最初から最後まで目視し、全ての対応について正しい対応先へ対応させるという手順となろう。 On the other hand, without using the automatic matching, if attempted matching initially manually and visually label string S _A and the label string S _B from beginning to end, the procedure of adapt to the correct corresponding destination for all corresponding Let's go.

それでは、自動マッチングを実行した後にその誤りを修正する作業に要する時間Ｔ_１と、自動マッチングを用いず手作業でマッチングを行う作業に要する時間Ｔ_２にはどの程度の違いがあるであろうか。Ｔ１を定式化すると数式２のようになる。 Then, what is the difference between the time T ₁ required for correcting the error after executing automatic matching and the time T ₂ required for manually matching without using automatic matching? When T1 is formulated, Equation 2 is obtained.

Ｔ_１＝ｔ_ａｕｔｏ＋Ｎｔ_{ｃｈｅｃｋ}＋ｎ_ｍｉｓｓｔ_ｂｉｎｄ・・・数式２
ここで、ｔ_ａｕｔｏは、自動マッチングに要する時間、Ｎは対応の数、ｔ_{ｃｈｅｃｋ}は、対応１個について正誤を判定するのに要する時間、ｎ_ｍｉｓｓは、誤った対応の数、ｔ_ｂｉｎｄは、対応１個について正しい対応先を探し対応付けるのに要する時間とする。後者の時間Ｔ_２を定式化すると、数式３のようになる。 T ₁ = t _auto + Nt _check + n _miss t _bind.
Here, t _auto is the time required for automatic matching, N is the number of correspondences, t _check is the time required to determine whether each correspondence is correct, n _miss is the number of incorrect correspondences, and t _bind is The time required to find and associate the correct correspondence destination for one correspondence. When formulating latter time T _2, so that the Equation 3.

Ｔ_２＝Ｎｔ_{ｃｈｅｃｋ}＋Ｎｔ_ｂｉｎｄ
Ｔ_１とＴ_２の違いを大雑把に見積もる。ｔ_ａｕｔｏは機械で実行できるので非常に小さいとみなし０で近似する。また、目視で正誤を確認する作業も、正しい対応先へ対応付ける作業に比べると簡単で時間もかからないであろうから、ｔ_{ｃｈｅｃｋ}＜＜ｔ_ｂｉｎｄとみなして、ｔ_{ｃｈｅｃｋ}を０で近似する。すると、両者の比Ｔ_１／Ｔ_２は高々ｎ_ｍｉｓｓ／Ｎの程度となる。ここで、前述したように、対応の誤りの数は少なくても修正すべき手数は多くなってしまう場合があることを考慮すると、自動マッチングを使って単純な手順で手作業で修正する場合の手作業の負荷と、全て手作業でマッチングを行う場合の手作業の負荷の差は、前者が非常に小さいとはいえない。同時に、手作業でマッチングの結果を修正する方法で、短時間で済む方法、あるいは、手数の少ない方法があれば、なお有効であると考えられる。 T ₂ = Nt _check + Nt _bind
Estimate roughly the difference between T ₁ and _{T 2.} Since t _auto can be executed by a machine, it is regarded as being very small and approximated by zero. In addition, since the work for checking correctness by visual inspection will be simpler and less time-consuming than the work for associating with the correct correspondence destination, t _check is approximated by 0 _assuming t _check << t _bind . Then, the ratio T ₁ / T ₂ between them is at most about n _miss / N. Here, as mentioned above, considering that there are cases where the number of correspondence errors is small and the number of steps to be corrected may increase, the case of manual correction using a simple procedure using automatic matching The difference between the manual work load and the manual work load when matching is performed manually is not very small in the former case. At the same time, it is considered that it is still effective if there is a method for correcting the matching result manually, which requires a short time or a method with a small number of steps.

以上の議論から理解されるように、自動マッチングという機械で高速に実行できる方法を利用したとしても、マッチングの誤りの混入を避けられないのならば、結局はその速度的利点を充分享受することはできなくなってしまうという問題がある。 As can be understood from the above discussion, even if a method that can be executed at high speed by a machine called automatic matching is used, if it is unavoidable to introduce matching errors, the speed advantage will be fully enjoyed in the end. There is a problem that it will not be possible.

本発明は、上記の点に鑑みなされたもので、マッチングの修正作業において、作業に要する手数を削減するためのラベルの系列マッチングの誤り修正方法及び装置及びプログラム及びラベルの系列のマッチング誤り修正プログラムを格納したコンピュータ読み取り可能な記憶媒体を提供することを目的とする。 The present invention has been made in view of the above points, and in a correction process for matching, an error correction method and apparatus for label series matching and a program for reducing the number of work required for the correction, and a matching error correction program for a label series It is an object of the present invention to provide a computer-readable storage medium that stores information.

図１は、本発明の原理を説明するための図である。 FIG. 1 is a diagram for explaining the principle of the present invention.

本発明（請求項１）は、ラベル列Ｓ_Ａ＝｛ａ_１，ａ_２，ａ_３，…，ａ_ｎ｝の個々の要素に、ラベル列Ｓ_Ｂ＝｛ｂ_１，ｂ_２，ｂ_３，…，ｂ_ｍ｝のいずれかの要素を対応付けたラベル列のマッチング結果ｗを修正する情報処理装置を用いて、ラベルの系列のマッチング誤り修正方法において、
ラベルａとラベルｂが似ているほどに小さな値をとるコストｃ（ａ，ｂ）を利用し、ラベル列Ｓ _Ａからみたコストｃ（ａ，ｂ）の総和Ｃを最小とするコスト最小化法により、マッチング結果ｗと最小コストＣを求め、記憶手段に格納するマッチング処理過程（ステップ１）と、
マッチング結果ｗにおけるラベルａ _ｉとラベルａ _ｉに対応したラベルｂ _ｗ（ｉ）の対応に大きいコストｃ _ｉを与えて、コストｃ _ｉが反映されたコストｃ（ａ，ｂ）の総和Ｃ _ｉを最小とするコスト最小化法によりマッチング結果ｗ_ｉと最小コストＣ_ｉを求めた場合に、該最小コストＣ_ｉが小さく、かつ、マッチング結果ｗとｗ _ｉとの距離ｄが大きくなる１個もしくは、複数の候補ａ _ｉを選択する修正候補選択過程（ステップ２）と、
マッチング結果ｗを表示手段に出力すると共に、候補ａ _ｉを強調表示し、
操作者から候補ａ_ｉとｂ_ｗ（ｉ）の対応が正しいか誤っているかを示す正誤判定結果を取得し（ステップ３）、
候補ａ _ｉの対応が誤りの場合には、ラベルａ_ｉとｂ_ｗ（ｉ）の対応に大きいコストｃを与え、一方、候補ａ _ｉの対応が正しい場合には、ラベルａ _ｉとｂ _ｗ（ｉ）の対応に小さいコストｃを与え、該コストｃを記憶手段に格納するコスト付与過程（ステップ４）と、
コスト付与過程で変更されたコストｃに基づき、再度マッチング処理過程（ステップ１）以降の処理を繰り返す（ステップ５）繰り返し過程と、を行う。 The present invention (claim 1), the label string _{_{_{_{S A = {a 1, a}}}} 2, a 3, ..., a n} in individual elements of the label string _{_{_{S B = {b 1, b}}} 2, b 3 ,..., B _m } in a label sequence matching error correction method using an information processing apparatus that corrects a matching result w of a label sequence that associates any element of
Using the cost c (a, b) taking a small value enough to label a and labels b are similar, the cost minimization method that minimizes the sum C of the label string S _A viewed from the cost c (a, b) by obtains the matching results w and the minimum cost C, the matching process to be stored in remembers means (step 1),
A large cost c _i is given to the correspondence between the label a _i and the label b _{w (i)} corresponding to the label a _i in the matching result w, and the sum C _i of the cost c (a, b) reflecting the cost c _i is given. if by the cost minimization method to minimize Lima etching results were obtained w _i and the minimum cost C _i, said minimum cost C _i is small and, ing large distance d between the matching result w and w _i one or a correction candidate selection process for selecting a plurality of candidate a _i (step 2),
The matching result w is output to the display means, the candidate a _i is highlighted,
Get the correctness determination result indicating whether the corresponding OPERA author or al candidate a _i and b _{w (i)} is correct or incorrect (Step 3),
If the correspondence between the candidates a _i is incorrect, a large cost c is given to the correspondence between the labels a _i and b _{w (i)} , while if the correspondence between the candidates a _i is correct, the labels a _i and b _{w (} a cost giving process (step 4) in which a small cost c is given to the correspondence of _i) and the cost c is stored in the storage means;
Based on the cost c of a change in cost imparting process is performed again to repeat the matching process (Step 1) subsequent processing (step 5) repeating process, the.

また、本発明（請求項２）は、コスト付与過程において、
操作者がラベルａ_ｉとｂ_ｗ（ｉ）の対応を判定した際に、対応が誤っており、該ラベルａ_ｉの正しい対応先としてラベルｂ_ｊが選択された場合には、
ラベルａ_ｉとｂ_ｗ（ｉ）の対応に大きいコストｃを与えると共に、ラベルａ_ｉとラベルｂ_ｊの対応に小さいコストｃを与え、２つの該コストｃを記憶手段に格納する。 Further, the present invention (Claim 2), in the cost granting process,
When the operator determines the correspondence between the labels a _i and b _{w (i)} , if the correspondence is incorrect and the label b _j is selected as the correct correspondence destination of the label a _i ,
A large cost c is given to the correspondence between the labels a _i and b _{w (i)} , and a small cost c is given to the correspondence between the labels a _i and b _j , and the two costs c are stored in the storage means.

図２は、本発明の原理構成図である。 FIG. 2 is a principle configuration diagram of the present invention.

本発明（請求項３）は、ラベル列Ｓ_Ａ＝｛ａ_１，ａ_２，ａ_３，…，ａ_ｎ｝の個々の要素に、ラベル列Ｓ_Ｂ＝｛ｂ_１，ｂ_２，ｂ_３，…，ｂ_ｍ｝のいずれかの要素を対応付けたラベル列のマッチング結果ｗを修正するラベルの系列のマッチング誤り修正装置であって、
ラベルａとラベルｂが似ているほどに小さな値をとるコストｃ（ａ，ｂ）を利用し、ラベル列Ｓ _Ａからみたコストｃ（ａ，ｂ）の総和Ｃを最小とするコスト最小化法により、マッチング結果ｗと最小コストＣを求め、記憶手段に格納するマッチング処理手段１１１と、
マッチング結果ｗにおけるラベルａ _ｉとラベルａ _ｉに対応したラベルｂ _ｗ（ｉ）の対応に大きいコストｃ _ｉを与えてコストｃ _ｉが反映されたコストｃ（ａ，ｂ）の総和Ｃ _ｉを最小とするコスト最小化法によりマッチング結果ｗ_ｉと最小コストＣ_ｉを求めた場合に、該最小コストＣ_ｉが小さく、かつ、マッチング結果ｗとｗ _ｉとの距離ｄが大きくなる１個もしくは、複数の候補ａ _ｉを選択する修正候補選択手段１１２と、
マッチング結果ｗを表示手段に出力すると共に、候補ａ _ｉを強調表示し、操作者から候補ａ_ｉとｂ_ｗ（ｉ）の対応が正しいか誤っているかを示す正誤判定結果を取得するインタフェース手段１１３と、
候補ａ _ｉの対応が誤りの場合には、ラベルａ_ｉとｂ_ｗ（ｉ）の対応に大きいコストｃを与え、一方、候補ａ _ｉの対応が正しい場合には、ラベルａ _ｉとｂ _ｗ（ｉ）の対応に小さいコストｃを与え、該コストｃを記憶手段に格納するコスト付与手段１１４と、
コスト付与手段１１４で変更されたコストｃに基づき、再度マッチング処理手段１１１、修正候補選択手段１１２、インタフェース手段１１３、及びコスト付与手段１１５の処理を繰り返す繰り返し手段１１７と、を有する。 The present invention (claim 3), the label string _{_{_{_{S A = {a 1, a}}}} 2, a 3, ..., a n} in individual elements of the label string _{_{_{S B = {b 1, b}}} 2, b 3 ,..., B _m } , a label sequence matching error correction device that corrects the matching result w of the label sequence in which any of the elements is associated ,
Using the cost c (a, b) taking a small value enough to label a and labels b are similar, the cost minimization method that minimizes the sum C of the label string S _A viewed from the cost c (a, b) by obtains the matching results w and the minimum cost C, the matching processing unit 111 to be stored in the memorize means,
Giving a large cost c _i to the correspondence between the label a _i and the label b _{w (i)} corresponding to the label a _i in the matching result w, and minimizing the sum C _i of the cost c (a, b) reflecting the cost c _i if by the cost minimization method Lima etching results were obtained w _i and the minimum cost C _i to, said minimum cost C _i is small and, ing large distance d between the matching result w and w _i 1 Correction candidate selection means 112 for selecting one or a plurality of candidates a _i ,
It outputs the matching result w on the display means, to highlight the candidate a _i, interface means for acquiring correctness determination result indicating whether the corresponding OPERA author or al candidate a _i and b _{w (i)} are correct or incorrect 113,
If the correspondence between the candidates a _i is incorrect, a large cost c is given to the correspondence between the labels a _i and b _{w (i)} , while if the correspondence between the candidates a _i is correct, the labels a _i and b _{w (} a cost giving unit 114 that gives a small cost c to the correspondence of _i) and stores the cost c in a storage unit;
Based on the cost c changed by the cost giving means 114 , the matching processing means 111, the correction candidate selecting means 112, the interface means 113, and the repeating means 117 that repeats the process of the cost giving means 115 are included.

また、本発明（請求項４）は、コスト付与手段１１５において、
操作者がラベルａ_ｉとｂ_ｗ（ｉ）の対応を判定した際に、対応が誤っており、該ラベルａ_ｉの正しい対応先としてラベルｂ_ｊが選択された場合には、
ラベルａ_ｉとｂ_ｗ（ｉ）の対応に大きいコストｃを与えると共に、ラベルａ_ｉとラベルｂ_ｊの対応に小さいコストｃを与え、２つの該コストｃを記憶手段に格納する。 Further , the present invention (Claim 4) is provided in the cost giving means 115 .
When the operator determines the correspondence between the labels a _i and b _{w (i)} , if the correspondence is incorrect and the label b _j is selected as the correct correspondence destination of the label a _i ,
A large cost c is given to the correspondence between the labels a _i and b _{w (i)} , and a small cost c is given to the correspondence between the labels a _i and b _j , and the two costs c are stored in the storage means.

本発明（請求項５）は、請求項３または４記載のラベル系列のマッチング誤り修正装置を構成する各手段としてコンピュータを機能させるためのラベル系列のマッチング誤り修正プログラムである。 The present invention (Claim 5) is a label sequence matching error correction program for causing a computer to function as each means constituting the label sequence matching error correction apparatus according to claim 3 or 4.

本発明（請求項６）は、請求項５記載のラベル系列のマッチング誤り修正プログラムを格納した、コンピュータ読み取り可能な記憶媒体である。 The present invention (Claim 6) is a computer-readable storage medium storing the label series matching error correction program according to Claim 5.

本発明によれば、マッチング結果に誤りが存在する場合に、修正の影響の大きい順に誤りを修正することができるので、誤りを総当りで修正しなくとも、誤りの無いマッチングもしくは、誤りの少ないマッチング結果が少ない手数で得ることができる。 According to the present invention, when there is an error in the matching result, the error can be corrected in the order of the influence of the correction. Therefore, matching without error or few errors without correcting the error brute force. Matching results can be obtained with less effort.

以下、図面と共に本発明の実施の形態を説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

本発明は、マッチング結果の誤りを修正する際に、全ての誤りを総当りで修正するのではなく、修正の影響の大きい順に誤りを修正することによって、誤りのない状態へ速く到達させようという着想によってなされたものである。 The present invention aims to quickly reach an error-free state by correcting errors in the order of the influence of correction, instead of correcting all errors brute-force when correcting errors in matching results. It was made by an idea.

図３は、本発明の一実施の形態における動作の概要を示すフローチャートである。 FIG. 3 is a flowchart showing an outline of the operation according to the embodiment of the present invention.

まず、自動マッチングを行い、一時的に記憶手段に格納しておく（ステップ１０１）。 First, automatic matching is performed and temporarily stored in the storage means (step 101).

次に、自動マッチングの結果を解析し、修正を施すことにより、自動マッチングの結果が大きく変化し、なおかつ、マッチングのコストが大きく下がるような対応を探索する（ステップ１０２）。 Next, the automatic matching result is analyzed and corrected to search for a correspondence that greatly changes the automatic matching result and greatly reduces the matching cost (step 102).

上記の探索された対応を、人が目視により正誤を判定し（ステップ１０３）、誤っていたならば、正しい対応先を探し、正しい対応に高い一致度（すなわち、低コスト）を付与する。または、誤っている対応に、高いコスト（すなわち、低い一致度）を付与する（ステップ１０４）。 A person visually determines whether the searched correspondence is correct (step 103). If the correspondence is incorrect, a correct correspondence destination is searched for and a high degree of coincidence (that is, low cost) is given to the correct correspondence. Alternatively, a high cost (that is, low coincidence) is given to the erroneous response (step 104).

誤りがなくなったと判断されるまで、あるいは、誤りが減り、許容できるほど正解に近付いたと判断されるまで繰り返す（ステップ１０５）。 The process is repeated until it is determined that the error has disappeared, or until it is determined that the error has been reduced and the correct answer has been approached (step 105).

この手順によれば、修正すべき誤りがどこにあるか、目視により探さなくてもよいため、人の作業の負荷が軽減される。また、影響度の大きい誤りを修正してマッチングをやり直すため、波及効果で他の誤りよりも正しくなり、修正の手順の回数が減少し、人の作業の負荷が軽減されると期待される。 According to this procedure, since it is not necessary to visually find where the error to be corrected is, the burden of human work is reduced. In addition, since the error having a large influence is corrected and matching is performed again, it is expected that the error is more correct than the other errors due to the ripple effect, the number of correction procedures is reduced, and the human work load is reduced.

また、誤りが減少し、利用者が許容できる範囲内に達したと判断したならば正解にいたらなくとも途中で修正を止めてよい。 Further, if it is determined that the number of errors has decreased and the user has reached an acceptable range, correction may be stopped midway even if the answer is not correct.

上記のステップ１０４において、マッチングの変化の大小を評価するには、マッチング結果ｗ_１、ｗ_２間の距離ｄ（ｗ_１，ｗ_２）を例えば、数式４のように定義し、ｄの大小で評価することができる。δ_ｉｊはクロネッカーのδであり、ｉ＝ｊのとき１、それ以外のとき０をとる。 In the above step 104, in order to evaluate the magnitude of the change in matching, the distance d (w ₁ , w ₂ ) between the matching results w ₁ and w ₂ is defined as, for example, Equation 4, and the magnitude of d is Can be evaluated. δ _ij is Kronecker's δ, which is 1 when i = j, and 0 otherwise.

本明細書に記載する方法では、２つのラベル列Ｓ_Ａ＝｛ａ_１，ａ_２，ａ_３，…，ａ_ｎ｝、Ｓ_Ｂ＝｛ｂ_１，ｂ_２，ｂ_３，…，ｂ_ｍ｝のマッチング結果ｗ：［１，…，ｎ］→［１，…，ｍ］を求める方法であって、
（１）コスト最小化法によりマッチング結果ｗと最小コストＣを求め、
（２）１∈［１，…，ｎ］について、ラベルａ_ｉとｂ_ｗ（ｉ）の対応に高いコストを与えたと仮定してコスト最小化法によって、ラベル列Ｓ_Ａ，Ｓ_Ｂのｗ_ｉと最小コストＣｉを求めた場合に、Ｃｉが小さく、かつ、マッチング結果距離ｄ（ｗ，ｗ_ｉ）が大きくなるような１個もしくは複数のｉを選び、
（３）上記の選んだｉについて、ラベルａ_ｉとｂ_ｗ（ｉ）の対応の正誤を目視により判定し、
（４）上記ラベルａ_ｉとｂ_ｗ（ｉ）の対応が誤りの場合には、ラベルａ_ｉとｂ_ｗ（ｉ）の対応に高いコストを与え、
（５）（４）で変更されたコストを加味して再度コスト最小化法によりマッチング結果ｗと最小コストＣを求め、上記（２）から（５）の手順を繰り返すマッチング方法について説明する。

In the methods described herein, two labels column _{_{_{_{S A = {a 1, a}}}} 2, a 3, ..., a n}, S B = {b 1, b 2, b 3, ..., b m} A matching result w: [1,..., N] → [1,.
(1) Find the matching result w and the minimum cost C by the cost minimization method,
(2) 1∈ [1, ... , n] for, by a label _{a i} and _{b w} cost minimization method assuming gave corresponding to high cost of _(i), the label string _S A, the _{S B} _{w i} And the minimum cost Ci, one or a plurality of i is selected such that Ci is small and the matching result distance d (w, w _i ) is large,
(3) For the above selected i, the correctness of the correspondence between the labels a _i and b _{w (i)} is determined visually,
(4) If the correspondence between the labels a _i and b _{w (i)} is incorrect, a high cost is given to the correspondence between the labels a _i and b _{w (i)} ,
(5) A matching method in which the matching result w and the minimum cost C are obtained again by the cost minimization method in consideration of the cost changed in (4), and the procedures (2) to (5) are repeated will be described.

また、本発明では、上記のマッチング方法において、ステップ１０３において、ラベルａ_ｉとｂ_ｗ（ｉ）の対応の制御を判定すると共に、対応が誤っている場合には、ａ_ｉの正しい対応先ｂ_ｊを選び、ステップ１０４において、ラベルａ_ｉとｂ_ｗ（ｉ）の対応に高いコストを与えると共に、ラベルａ_ｉとｂ_ｊの対応に低いコストを与えるマッチング方法について説明する。 In the present invention, in the above matching method, in step 103, the control of the correspondence between the labels a _i and b _{w (i)} is determined, and if the correspondence is incorrect, the correct correspondence b of a _i _A matching method is described in which _j is selected, and in step 104, a high cost is given to the correspondence between labels a _i and b _{w (i)} and a low cost is given to the correspondence between labels a _i and b _j .

以下では、映像番組の発話区間と、台本のセリフを対応付けるという状況を例にとり説明する。 In the following, a description will be given by taking as an example a situation in which a speech section of a video program is associated with a script line.

図４は、本発明の一実施の形態におけるシステム構成を示す。 FIG. 4 shows a system configuration according to an embodiment of the present invention.

同図に示すシステムは、情報処理装置１０１、表示装置１０２、入力装置１０３から構成されている。 The system shown in FIG. 1 includes an information processing apparatus 101, a display apparatus 102, and an input apparatus 103.

操作者は、システムと対話しながらマッチング結果の修正を行う。対話のインタフェースとして、操作者へ情報を提示するために表示装置１０２が用いられ、操作者の操作を入力するために入力装置１０３が用いられる。 The operator corrects the matching result while interacting with the system. As a dialog interface, the display device 102 is used to present information to the operator, and the input device 103 is used to input the operation of the operator.

図５は、本発明の一実施の形態における表示処理装置に表示される画面の例を示す。同図に示す画面は、いわゆるグラフィカルユーザインタフェースとなっており、仮想的にボタン等が配置されている。 FIG. 5 shows an example of a screen displayed on the display processing device according to the embodiment of the present invention. The screen shown in the figure is a so-called graphical user interface, and buttons and the like are virtually arranged.

図５の領域２０１は、画面の表示領域を表している。領域２０２は、番組の発話区間を表示する領域である。発話区間２０６〜２０９は、各々が一つの発話区間を表しており、発話がなされている間の番組の映像の代表画像を表示している。代表画像は、例えば、発話開始時間の映像を静止画として抽出して作成する。本実施の形態では、発話区間を代表画像にて表すよう記述しているが、発話区間を表す方法はこれに限らず、他の方法でもよい。例えば、発話区間の開始時刻と終了時刻を表示してもよい。発話区間２０６〜２０９はボタンになっており、これを操作することにより、当該発話区間の映像（音声も含む）が再生されるものとする。スクロールバー２０３は、番組の中で所望の発話区間を表示させるためのものである。領域２０４は、台本のセリフを表示する領域である。 An area 201 in FIG. 5 represents a display area of the screen. An area 202 is an area for displaying an utterance section of a program. Each of the utterance sections 206 to 209 represents one utterance section, and displays a representative image of the video of the program while the utterance is being made. The representative image is created, for example, by extracting a video at the utterance start time as a still image. In the present embodiment, it is described that the utterance section is represented by the representative image, but the method of representing the utterance section is not limited to this, and other methods may be used. For example, the start time and end time of the utterance section may be displayed. The utterance sections 206 to 209 are buttons, and by operating the utterance sections 206 to 209, the video (including audio) of the utterance section is reproduced. The scroll bar 203 is for displaying a desired utterance section in the program. The area 204 is an area for displaying a script line.

セリフ２１０〜２１５は、各々が一区切りのセリフを表しており、セリフの文が表示されている。スクロールバー２０５は、番組の中で所望のセリフを表示するためのものである。矢印２１６は、発話区間とセリフの対応関係を表している。例えば、発話区間２０６とセリフ２１０が矢印で結ばれていると、発話区間２０６とセリフ２１０が対応していることを表している。領域２１７は、システムが操作者へ提示するメッセージを出力する領域である。ボタン２１８及び２１９は、システムが操作者にある対応の正誤を問い合わせたときに、操作者がシステムへ応答するためのものである。ボタン２１８は対応が正しいと答えるためのもので、ボタン２１９は対応が誤っていると答えるためのものである。ボタン２２０は、マッチング結果の修正の処理を終了するためのボタンである。 Each of the lines 210 to 215 represents a line of lines, and a sentence of the line is displayed. The scroll bar 205 is for displaying a desired line in the program. An arrow 216 represents the correspondence between the speech segment and the speech. For example, if the utterance section 206 and the dialogue 210 are connected by an arrow, it indicates that the utterance section 206 and the dialogue 210 correspond to each other. An area 217 is an area for outputting a message that the system presents to the operator. Buttons 218 and 219 are for the operator to respond to the system when the system inquires the operator of the correctness of the correspondence. The button 218 is for answering that the correspondence is correct, and the button 219 is for answering that the correspondence is incorrect. The button 220 is a button for ending the matching result correction process.

マッチングの例の説明の前に、入力データの準備について説明する。 Prior to the description of the matching example, preparation of input data will be described.

図６は、本発明の一実施の形態における番組と台本の対応付けの作業の全体の流れを表した図である。 FIG. 6 is a diagram showing an overall flow of work for associating a program and a script in one embodiment of the present invention.

同図において、映像ファイル３０１は、対応付けたい番組をディジタル化し、ＭＰＥＧ形式などに変換したファイルである。台本３０３は、対応付けたい台本をディジタル化し、テキストファイルとしたものである。発話区間抽出処理３０６は、映像ファイル３０１を入力とし、発話区間を抽出し、発話区間の列３０２を出力する。ラベルａ_ｉ（１≦ｉ≦ｎ）は、ｉ番目の発話区間を表すラベルとする。ｎは番組中に抽出された発話区間の総数である。発話区間を抽出する方法、例えば、「南憲一、阿久津明人、浜田洋、外村佳伸“音情報を用いた映像インデクシングとその応用”信学論Ｄ−ＩＩ，ｖｏｌ．Ｊ８１−Ｄ−ＩＩ，ｎｏ．３，ｐｐ．５２９−５３７，１９９８」に示す方法を利用することができる。セリフ抽出処理３０７は、台本３０３を入力とし、セリフの列３０４を出力する。セリフとは、つまり、台本上に表された１発話区間ということであるが、例えば、発話文を表すカギ括弧（「」）で囲まれた１発話と解釈する。セリフの区切り方は、これによらず、他の基準によって区切ってもよい。ラベルｂ_ｉ（１≦ｉ≦ｍ）は、ｉ番目のセリフを表すラベルとする。ｍは台本中のセリフの総数である。マッチング処理３１０は、映像ファイル３０１と、音声区間列３０２と、セリフ列３０４とを入力とし、操作者と対話をしつつ、発話区間列３０２とセリフ列３０４のマッチング結果３０５を求め、出力する。マッチング結果を記号ｗで表す。マッチングを行うのに必須なのは発話区間列３０２とセリフ列３０４だけであるが、操作者が対応の正誤を判定する際、映像を参照できると都合がよいため、映像ファイル３０１も入力するように構成している。 In the figure, a video file 301 is a file obtained by digitizing a program to be associated and converting it into an MPEG format or the like. A script 303 is obtained by digitizing a script to be associated with a text file. The speech segment extraction processing 306 receives the video file 301 as an input, extracts a speech segment, and outputs a speech segment column 302. The label a _i (1 ≦ i ≦ n) is a label representing the i-th utterance section. n is the total number of utterance sections extracted in the program. For example, “Kenichi Minami, Akito Akutsu, Hiroshi Hamada, Yoshinobu Tonomura“ Video indexing using sound information and its application ”, Science theory D-II, vol. J81-D-II, no. 3, pp. 529-537, 1998 "can be used. The serif extraction process 307 takes the script 303 as an input and outputs a serif string 304. In other words, the phrase means one utterance section represented on the script, but is interpreted as, for example, one utterance surrounded by square brackets (“”) representing an utterance sentence. The method of dividing the lines is not limited to this, and may be divided according to other criteria. The label b _i (1 ≦ i ≦ m) is a label representing the i-th line. m is the total number of lines in the script. The matching process 310 receives the video file 301, the audio segment sequence 302, and the serif sequence 304 as input, and obtains and outputs a matching result 305 between the utterance segment sequence 302 and the serif sequence 304 while interacting with the operator. The matching result is represented by the symbol w. Only the utterance section string 302 and the serif string 304 are essential for performing the matching, but it is convenient if the operator can refer to the video when determining the correctness of the correspondence, so the video file 301 is also input. is doing.

マッチングの実施に先立ち、コストの評価方法と、自動マッチングの方法を定めておく必要がある。 Prior to performing matching, it is necessary to determine a cost evaluation method and an automatic matching method.

ラベルとラベルのコストｃ（ａ_ｉ，ｂ_ｊ）の構成方法は種々のものが可能であるが、例えば、次のように構成する。発話区間ａ_ｉの発話の継続時間をｔ_ｉ、セリフｂ_ｊのモーラ数をｒ_ｊとする。文献「谷村正剛、中川裕志、“ドラマのビデオ音声トラックとシナリオのセリフの時刻同期法”、情報処理学会第118回知能と複雑系研究会研究報告、SIG-ICS，no．188-4，pp．25-31」によれば、発話区間の継続時間はセリフのモーラ数に近似的に比例することが知られている。比例定数（「継続時間」／「モーラ数」）をαとする。そして、数式５のようにラベル間のコストを定義する。 Various methods can be used to construct the label and the label cost c (a _i , b _j ). For example, the label is constructed as follows. Assume that the duration of the utterance in the utterance section a _i is t _i , and the number of mora in the line b _j is r _j . References: Masatake Tanimura and Hiroshi Nakagawa, “Time Synchronization Method for Drama Video and Audio Tracks and Scenarios”, Information Processing Society of Japan, 118th Research Report on Intelligence and Complex Systems, SIG-ICS, no.188-4, pp .25-31 ", it is known that the duration of the utterance interval is approximately proportional to the number of mora in the speech. The proportionality constant (“duration” / “number of mora”) is α. And the cost between labels is defined like Formula 5.

ラベル間のコストの定義は、この方法に限らず、例えば、音声認識技術を利用して発話の音声信号とセリフとの類似度を用いてもよいし他の方法を用いてもよい。

The definition of the cost between labels is not limited to this method. For example, the similarity between the speech signal of speech and the speech may be used by using a speech recognition technique, or another method may be used.

本発明は、コストの値を動的に変更させるという特徴を有するので、コスト関数は修正可能である必要がある。これを実現するために、ヒントリストを用いる。ヒントリストは、組（発話区間インデックス、セリフインデックス、コスト）の組のリストである。最初は、ヒントリストは空であるとする。ラベルａ_ｉとｂ_ｊのコストを評価するとき、ヒントリストを探索し、もし、ヒントリスト内に、発話区間インデックスがｉであり、セリフインデックスがｊであるような組が存在したら、この組のコストの値をａ_ｉとｂ_ｊのコストとする。もし、ヒントリスト内に存在しない場合は、数式５によってコストを算出する。このようなコスト値を上書きできるようにしたコスト関数を、 Since the present invention has a feature of dynamically changing the cost value, the cost function needs to be modifiable. To achieve this, a hint list is used. The hint list is a list of pairs (speech interval index, serif index, cost). Initially, the hint list is empty. When evaluating the cost of labels a _i and b _j , search the hint list, and if there is a pair in the hint list whose utterance interval index is i and serif index is j, Let cost values be the costs of a _i and b _j . If it does not exist in the hint list, the cost is calculated by Equation 5. A cost function that allows you to overwrite these cost values,

と表す。

It expresses.

自動マッチングの方法は、一般の非線形最小化問題の解法を用いる。例えば、焼きなまし法（アニーリング法）を用いることができる（文献：Richard O. Duda, Peter E. Hart and David G. Stock, 尾上守夫監訳、パターン識別、新技術コミュニケーションズ）。 The automatic matching method uses a general solution for the nonlinear minimization problem. For example, the annealing method (annealing method) can be used (literature: Richard O. Duda, Peter E. Hart and David G. Stock, supervised by Morio Onoe, pattern identification, New Technology Communications).

次に、本発明のラベルの系列マッチング結果の誤り修正方法の説明を行う前に、図4における情報処理装置１０１の詳細構成について説明する。 Next, the detailed configuration of the information processing apparatus 101 in FIG. 4 will be described before explaining the error correction method for label sequence matching results of the present invention.

図７は、本発明の一実施の形態におけるマッチング誤り修正装置の構成を示す。 FIG. 7 shows the configuration of a matching error correction apparatus according to an embodiment of the present invention.

同図に示す装置は、図４に示す情報処理装置１０１に対応する。 The apparatus shown in the figure corresponds to the information processing apparatus 101 shown in FIG.

同図に示す装置１０１は、マッチング処理部１１１、修正候補選択部１１２、インタフェース部１１３、入力判定部１１４、コスト付与部１１５、ヒントリスト１１６、及び終了判定部１１７から構成される。 The apparatus 101 shown in the figure includes a matching processing unit 111, a correction candidate selection unit 112, an interface unit 113, an input determination unit 114, a cost assignment unit 115, a hint list 116, and an end determination unit 117.

マッチング処理部１１１は、図６におけるマッチング処理３０８を行い、マッチング結果をメモリ等の記憶手段に格納する。 The matching processing unit 111 performs the matching processing 308 in FIG. 6 and stores the matching result in a storage unit such as a memory.

修正候補選択部１１２は、操作者に提示するための修正候補を選択する。修正候補とは、修正を行うことでマッチングの結果がより大きく変化し、かつ、マッチングのコストがより小さくなると期待される対応の候補である。 The correction candidate selection unit 112 selects a correction candidate to be presented to the operator. The correction candidate is a candidate for correspondence that is expected to change the matching result more greatly by performing the correction, and to reduce the matching cost.

インタフェース部１１３は、表示装置１０２及び入力装置１０３との情報のやり取りを行う。 The interface unit 113 exchanges information with the display device 102 and the input device 103.

入力判定部１１４は、操作者からどのボタンが押下されたか等を判定する。 The input determination unit 114 determines which button has been pressed by the operator.

コスト付与部１１５は、入力判定部１１４において、操作者からの判断（対応が誤り、対応が正しい）に基づいて、コスト関数にコストを付与する。 The cost assigning unit 115 assigns a cost to the cost function in the input determining unit 114 based on the judgment from the operator (correspondence is incorrect and the correspondence is correct).

ヒントリスト１１６は、メモリ等の記憶手段であり、発話区間、インデックス、セリフインデックス、コストの組からなるリストである。 The hint list 116 is a storage unit such as a memory, and is a list including a set of an utterance section, an index, a serif index, and a cost.

終了判定部１１７は、全ての修正候補に対する処理が終了したか否か、または、操作者が終了ボタンを押下したかを判定する。 The end determination unit 117 determines whether or not the processing for all the correction candidates has ended, or whether or not the operator has pressed the end button.

図８は、本発明の一実施の形態におけるマッチング誤り修正処理のフローチャートである。 FIG. 8 is a flowchart of matching error correction processing according to the embodiment of the present invention.

ステップ４０１）まず、マッチング処理部１１１における自動マッチングにより、発話区間とセリフのマッチング結果ｗを求める。ｉ番目の発話区間とｊ番目のセリフの対応のコストを Step 401) First, a matching result w between the speech section and the speech is obtained by automatic matching in the matching processing unit 111. The cost of correspondence between the i-th speech segment and the j-th line

とし、マッチングのコストＣを最小化するようなマッチング結果ｗを求め、このときの最小コストをＣ_ｍｉｎとする。最小化には焼きなまし法などを用いることができる。

And a matching result w that minimizes the matching cost C is obtained, and the minimum cost at this time is defined as C _min . An annealing method or the like can be used for minimization.

そして、得られたマッチング結果の状態をインタフェース部１１３を介して表示装置１０２に表示する。表示の方法は、領域２０２中に、発話区間ごとに代表画像を表示し、領域２０４の中にセリフ毎にセリフの文面を表示する。対応している発話区間ａ_ｉとセリフｂ_ｗ（ｉ）の間には、領域２１６に矢印を表示する。なお、発話区間ａ_ｉとセリフｂ_ｗ（ｉ）が領域２０２、２０４内にない時は、矢印を表示せずともよい。代表画像は、映像ファイル３０１から、発話区間ａ_ｉの発話開始時刻の画像を静止画として抽出し、これを代表画像とする。

The state of the obtained matching result is displayed on the display device 102 via the interface unit 113. In the display method, a representative image is displayed for each utterance section in the area 202, and a sentence text is displayed for each line in the area 204. An arrow is displayed in the area 216 between the corresponding utterance section a _i and the dialogue b _{w (i)} . When the speech section a _i and the speech b _{w (i)} are not in the areas 202 and 204, the arrow may not be displayed. As the representative image, an image at the utterance start time of the utterance section a _i is extracted from the video file 301 as a still image, and this is used as the representative image.

ステップ４０２）修正候補選択部１１２において、Ｐ個の修正候補の対応を選択する。修正の候補となる対応は、その対応を修正することにより、マッチングの結果がより大きく変化し、なおかつ、マッチングのコストがより小さくなると期待される対応である。その選択の方法の詳細は後述する。 Step 402) In the correction candidate selection unit 112, the correspondence of P correction candidates is selected. The correspondence that is a candidate for the correction is a correspondence that is expected to change the matching result more greatly and reduce the matching cost by correcting the correspondence. Details of the selection method will be described later.

当該処理の結果、Ｐ個の発話区間のインデックス値が配列ＸのＸ［１］〜Ｘ［Ｐ］に格納されるものとする。Ｐは定数である必要はなく、当該処理を実行する都度に異なる値でもよい。また、Ｐは１でもよい。 As a result of the processing, it is assumed that the index values of the P speech sections are stored in X [1] to X [P] of the array X. P need not be a constant, and may be different each time the process is executed. P may be 1.

ステップ４０３）変数ｐに１を代入する。 Step 403) Assign 1 to the variable p.

ステップ４０４）ｐ番目の発話区間の修正候補のインデックスＸ［ｐ］を、変数ｉに代入する。 Step 404) The index X [p] of the correction candidate of the p-th utterance interval is substituted into the variable i.

ステップ４０５）発話区間ａ_ｉとセリフｂ_ｗ（ｉ）を結ぶ矢印を表示装置１０２に強調表示する。強調表示の方法は、色を変えてもよいし、点滅させてもよいし、他の方法でもよい。発話区間ａ_ｉとセリフｂ_ｗ（ｉ）が領域２０２、２０４内にないときは、スクロールさせ、ともに表示されるような状態にする。しかる後、メッセージ出力領域２１７に、対応の正誤の入力要求を表示する。入力要求の表示は、例えば、「この対応は正しいですか？」と表示する。 Step 405) The arrow connecting the speech segment a _i and the speech b _{w (i)} is highlighted on the display device 102. The highlighting method may be changed in color, blinked, or other methods. When the speech section a _i and the speech b _{w (i)} are not in the areas 202 and 204, they are scrolled so that they are displayed together. Thereafter, a corresponding correct / incorrect input request is displayed in the message output area 217. The input request is displayed, for example, “Is this correspondence correct?”.

ステップ４０６）情報処理装置１０１は、画面上のボタンのいずれかが押されるまで待機する。 Step 406) The information processing apparatus 101 stands by until any button on the screen is pressed.

情報処理装置１０１が待機している間、操作者はステップ４０５での入力要求に促され、強調表示されている矢印の指している発話区間ａ_ｉとセリフｂ_ｗ（ｉ）を吟味する。必要であれば、代表表示画像を押下し、当該発話区間の映像（音声も含む）を再生し、発話内容を聴き取り、セリフｂ_ｗ（ｉ）との同一性を判定する。必要であればスクロールバー２０３，２０５を操作して、当該発話区間の前後や、当該セリフの前後を確認してもよい。操作者がこの対応が正しいと判定した場合には、「正」ボタン２１８を押下し、誤りと判定した場合には「誤」ボタン２１９を押下する。「全体を確認」ボタン２２０を押下するケースについては後述する。 While the information processing apparatus 101 is on standby, the operator is prompted by the input request in step 405, and examines the utterance section a _i and the line b _{w (i)} pointed to by the highlighted arrow. If necessary, the representative display image is pressed, the video (including audio) of the utterance section is reproduced, the utterance content is heard, and the identity with the speech b _{w (i)} is determined. If necessary, the scroll bars 203 and 205 may be operated to check before and after the speech segment and before and after the speech. When the operator determines that this correspondence is correct, the operator presses the “correct” button 218, and when the operator determines that the error is correct, the operator presses the “false” button 219. The case where the “confirm all” button 220 is pressed will be described later.

操作者がいずれかのボタンを押下すると、処理はステップ４０７へ移行する。 When the operator presses any button, the process proceeds to step 407.

ステップ４０７）入力判定部１１４は、押下されたボタンが「正」ボタン２１８であるか判定し、そうであれば、ステップ４１１に移行し、そうでなければステップ４０８へ移行する。 Step 407) The input determination unit 114 determines whether the pressed button is the “correct” button 218. If so, the process proceeds to step 411. Otherwise, the process proceeds to step 408.

ステップ４０８）押下されたボタンが「誤」ボタン２１９であるか判定し、そうであれば、ステップ４１２に移行する。そうでなければステップ４０９に移行する。 Step 408) It is determined whether or not the pressed button is the “wrong” button 219. If so, the process proceeds to Step 412. Otherwise, the process proceeds to step 409.

ステップ４０９）押下されたボタンが「終了」ボタン２２０であるか判定し、そうであれば、ステップ４１０へ移行し、そうでなければステップ４０６に戻り、また別のボタンが押下されるまで待機する。 Step 409) It is determined whether the pressed button is the “End” button 220. If so, the process proceeds to Step 410, otherwise returns to Step 406 and waits until another button is pressed. .

ステップ４１０）現状のマッチング結果ｗを出力し、処理を終了する。 Step 410) The current matching result w is output and the process is terminated.

ステップ４１１）当該ステップに処理が移行した場合は、発話区間ａ_ｉとセリフｂ_ｗ（ｉ）の対応が正しいと操作者が判定したのであるから、コスト付与部１１５は、 Step 411) When the process shifts to this step, the operator has determined that the correspondence between the utterance section a _i and the speech b _{w (i)} is correct.

に有利なコストを付与する。そのために、ヒントリスト１１６に、組（ｉ，ｗ（ｉ），ε）を挿入する。εは、定数であり、小さな数である。例えば、ε＝０とする。その後、ステップ４１７に移行する。
ステップ４１２）当該ステップに処理が移行した場合は、発話区間ａ_ｉとセリフｂ_ｗ（ｉ）の対応が誤りと操作者が判定したのであるから、

To give a favorable cost. For this purpose, a set (i, w (i), ε) is inserted into the hint list 116. ε is a constant and a small number. For example, ε = 0. Thereafter, the process proceeds to step 417.
Step 412) When the process moves to the step, the operator determines that the correspondence between the speech section a _i and the speech b _{w (i)} is incorrect.

に不利なコストを付与する。そのため、ヒントリスト１１６に、組（ｉ，ｗ（ｉ），Ω）を挿入する。Ωは、定数であり、充分大きな数である。例えば、その後、ステップ４１３に以降する。

Is disadvantageous. Therefore, the set (i, w (i), Ω) is inserted into the hint list 116. Ω is a constant and is a sufficiently large number. For example, the process proceeds to step 413 thereafter.

ステップ４１３）コスト付与部１１５において、発話区間ａ_ｉに対応する正しいセリフはどれか、操作者に選択を促すよう選択を要求する。例えば、メッセージ出力領域２１７に、「この発話区間に対応する正しいセリフを選択して下さい」と表示する。 Step 413) The cost giving unit 115 requests the operator to select which of the correct lines corresponding to the utterance section a _i is prompted. For example, the message output area 217 displays “Please select the correct line corresponding to this utterance section”.

ステップ４１４）情報処理装置１０１は、セリフの一つが選択されるまで待機する。 Step 414) The information processing apparatus 101 waits until one of the lines is selected.

当該ステップにおいて待機している間、操作者は、ステップ４１３での入力要求に促され、強調表示されている矢印の指している発話区間ａ_ｉと、セリフの系列とを吟味する。必要であれば、代表画像を押下し、当該発話区間の映像(音声も含む)を再生し、発話内容を聴き取る。スクロールバー２０５を操作して、対応するセリフを探し、選択する。選択の操作は、例えば、表示領域２０４の対応するセリフの部分を押下することでなされるものとする。 While waiting in this step, the operator is prompted by the input request in step 413 and examines the speech segment a _i pointed to by the highlighted arrow and the line of speech. If necessary, the representative image is pressed, the video (including audio) of the utterance section is reproduced, and the utterance content is listened to. The scroll bar 205 is operated to search for and select a corresponding line. For example, the selection operation is performed by pressing a corresponding line portion of the display area 204.

操作者がいずれかのセリフを押下すると、処理は、ステップ４１５へ移行する。 When the operator presses any line, the process proceeds to step 415.

ステップ４１５）コスト付与部１１５は、セリフのインデックスｊを求める。操作者が選択したセリフをｂ_ｊと表すことにする。 Step 415) The cost assigning unit 115 obtains the serif index j. The line selected by the operator is represented as b _j .

ステップ４１６）発話区間ａ_ｉとセリフｂ_ｊの対応が正しいと操作者が判定したのであるから、コスト付与部１１５は、 Step 416) Since the operator determines that the correspondence between the utterance sections a _i and the lines b _j is correct, the cost granting unit 115

に有利なコストを付与する。そのために、ヒントリスト１１６に、組(ｉ，ｊ，ε)を挿入する。その後、ステップ４１７へ移行する。

To give a favorable cost. For this purpose, the set (i, j, ε) is inserted into the hint list 116. Thereafter, the process proceeds to step 417.

ステップ４１７）変数ｐに値ｐ+１を代入する。 Step 417) The value p + 1 is substituted into the variable p.

ステップ４１８）ｐとＰを比較し、ｐ≦Ｐであれば、修正候補をまだ尽くしていないので、ステップ４０４へ以降する。そうでなければ、修正候補を尽くしたので、ステップ４１９に移行する。 Step 418) Compare p and P, and if p ≦ P, the correction candidates have not been exhausted, and the process goes to Step 404. Otherwise, since the correction candidates are exhausted, the process proceeds to step 419.

ステップ４１９）再度マッチング処理部１１１において、自動マッチングを実行する。コスト関数は、前回と同じく数式６を用いるが、ヒントリスト１１６の内容が前回と異なっているため、前回のマッチングとは異なるマッチング結果ｗが得られる。その後ステップ４０２へ戻り、再度修正候補を選択する。 Step 419) In the matching processing unit 111 again, automatic matching is executed. The cost function uses Equation 6 as in the previous case, but since the contents of the hint list 116 are different from the previous time, a matching result w different from the previous matching is obtained. Thereafter, the process returns to step 402 to select a correction candidate again.

次に、上記のステップ４０２の処理の詳細について説明する。 Next, details of the processing in step 402 will be described.

図９は、本発明の一実施の形態における修正候補選択処理のフローチャートである。 FIG. 9 is a flowchart of correction candidate selection processing according to an embodiment of the present invention.

ステップ５０１）リストＬを空にする。 Step 501) The list L is emptied.

ステップ５０２）変数ｊに1を代入する。 Step 502) Assign 1 to the variable j.

ステップ５０３）ヒントリスト１１６に、発話区間インデックスがｊであるような組が存在するか検査する。もし存在しているならば、発話区間ｊは、既に操作者が何らかの正誤を判定したことを意味しているので、修正候補に入れる必要はないから、以下の処理をスキップし、ステップ５０９へ移行する。存在していなければステップ５０４に移行する。 Step 503) It is checked whether or not there is a pair whose utterance section index is j in the hint list 116. If it exists, the utterance section j means that the operator has already determined whether it is correct or not, so there is no need to enter it as a correction candidate, so the following processing is skipped and the process proceeds to step 509. To do. If not, the process proceeds to step 504.

ステップ５０４）ヒントリスト１１６に組（ｊ，ｗ（ｉ），Ω）を挿入する。このようにして、発話区間ｊとセリフｗ（ｊ）の対応が誤りと見做される状況を模擬的に設定する。 Step 504) Insert the pair (j, w (i), Ω) into the hint list 116. In this way, a situation is assumed in which the correspondence between the utterance section j and the speech w (j) is regarded as an error.

ステップ５０５）マッチング処理部１１１において、自動マッチングを実行する。自動マッチングの方法は、ステップ４０１及びステップ４１９と同様である。但し、マッチング結果をｗ_ｊ、最小コストをＣ_ｊとする。 Step 505) The matching processing unit 111 executes automatic matching. The automatic matching method is the same as in steps 401 and 419. Here, the matching result is w _j and the minimum cost is C _j .

ステップ５０６）修正候補適合度Ｈ_ｊを求める。Ｈ_ｊは、発話区間ｊが修正候補として適しているかを評価するための量である。本実施の形態では、
Ｈ_ｊ＝Ｃ_ｊ−ｋ_ｄ（ｗ，ｗｊ）
により評価する。ここで、ｋは予め定めた定数とする。マッチングのコストが小さいほどＨ_ｊの値も小さくなり、マッチング結果ｗとｗ_ｊとの距離が大きくなるほどＨ_ｊの値は小さくなる。なお、Ｈ_ｊの評価方法はこれに限らず、他の方法を用いてもよい。 Step 506) A correction candidate fitness H _j is obtained. H _j is an amount for evaluating whether the utterance section j is suitable as a correction candidate. In this embodiment,
H _j = C _j −k _d (w, wj)
Evaluate by Here, k is a predetermined constant. Value of about H _j cost matching is less reduced, matching the value of the result w and w _j and distances larger the H _j of the smaller. Note that the method of evaluating H _j is not limited to this, and other methods may be used.

ステップ５０７）リストＬにペア(ｊ、Ｈ_ｊ)を挿入する。 Step 507) Insert the pair (j, H _j ) into the list L.

ステップ５０８）ヒントリスト１１６から組(ｊ，ｗ（ｊ），Ω)を削除し、元の状態に戻す。 Step 508) The pair (j, w (j), Ω) is deleted from the hint list 116, and the original state is restored.

ステップ５０９）変数ｊに値ｊ+１を代入する。 Step 509) The value j + 1 is substituted into the variable j.

ステップ５１０）ｊと、発話区間の数ｎを比較すると、ｊ≦ｎであれば、ステップ５０３に戻り、そうでなければステップ５１１に移行する。 Step 510) When j is compared with the number n of utterance sections, if j ≦ n, the process returns to Step 503, and if not, the process proceeds to Step 511.

ステップ５１１）リストＬを、修正候補適合度Ｈｊの昇順でソートする。その結果、リストの先頭ほど、Ｈｊの値が小さいペアが配置される。 Step 511) The list L is sorted in ascending order of the correction candidate fitness Hj. As a result, pairs with smaller values of Hj are arranged at the top of the list.

ステップ５１２）リストＬの先頭からＰ個のペアを取り出し、発話区間インデックスの値ｊを、配列Ｘに順に格納する。 Step 512) P pairs are extracted from the head of the list L, and the value j of the speech segment index is stored in the array X in order.

以上が、ステップ４０２の処理の詳細である。 The details of the processing in step 402 have been described above.

なお、上記の実施の形態では、全探索で修正候補を選び出すような形態を示したが、同等の結果を得られるならば、他の探索アルゴリズムを用いてもかまわない。 In the above embodiment, the correction candidate is selected by full search. However, other search algorithms may be used as long as an equivalent result can be obtained.

本実施の形態では、ステップ４１１及びステップ４１６で、正しい対応に有利なコストを付与する例を示したが、代わりに、正しい対応以外の対応に不利なコストを付与するようにしてもよい。 In the present embodiment, an example of giving a cost advantageous for correct handling in steps 411 and 416 has been shown, but instead, a cost disadvantageous for handling other than correct handling may be given.

また、ステップ４１２で、特定のラベルの対応が誤りであることを反映させるために、当該ラベルの対応のコストを多角するような方法を示したが、その代わりに、当該ラベルが対応しないような制約条件を自動マッチングに盛り込むようにしてもよい。そのためには、Ωを無限大に設定するという方法や、両ラベル列を当該ラベルの前後に分け、ラベルより前のラベル列同士をマッチングし、ラベルより後のラベル列同士をマッチングし、その結果を結合するという方法などで行ってもよい。 Also, in step 412, in order to reflect that the correspondence of a specific label is incorrect, a method of diversifying the cost of correspondence of the label has been shown, but instead, the label does not correspond. You may make it incorporate a restriction condition in automatic matching. For that purpose, Ω is set to infinity, or both label columns are divided before and after the label, the label columns before the label are matched, the label columns after the label are matched, and the result You may carry out by the method of couple | bonding.

上記の図８のフローチャートは、請求項１（請求項３）及び請求項２（請求項４）を含めた動作となっているが、請求項１（請求項３）に関する処理のみを実行する場合には、ステップ４１３からステップ４１６を省略したものを実行する。

The flowchart of FIG. 8 is an operation including Claim 1 (Claim 3 ) and Claim 2 (Claim 4 ), but only the processing related to Claim 1 (Claim 3 ) is executed. Is executed by omitting step 413 to step 416.

上述したように、本発明の処理を用いることにより、マッチングに誤りが存在する場合に、修正の影響の大きい順に誤りを修正できるので、誤りを総当りで修正しなくとも、誤りの無いマッチング、もしくは、誤りの少ないマッチング結果が少ない手数で得ることができる。 As described above, by using the processing of the present invention, when there is an error in matching, the error can be corrected in the order of the influence of the correction. Therefore, matching without error without correcting the error brute force, Alternatively, a matching result with few errors can be obtained with less effort.

なお、上記の図８、図９に示す動作をプログラムとして構築し、情報処理装置１０１のようなコンピュータにインストールして実行する、または、ネットワークを介して流通させることも可能である。 The operations shown in FIGS. 8 and 9 can be constructed as a program, installed in a computer such as the information processing apparatus 101 and executed, or distributed via a network.

また、構築されたプログラムを、コンピュータに接続されるハードディスクや、フレキシブルディスク、ＣＤ−ＲＯＭ等の可搬記憶媒体に格納しておき、コンピュータにインストールすることも可能である。 Further, the constructed program can be stored in a portable storage medium such as a hard disk, a flexible disk, or a CD-ROM connected to the computer, and installed in the computer.

なお、本発明は、上記の実施の形態に限定されることなく、特許請求の範囲内において、種々変更・応用が可能である。 The present invention is not limited to the above-described embodiment, and various modifications and applications can be made within the scope of the claims.

本発明の処理が適用されるのは、発話区間とセリフのマッチングに限られるものではなく、文章の原稿と発生された音声信号のマッチングや、映像番組のショットと台本のカットとのマッチングなど、広範囲に適用可能なものである。 The processing of the present invention is not limited to the matching of the speech section and the speech, but the matching of the sentence manuscript and the generated audio signal, the matching of the shot of the video program and the script cut, etc. It is applicable to a wide range.

本発明の原理を説明するための図である。It is a figure for demonstrating the principle of this invention. 本発明の原理構成図である。It is a principle block diagram of this invention. 本発明の一実施の形態における動作の概要を示すフローチャートである。It is a flowchart which shows the outline | summary of operation | movement in one embodiment of this invention. 本発明の一実施の形態におけるシステム構成図である。1 is a system configuration diagram according to an embodiment of the present invention. 本発明の一実施の形態における表示処理装置に表示される画面の例である。It is an example of the screen displayed on the display processing apparatus in one embodiment of this invention. 本発明の一実施の形態における番組と台本の対応付けの作業の全体の流れを示す図である。It is a figure which shows the whole flow of the operation | work of matching of the program and a script in one embodiment of this invention. 本発明の一実施の形態におけるマッチング誤り修正装置の構成図である。It is a block diagram of the matching error correction apparatus in one embodiment of this invention. 本発明の一実施の形態におけるマッチング誤り修正処理のフローチャートである。It is a flowchart of the matching error correction process in one embodiment of the present invention. 本発明の一実施の形態における修正候補選択処理のフローチャートである。It is a flowchart of the correction candidate selection process in one embodiment of the present invention.

Explanation of symbols

１０１情報処理装置
１０２表示装置
１０３入力装置
１１１マッチング処理手段、マッチング処理部
１１２修正候補選択手段、修正候補選択部
１１３インタフェース手段、インタフェース部
１１４入力判定部
１１５コスト付与手段、コスト付与部
１１７繰り返し手段、終了判定部
２０１表示領域
２０２番組の発話区間を表示する領域
２０３スクロールバー
２０４台本のセリフを表示する領域
２０５スクロールバー
２０６〜２０９発話区間
２１０〜２１５一区切りのセリフ
２１６矢印（発話区間とセリフの対応関係）
２１７システムが操作者へ提示するメッセージを出力する領域
２１８，２１９操作者がシステムへ応答するためのボタン
２２０マッチングの修正の処理を終了するためのボタン
３０１映像ファイル
３０２音声区間列
３０３台本
３０４セリフ列
３０５マッチング
３０６音声認識処理
３０７セリフ抽出処理
３０８マッチング処理 101 Information Processing Device 102 Display Device 103 Input Device 111 Matching Processing Unit, Matching Processing Unit 112 Correction Candidate Selection Unit, Correction Candidate Selection Unit 113 Interface Unit, Interface Unit 114 Input Determination Unit 115 Cost Giving Unit, Cost Giving Unit 117 Repeating Unit, End determination unit 201 Display area 202 Area for displaying program utterance section 203 Scroll bar 204 Area for displaying script lines 205 Scroll bar 206 to 209 Speech section 210 to 215 Line segment 216 )
217 Fields 218 and 219 for outputting a message to be presented to the operator by the system Button 220 for the operator to respond to the system Button 220 for finishing the process of correcting the matching 301 Video file 302 Audio section string 303 Script 304 Serif string 305 Matching 306 Speech recognition processing 307 Line extraction processing 308 Matching processing

Claims

Label string _{_{_{_{S A = {a 1, a}}}} 2, a 3, ..., a n} in individual elements of the label string _{_{_{_{S B = {b 1, b}}}} 2, b 3, ..., b m} either In a method for correcting a matching error of a label sequence, using an information processing apparatus that corrects a matching result w of a label sequence in which elements of
Using the cost c (a, b) taking a small value enough to label a and labels b are similar, the cost minimization method that minimizes the sum C of the label string S _A viewed from the cost c (a, b) by obtains the matching results w and the minimum cost C, a matching processing step of storing in the memorize means,
Giving corresponding to higher cost _{c i} labels _{a i} and labels _{b w} corresponding to the label _{a i} _(i) in the matching result w, the sum C of the cost _{c i} is reflected in the cost c (a, b) when due to the cost minimization method to minimize the _i Lima etching results sought w _i and the minimum cost C _i, said minimum cost C _i is small and the distance d between the matching result w and w _i one large ing or a modified candidate selection step of selecting a plurality of candidate a _i,
Outputting the matching result w to the display means, highlighting the candidate a _i ,
Get the correctness determination result indicating whether the corresponding OPERA author or al candidate a _i and b _{w (i)} are correct or incorrect,
If the correspondence between the candidates a _i is incorrect, a large cost c is given to the correspondence between the labels a _i and b _{w (i)} , while if the correspondence between the candidates a _i is correct, the labels a _i and b _{w (} a cost giving process of giving a small cost c to the correspondence of _i) and storing the cost c in the storage means;
Based on the cost c changed in the cost granting process, repeating the process after the matching process again,
A method for correcting a matching error of a series of labels, characterized in that:

In the cost granting process,
When the operator determines the correspondence between the labels a _i and b _{w (i)} , if the correspondence is incorrect and the label b _j is selected as the correct correspondence destination of the label a _i ,
Together give pre Symbol label a _i and b _w corresponding to a large cost c of _(i), giving a small cost c in correspondence of the label a _i and the label b _j, and stores the two said cost c in the storage means The method for correcting a matching error in a label sequence according to claim 1.

Label string _{_{_{_{S A = {a 1, a}}}} 2, a 3, ..., a n} in individual elements of the label string _{_{_{_{S B = {b 1, b}}}} 2, b 3, ..., b m} either A label sequence matching error correction device that corrects a matching result w of a label sequence that associates elements of
Using the cost c (a, b) taking a small value enough to label a and labels b are similar, the cost minimization method that minimizes the sum C of the label string S _A viewed from the cost c (a, b) by obtains the matching results w and the minimum cost C, a matching processing means for storing in the memorize means,
A large cost c _i is given to the correspondence between the label a _i and the label b _{w (i)} corresponding to the label a _i in the matching result w, and the total C _i of the cost c (a, b) reflecting the cost c _i is obtained. if by the cost minimization method to minimize Lima etching results were obtained w _i and the minimum cost C _i, said minimum cost C _i is small and it increases the distance d between the matching result w and w _i one or a correction candidate selecting means for selecting a plurality of candidate a _i that,
And outputs to the display unit the matching result w, wherein the candidate a _i highlight acquires correctness determination result indicating whether the corresponding OPERA author or al candidate a _i and b _{w (i)} are correct or incorrect Interface means;
If the correspondence between the candidates a _i is incorrect, a large cost c is given to the correspondence between the labels a _i and b _{w (i)} , while if the correspondence between the candidates a _i is correct, the labels a _i and b _{w (} a cost giving means for giving a small cost c to the correspondence of _i) and storing the cost c in a storage means;
Based on the cost c changed by the cost giving means, a repeating means that repeats the matching processing means, the correction candidate selecting means, the interface means, and the cost giving means again,
An apparatus for correcting a matching error of a series of labels, characterized by comprising:

The cost giving means is
When the operator determines the correspondence between the labels a _i and b _{w (i)} , if the correspondence is incorrect and the label b _j is selected as the correct correspondence destination of the label a _i ,
Together give pre Symbol label a _i and b _w corresponding to a large cost c of _(i), giving a small cost c in correspondence of the label a _i and the label b _j, and stores the two said cost c in the storage means The label sequence matching error correction device according to claim 1.

5. A label sequence matching error correction program for causing a computer to function as each means constituting the label sequence matching error correction device according to claim 3.

A computer-readable storage medium storing the label series matching error correction program according to claim 5.