JP2020160388A

JP2020160388A - Scoring support device, method thereof, and program

Info

Publication number: JP2020160388A
Application number: JP2019062727A
Authority: JP
Inventors: 哲小橋川; Satoru Kobashigawa; 土橋　寿昇; Hisanori Dobashi; 寿昇土橋; 中村　高雄; Takao Nakamura; 高雄中村; 亮増村; Akira Masumura; 歩相名神山; Hosona Kamiyama; 裕司青野; Yuji Aono; 遠藤　公誉; Takayoshi Endo; 公誉遠藤
Original assignee: NTT Advanced Technology Corp; Nippon Telegraph and Telephone Corp
Current assignee: NTT Advanced Technology Corp; Nippon Telegraph and Telephone Corp
Priority date: 2019-03-28
Filing date: 2019-03-28
Publication date: 2020-10-01
Anticipated expiration: 2039-03-28
Also published as: JP7258627B2

Abstract

To provide a scoring support device, a method thereof, and a program, which are applicable also to reading voices even if any correct reply to a question is not uniquely known, different from prior art applied only to the reading voice replying to a sentence only when a correct answer to the question is uniquely known.SOLUTION: The scoring support device has a scoring support unit that calculates the evaluation of a voice reply using a voice recognition result of voice reply to question sentence and a piece of correct reply information corresponding to a question sentence including at least one correct reply sentence to the correct answer sentence list.SELECTED DRAWING: Figure 1

Description

本発明は、問題文に対する音声回答の採点を支援する採点支援装置、その方法、およびプログラムに関する。 The present invention relates to a scoring support device, a method thereof, and a program that support scoring of voice answers to question sentences.

非特許文献１では、非母語話者モデルの音声認識結果に対して、母語話者モデルで音素を置換する文法で音声認識を行い、発音誤り候補を出力する。 In Non-Patent Document 1, speech recognition is performed on the speech recognition result of the non-native speaker model by a grammar that replaces phonemes with the native speaker model, and pronunciation error candidates are output.

張昊宇,齋藤大輔,峯松信明,小橋川哲、「日本人英語の発音多様性のモデル化と音素誤り自動検出への応用」、日本音響学会講演論文集、2-Q-4、2018年Zhang Hao, Daisuke Saito, Nobuaki Minematsu, Satoshi Kobashigawa, "Modeling of Japanese English Pronunciation Diversity and Application to Automatic Phoneme Error Detection", Proceedings of the Acoustical Society of Japan, 2-Q-4, 2018

従来技術を利用して、非母語話者の(学習言語の)音声回答に発音誤りがないかを採点する採点支援装置が考えられる。 A scoring support device that uses conventional technology to score whether there is a pronunciation error in the spoken answer (of the learning language) of a non-native speaker can be considered.

しかしながら、従来技術では、問題に対する正しい回答（正解文）が一意に分かっている必要があるため、正解文に対応する読み上げ音声にしか適用できない。 However, in the prior art, since it is necessary to uniquely know the correct answer (correct sentence) to the question, it can be applied only to the reading voice corresponding to the correct sentence.

本発明は、正解文ありきの読み上げ音声以外にも適用できる採点支援装置、その方法、およびプログラムを提供することを目的とする。 An object of the present invention is to provide a scoring support device, a method thereof, and a program that can be applied to other than the reading voice with a correct sentence.

上記の課題を解決するために、本発明の一態様によれば、採点支援装置は、問題文に対する音声回答の音声認識結果と問題文に対応する少なくともひとつの正解文を含む正解文リストに対応する正解情報とを用いて、音声回答の評価を求める採点支援部を含む。 In order to solve the above problems, according to one aspect of the present invention, the scoring support device corresponds to a voice recognition result of a voice answer to a question sentence and a correct sentence list including at least one correct answer sentence corresponding to the question sentence. Includes a scoring support department that requests evaluation of voice responses using correct answer information.

本発明によれば、正解文ありきの読み上げ音声以外にも適用できるため、学習教材の幅が拡がる。 According to the present invention, since it can be applied to other than the reading voice with the correct sentence, the range of learning materials is expanded.

第一実施形態に係る採点支援装置の機能ブロック図。The functional block diagram of the scoring support device which concerns on 1st Embodiment. 第一、第三〜第六実施形態に係る採点支援装置の処理フローの例を示す図。The figure which shows the example of the processing flow of the scoring support apparatus which concerns on 1st, 3rd to 6th Embodiment. データの例を示す図。The figure which shows the example of data. 第二実施形態に係る採点支援装置の機能ブロック図。The functional block diagram of the scoring support device which concerns on 2nd Embodiment. 第二実施形態に係る採点支援装置の処理フローの例を示す図。The figure which shows the example of the processing flow of the scoring support apparatus which concerns on 2nd Embodiment. 第三実施形態に係る採点支援装置の機能ブロック図。The functional block diagram of the scoring support device which concerns on 3rd Embodiment. 第四〜第六実施形態に係る採点支援装置の処理フローの例を示す図。The figure which shows the example of the processing flow of the scoring support apparatus which concerns on 4th to 6th Embodiment.

以下、本発明の実施形態について、説明する。なお、以下の説明に用いる図面では、同じ機能を持つ構成部や同じ処理を行うステップには同一の符号を記し、重複説明を省略する。ベクトルや行列の各要素単位で行われる処理は、特に断りが無い限り、そのベクトルやその行列の全ての要素に対して適用されるものとする。 Hereinafter, embodiments of the present invention will be described. In the drawings used in the following description, the same reference numerals are given to the components having the same function and the steps for performing the same processing, and duplicate description is omitted. Unless otherwise specified, the processing performed for each element of a vector or matrix shall be applied to all the elements of the vector or matrix.

＜各実施形態のポイント＞
第一実施形態では、正解文に含まれるキーワードと、音声認識結果に含まれるキーワードのみに絞り、その合致率を評点とする。 <Points of each embodiment>
In the first embodiment, only the keywords included in the correct sentence and the keywords included in the voice recognition result are narrowed down, and the matching rate is used as the score.

キーワードに基づく合致率の場合、音声認識結果に含まれるキーワードの中の一部の文字が異なるという小さな誤りで全体の評価が大きく下がってしまう。この課題を解決するために、第二実施形態では、文字単位での合致率を評点とする。 In the case of a match rate based on keywords, a small error that some characters in the keywords included in the speech recognition result are different will greatly reduce the overall evaluation. In order to solve this problem, in the second embodiment, the matching rate for each character is used as a score.

第一実施形態ではキーワード選定の方法の制御が難しい。そこで、第三実施形態では、出題者の意図に合わせた評価を行うため、正解文を正規表現として表現し、音声認識結果と正規表現との比較を行い、比較結果を評点とする。 In the first embodiment, it is difficult to control the keyword selection method. Therefore, in the third embodiment, in order to perform the evaluation according to the intention of the questioner, the correct sentence is expressed as a regular expression, the speech recognition result and the regular expression are compared, and the comparison result is used as a score.

第四実施形態では、正解文に対応するキーワード、正規表現等の準備コストを削減するため、音声認識結果と正解文との比較により、正解精度を評点とする。 In the fourth embodiment, in order to reduce the preparation cost of keywords, regular expressions, etc. corresponding to the correct answer sentence, the correct answer accuracy is scored by comparing the speech recognition result with the correct answer sentence.

教室等で使用するシーンを考えると、発話者が正しく発声していても、周囲の音声雑音による悪影響が生じる場合がある。そこで、第五実施形態では、挿入誤りを考慮しない認識率を評点とする。また、文頭、文末を除く文中の挿入誤りは、間違えの可能性があるので、挿入誤りを無視するのは、文頭・文末に絞る。加えて、長い文の場合は、文中の句と句に間が空く可能性があるので、句末・句頭間の挿入誤りを無視しても良い。 Considering a scene used in a classroom or the like, even if the speaker speaks correctly, there may be an adverse effect due to ambient voice noise. Therefore, in the fifth embodiment, the recognition rate that does not consider the insertion error is used as the score. In addition, since there is a possibility of mistakes in insertion errors in sentences other than the beginning and end of sentences, the insertion errors should be ignored only at the beginning and end of sentences. In addition, in the case of a long sentence, there is a possibility that there is a gap between the phrases in the sentence, so the insertion error between the end of the phrase and the beginning of the phrase may be ignored.

評点そのものを学習者に提示すると、雑音等による影響で不当に悪い点が付いた場合、採点支援装置に対して悪い印象を持ってしまう可能性がある。そこで、第七実施形態では、評点の範囲に応じて分類し、分類結果を評価結果として提示する。 When the score itself is presented to the learner, if an unreasonably bad point is given due to the influence of noise or the like, the scoring support device may have a bad impression. Therefore, in the seventh embodiment, classification is performed according to the range of the score, and the classification result is presented as the evaluation result.

＜第一実施形態＞
第一実施形態では、問題文に対する正解文を少なくともひとつ含む正解文リストに含まれるキーワードのみに絞り、音声認識結果との合致率を求め、求めた合致率を評点とする。 <First Embodiment>
In the first embodiment, only the keywords included in the correct sentence list including at least one correct sentence for the question sentence are narrowed down, the matching rate with the speech recognition result is obtained, and the obtained matching rate is used as a score.

図１は第一実施形態に係る採点支援装置の機能ブロック図を、図２はその処理フローの例を示す図である。 FIG. 1 is a functional block diagram of the scoring support device according to the first embodiment, and FIG. 2 is a diagram showing an example of the processing flow.

第一実施形態に係る採点支援装置は、音声認識部１１０と、キーワード作成部１２０と、採点支援部１３０とを含む。 The scoring support device according to the first embodiment includes a voice recognition unit 110, a keyword creation unit 120, and a scoring support unit 130.

第一実施形態に係る採点支援装置は、正解文リストと回答音声を含む音声信号とを入力とし、回答音声に対する評価を行い、評点結果を出力する。 The scoring support device according to the first embodiment inputs a correct answer sentence list and a voice signal including a reply voice, evaluates the reply voice, and outputs a scoring result.

採点支援装置は、例えば、中央演算処理装置（CPU: Central Processing Unit）、主記憶装置（RAM: Random Access Memory）などを有する公知又は専用のコンピュータに特別なプログラムが読み込まれて構成された特別な装置である。採点支援装置は、例えば、中央演算処理装置の制御のもとで各処理を実行する。採点支援装置に入力されたデータや各処理で得られたデータは、例えば、主記憶装置に格納され、主記憶装置に格納されたデータは必要に応じて中央演算処理装置へ読み出されて他の処理に利用される。採点支援装置の各処理部は、少なくとも一部が集積回路等のハードウェアによって構成されていてもよい。採点支援装置が備える各記憶部は、例えば、RAM（Random Access Memory）などの主記憶装置、またはリレーショナルデータベースやキーバリューストアなどのミドルウェアにより構成することができる。ただし、各記憶部は、必ずしも採点支援装置がその内部に備える必要はなく、ハードディスクや光ディスクもしくはフラッシュメモリ（Flash Memory）のような半導体メモリ素子により構成される補助記憶装置により構成し、採点支援装置の外部に備える構成としてもよい。 The scoring support device is a special program configured by loading a special program into a known or dedicated computer having, for example, a central processing unit (CPU), a main storage device (RAM: Random Access Memory), and the like. It is a device. The scoring support device executes each process under the control of the central processing unit, for example. The data input to the scoring support device and the data obtained by each process are stored in the main storage device, for example, and the data stored in the main storage device is read out to the central processing unit as needed. It is used for processing. At least a part of each processing unit of the scoring support device may be configured by hardware such as an integrated circuit. Each storage unit included in the scoring support device can be configured by, for example, a main storage device such as RAM (Random Access Memory) or middleware such as a relational database or a key-value store. However, each storage unit does not necessarily have to be provided inside the scoring support device, and is composed of an auxiliary storage device composed of semiconductor memory elements such as a hard disk, an optical disk, or a flash memory, and is a scoring support device. It may be configured to be provided outside the above.

＜採点支援方法＞
本実施系形態では、以下のように採点支援を行う。 <Scoring support method>
In this embodiment, scoring support is provided as follows.

(i)ある言語（以下、学習言語ともいう）を母国語としない話者（以下、非母語話者、または、学習者ともいう）に問題文を何らかの方法で提示する。例えば問題文を記載した紙を提示したり、ディスプレイ等の出力装置に問題文を表示する。 (i) Present the question sentence to a speaker whose native language is not a certain language (hereinafter, also referred to as a learning language) (hereinafter, also referred to as a non-native speaker or a learner) in some way. For example, a paper on which the problem statement is described is presented, or the problem statement is displayed on an output device such as a display.

(ii)学習者が学習言語で問題文に対し発話により回答し、回答音声を収音する。 (ii) The learner answers the question sentence by utterance in the learning language and picks up the answer voice.

(iii)問題文に対する正解文と、回答音声を含む音声信号を音声認識した音声認識結果との合致率に基づき採点し、採点結果を学習者に何らかの方法で提示する。 (iii) Scoring is performed based on the matching rate between the correct answer sentence for the question sentence and the voice recognition result of voice recognition of the voice signal including the answer voice, and the scoring result is presented to the learner in some way.

例えば、問題文を「『私は貴方が好きです。』の英訳を答えよ。」とし、正解文を「I love you」とする。なお、正解文を少なくとも１つ含むリストを正解文リストと呼ぶ。問題文に対する正解文は１つとは限らないので、正解文リストは１つ以上の正解文を含む(図３参照)。例えば、正解文リストには、上述の正解文に加え「I like you」を正解文として加えてもよい。なお、問題文が複数ある場合には、問題文毎に正解文リストを用意する。 For example, the question sentence is "Please answer the English translation of'I like you.'" And the correct sentence is "I love you". A list containing at least one correct sentence is called a correct sentence list. Since the number of correct sentences for a question sentence is not limited to one, the correct sentence list includes one or more correct sentences (see FIG. 3). For example, "I like you" may be added as a correct sentence to the correct sentence list in addition to the above-mentioned correct sentence. If there are multiple question sentences, prepare a correct answer sentence list for each question sentence.

以下、上述の採点支援を実現するための各部の処理について説明する。 Hereinafter, the processing of each part for realizing the above-mentioned scoring support will be described.

＜音声認識部１１０＞
入力: 回答音声を含む音声信号
出力: 音声認識結果（文または音声認識処理単位）
処理内容:
音声認識部１１０は、音声信号に対して、音声認識を行い（Ｓ１１０）、音声認識結果をテキストとして出力する。 <Voice recognition unit 110>
Input: Voice signal output including answer voice: Voice recognition result (sentence or voice recognition processing unit)
Processing content:
The voice recognition unit 110 performs voice recognition on the voice signal (S110), and outputs the voice recognition result as text.

音声認識としては様々な方法が考えられる。例えば、参考文献１の方式等を用いて、非母語話者音声に頑健な手法を用いても良い。
（参考文献１）増村亮，椛島優，森谷崇史，小橋川哲，山口義和，青野裕司,「ネイティブ日本語とネイティブ英語の音声データを活用した日本人英語向けニューラル音響モデルの検討」,日本音響学会講演論文集,1-2-2, 2018年 Various methods can be considered for voice recognition. For example, a robust method for non-native speaker voice may be used by using the method of Reference 1.
(Reference 1) Ryo Masumura, Yu Kabashima, Takashi Moriya, Satoshi Kobashikawa, Yoshikazu Yamaguchi, Yuji Aono, "Study of neural acoustic model for Japanese English using voice data of native Japanese and native English", Acoustical Society of Japan Proceedings, 1-2-2, 2018

ここで出力される音声認識結果は、文または音声認識の処理単位に対応する文の一部である。この例では、音声認識結果を「I love you」とする(図３参照)。 The voice recognition result output here is a part of a sentence or a sentence corresponding to a processing unit of voice recognition. In this example, the voice recognition result is "I love you" (see FIG. 3).

＜キーワード作成部１２０＞
入力：正解文リスト
出力：正解文リストに対するキーワードリスト
処理内容:
キーワード作成部１２０は、正解文リストに含まれる正解文からキーワードを抽出し、キーワードリストを作成する。1つの正解文に対して1つのキーワードリストを作成し、１つのキーワードリストには1つ以上のキーワードが含まれる。この例では、キーワードリストは「love」というキーワードを含むリストとする(図３参照)。 <Keyword creation unit 120>
Input: Correct sentence list Output: Keyword list for correct sentence list Processing content:
The keyword creation unit 120 extracts keywords from the correct sentences included in the correct sentence list and creates a keyword list. Create one keyword list for one correct sentence, and one keyword list contains one or more keywords. In this example, the keyword list is a list that includes the keyword "love" (see FIG. 3).

キーワード抽出処理としては様々な方法が考えられる。例えば、ルールに基づきキーワードを抽出してもよい。以下、二つのルールを例示する。
（ルール１）名詞や前置詞、動詞等の所定の品詞をキーワードとして抽出するというルールを予め設定し、このルールに従い正解文からキーワードを抽出する。なお、品詞以外にも予め定めた少なくともひとつの言語属性を有することをルールとして予め設定してもよい。
（ルール２）予め抽出対象となるキーワードを設定しておき、正解文から設定したキーワードと一致するものを抽出する。 Various methods can be considered as the keyword extraction process. For example, keywords may be extracted based on rules. Two rules will be illustrated below.
(Rule 1) A rule is set in advance to extract predetermined part of speech such as nouns, prepositions, and verbs as keywords, and keywords are extracted from correct sentences according to this rule. In addition to the part of speech, it may be preset as a rule to have at least one predetermined language attribute.
(Rule 2) Keywords to be extracted are set in advance, and those that match the set keywords are extracted from the correct sentence.

例えば、ルール１の場合、正解文に対して形態素解析を行い、形態素解析結果を元に、所定の品詞のみをキーワードとして抽出する。上述のルール１，２を組合せて、キーワードを抽出してもよい。 For example, in the case of rule 1, morphological analysis is performed on the correct sentence, and only a predetermined part of speech is extracted as a keyword based on the morphological analysis result. Keywords may be extracted by combining the above rules 1 and 2.

キーワード作成部１２０は、抽出したキーワードをリスト化し、キーワードリストを作成する。なお、キーワードリストに含まれるキーワードの総数が少ない場合には、シソーラスや同義語辞書を利用して、抽出したキーワードと同義語を新たなキーワードとし、キーワードリストに追加してもよい。また、学習者の視点では、音声認識として混同のし易い同音語を新たなキーワードとして扱っても良い。例えば、抽出したキーワードやその同義語に対する同音語を新たなキーワードとし、キーワードリストに追加してもよい。この場合、正解文や抽出したキーワード、その同義語等には発音記号等が付与されており、この発音記号等によって、同音語を取得できるものとする。 The keyword creation unit 120 lists the extracted keywords and creates a keyword list. If the total number of keywords included in the keyword list is small, the extracted keyword and the synonym may be used as a new keyword and added to the keyword list by using a thesaurus or a thesaurus. From the learner's point of view, homophones that are easily confused as speech recognition may be treated as new keywords. For example, the extracted keyword or a homonym for the synonym thereof may be used as a new keyword and added to the keyword list. In this case, phonetic symbols and the like are added to the correct answer sentences, the extracted keywords, and their synonyms, and the homonyms can be obtained by these phonetic symbols and the like.

なお、問題文に対応する予め正解文リストが与えられている場合には、音声認識処理を行う前に、正解文から予めキーワードを抽出し、キーサードリストを作成しておいてもよい。このように、予めキーワードリストを作成しておくことで、採点支援の処理時間を低減することができる。 If a correct answer sentence list corresponding to the question sentence is given in advance, keywords may be extracted from the correct answer sentence in advance and a key third list may be created before performing the voice recognition process. By creating the keyword list in advance in this way, the processing time for scoring support can be reduced.

＜採点支援部１３０＞
入力: 音声認識結果、正解文リストに含まれる正解文毎のキーワードリスト
出力: 評点結果
処理内容:
採点支援部１３０は、音声認識結果と正解文毎のキーワードリストとを用いて、評点結果を求める（Ｓ１３０）。本実施形態では、採点支援部１３０は、音声認識結果と正解文毎のキーワードリストとの合致率を算出し、合致率に基づき評点を求める。例えば、図３の例では、音声認識結果が、キーワードリスト(1)に含まれるキーワードを全て含むため、算出した合致率は100％であり、評点を100点中100点としている。 <Scoring Support Department 130>
Input: Speech recognition result, keyword list output for each correct sentence included in the correct sentence list: Score result processing content:
The scoring support unit 130 obtains a scoring result using the voice recognition result and the keyword list for each correct sentence (S130). In the present embodiment, the scoring support unit 130 calculates the matching rate between the voice recognition result and the keyword list for each correct sentence, and obtains a score based on the matching rate. For example, in the example of FIG. 3, since the voice recognition result includes all the keywords included in the keyword list (1), the calculated match rate is 100%, and the score is 100 out of 100 points.

なお、本実施形態では、減点による学習者のモチベーションの低下を防ぐために、音声認識結果におけるキーワードの出現順と評点とを無関係とする。 In this embodiment, in order to prevent the learner's motivation from being lowered due to the deduction of points, the order of appearance of the keywords in the speech recognition result and the score are irrelevant.

例えば、採点支援部１３０は、複数のキーワードリストに対して合致率を算出し、算出した合致率のうち最もよい合致率を評点結果として出力してもよい。 For example, the scoring support unit 130 may calculate the matching rate for a plurality of keyword lists and output the best matching rate among the calculated matching rates as the scoring result.

＜効果＞
本実施形態の場合、音声認識結果が正解文に対応するキーワードを含めばよいため、正解文ありきの読み上げ音声以外にも適用でき、学習教材の幅が拡がる。 <Effect>
In the case of the present embodiment, since it is sufficient to include the keyword corresponding to the correct answer sentence in the voice recognition result, it can be applied to other than the reading voice with the correct answer sentence, and the range of learning teaching materials is expanded.

また、従来技術では正解文に対応する発音情報が必要となるため、そのためのコストを要するが、本実施形態の構成であれば、不要である。また、従来技術では、正しく発音しないと評価されない減点方式に近いため学習者のモチベーションが低下しやすいが、本実施形態の構成であれば、多少誤ったとしても高く評点で評価してくれるため、学習者の意欲の維持が期待できる。さらに、正解文リストが複数の正解文を含む場合にも評価ができる。 Further, in the prior art, pronunciation information corresponding to the correct sentence is required, so that a cost for that is required, but it is not necessary in the configuration of the present embodiment. In addition, in the prior art, the learner's motivation tends to decrease because it is close to the deduction method that is not evaluated unless it is pronounced correctly, but with the configuration of this embodiment, even if it is a little wrong, it is evaluated with a high score. It can be expected to maintain the motivation of learners. Furthermore, evaluation can be performed even when the correct sentence list contains a plurality of correct sentences.

＜変形例＞
本実施形態では、キーワード作成部１２０が正解文リストに含まれる正解文からキーワードを抽出し、キーワードリストを作成しているが、キーワード作成部１２０によらずに、別途正解文毎に予めキーワードリストを用意しておき、採点支援部１３０に与えてもよい。 <Modification example>
In the present embodiment, the keyword creation unit 120 extracts keywords from the correct answer sentences included in the correct answer sentence list and creates a keyword list. However, the keyword list is separately prepared for each correct answer sentence in advance regardless of the keyword creation unit 120. May be prepared and given to the scoring support unit 130.

本実施形態では、音声認識結果におけるキーワードの出現順と評点とを無関係としているが、音声認識結果におけるキーワードの出現順を考慮して評点を求めても良い。この場合、キーワードリストは出現順の情報を含む。例えば、キーワードリストに含まれるキーワードは出現順にリスト化され（キーワードリストに含まれるキーワードの順番が、正解文におけるキーワードの出現順に対応する）、採点支援部１３０は、キーワードリストに含まれる1つ以上のキーワードと音声認識結果との前方一致、後方一致のいずれかひとつ以上に基づく評価を求める。さらに、採点支援部１３０は、キーワードと音声認識結果との出現順を考慮した部分一致、完全一致のいずれかに基づく評価を求めてもよい。 In the present embodiment, the order of appearance of the keywords in the voice recognition result and the score are irrelevant, but the score may be obtained in consideration of the order of appearance of the keywords in the voice recognition result. In this case, the keyword list contains information in the order of appearance. For example, the keywords included in the keyword list are listed in the order of appearance (the order of the keywords included in the keyword list corresponds to the order of appearance of the keywords in the correct sentence), and the scoring support unit 130 has one or more keywords included in the keyword list. The evaluation is based on one or more of the prefix match and the suffix match between the keyword and the speech recognition result. Further, the scoring support unit 130 may request an evaluation based on either a partial match or an exact match in consideration of the order of appearance of the keyword and the voice recognition result.

本実施形態では、音声認識結果に対してキーワード抽出を行っていないが、キーワード抽出を音声認識結果に対して適用した上で、音声認識結果に対するキーワード抽出結果と、正解文に対するキーワード抽出結果を比較しても良い。例えば、音声認識結果の単語情報に含まれる品詞情報を用いてキーワード抽出を行ってもよい。この場合、採点支援装置は、回答文キーワード作成部１４０を含む（図１中、破線で示す）。例えば、回答文キーワード作成部１４０は、音声認識結果からキーワードを抽出し、回答文キーワードリストを作成する。このとき、予め設定したルールに基づいてキーワードを抽出してもよい。予め設定したルールとしては、本実施形態のキーワード作成部１２０で説明したルール１、ルール２等が考えられる。採点支援部１３０は、回答文キーワードリストと正解文に対応するキーワードリストに基づき、評価を求める。例えば、回答文キーワードリストに含まれるキーワードと正解文毎のキーワードリストに含まれるキーワードとの合致率を算出し、合致率に基づき評点を求める。このとき、出現順と評点と無関係としてもよいし、関連付けてもよい。 In this embodiment, the keyword extraction is not performed on the speech recognition result, but after applying the keyword extraction to the speech recognition result, the keyword extraction result for the speech recognition result and the keyword extraction result for the correct sentence are compared. You may. For example, keyword extraction may be performed using the part of speech information included in the word information of the speech recognition result. In this case, the scoring support device includes the answer sentence keyword creation unit 140 (indicated by a broken line in FIG. 1). For example, the answer sentence keyword creation unit 140 extracts a keyword from the voice recognition result and creates an answer sentence keyword list. At this time, keywords may be extracted based on preset rules. As the preset rule, rule 1, rule 2, and the like described in the keyword creation unit 120 of the present embodiment can be considered. The scoring support unit 130 requests an evaluation based on the answer sentence keyword list and the keyword list corresponding to the correct answer sentence. For example, the match rate between the keyword included in the answer sentence keyword list and the keyword included in the keyword list for each correct sentence is calculated, and the score is obtained based on the match rate. At this time, the order of appearance may be irrelevant to the score, or may be related.

音声認識結果は、大文字化しないケースもあるため、正解文も含め、全て小文字化してマッチングしても良い。同様に、音声認識結果は、句読点、カンマ、ピリオド、エクスクラメーションマーク、クエスチョンマーク、アポストロフィー等の記号類を認識しないケースもあるため、正解文から上述の記号類を取り除いても良い。また、学習者のもモチベーション向上のため、回答文・正解文のキーワードとその同義語や同音語を同一に扱い、例えば、同義語や同音語があった場合には元のキーワードに入れ替えた上で合致率を計算しても良い。例えば、採点支援装置は、文字処理部１５０を含む（図１中、破線で示す）。文字処理部は、音声認識結果及び正解文リストを入力とし、予め設定された文字処理ルールに基づき、音声認識結果および正解文リストに対して処理を行う。例えば、文字処理ルールとしては上述のルールが考えられる。言い換えると、文字処理ルールは、(i)大文字を小文字に変換する、(ii)予め指定された記号を削除する、の少なくとも何れかを含む。 Since there are cases where the voice recognition result is not capitalized, all the correct sentences may be converted to lowercase for matching. Similarly, since there are cases where the speech recognition result does not recognize symbols such as punctuation marks, commas, periods, exclamation marks, question marks, and apostrophes, the above symbols may be removed from the correct sentence. In addition, in order to improve the motivation of learners, the keywords of the answer sentence / correct answer sentence and their synonyms and homophones are treated the same. You may calculate the match rate with. For example, the scoring support device includes a character processing unit 150 (indicated by a broken line in FIG. 1). The character processing unit inputs the voice recognition result and the correct sentence list, and processes the voice recognition result and the correct sentence list based on the preset character processing rules. For example, the above-mentioned rule can be considered as the character processing rule. In other words, the character processing rule includes at least one of (i) converting uppercase letters to lowercase letters and (ii) removing pre-specified symbols.

音声認識については、非母語話者の発音に対応した音声認識モデルを用いることで、学習者の発声モチベーションを上げても良い。例えば、非母語話者による音声データと対応する書き起こしテキスト（正解テキスト）の組を大量に記憶したデータベース（以下、非母語話者データベースともいう）を用意し、非母語話者データベースを参照し、非母語話者のデータ（非母語話者による音声データと対応する書き起こしテキスト）を利用して非母語話者の発音に対応した音声認識モデルを学習する。なお、音声認識モデルは、音声−テキスト対から学習した非母語話者音響モデルと、対応する言語のテキストから学習した言語モデルからなる。非母語話者データベースから非母語話者の発音に対応した音声認識モデルを学習してもよいし、母語話者による音声認識モデルを入力とし、母語話者による音声認識モデルをチューニングすることにより、非母語話者の発音に対応した音声認識モデルを作成してもよい。 For speech recognition, the learner's vocal motivation may be increased by using a speech recognition model corresponding to the pronunciation of a non-native speaker. For example, prepare a database (hereinafter, also referred to as a non-native speaker database) that stores a large number of sets of transcribed text (correct text) corresponding to voice data by non-native speakers, and refer to the non-native speaker database. , Learn a voice recognition model corresponding to the pronunciation of a non-native speaker using non-native speaker data (voice data by a non-native speaker and the corresponding transcript). The speech recognition model consists of a non-native speaker acoustic model learned from the speech-text pair and a language model learned from the text of the corresponding language. You may learn the speech recognition model corresponding to the pronunciation of the non-native speaker from the non-native speaker database, or you can input the speech recognition model by the native speaker and tune the speech recognition model by the native speaker. A speech recognition model corresponding to the pronunciation of a non-native speaker may be created.

学習者の意欲を高めるため、採点支援部１３０は、キーワードと音声認識結果との評価において、予め指定された文字数以下の違いを許容する構成としてもよい。例えば、発音誤りに相当する音声認識結果と正解文の2〜3の文字単位の違いを許容し、正解とみなすようにしても良い。 In order to increase the motivation of the learner, the scoring support unit 130 may be configured to allow a difference of the number of characters or less specified in advance in the evaluation of the keyword and the voice recognition result. For example, the difference between the voice recognition result corresponding to the pronunciation error and the correct sentence in 2 to 3 character units may be allowed and regarded as the correct answer.

本実施形態では、非母語話者の音声回答を採点し、評価結果を出力している。しかし、非母語話者の音声回答の採点に限定されるものではない。問題文が設定され、問題文に対応する正解文を含む正解文リストが与えられ、問題文に対する回答音声を入力とし、回答音声に対する評価を行い、評点結果を出力する構成であれば、どのような採点であってもよい。 In this embodiment, the voice response of a non-native speaker is scored and the evaluation result is output. However, it is not limited to scoring non-native speakers' voice responses. If the question sentence is set, the correct answer sentence list including the correct answer sentence corresponding to the question sentence is given, the answer voice for the question sentence is input, the answer voice is evaluated, and the score result is output. It may be a good score.

＜第二実施形態＞
第一実施形態と異なる部分を中心に説明する。 <Second embodiment>
The part different from the first embodiment will be mainly described.

第一実施形態の場合、キーワード単位で合致率を算出しているため、音声認識結果が誤認識を含み、キーワードの一部の文字が異なるという微妙な誤りで評価が大きく下がってしまう。本実施形態では、文字単位での合致率を評点とする。 In the case of the first embodiment, since the match rate is calculated for each keyword, the evaluation is greatly lowered due to a subtle error that the voice recognition result includes erroneous recognition and some characters of the keyword are different. In the present embodiment, the matching rate for each character is used as a score.

図４は第二実施形態に係る採点支援装置の機能ブロック図を、図５はその処理フローの例を示す図である。 FIG. 4 is a functional block diagram of the scoring support device according to the second embodiment, and FIG. 5 is a diagram showing an example of the processing flow.

第一実施形態に係る採点支援装置は、音声認識部１１０と、キーワード作成部１２０と、採点支援部２３０と、文字単位分割部２４０とを含む。 The scoring support device according to the first embodiment includes a voice recognition unit 110, a keyword creation unit 120, a scoring support unit 230, and a character unit division unit 240.

＜文字単位分割部２４０＞
入力: 音声認識結果、キーワードリストに含まれるキーワード
出力: 音声認識結果の文字リスト、キーワードリストに含まれるキーワード毎の文字リスト
処理内容:
文字単位分割部２４０は、音声認識結果及びキーワードリストに含まれるキーワードを文字単位に分割し（Ｓ２４０）、文字単位のリスト（文字リスト）を作成する。 <Character unit division 240>
Input: Voice recognition result, keyword output included in keyword list: Character list of voice recognition result, character list for each keyword included in keyword list Processing content:
The character unit division unit 240 divides the voice recognition result and the keywords included in the keyword list into character units (S240), and creates a character unit list (character list).

＜採点支援部２３０＞
入力: 音声認識結果の文字リスト、キーワード毎の文字リスト
出力: 評点結果
処理内容:
採点支援部２３０は、音声認識結果の文字リストとキーワード毎の文字リストとを用いて、評点結果を求める（Ｓ２３０）。本実施形態では、採点支援部２３０は、音声認識結果の文字リストをキーワード毎の文字リストと照合し、文字単位で合致率を算出する。なお、音声認識結果の文字リストのうちキーワードの文字リストの一部でも一致する部分を全て照合する。例えば動的計画法(DPマッチング)または前方一致または後方一致等といった方式で合致率を計算し（参考文献２）、最もよい合致率を評点結果としてもよい。
（参考文献２）中川聖一、伊藤立治、「拡張連続DP法の連続数字音声認識による評価」、電気学会論文誌Ｃ、1988年108巻10号p.834-841 <Scoring Support Department 230>
Input: Character list of voice recognition result, character list output for each keyword: Score result processing content:
The scoring support unit 230 obtains the scoring result by using the character list of the voice recognition result and the character list for each keyword (S230). In the present embodiment, the scoring support unit 230 collates the character list of the voice recognition result with the character list for each keyword, and calculates the matching rate for each character. In addition, in the character list of the voice recognition result, all the matching parts of the character list of the keyword are collated. For example, the match rate may be calculated by a method such as dynamic programming (DP matching) or prefix match or suffix match (Reference 2), and the best match rate may be used as the scoring result.
(Reference 2) Seiichi Nakagawa, Ritsuji Ito, "Evaluation by Continuous Numeric Speech Recognition of Extended Continuous DP Method", IEEJ Journal C, 1988, Vol. 108, No. 10, p.834-841

採点支援部２３０は、算出した合致率を評点結果とする。正解文リストに含まれる正解文、さらに、正解文に対するキーワードリストに含まれるキーワード毎に合致率を算出するが、算出した合致率のうち最もよい合致率を評点結果としてもよい。 The scoring support unit 230 uses the calculated matching rate as the scoring result. The match rate is calculated for each of the correct sentences included in the correct sentence list and each keyword included in the keyword list for the correct sentence, and the best match rate among the calculated match rates may be used as the scoring result.

例えば、音声認識結果が「have」であり、音声認識結果の文字リストが「h」、「a」、「v」、「e」であり、キーワードが「love」であり、キーワードの文字リストが「l」、「o」、「v」、「e」の場合、4文字中2文字が合致し、合致率は50%であり、評点を100点中50点とする。 For example, the voice recognition result is "have", the character list of the voice recognition result is "h", "a", "v", "e", the keyword is "love", and the character list of the keyword is. In the case of "l", "o", "v", and "e", 2 out of 4 letters match, the match rate is 50%, and the score is 50 out of 100 points.

＜効果＞
このような構成により第一実施形態と同様の効果を得ることができる。さらに、微妙な誤りを反映した、より細かい採点を行うことができる。 <Effect>
With such a configuration, the same effect as that of the first embodiment can be obtained. In addition, finer scoring can be done that reflects subtle errors.

＜変形例＞
本実施形態では、文字単位分割部２４０の入力をキーワードリストに含まれるキーワードとしているが、これに代えて、正解文リストに含まれる正解文を入力としてもよい（図４中、破線で示す）。その場合、文字単位分割部２４０は、正解文リストに含まれる正解文を文字単位に分割し、文字単位のリスト（文字リスト）を作成し、出力する。また、この場合、採点支援部２３０は、音声認識結果の文字リスト、正解文毎の文字リストを入力とし、音声認識結果の文字リストを正解文毎の文字リストと照合し、文字単位で合致率を算出し、最もよい合致率を評点結果とする。この構成の場合、採点支援装置は、キーワード作成部１２０を含まなくともよい。 <Modification example>
In the present embodiment, the input of the character unit division unit 240 is a keyword included in the keyword list, but instead, the correct answer sentence included in the correct answer sentence list may be input (indicated by a broken line in FIG. 4). .. In that case, the character unit division unit 240 divides the correct answer sentence included in the correct answer sentence list into character units, creates a character unit list (character list), and outputs the correct answer sentence. Further, in this case, the scoring support unit 230 inputs the character list of the voice recognition result and the character list for each correct sentence, collates the character list of the voice recognition result with the character list for each correct sentence, and matches the match rate for each character. Is calculated, and the best match rate is used as the scoring result. In the case of this configuration, the scoring support device does not have to include the keyword creation unit 120.

本変形例と第二実施形態とを組合せて、正解文毎の合致率とキーワード毎の合致率とを全て求め、最もよい合致率を評点結果としてもよい。 By combining this modification and the second embodiment, the matching rate for each correct sentence and the matching rate for each keyword may be obtained, and the best matching rate may be used as the scoring result.

＜第三実施形態＞
第一実施形態と異なる部分を中心に説明する。 <Third Embodiment>
The part different from the first embodiment will be mainly described.

正解文からキーワードを抽出する方法では出題者の意図に合わせた評価が困難な場合がある。本実施形態では、正解文を正規表現として表現し、音声認識結果と正解文の正規表現と比較を行い、合致した際のスコアを評点とする。 With the method of extracting keywords from correct sentences, it may be difficult to evaluate according to the intention of the questioner. In the present embodiment, the correct answer sentence is expressed as a regular expression, the voice recognition result is compared with the regular expression of the correct answer sentence, and the score when they match is used as the score.

正規表現としては、任意の正規表現を定義して用いてよい。 As the regular expression, any regular expression may be defined and used.

図６は第三実施形態に係る採点支援装置の機能ブロック図を、図２はその処理フローの例を示す図である。 FIG. 6 is a functional block diagram of the scoring support device according to the third embodiment, and FIG. 2 is a diagram showing an example of the processing flow.

第三実施形態に係る採点支援装置は、音声認識部１１０と、採点支援部３３０とを含む。 The scoring support device according to the third embodiment includes a voice recognition unit 110 and a scoring support unit 330.

第三実施形態に係る採点支援装置は、正規表現の正解文リストと回答音声を含む音声信号とを入力とし、回答音声に対する評価を行い、評点結果を出力する。 The scoring support device according to the third embodiment inputs a list of correct sentences of regular expressions and a voice signal including a reply voice, evaluates the reply voice, and outputs a scoring result.

＜採点支援部３３０＞
入力: 音声認識結果、正解文の正規表現を含む正解文リスト
出力: 評点結果
処理内容:
採点支援部３３０は、音声認識結果と正解文の正規表現を含む正解文リストとを用いて、評点結果を求める（Ｓ３３０）。本実施形態では、採点支援部３３０は、音声認識結果が正解文リストに含まれる正解文の正規表現にマッチするか否か（正解または不正解）で評点結果を求める。例えば、正解文の正規表現を「I love *」とし、音声認識結果が「I love you, too」の場合、音声認識結果が正解文の正規表現にマッチする（正解）ため、評点を100点中100点とする。 <Scoring Support Department 330>
Input: Speech recognition result, correct sentence list including regular expression of correct sentence Output: Score result Processing content:
The scoring support unit 330 obtains a scoring result using the voice recognition result and the correct sentence list including the regular expression of the correct sentence (S330). In the present embodiment, the scoring support unit 330 obtains the scoring result based on whether or not the voice recognition result matches the regular expression of the correct answer sentence included in the correct answer sentence list (correct answer or incorrect answer). For example, if the regular expression of the correct sentence is "I love *" and the voice recognition result is "I love you, too", the voice recognition result matches the regular expression of the correct sentence (correct answer), so the score is 100 points. The score is 100 points.

＜第四実施形態＞
第一実施形態と異なる部分を中心に説明する。 <Fourth Embodiment>
The part different from the first embodiment will be mainly described.

本実施形態では、第一実施形態、第二実施形態のキーワードや、第三実施形態の正規表現等の準備コストを削減するため、正解文リストに含まれる正解文そのものと音声認識結果との比較により、正解精度を求め評点とする。 In this embodiment, in order to reduce the preparation cost of the keywords of the first embodiment and the second embodiment and the regular expression of the third embodiment, the correct answer sentence itself included in the correct answer sentence list is compared with the voice recognition result. Therefore, the accuracy of the correct answer is calculated and used as a score.

図７は第四実施形態に係る採点支援装置の機能ブロック図を、図２はその処理フローの例を示す図である。 FIG. 7 is a functional block diagram of the scoring support device according to the fourth embodiment, and FIG. 2 is a diagram showing an example of the processing flow.

第四実施形態に係る採点支援装置は、音声認識部１１０と、採点支援部４３０とを含む。 The scoring support device according to the fourth embodiment includes a voice recognition unit 110 and a scoring support unit 430.

＜採点支援部４３０＞
入力: 音声認識結果、正解文リスト
出力: 評点結果
処理内容:
採点支援部４３０は、音声認識結果と正解文を含む正解文リストとを用いて、評点結果を求める（Ｓ４３０）。採点支援部４３０は、音声認識結果に対して、正解文リストに含まれる正解文毎にマッチングを行い、正解精度を求め、評点とする。正解文とできるだけ合うよう動的計画法(DPマッチング)等の方式でマッチングを行う（参考文献２参照）。 <Scoring Support Department 430>
Input: Speech recognition result, correct sentence list output: Score result Processing content:
The scoring support unit 430 obtains the scoring result by using the voice recognition result and the correct sentence list including the correct sentence (S430). The scoring support unit 430 matches the voice recognition result for each correct answer sentence included in the correct answer sentence list, obtains the correct answer accuracy, and uses it as a score. Matching is performed by a method such as dynamic programming (DP matching) so as to match the correct sentence as much as possible (see Reference 2).

なお、正解精度は、正解率や認識精度からなり、単語単位でも、文字単位でも良い。ただし、文字単位の方が正解精度が高くなるため、学習者の意欲を考慮すると、文字単位の方が望ましい。例えば、正解文が「I love you」であり、音声認識結果が「I have you」である場合、単語単位では3単語中2単語が一致しているので評点を100点中66点とし、文字単位では8文字中6文字が一致しているので評点を100点中75点とする事ができる。 The correct answer accuracy consists of the correct answer rate and the recognition accuracy, and may be in word units or character units. However, since the accuracy of the correct answer is higher in the character unit, the character unit is preferable in consideration of the learner's motivation. For example, if the correct sentence is "I love you" and the voice recognition result is "I have you", 2 out of 3 words match in word units, so the score is 66 out of 100 and the letters. In the unit, 6 out of 8 characters match, so the score can be 75 out of 100.

また、似た発音の文字(例えば sh→s)の表を用意し、その差分は許容しても良い。 Also, a table of characters with similar pronunciation (for example, sh → s) may be prepared and the difference may be allowed.

＜第五実施形態＞
第四実施形態と異なる部分を中心に説明する。 <Fifth Embodiment>
The part different from the fourth embodiment will be mainly described.

教室等で使用するシーンを考えると、発話者が正しく発声していても、周囲の音声雑音による悪影響があるため、挿入誤りを考慮しない認識率を評点とする。 Considering the scene used in the classroom, even if the speaker speaks correctly, there is an adverse effect due to the surrounding voice noise, so the recognition rate that does not consider the insertion error is used as the score.

図７は第五実施形態に係る採点支援装置の機能ブロック図を、図２はその処理フローの例を示す図である。 FIG. 7 is a functional block diagram of the scoring support device according to the fifth embodiment, and FIG. 2 is a diagram showing an example of the processing flow.

第五実施形態に係る採点支援装置は、音声認識部１１０と、採点支援部５３０とを含む。 The scoring support device according to the fifth embodiment includes a voice recognition unit 110 and a scoring support unit 530.

＜採点支援部５３０＞
入力: 音声認識結果、正解文リスト
出力: 評点結果
処理内容:
採点支援部５３０は、音声認識結果と正解文を含む正解文リストとを用いて、評点結果を求める（Ｓ５３０）。採点支援部５３０は、音声認識結果に対して、正解文リストに含まれる正解文毎に挿入誤りを考慮せずにマッチングを行い、正解精度を求め、評点とする。第四実施形態と同様に正解文とできるだけ合うよう動的計画法(DPマッチング)等の方式でマッチングを行い、評点は、単語単位でも、文字単位でも良い。例えば、正解文が「I love you」であり、音声認識結果が「ah I la have you」である場合、「ah」、「la」は挿入誤りとして無視される。単語単位でも、文字単位でも正解精度は100％であり、評点を100点中100点とする。 <Scoring Support Department 530>
Input: Speech recognition result, correct sentence list output: Score result Processing content:
The scoring support unit 530 obtains the scoring result by using the voice recognition result and the correct sentence list including the correct sentence (S530). The scoring support unit 530 matches the voice recognition result for each correct answer sentence included in the correct answer sentence list without considering the insertion error, obtains the correct answer accuracy, and sets the score. As in the fourth embodiment, matching is performed by a method such as dynamic programming (DP matching) so as to match the correct sentence as much as possible, and the score may be in word units or character units. For example, if the correct sentence is "I love you" and the voice recognition result is "ah I la have you", "ah" and "la" are ignored as insertion errors. The accuracy of the correct answer is 100% for both words and letters, and the score is 100 out of 100.

＜変形例＞
挿入誤りは、文頭、文末に発生しやすく、発話文内の音声は、挿入誤りではなく、単なる間違い可能性がある。そのため、本変形例では、挿入誤りを無視するのは、文頭、文末に絞る。また、長い文(例えば、5単語以上)の場合は、文中の句と句に間が空く可能性があるので、句末・句頭間の挿入誤りを無視しても良い。 <Modification example>
Insertion errors are likely to occur at the beginning and end of sentences, and the voice in the utterance sentence may be a mere error, not an insertion error. Therefore, in this modification, the insertion error is ignored only at the beginning and end of the sentence. Also, in the case of a long sentence (for example, 5 words or more), there is a possibility that there is a gap between the phrases in the sentence, so the insertion error between the phrase end and the phrase beginning may be ignored.

＜採点支援部５３０＞
入力: 音声認識結果、正解文リスト
出力: 評点結果
処理内容:
採点支援部５３０は、音声認識結果と正解文を含む正解文リストとを用いて、評点結果を求める（Ｓ５３０）。採点支援部５３０は、音声認識結果に対して、正解文リストに含まれる正解文毎に文頭、文末の挿入誤りを考慮せずにマッチングを行い、正解精度を求め、評点とする。なお、文頭、文末の位置については、正解文の最初の単語より前および最後の単語の後とすれば良い。例えば、正解文が「I love you」であり、音声認識結果が「ah I la have you」である場合、文頭の「ah」は挿入誤りとして無視され、文頭、文末を除く文中に位置する「la」は不正解として判定される。例えば、単語単位の場合、「ah」を無視し、評点を100点中50点とする。 <Scoring Support Department 530>
Input: Speech recognition result, correct sentence list output: Score result Processing content:
The scoring support unit 530 obtains the scoring result by using the voice recognition result and the correct sentence list including the correct sentence (S530). The scoring support unit 530 matches the voice recognition result for each correct sentence included in the correct sentence list without considering the insertion error at the beginning and end of the sentence, obtains the accuracy of the correct answer, and uses it as a score. The positions of the beginning and end of the sentence may be before the first word of the correct sentence and after the last word. For example, if the correct sentence is "I love you" and the speech recognition result is "ah I la have you", the "ah" at the beginning of the sentence is ignored as an insertion error and is located in the sentence excluding the beginning and end of the sentence. "la" is judged as an incorrect answer. For example, in the case of word units, "ah" is ignored and the score is 50 out of 100.

なお、挿入誤りが文頭、文末の何れか一方にのみ生じる場合には、文頭、文末の何れか一箇所のみを考慮せずにマッチングを行う構成としてもよい。 If the insertion error occurs only at either the beginning or the end of the sentence, the matching may be performed without considering only one of the beginning and the end of the sentence.

＜第六実施形態＞
第四実施形態と異なる部分を中心に説明する。 <Sixth Embodiment>
The part different from the fourth embodiment will be mainly described.

学習者に評点そのものを提示すると、雑音等による影響で不当に悪い点が付いた場合、学習者が採点支援装置に対して悪い印象を持ってしまう。そこで、本実施形態では、評点そのものではなく、内部的に得た評点を所定のルールで分類し、分類結果を評価結果として提示する。 When the score itself is presented to the learner, the learner has a bad impression on the scoring support device when an unreasonably bad point is given due to the influence of noise or the like. Therefore, in the present embodiment, not the score itself but the score obtained internally is classified according to a predetermined rule, and the classification result is presented as the evaluation result.

図７は第六実施形態に係る採点支援装置の機能ブロック図を、図２はその処理フローの例を示す図である。 FIG. 7 is a functional block diagram of the scoring support device according to the sixth embodiment, and FIG. 2 is a diagram showing an example of the processing flow.

第六実施形態に係る採点支援装置は、音声認識部１１０と、採点支援部６３０とを含む。 The scoring support device according to the sixth embodiment includes a voice recognition unit 110 and a scoring support unit 630.

＜採点支援部６３０＞
入力: 音声認識結果、正解文リスト、分類テーブル
出力: 評点結果
処理内容:
採点支援部６３０は、音声認識結果と正解文を含む正解文リストとを用いて、評点結果を求める（Ｓ６３０）。採点支援部６３０は、音声認識結果に対して、正解文リストに含まれる正解文毎にマッチングを行い、内部的に得られたマッチング結果を分類テーブルに応じて分類し、分類結果を評価結果として出力する。 <Scoring Support Department 630>
Input: Speech recognition result, correct sentence list, classification table output: Grade result processing content:
The scoring support unit 630 obtains the scoring result by using the voice recognition result and the correct sentence list including the correct sentence (S630). The scoring support unit 630 matches the voice recognition result for each correct sentence included in the correct sentence list, classifies the internally obtained matching result according to the classification table, and uses the classification result as the evaluation result. Output.

例えば、正解精度のレンジに応じて、以下の分類テーブルに基づき分類する。 For example, classification is performed based on the following classification table according to the range of correct answer accuracy.

また、例えば、正規表現については、マッチングした正規表現に応じて、以下の分類テーブルに基づき分類する。 Further, for example, the regular expression is classified based on the following classification table according to the matched regular expression.

＜その他の変形例＞
本発明は上記の実施形態及び変形例に限定されるものではない。例えば、上述の各種の処理は、記載に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。その他、本発明の趣旨を逸脱しない範囲で適宜変更が可能である。 <Other variants>
The present invention is not limited to the above embodiments and modifications. For example, the various processes described above may not only be executed in chronological order according to the description, but may also be executed in parallel or individually as required by the processing capacity of the apparatus that executes the processes. In addition, changes can be made as appropriate without departing from the spirit of the present invention.

＜プログラム及び記録媒体＞
また、上記の実施形態及び変形例で説明した各装置における各種の処理機能をコンピュータによって実現してもよい。その場合、各装置が有すべき機能の処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、上記各装置における各種の処理機能がコンピュータ上で実現される。 <Programs and recording media>
Further, various processing functions in each device described in the above-described embodiment and modification may be realized by a computer. In that case, the processing content of the function that each device should have is described by a program. Then, by executing this program on the computer, various processing functions in each of the above devices are realized on the computer.

この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。 The program describing the processing content can be recorded on a computer-readable recording medium. The computer-readable recording medium may be, for example, a magnetic recording device, an optical disk, a photomagnetic recording medium, a semiconductor memory, or the like.

また、このプログラムの流通は、例えば、そのプログラムを記録したＤＶＤ、ＣＤ−ＲＯＭ等の可搬型記録媒体を販売、譲渡、貸与等することによって行う。さらに、このプログラムをサーバコンピュータの記憶装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することにより、このプログラムを流通させてもよい。 Further, the distribution of this program is performed, for example, by selling, transferring, renting, or the like a portable recording medium such as a DVD or a CD-ROM in which the program is recorded. Further, the program may be distributed by storing the program in the storage device of the server computer and transferring the program from the server computer to another computer via a network.

このようなプログラムを実行するコンピュータは、例えば、まず、可搬型記録媒体に記録されたプログラムもしくはサーバコンピュータから転送されたプログラムを、一旦、自己の記憶部に格納する。そして、処理の実行時、このコンピュータは、自己の記憶部に格納されたプログラムを読み取り、読み取ったプログラムに従った処理を実行する。また、このプログラムの別の実施形態として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよい。さらに、このコンピュータにサーバコンピュータからプログラムが転送されるたびに、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。また、サーバコンピュータから、このコンピュータへのプログラムの転送は行わず、その実行指示と結果取得のみによって処理機能を実現する、いわゆるＡＳＰ（Application Service Provider）型のサービスによって、上述の処理を実行する構成としてもよい。なお、プログラムには、電子計算機による処理の用に供する情報であってプログラムに準ずるもの（コンピュータに対する直接の指令ではないがコンピュータの処理を規定する性質を有するデータ等）を含むものとする。 A computer that executes such a program first, for example, first stores a program recorded on a portable recording medium or a program transferred from a server computer in its own storage unit. Then, when the process is executed, the computer reads the program stored in its own storage unit and executes the process according to the read program. Further, as another embodiment of this program, a computer may read the program directly from a portable recording medium and execute processing according to the program. Further, each time the program is transferred from the server computer to this computer, the processing according to the received program may be executed sequentially. In addition, the above processing is executed by a so-called ASP (Application Service Provider) type service that realizes the processing function only by the execution instruction and result acquisition without transferring the program from the server computer to this computer. May be. In addition, the program shall include information used for processing by a computer and equivalent to the program (data that is not a direct command to the computer but has a property of defining the processing of the computer, etc.).

また、コンピュータ上で所定のプログラムを実行させることにより、各装置を構成することとしたが、これらの処理内容の少なくとも一部をハードウェア的に実現することとしてもよい。 Further, although each device is configured by executing a predetermined program on a computer, at least a part of these processing contents may be realized by hardware.

Claims

Includes a scoring support unit that requests evaluation of the voice answer using the voice recognition result of the voice answer to the question sentence and the correct answer information corresponding to the correct answer sentence list including at least one correct answer sentence corresponding to the question sentence.
Scoring support device.

The scoring support device according to claim 1.
The voice recognition result is a voice recognition result of the voice response by a voice recognition model corresponding to the pronunciation of a non-native speaker.
Scoring support device.

The scoring support device according to claim 1 or 2.
Includes a character processing unit that processes the voice recognition result and the correct answer information based on preset character processing rules.
The character processing rule is
(i) Convert uppercase letters to lowercase letters,
(ii) Delete the pre-specified symbol,
Including at least one of
Scoring support device.

A scoring support device according to any one of claims 1 to 3.
The scoring support unit requests the evaluation of the voice response without considering the insertion error included in the voice recognition result.
Scoring support device.

The scoring support device according to claim 4.
The insertion error is an insertion error at one or more of the beginning of the sentence, the end of the sentence, and the end of the phrase and the beginning of the phrase.
Scoring support device.

A scoring support device according to any one of claims 1 to 5.
The correct answer information includes a keyword list composed of a list of at least one word for one of the correct answer sentences in the correct answer sentence list, and the keyword list includes words extracted from the one correct answer sentence and words. Contains at least one of the synonyms or homophones corresponding to the extracted words as a keyword.
The scoring support unit is a part of a keyword and the voice recognition result for each keyword list set in advance for the correct answer sentence included in the correct answer sentence list or created from the correct answer sentence included in the correct answer sentence list. Find an evaluation based on one or more of matches, exact matches, prefix matches, and suffix matches,
Scoring support device.

The scoring support device according to claim 6.
The keyword list of the correct answer information is a list including information in the order of appearance of the keywords in one correct answer sentence.
The scoring support unit evaluates the matching rate of the voice recognition result and the keyword with respect to the appearance order of the keyword for each keyword list of the correct answer information.
Scoring support device.

The scoring support device according to claim 6 or 7.
The keyword has at least one predetermined linguistic attribute.
Scoring support device.

A scoring support device according to any one of claims 6 to 8.
Includes an answer sentence keyword creation unit that extracts keywords from the voice recognition result and creates an answer sentence keyword list.
The scoring support unit requests an evaluation based on the answer sentence keyword and the keyword list of the correct answer information.
Scoring support device.

A scoring support device according to any one of claims 1 to 5.
The scoring support unit requests an evaluation in which the correct answer information and the voice recognition result are collated in character units, and the correct answer information is a correct answer sentence list or a keyword list.
Scoring support device.

A scoring support device according to any one of claims 1 to 5.
The correct answer information includes one or more regular expressions corresponding to the one correct answer sentence, and the scoring support unit requests matching between the correct answer information and the voice recognition result as an evaluation.
Scoring support device.

A scoring support device according to any one of claims 1 to 5.
The correct answer information is the correct answer sentence list.
Scoring support device.

A scoring support device according to any one of claims 1 to 12.
The scoring support unit allows a difference of the number of characters or less specified in advance in the evaluation of the correct answer information and the voice recognition result.
Scoring support device.

A scoring support device according to any one of claims 1 to 13.
The scoring support unit outputs the best evaluation among the obtained evaluations as an evaluation result.
Scoring support device.

A scoring support device according to any one of claims 1 to 14.
The scoring support unit classifies the obtained evaluation according to a predetermined classification table, and outputs the classification result as the evaluation result.
Scoring support device.

It includes a scoring support step for requesting the evaluation of the voice answer by using the voice recognition result of the voice answer to the question sentence and the correct answer information corresponding to the correct answer sentence list including at least one correct answer sentence corresponding to the question sentence.
Scoring support method.

A program for operating a computer as a scoring support device according to any one of claims 1 to 15.