JP2014186061A

JP2014186061A - Information processing device and program

Info

Publication number: JP2014186061A
Application number: JP2013059093A
Authority: JP
Inventors: Masatsugu Sotoike; 昌嗣外池; Hiroshi Masuichi; 博増市
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2013-03-21
Filing date: 2013-03-21
Publication date: 2014-10-02
Anticipated expiration: 2033-03-21
Also published as: JP6040819B2

Abstract

PROBLEM TO BE SOLVED: To discriminate whether an important word issued in a telephone conversation is recorded in an appropriate place in a text document recording contents of the telephone conversation.SOLUTION: An association execution part (4d) associates each of a plurality of sentences included in telephone conversation summary data representing a text document with any spoken voice among a plurality of spoken voices represented by telephone conversation voice data. Then, an important word existence/absence determination part (4h) determines whether a sentence including a character string of the important word exists between a sentence associated with the preceding spoken voice of an important spoken voice being a spoken voice including the voice of the important word among the plurality of spoken voices and a sentence associated with a spoken voice after the important spoken voice.

Description

本発明は、情報処理装置及びプログラムに関する。 The present invention relates to an information processing apparatus and a program.

下記特許文献１には、オペレータの通話音声に含まれるキーワードから、複数の入力項目のうちで、オペレータによるデータ入力がなされるべき入力項目を特定し、特定した入力項目にデータが入力されているか否かを判別すること、が記載されている。 In Patent Literature 1 below, an input item that should be input by the operator among a plurality of input items is identified from keywords included in the call voice of the operator, and data is input to the specified input item. It is described that it is determined whether or not.

特開２０１２−３２５６２号公報JP 2012-32562 A

本発明の目的は、通話において発された重要語が通話の内容を記録したテキスト文書中のしかるべき場所に記載されているか否かを判別することである。 An object of the present invention is to determine whether or not an important word issued in a call is described in an appropriate place in a text document in which the contents of the call are recorded.

上記課題を解決するために、請求項１に記載の情報処理装置は、通話の内容を記録したテキスト文書に含まれる複数の単位テキストそれぞれに関する形態素解析処理の結果と、音声データにより示される前記通話において発された複数の発話音声それぞれに関する音声認識処理の結果と、に基づき、各単位テキストを、いずれかの発話音声に関連づける関連づけ手段と、前記複数の発話音声のうちの予め定められた重要語の音声を含む発話音声である重要発話音声、の前の発話音声に関連づけられた単位テキストたる第１テキストと、前記重要発話音声の後の発話音声に関連づけられた単位テキストたる第２テキストと、の間に、前記重要語の文字列を含む単位テキストが存在するか否かを判定する判定手段と、を含む。 In order to solve the above problem, the information processing apparatus according to claim 1, wherein the call is indicated by a result of morphological analysis processing for each of a plurality of unit texts included in a text document in which the content of the call is recorded and voice data. Based on the result of the speech recognition processing for each of the plurality of uttered voices, the association means for associating each unit text with one of the uttered voices, and a predetermined important word of the plurality of uttered voices A first text that is a unit text associated with an utterance voice before the important utterance voice, which is an utterance voice including the voice of the second voice, and a second text that is a unit text associated with the utterance voice after the important utterance voice; Determining means for determining whether or not there is a unit text including the character string of the important word.

また、請求項２に記載の情報処理装置は、請求項１に記載の情報処理装置において、前記第１テキストと前記第２テキストとの間に、前記重要語の文字列を含む単位テキストが存在しないと判定された場合に、その旨を出力することを特徴としている。 The information processing apparatus according to claim 2 is the information processing apparatus according to claim 1, wherein a unit text including the character string of the important word is present between the first text and the second text. When it is determined not to do so, the fact is output.

また、請求項３に記載の情報処理装置は、請求項２に記載の情報処理装置に、前記テキスト文書を表示手段に表示させる表示制御手段をさらに備えさせ、前記表示制御手段が、前記第１テキストと前記第２テキストとの間に、前記重要語の文字列を含む単位テキストがない場合、前記テキスト文書とともに、前記重要語を示す情報を、前記表示手段に表示させること、を特徴としている。 The information processing apparatus according to claim 3 further includes a display control unit that causes the information processing apparatus according to claim 2 to display the text document on a display unit, wherein the display control unit includes the first control unit. When there is no unit text including a character string of the important word between the text and the second text, information indicating the important word is displayed on the display unit together with the text document. .

また、請求項４に記載の情報処理装置は、請求項３に記載の情報処理装置に、前記第１テキストと前記第２テキストとの間に、前記重要語の文字列と予め定められた関係を有する文字列である代替文字列を含む単位テキストがあるか否かを判定する手段をさらに備えさせ、前記表示制御手段が、前記第１テキストと前記第２テキストとの間に、前記代替文字列を含む単位テキストがある場合、前記テキスト文書とともに、前記重要語と前記代替文字列とを示す情報を、前記表示手段に表示させること、を特徴としている。 Further, an information processing apparatus according to claim 4 is the information processing apparatus according to claim 3, wherein a character string of the important word and a predetermined relationship are set between the first text and the second text. Means for determining whether or not there is a unit text including an alternative character string that is a character string having the character string, and the display control means includes the alternative character between the first text and the second text. When there is a unit text including a column, information indicating the important word and the alternative character string is displayed on the display unit together with the text document.

また、請求項５に記載の情報処理装置は、請求項４に記載の情報処理装置に、前記第１テキストと前記第２テキストとの間に、前記代替文字列を含む単位テキストがある場合、前記テキスト文書を更新し、前記第１テキストと前記第２テキストとの間に存在する単位テキストに含まれる前記代替文字列を、前記重要語の文字列に修正する手段、をさらに備えさせたことを特徴としている。 Further, in the information processing device according to claim 5, when the information processing device according to claim 4 includes a unit text including the substitute character string between the first text and the second text, Means for updating the text document and correcting the substitute character string included in a unit text existing between the first text and the second text into a character string of the important word; It is characterized by.

また、請求項６に記載の情報処理装置は、請求項１乃至５のいずれかに記載の情報処理装置に、形態素解析処理により、各単位テキストに含まれる自立語を抽出する自立語抽出手段と、音声認識処理により、各発話音声で発される、前記自立語抽出手段により抽出された自立語を特定する自立語特定手段と、をさらに備えさせ、前記関連づけ手段が、各単位テキストを、その単位テキストに含まれる自立語と同一の自立語が発される発話音声に関連づけることを特徴としている。 An information processing apparatus according to claim 6 is an information processing apparatus according to any one of claims 1 to 5; independent word extraction means for extracting an independent word included in each unit text by morphological analysis processing; And an independent word specifying means for specifying an independent word extracted by the independent word extraction means, which is uttered in each utterance voice by the speech recognition processing, and the associating means each unit text, It is characterized by associating with the uttered voice in which the same independent word contained in the unit text is uttered.

また、請求項７に記載の情報処理装置は、請求項１乃至６のいずれかに記載の情報処理装置に、前記複数の発話音声を示す前記音声データを取得する音声取得手段をさらに備えさせたことを特徴としている。 An information processing apparatus according to a seventh aspect further includes a voice acquisition unit that acquires the voice data indicating the plurality of uttered voices in the information processing apparatus according to any one of the first to sixth aspects. It is characterized by that.

上記課題を解決するために、請求項８に記載のプログラムは、通話の内容を記録したテキスト文書に含まれる複数の単位テキストそれぞれに関する形態素解析処理の結果と、音声データにより示される前記通話において発された複数の発話音声それぞれに関する音声認識処理の結果と、に基づき、各単位テキストをいずれかの発話音声に関連づける関連づけ手段、前記複数の発話音声のうちの予め定められた重要語の音声を含む発話音声である重要発話音声、の前の発話音声に関連づけられた単位テキストたる第１テキストと、前記重要発話音声の後の発話音声に関連づけられた単位テキストたる第２テキストと、の間に、前記重要語の文字列を含む単位テキストが存在するか否かを判定する判定手段、としてコンピュータを機能させる。 In order to solve the above-described problem, the program according to claim 8 is configured to generate a call in the call indicated by the result of the morphological analysis processing for each of the plurality of unit texts included in the text document in which the content of the call is recorded and the voice data. An association means for associating each unit text with one of the uttered voices based on the result of the voice recognition processing for each of the plurality of uttered voices, including a voice of a predetermined important word among the plurality of uttered voices Between the first text that is the unit text associated with the speech before the important speech that is the speech and the second text that is the unit text associated with the speech after the important speech, A computer is caused to function as a determination unit that determines whether or not a unit text including a character string of the important word exists.

請求項１、７、８の発明によれば、通話において発された重要語が通話の内容を記録したテキスト文書中のしかるべき場所に記載されているか否かを判別できる。 According to the first, seventh, and eighth aspects of the present invention, it is possible to determine whether or not an important word issued in a call is described in an appropriate place in a text document that records the contents of the call.

請求項２の発明によれば、重要語がテキスト文書中のしかるべき場所に記載されていないことを、報知できる。 According to invention of Claim 2, it can alert | report that the important word is not described in the appropriate place in a text document.

請求項３の発明によれば、本構成を有しない場合と比較して、重要語がテキスト文書中のしかるべき場所に記載されていないことを、より詳しく報知できる。 According to the third aspect of the present invention, it can be notified in more detail that the important word is not described in an appropriate place in the text document, as compared with the case where this configuration is not provided.

請求項４の発明によれば、例えば、重要語がテキスト文書中のしかるべき場所に記載されているものの、重要語が正しく記載されていないことを、報知できる。 According to the invention of claim 4, for example, it is possible to notify that the important word is not correctly described although the important word is described in an appropriate place in the text document.

請求項５の発明によれば、例えば、しかるべき場所に記載されている重要語の誤りを訂正できる。 According to the invention of claim 5, for example, it is possible to correct an error of an important word described in an appropriate place.

請求項６の発明によれば、重要語が記録されるべきテキスト文書中の場所を、本構成を有しない場合と比較して、より正確に特定できる。 According to the invention of claim 6, the location in the text document where the important word is to be recorded can be specified more accurately than in the case where the configuration is not provided.

情報処理装置のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of information processing apparatus. 情報処理装置で実現される機能群を示す機能ブロック図である。It is a functional block diagram which shows the function group implement | achieved with information processing apparatus. 音声データの一例を示す図である。It is a figure which shows an example of audio | voice data. 一部の発話音声の一例を示す図である。It is a figure which shows an example of some speech sounds. テキスト文書の内容の一例を示す図である。It is a figure which shows an example of the content of a text document. 重要リストの一例を示す図である。It is a figure which shows an example of an important list. キーワード記憶部の記憶内容を示す図である。It is a figure which shows the memory content of a keyword memory | storage part. 単語ラティスの一例を示す図である。It is a figure which shows an example of a word lattice. ワードスポッティング結果記憶部の記憶内容を示す図である。It is a figure which shows the memory content of a word spotting result memory | storage part. 情報処理装置で実行される処理を示すフロー図である。It is a flowchart which shows the process performed with information processing apparatus. 対象テキストが発話音声に関連づけられる様子を示す概念図である。It is a conceptual diagram which shows a mode that an object text is linked | related with speech sound. 情報処理装置で実行される処理を示すフロー図である。It is a flowchart which shows the process performed with information processing apparatus. 表示部に表示される画像を示す図である。It is a figure which shows the image displayed on a display part. 情報処理装置で実行される処理を示すフロー図である。It is a flowchart which shows the process performed with information processing apparatus. 表示部に表示される画像を示す図である。It is a figure which shows the image displayed on a display part.

以下、本発明の実施形態の例について図面に基づき詳細に説明する。 Hereinafter, examples of embodiments of the present invention will be described in detail with reference to the drawings.

図１は、本発明の実施形態に係る情報処理装置２のハードウェア構成を示す図である。情報処理装置２は、制御部４、主記憶６、ハードディスク８、表示部１０、及び操作入力部１２等を備えたコンピュータとして実現される。本実施形態の場合、情報処理装置２は、製造業者により利用される。 FIG. 1 is a diagram illustrating a hardware configuration of an information processing apparatus 2 according to an embodiment of the present invention. The information processing apparatus 2 is realized as a computer including a control unit 4, a main memory 6, a hard disk 8, a display unit 10, an operation input unit 12, and the like. In the case of this embodiment, the information processing apparatus 2 is used by a manufacturer.

制御部４は、マイクロプロセッサであり、主記憶６に格納されるプログラムに従って各種情報処理を実行する。主記憶６は、ＲＯＭ及びＲＡＭによって実現され、上記プログラムの他、各種情報処理に必要な情報を格納する。ここで、上記プログラムは、コンピュータ読み取り可能な情報記憶媒体（例えば、ＤＶＤ（登録商標）−ＲＯＭ）から読み出されて主記憶６に格納される。上記プログラムは、ネットワークを介してダウンロードされて主記憶６に格納されてもよい。 The control unit 4 is a microprocessor, and executes various types of information processing according to programs stored in the main memory 6. The main memory 6 is realized by a ROM and a RAM, and stores information necessary for various types of information processing in addition to the above programs. Here, the program is read from a computer-readable information storage medium (for example, DVD (registered trademark) -ROM) and stored in the main memory 6. The program may be downloaded via a network and stored in the main memory 6.

ハードディスク８は、各種情報を記憶する。ハードディスク８に記憶される情報については後述する。表示部１０は、液晶ディスプレイ等のディスプレイであり、制御部４の命令に従い、情報を表示する。 The hard disk 8 stores various information. Information stored in the hard disk 8 will be described later. The display unit 10 is a display such as a liquid crystal display, and displays information according to a command from the control unit 4.

また、操作入力部１２は、マウス及びキーボード等であり、情報処理装置２の管理者が実行した操作内容を示す信号を、制御部４に渡す。 The operation input unit 12 is a mouse, a keyboard, or the like, and passes a signal indicating the operation content executed by the administrator of the information processing apparatus 2 to the control unit 4.

図２は、情報処理装置２で実現される機能群を示す機能ブロック図である。情報処理装置２では、通話音声データ記憶部８ａ、通話要約記憶部８ｂ、及び重要語記憶部８ｃが実現される。これらは、ハードディスク８により実現される。 FIG. 2 is a functional block diagram illustrating a functional group realized by the information processing apparatus 2. In the information processing apparatus 2, a call voice data storage unit 8a, a call summary storage unit 8b, and an important word storage unit 8c are realized. These are realized by the hard disk 8.

また、情報処理装置２では、さらに、キーワード記憶部６ａ、ワードスポッティング結果記憶部６ｂ、及びペア記憶部６ｃが実現される。これらは、主記憶６により実現される。 In the information processing apparatus 2, a keyword storage unit 6a, a word spotting result storage unit 6b, and a pair storage unit 6c are further realized. These are realized by the main memory 6.

また、情報処理装置２では、さらに、キーワード抽出部４ａ、音声認識部４ｂ、ワードスポッティング部４ｃ、関連づけ実行部４ｄ、重要発話音声特定部４ｅ、第１対象テキスト特定部４ｆ、第２対象テキスト特定部４ｇ、重要語有無判定部４ｈ、及び通話要約表示部４ｆが実現される。これらは、管理者により通話要約表示操作が行われた場合に制御部４が上記プログラムに従い情報処理を実行することによって実現される。 In the information processing apparatus 2, the keyword extraction unit 4a, the speech recognition unit 4b, the word spotting unit 4c, the association execution unit 4d, the important utterance speech specification unit 4e, the first target text specification unit 4f, and the second target text specification are further provided. A unit 4g, an important word presence / absence determination unit 4h, and a call summary display unit 4f are realized. These are realized by the control unit 4 executing information processing according to the program when a call summary display operation is performed by the administrator.

[通話音声データ記憶部]
通話音声データ記憶部８ａは、製造業者のコールセンターで働くオペレータが、顧客との通話において発した一連の発話音声を示す音声データである通話音声データ１４を記憶している。通話音声データ１４は、通話中にオペレータが電話の受話器に入力した音声の録音データである。 [Call voice data storage]
The call voice data storage unit 8a stores call voice data 14 which is voice data indicating a series of uttered voices made by an operator working in a manufacturer's call center in a call with a customer. The call voice data 14 is voice recording data input to the telephone receiver by the operator during a call.

図３は、通話音声データ１４の一例を示す図である。矢印は、通話開始からの時間経過を示す。通話音声データ１４は、オペレータが発した一連の発話音声に係る音声部分を含む。オペレータが発話を行っていない部分をハッチングしている。通話音声データ１４には、各発話音声の開始タイミング及び終了タイミングが記録されている。また、本実施形態の場合、各発話音声には先頭から通し番号（以下、発話番号と表記する）が付与されており、通話音声データ１４には、各発話音声の発話番号が記録されている。図４に、一部の発話音声の一例を示した。図４では、各発話音声の左に発話番号を記載している。 FIG. 3 is a diagram illustrating an example of the call voice data 14. Arrows indicate the passage of time since the start of the call. The call voice data 14 includes a voice portion related to a series of uttered voices uttered by an operator. The part where the operator is not speaking is hatched. The call voice data 14 records the start timing and end timing of each utterance voice. In the present embodiment, a serial number (hereinafter referred to as an utterance number) is assigned to each utterance voice from the beginning, and the utterance number of each utterance voice is recorded in the call voice data 14. FIG. 4 shows an example of a part of speech voice. In FIG. 4, the utterance number is written to the left of each utterance voice.

[通話要約記憶部]
通話要約記憶部８ｂは、通話要約データを記憶している。通話要約データは、顧客との通話が終了した後、通話内容を思い出しながらオペレータが作成したテキスト文書を示す文書データである。テキスト文書には、通話の内容が記録され、複数の文それぞれのテキスト（以下、テキスト文と表記する）が含まれる。すなわち、テキスト文書には、オペレータの発話部分に係る複数のテキスト文（複数の単位テキストに相当）と、顧客の発話部分に係る複数のテキスト文と、が含まれる。本実施形態の場合、文書データには、各テキスト文がオペレータと顧客とのうちのどちらの発話部分に係るテキスト文なのかを示す情報が含まれる。図５に、テキスト文書の内容の一例を示した。本実施形態の場合、一つの行に記載の文字列が一つのテキスト文となる。なお、図５では、便宜上、各テキスト文の左に通し番号を記載している。また、図５では、便宜上、オペレータの発話部分に係るテキスト文を太字で示している。 [Call summary storage]
The call summary storage unit 8b stores call summary data. The call summary data is document data indicating a text document created by the operator while recalling the contents of the call after the call with the customer is finished. The contents of the call are recorded in the text document, and the text of each of a plurality of sentences (hereinafter referred to as text sentences) is included. That is, the text document includes a plurality of text sentences (corresponding to a plurality of unit texts) relating to the utterance part of the operator and a plurality of text sentences relating to the utterance part of the customer. In the case of the present embodiment, the document data includes information indicating which text sentence is the text sentence related to the utterance portion of the operator or the customer. FIG. 5 shows an example of the contents of a text document. In the case of this embodiment, the character string described in one line becomes one text sentence. In FIG. 5, for convenience, a serial number is written on the left of each text sentence. Further, in FIG. 5, for convenience, a text sentence related to the utterance portion of the operator is shown in bold.

以下、オペレータの発話部分に係るテキスト文のことを、対象テキストと表記する。 Hereinafter, a text sentence related to the utterance part of the operator is referred to as a target text.

[重要語記憶部]
重要語記憶部８ｃは、重要語リストを記憶している。重要語リストは、管理者により予め登録されている複数の重要語を示すデータである。また、重要語リストは、各重要語の読み方も示す。図６は、重要語リストの一例を示す図である。同図に示すように、重要語リストは、重要語ごとに、その重要語の文字列と、その重要語の読み方を示す音素と、を関連付けて記憶している。 [Key word storage]
The important word storage unit 8c stores an important word list. The important word list is data indicating a plurality of important words registered in advance by the administrator. The important word list also indicates how to read each important word. FIG. 6 is a diagram illustrating an example of the important word list. As shown in the figure, the important word list stores, for each important word, a character string of the important word and a phoneme indicating how to read the important word in association with each other.

なお、重要語リストが表す各重要語の文字列及び音素は、音声認識のための単語辞書に予め登録されている。 Note that the character strings and phonemes of each important word represented by the important word list are registered in advance in a word dictionary for speech recognition.

通話音声データ、通話要約データ、及び重要語リストは、上述した通話要約表示操作が行われた場合に、制御部４（音声取得手段）により読み出される。 The call voice data, call summary data, and important word list are read out by the control unit 4 (voice acquisition means) when the above-described call summary display operation is performed.

次に、キーワード抽出部４ａ、音声認識部４ｂ、ワードスポッティング部４ｃ、関連づけ実行部４ｄ、及び重要発話音声特定部４ｅについて説明する。 Next, the keyword extraction unit 4a, the speech recognition unit 4b, the word spotting unit 4c, the association execution unit 4d, and the important utterance speech specifying unit 4e will be described.

[キーワード抽出部]
キーワード抽出部４ａは、各対象テキストに対して形態素解析処理を行い、各対象テキストに含まれる自立語をキーワードとして抽出する。 [Keyword extractor]
The keyword extraction unit 4a performs morphological analysis processing on each target text, and extracts independent words included in each target text as keywords.

本実施形態の場合、キーワード抽出部４ａは、まず、テキスト文書中の各テキスト文に対して、先頭のテキスト文から通し番号（以下、文番号と表記する）を付与する。その後、キーワード抽出部４ａは、テキスト文書中の対象テキストごとに、形態素解析処理を行ってその対象テキストに含まれる自立語をキーワードとして抽出するとともに、抽出したキーワードをその対象テキストの文番号と関連づけてキーワード記憶部６ａに保存する。なお、形態素解析処理のための形態素解析器としては、MeCab、茶筌、及びJUMANなどが用いられる。図７に、キーワード記憶部６ａの記憶内容を示した。括弧内の数字は、文番号を示している。また、文番号の右側に、その文番号の対象テキストから抽出されたキーワードを示している。 In the present embodiment, the keyword extraction unit 4a first assigns a serial number (hereinafter referred to as a sentence number) from the first text sentence to each text sentence in the text document. Thereafter, the keyword extraction unit 4a performs morphological analysis processing for each target text in the text document to extract independent words included in the target text as keywords, and associates the extracted keyword with the sentence number of the target text. And stored in the keyword storage unit 6a. Note that MeCab, teacup, and JUMAN are used as the morphological analyzer for the morphological analysis processing. FIG. 7 shows the stored contents of the keyword storage unit 6a. The numbers in parentheses indicate sentence numbers. Moreover, the keyword extracted from the object text of the sentence number is shown on the right side of the sentence number.

[音声認識部]
音声認識部４ｂは、通話音声データに対して音声認識処理を実行する。本実施形態の場合、音声認識部４ｂは、オープンソースの音声認識エンジン「Julius」で用いられる音声認識アルゴリズムに従って、通話音声データに対して音声認識処理を実行する。これにより、音声認識部４ｂは、発話音声ごとに、音声認識処理の結果であるいわゆる単語ラティスを得る。また、音声認識部４ｂは、発話音声の発話番号に関連づけてその発話音声から得られた単語ラティスのデータを主記憶６に保存する。 [Voice recognition part]
The voice recognition unit 4b performs voice recognition processing on the call voice data. In the case of the present embodiment, the voice recognition unit 4b performs voice recognition processing on the call voice data in accordance with a voice recognition algorithm used by the open source voice recognition engine “Julius”. Thereby, the speech recognition unit 4b obtains a so-called word lattice that is a result of the speech recognition process for each uttered speech. The voice recognition unit 4 b stores the word lattice data obtained from the uttered voice in the main memory 6 in association with the utterance number of the uttered voice.

図８は、単語ラティスの一例を示す図である。同図に示すように、単語ラティスは、一つ一つの単語と、連接可能な単語を結ぶリンクと、を含むグラフを表す。一つ一つの単語がノードに対応している。矩形がノード（単語）を表し、矩形間を結ぶ線がリンクを表している。 FIG. 8 is a diagram illustrating an example of a word lattice. As shown in the figure, the word lattice represents a graph including each word and a link connecting words that can be connected. Each word corresponds to a node. A rectangle represents a node (word), and a line connecting the rectangles represents a link.

[ワードスポッティング部]
ワードスポッティング部４ｃは、発話音声ごとに、その発話音声で発される重要語及びキーワードを、その発話音声の発話番号に関連づけられた単語ラティスのデータに基づいて特定する。 [Word spotting part]
For each utterance voice, the word spotting unit 4c specifies an important word and a keyword uttered by the utterance voice based on word lattice data associated with the utterance number of the utterance voice.

本実施形態では、ワードスポッティング部４ｃは、まず、キーワード記憶部６ａの記憶内容に基づいてキーワード抽出部４ａが各対象テキストから抽出したキーワードの集合（以下、キーワード集合と表記する）を示すデータを生成する。そして、ワードスポッティング部４ｃは、発話音声ごとに、その発話音声の発話番号に関連づけられた単語ラティスに含まれる重要語及びキーワード（すなわちその発話音声で発される重要語及びキーワード）を重要語リスト及びキーワード集合を示すデータに基づいて特定するとともに、特定した重要語及びキーワードをその発話音声の発話番号に関連づけてワードスポッティング結果記憶部６ｂに保存する。 In the present embodiment, the word spotting unit 4c first stores data indicating a set of keywords (hereinafter referred to as a keyword set) extracted from each target text by the keyword extraction unit 4a based on the storage contents of the keyword storage unit 6a. Generate. Then, for each utterance voice, the word spotting unit 4c selects an important word and keyword (that is, an important word and keyword uttered by the utterance voice) included in the word lattice associated with the utterance number of the utterance voice as an important word list. The keyword is identified based on the data indicating the keyword set, and the identified important word and keyword are stored in the word spotting result storage unit 6b in association with the utterance number of the uttered voice.

図９に、ワードスポッティング結果記憶部６ｂの記憶内容を示した。括弧内の数字は、発話番号を示している。また、発話番号の右側に、その発話番号の発話音声で発される重要語及びキーワードを示している。重要語及びキーワードは区別して記憶される。図９では、重要語に二重下線し、キーワードに一重下線している。 FIG. 9 shows the stored contents of the word spotting result storage unit 6b. The numbers in parentheses indicate utterance numbers. Further, on the right side of the utterance number, important words and keywords uttered by the utterance voice of the utterance number are shown. Important words and keywords are stored separately. In FIG. 9, important words are double underlined and keywords are single underlined.

[関連づけ実行部]
関連付け実行部４ｄは、各対象テキストを、いずれかの発話音声に関連づける。 [Associate execution part]
The association execution unit 4d associates each target text with one of the uttered voices.

図１０は、関連づけ実行部４ｄにより実行される処理を示すフロー図である。まず、関連づけ実行部４ｄは、テキスト文書に含まれる複数の対象テキストを文番号の昇順にソートする（Ｓ１０１）。以下、「ｉ」番目の対象テキストのことを対象テキスト[ｉ]と表記する。 FIG. 10 is a flowchart showing processing executed by the association execution unit 4d. First, the association executing unit 4d sorts a plurality of target texts included in the text document in ascending order of sentence numbers (S101). Hereinafter, the “i” th target text is referred to as target text [i].

また、関連づけ実行部４ｄは、発話音声集合を設定する（Ｓ１０２）。すなわち、Ｓ１０２で関連づけ実行部４ｄは、通話音声データが表す複数の発話音声全部を、発話音声集合の元として設定する。 Further, the association execution unit 4d sets a speech voice set (S102). That is, in S102, the association execution unit 4d sets all of the plurality of utterance voices represented by the call voice data as the source of the utterance voice set.

そして、一番目の対象テキスト[１]から順番にＳ１０３以降のステップが実行される。 Then, the steps after S103 are executed in order from the first target text [1].

すなわち、関連づけ実行部４ｄは、発話音声集合に含まれる発話音声を発話番号の昇順にソートする（Ｓ１０３）。以下、「ｊ」番目の発話音声のことを発話音声[ｊ]と表記する。そして、関連づけ実行部４ｄは、一番目の発話音声[１]から順番にＳ１０４及びＳ１０５のステップを実行する。 That is, the association execution unit 4d sorts the speech sounds included in the speech sound set in ascending order of the speech numbers (S103). Hereinafter, the “j” -th utterance voice is denoted as utterance voice [j]. Then, the association execution unit 4d executes steps S104 and S105 in order from the first utterance voice [1].

すなわち、関連づけ実行部４ｄは、対象テキスト[ｉ]と発話音声[ｊ]とで共有されるキーワードの数（以下、キーワード数と表記する）を計数する（Ｓ１０４）。より詳しくは、Ｓ１０４で関連づけ実行部４ｄは、対象テキスト[ｉ]の文番号に関連づけてキーワード記憶部６ａに記憶されるキーワードの集合と、発話音声[ｊ]の発話番号に関連づけてワードスポッティング結果記憶部６ｂに記憶されるキーワードの集合と、の両方に含まれるキーワードの数をキーワード数として計数する。 That is, the association execution unit 4d counts the number of keywords (hereinafter referred to as the number of keywords) shared between the target text [i] and the uttered voice [j] (S104). More specifically, in S104, the association execution unit 4d associates the keyword spotting result with the set of keywords stored in the keyword storage unit 6a in association with the sentence number of the target text [i] and the speech number of the utterance voice [j]. The number of keywords included in both the keyword set stored in the storage unit 6b is counted as the number of keywords.

また、関連づけ実行部４ｄは、Ｓ１０４で計数したキーワード数を、発話音声[ｊ]の発話番号に関連づけて主記憶６に保存する。こうして、発話音声集合中の全発話音声につきキーワード数が計数される。その後、関連づけ実行部４ｄは、主記憶６に記憶されるキーワード数のうちの最大のキーワード数に関連づけられた発話番号を特定し（Ｓ１０６）、特定した発話番号と対象テキスト[ｉ]の文番号とのペアをペア記憶部６ｃに保存する（Ｓ１０７）。なお、主記憶６に記憶されるキーワード数がすべて「０」の場合、Ｓ１０６及びＳ１０７はスキップされる。 The association executing unit 4d stores the number of keywords counted in S104 in the main memory 6 in association with the utterance number of the uttered voice [j]. Thus, the number of keywords is counted for all uttered voices in the uttered voice set. Thereafter, the association execution unit 4d identifies the utterance number associated with the maximum number of keywords stored in the main memory 6 (S106), and the identified utterance number and the sentence number of the target text [i]. Is stored in the pair storage unit 6c (S107). If all the keywords stored in the main memory 6 are “0”, S106 and S107 are skipped.

また、関連づけ実行部４ｄは、Ｓ１０６のステップで特定した発話番号以前の発話番号が付与された発話音声を、発話音声集合から削除する（Ｓ１０８）。 Further, the association executing unit 4d deletes the utterance voice to which the utterance number before the utterance number specified in step S106 is assigned from the utterance voice set (S108).

図１１に、関連づけ実行部４ｄにより対象テキストが発話音声に関連づけられる様子を示す概念図を示した。各矢印が関連づけ先の発話音声を示している。同図によれば、発話番号が「３４」の発話音声に文番号が「２４」の対象テキストが関連づけられている。また、発話番号が「３８」の発話音声に文番号が「２５」の対象テキストが関連づけられている。発話番号が「４４」の発話音声に文番号が「２６」の対象テキストが関連づけられている。 FIG. 11 is a conceptual diagram showing a state in which the target text is associated with the uttered voice by the association execution unit 4d. Each arrow indicates the speech to be linked. According to the figure, the target text with the sentence number “24” is associated with the utterance voice with the utterance number “34”. Further, the target text with the sentence number “25” is associated with the utterance voice with the utterance number “38”. The target text with the sentence number “26” is associated with the utterance voice with the utterance number “44”.

[重要発話音声特定部]
重要発話音声特定部４ｅは、通話音声データが表す複数の発話音声のうちで、いずれかの重要語の音声を含む発話音声である重要発話音声を特定する。本実施形態の場合、重要発話音声特定部４ｅは、ワードスポッティング結果記憶部６ｂの記憶内容に基づいて重要発話音声を特定する。より詳しくは、重要発話音声特定部４ｅは、ワードスポッティング結果記憶部６ｂに記憶される発話番号のうちで、いずれかの重要語に関連づけられている発話番号を１又は複数特定する。 [Important speech identification unit]
The important utterance voice specifying unit 4e specifies an important utterance voice that is a utterance voice including the voice of any one of the important words among the plurality of utterance voices represented by the call voice data. In the case of the present embodiment, the important utterance voice specifying unit 4e specifies the important utterance voice based on the stored contents of the word spotting result storage unit 6b. More specifically, the important utterance voice specifying unit 4e specifies one or a plurality of utterance numbers associated with any one of the utterance numbers stored in the word spotting result storage unit 6b.

次に、第１対象テキスト特定部４ｆ、第２対象テキスト特定部４ｇ、重要語有無判定部４ｈ、及び通話要約表示部４ｆについて説明する。なお、以下、重要発話音声特定部４ｅにより特定された１又は複数の発話番号のうちの任意の発話番号、の発話音声のことを重要発話音声Ｘと呼ぶ。 Next, the first target text specifying unit 4f, the second target text specifying unit 4g, the important word presence / absence determining unit 4h, and the call summary display unit 4f will be described. Hereinafter, an utterance voice of an arbitrary utterance number among one or a plurality of utterance numbers specified by the important utterance voice specifying unit 4e is referred to as an important utterance voice X.

[第１対象テキスト特定部]
第１対象テキスト特定部４ｆは、重要発話音声Ｘより発話番号が前の発話音声、に関連づけられた対象テキスト（以下、第１対象テキストと表記する）を特定する。具体的には、第１対象テキスト特定部４ｆは、重要発話音声Ｘの発話番号より小さい発話番号を含むペアのうちで、最大の発話番号を含むペアを特定し、特定したペアに含まれる文番号を、第１対象テキストの文番号として特定する。例えば、図１１に示す発話番号「４１」の発話音声が重要発話音声Ｘである場合、発話番号「３８」の発話音声に関連づけられた、文番号「２５」の対象テキストが第１対象テキストとして特定される。 [First target text identification part]
The first target text specifying unit 4f specifies the target text (hereinafter referred to as the first target text) associated with the utterance voice whose utterance number is before the important utterance voice X. Specifically, the first target text specifying unit 4f specifies a pair including the largest utterance number among pairs including an utterance number smaller than the utterance number of the important utterance voice X, and a sentence included in the specified pair. The number is specified as the sentence number of the first target text. For example, when the utterance voice of the utterance number “41” shown in FIG. 11 is the important utterance voice X, the target text of the sentence number “25” associated with the utterance voice of the utterance number “38” is the first target text. Identified.

[第２対象テキスト特定部]
第２対象テキスト特定部４ｇは、重要発話音声Ｘより発話番号が後の発話音声、に関連づけられた対象テキスト（以下、第２対象テキストと表記する）を特定する。具体的には、第２対象テキスト特定部４ｇは、重要発話音声Ｘの発話番号より大きい発話番号を含むペアのうちで、最小の発話番号を含むペアを特定し、特定したペアに含まれる文番号を、第２対象テキストの文番号として特定する。例えば、図１１に示す発話番号「４１」の発話音声が重要発話音声Ｘである場合、発話番号「４４」の発話音声に関連づけられた、文番号「２６」の対象テキストが第２対象テキストとして特定される。 [Second target text identification part]
The second target text specifying unit 4g specifies a target text (hereinafter referred to as a second target text) associated with an utterance voice whose utterance number is later than the important utterance voice X. Specifically, the second target text specifying unit 4g specifies a pair including the smallest utterance number among pairs including an utterance number larger than the utterance number of the important utterance voice X, and a sentence included in the specified pair. The number is specified as the sentence number of the second target text. For example, when the utterance voice of the utterance number “41” shown in FIG. 11 is the important utterance voice X, the target text of the sentence number “26” associated with the utterance voice of the utterance number “44” is the second target text. Identified.

[重要語有無判定部]
重要語有無判定部４ｈは、第１対象テキストと第２対象テキストとの間に、重要発話音声Ｘで発される重要語、の文字列を含む対象テキストが存在するか否かを判定する。 [Important word presence determination unit]
The important word presence / absence determination unit 4h determines whether or not there is a target text including a character string of an important word uttered by the important utterance speech X between the first target text and the second target text.

図１２は、重要語有無判定部４ｈにより実行される処理を示すフロー図である。まず、重要語有無判定部４ｈは、主記憶６に記憶されるフラグの値を「０」に設定する（Ｓ２０１）。また、重要語有無判定部４ｈは、第１対象テキストの文番号と第２対象テキストの文番号とが連番になっているか否かを判定する（Ｓ２０２）。第１対象テキストの文番号と第２対象テキストの文番号とが連番になっている場合（Ｓ２０２のＹＥＳ）、重要語有無判定部４ｈは、処理を終了する。 FIG. 12 is a flowchart showing processing executed by the important word presence / absence determination unit 4h. First, the important word presence / absence determination unit 4h sets the value of the flag stored in the main memory 6 to “0” (S201). Further, the important word presence / absence determination unit 4h determines whether or not the sentence number of the first target text and the sentence number of the second target text are serial numbers (S202). When the sentence number of the first target text and the sentence number of the second target text are serial numbers (YES in S202), the important word presence / absence determination unit 4h ends the process.

一方、第１対象テキストの文番号と第２対象テキストの文番号とが連番になっていない場合（Ｓ２０２のＮＯ）、重要語有無判定部４ｈは、第１対象テキストの文番号と第２対象テキストの文番号との間の文番号の対象テキストを、文番号の昇順にソートする（Ｓ２０３）。以下、「ｉ」番目の対象テキストのことを対象テキスト[ｉ]と表記する。 On the other hand, when the sentence number of the first target text and the sentence number of the second target text are not consecutive numbers (NO in S202), the important word presence / absence determining unit 4h determines the sentence number of the first target text and the second number. The target texts with the sentence numbers between the sentence numbers of the target texts are sorted in ascending order of the sentence numbers (S203). Hereinafter, the “i” th target text is referred to as target text [i].

そして、重要語有無判定部４ｈは、「１」番目の対象テキスト[１]から順番に、対象テキスト[ｉ]が、重要発話音声Ｘの発話番号に関連づけてワードスポッティング結果記憶部６ｂに記憶される重要語の文字列を含むか否かを、通話要約データ及び重要語リストに基づいて判定する（Ｓ２０４）。対象テキスト[ｉ]が、重要発話音声Ｘの発話番号に関連づけて記憶される重要語の文字列を含む場合（Ｓ２０４のＹＥＳ）、重要語有無判定部４ｈは、フラグの値を「１」に更新し（Ｓ２０５）、処理を終了する。 Then, the important word presence / absence determination unit 4h stores the target text [i] in the word spotting result storage unit 6b in association with the utterance number of the important utterance speech X in order from the “1” -th target text [1]. It is determined whether or not an important word character string is included based on the call summary data and the important word list (S204). When the target text [i] includes a character string of an important word stored in association with the utterance number of the important utterance voice X (YES in S204), the important word presence / absence determination unit 4h sets the flag value to “1”. Update (S205) and end the process.

フラグの値「０」は、「重要発話音声Ｘで発される重要語の文字列を含む対象テキストが第１対象テキストと第２対象テキストとの間に存在しない」ことを示し、フラグの値「１」は、「重要発話音声Ｘで発される重要語の文字列を含む対象テキストが第１対象テキストと第２対象テキストとの間に存在する」ことを示す。 The flag value “0” indicates that “the target text including the character string of the important word uttered by the important utterance speech X does not exist between the first target text and the second target text”. “1” indicates that “the target text including the character string of the important word uttered by the important speech X exists between the first target text and the second target text”.

[通話要約表示部]
通話要約表示部４ｉは、通話要約データが表すテキスト文書の画像１６を表示部１０に表示する。但し、上記フラグの値が「０」である場合、すなわち、重要発話音声Ｘで発される重要語の文字列を含む対象テキストが第１対象テキストと第２対象テキストとの間に存在しない場合、通話要約表示部４ｉは、その旨を示す情報を出力する。すなわち、上記フラグの値が「０」である場合、通話要約表示部４ｉは、図１３に示すように、重要発話音声Ｘの発話番号に関連づけられた重要語を示す情報（ここでは、文字列）を含む図、アイコン、及びウィンドウ等の画像１８を、画像１６と併せて表示部１０に表示する。文字列「Fujisan x430」が重要語の文字列を示す。 [Call summary display]
The call summary display unit 4 i displays an image 16 of the text document represented by the call summary data on the display unit 10. However, when the value of the flag is “0”, that is, when the target text including the character string of the important word uttered by the important utterance speech X does not exist between the first target text and the second target text. The call summary display unit 4i outputs information indicating that. That is, when the value of the flag is “0”, the call summary display unit 4i, as shown in FIG. 13, shows information (in this case, a character string) indicating an important word associated with the utterance number of the important utterance voice X. ) Including images, icons, windows, and the like are displayed on the display unit 10 together with the image 16. The character string “Fujisan x430” indicates a character string of an important word.

以上のように、この情報処理装置２では、通話においてオペレータが発した重要語がテキスト文書中のしかるべき場所に記載されているか否かが判別される。また、通話においてオペレータが発した重要語がテキスト文書中のしかるべき場所に記録されていないことが、管理者に報知される。 As described above, in the information processing apparatus 2, it is determined whether or not the important word issued by the operator in the call is described in an appropriate place in the text document. In addition, the manager is notified that the important word issued by the operator in the call is not recorded in an appropriate place in the text document.

なお、本発明の実施形態は、上記実施形態だけに限らない。 In addition, embodiment of this invention is not restricted only to the said embodiment.

[変形例]
例えば、重要語有無判定部４ｈは、第１対象テキストと第２対象テキストとの間に、重要語の文字列と所定関係を有する代替文字列を含む対象テキストが存在するか否かも判定してよい。ここで、代替文字列とは、例えば、重要語の上位概念語又は下位概念語の文字列、重要語の類義語の文字列、及び重要語の文字列の一部の文字のケース（大文字小文字の区別）を変更した文字列などである。ここでは、代替文字列が、「重要語の文字列の一部の文字のケースを変更した文字列」である場合を例に取り上げ、変形例を説明する。 [Modification]
For example, the important word presence / absence determining unit 4h also determines whether or not there is a target text including an alternative character string having a predetermined relationship with the character string of the important word between the first target text and the second target text. Good. Here, the substitute character string is, for example, a character string of a broader concept word or a lower concept word of a key word, a character string of a synonym of a key word, and a case of a part of a character string of a key word (uppercase or lowercase letters). A character string with a changed distinction. Here, a case where the alternative character string is “a character string obtained by changing the case of a part of a character string of an important word” will be described as an example, and a modified example will be described.

図１４は、変形例において、重要語有無判定部４ｈにより実行される処理を示すフロー図である。同図に示すように、変形例では、図１２に示す処理に、Ｓ２０６及びＳ２０７のステップが加えられている。すなわち、重要語有無判定部４ｈは、対象テキスト[ｉ]が重要発話音声Ｘの発話番号に関連づけて記憶される重要語の文字列を含まない場合に（Ｓ２０５のＮＯ）、さらに、対象テキスト[ｉ]が代替文字列を含むか否かを判定する（Ｓ２０６）。そして、重要語有無判定部４ｈは、対象テキスト[ｉ]が代替文字列を含む場合（Ｓ２０６のＹＥＳ）、対象テキスト[ｉ]の文番号を文番号Ｘとして主記憶６に保存するとともに、上記フラグの値を「２」に更新して（Ｓ２０７）、処理を終了する。 FIG. 14 is a flowchart showing processing executed by the important word presence / absence determination unit 4h in the modification. As shown in the figure, in the modification, steps S206 and S207 are added to the process shown in FIG. That is, the important word presence / absence determination unit 4h further determines that the target text [i] does not include the important word character string stored in association with the utterance number of the important utterance voice X (NO in S205). It is determined whether i] includes an alternative character string (S206). When the target text [i] includes an alternative character string (YES in S206), the important word presence / absence determining unit 4h stores the sentence number of the target text [i] in the main memory 6 as the sentence number X, and The value of the flag is updated to “2” (S207), and the process ends.

フラグの値「２」は、「代替文字列を含む対象テキストが第１対象テキストと第２対象テキストとの間に存在する」ことを示す。 The flag value “2” indicates that “the target text including the substitute character string exists between the first target text and the second target text”.

また、変形例では、通話要約表示部４ｉが、上記フラグの値が「２」である場合、図１５に示すように、重要発話音声Ｘの発話番号に関連づけられた重要語とその代替文字列とを示す情報（ここでは、文字列）を含む図、アイコン、及びウィンドウ等の修正案内画像２０を、画像１６と併せて表示部１０に表示する。文字列「Fujisan x430」が重要語の文字列であり、文字列「FUJISAN X430」が代替文字列を示す。図１５に示すように、修正案内画像２０には、修正指示画像２２及びキャンセル指示画像２４が含まれる。 Further, in the modification, when the call summary display unit 4i has the flag value “2”, as shown in FIG. 15, the important word associated with the utterance number of the important utterance voice X and its substitute character string A correction guide image 20 such as a figure, an icon, and a window including information (here, a character string) indicating “” is displayed on the display unit 10 together with the image 16. The character string “Fujisan x430” is a character string of an important word, and the character string “FUJISAN X430” indicates an alternative character string. As shown in FIG. 15, the correction guide image 20 includes a correction instruction image 22 and a cancel instruction image 24.

また、変形例では、通話要約表示部４ｉは、修正指示ボタン画像２２を選択する操作が行われた場合に、以下に説明するようにして通話要約データを更新する。すなわち、通話要約表示部４ｉは、文番号Ｘの対象テキストに含まれる代替文字列を、重要発話音声Ｘの発話番号に関連づけられた重要語の文字列へと修正する。 In the modified example, when an operation for selecting the correction instruction button image 22 is performed, the call summary display unit 4i updates the call summary data as described below. That is, the call summary display unit 4i corrects the substitute character string included in the target text of the sentence number X to the important word character string associated with the utterance number of the important utterance voice X.

２情報処理装置、４制御部、４ａキーワード抽出部、４ｂ音声認識部、４ｃワードスポッティング部、４ｄ関連づけ実行部、４ｅ重要発話音声特定部、４ｆ第１対象テキスト特定部、４ｇ第２対象テキスト特定部、４ｈ重要語有無判定部、４ｉ通話要約表示部、６主記憶、６ａキーワード記憶部、６ｂワードスポッティング結果記憶部、６ｃペア記憶部、８ハードディスク、８ａ通話音声データ記憶部、８ｂ通話要約記憶部、８ｃ重要語記憶部、１０表示部、１２操作入力部、１４通話音声データ、１６，１８画像、２０修正案内画像、２２修正指示画像、２４キャンセル指示画像。 2 Information processing device, 4 control unit, 4a keyword extraction unit, 4b speech recognition unit, 4c word spotting unit, 4d association execution unit, 4e important utterance speech specification unit, 4f first target text specification unit, 4g second target text specification 4h important word presence / absence determination unit, 4i call summary display unit, 6 main memory, 6a keyword storage unit, 6b word spotting result storage unit, 6c pair storage unit, 8 hard disk, 8a call voice data storage unit, 8b call summary storage Part, 8c important word storage part, 10 display part, 12 operation input part, 14 call voice data, 16, 18 images, 20 correction guidance images, 22 correction instruction images, 24 cancel instruction images.

Claims

Based on the result of the morphological analysis process for each of the plurality of unit texts included in the text document in which the content of the call is recorded and the result of the speech recognition process for each of the plurality of uttered voices uttered in the call indicated by the voice data , Means for associating each unit text with one of the utterances,
A first text that is a unit text associated with an utterance voice prior to an important utterance voice that is an utterance voice including a voice of a predetermined important word among the plurality of utterance voices; A determination means for determining whether or not there is a unit text including the character string of the important word between the second text that is a unit text associated with the uttered voice;
An information processing apparatus including:

2. When it is determined that there is no unit text including the key word character string between the first text and the second text, a message to that effect is output. Information processing device

Further comprising display control means for displaying the text document on display means,
The display control means includes
When there is no unit text including the important word character string between the first text and the second text, the information indicating the important word is displayed on the display unit together with the text document.
The information processing apparatus according to claim 2.

Means for determining whether or not there is a unit text including an alternative character string that is a character string having a predetermined relationship with a character string of the important word between the first text and the second text; Including
The display means includes
When there is a unit text including the substitute character string between the first text and the second text, information indicating the important word and the substitute character string is displayed on the display unit together with the text document. Letting
The information processing apparatus according to claim 3.

When there is a unit text including the substitution character string between the first text and the second text, the text document is updated, and the unit text existing between the first text and the second text Means for correcting the substitute character string included in the key word character string;
The information processing apparatus according to claim 4.

An independent word extraction means for extracting independent words included in each unit text by morphological analysis processing;
A self-supporting word specifying means for specifying a self-supporting word extracted by the self-supporting word extraction means, which is uttered by each utterance voice by a speech recognition process;
Further including
The association means includes
Associating each unit text with an utterance that produces the same independent word as that contained in the unit text;
The information processing apparatus according to any one of claims 1 to 5.

The information processing apparatus according to claim 1, further comprising a voice acquisition unit that acquires the voice data indicating the plurality of uttered voices.

A result of a morphological analysis process for each of a plurality of unit texts included in a text document in which the contents of the call are recorded, and a result of a speech recognition process for each of a plurality of speech sounds uttered in the call indicated by voice data. An association means for associating each unit text with one of the utterances,
A first text that is a unit text associated with an utterance voice prior to an important utterance voice that is an utterance voice including a voice of a predetermined important word among the plurality of utterance voices; Determination means for determining whether or not there is a unit text including the character string of the important word between a second text which is a unit text associated with the uttered voice;
As a program to make the computer function as.