JP4272658B2

JP4272658B2 - Program for functioning a computer as an operator support system

Info

Publication number: JP4272658B2
Application number: JP2006036852A
Authority: JP
Inventors: 洋明岩山
Original assignee: Mitsubishi Electric Information Systems Corp
Current assignee: Mitsubishi Electric Information Systems Corp
Priority date: 2006-02-14
Filing date: 2006-02-14
Publication date: 2009-06-03
Anticipated expiration: 2025-03-30
Also published as: JP2006285212A

Description

本発明は、オペレータ業務支援システムとしてコンピュータを機能させるためのプログラムに関する。 The present invention relates to a program for causing a computer to function as an operator business support system.

コールセンター業務において、運用コスト削減のために、１件あたりの処理時間をいかに短くするかが大きな命題となっている。受発注系の業務（たとえば通販などにおけるもの）の定型化された業務では、応対マニュアルや顧客情報を、着信と同時に画面ヘポップアップするシステムなど、様々な支援ツールが開発・適用されている。一方、家電やＰＣなどの、操作方法や障害対処などの問合せ窓口である技術サポート系のコールセンターでは、応対内容が多岐に渡り、また１通話が数十分に渡るなど長くなる場合もある。このため、応対シナリオを定型化しにくく、オペレータのスキルに頼っている部分が多い。 In call center operations, how to shorten the processing time per case is a major proposition for reducing operational costs. In a standardized business of ordering and receiving (for example, mail order), various support tools have been developed and applied, such as a system that pops up a response manual and customer information on the screen as soon as an incoming call is received. On the other hand, technical support call centers such as home appliances and PCs, which are inquiries for operation methods and troubleshooting, can be long, such as a wide range of response contents and several tens of calls. For this reason, it is difficult to standardize the response scenario, and there are many parts that rely on operator skills.

これらの業務において、オペレータは、通話終了後にはユーザからの問い合わせ内容、回答内容などを業務ログとして一件一葉で作成するのが通常である。しかしながら、ユーザへ操作説明する場合は、オペレータも同時に操作、表示内容を確認しながら通話しているケースも多く、通話中に会話の内容を逐一記録しておくことは困難である。そのため、業務ログ作成作業はユーザとの会話を思い出しながら実施しなければならず、多大な負荷と時間がかかっていた。このため、通話中の音声データを文字列に変換して記録するためのシステムが開発されてきた。このようなオペレータ業務支援システムの例は、特許文献１に開示される。 In these tasks, the operator usually creates the contents of inquiries and answers from the user as a work log after each call. However, when explaining the operation to the user, there are many cases where the operator also makes a call while confirming the operation and display contents at the same time, and it is difficult to record the contents of the conversation during the call. For this reason, the work log creation work has to be carried out while remembering the conversation with the user, which takes a lot of load and time. For this reason, a system for converting voice data during a call into a character string and recording it has been developed. An example of such an operator work support system is disclosed in Patent Document 1.

特開平１１−３３８４９４号公報JP 11-338494 A

しかしながら、従来のオペレータ業務支援システムにおいては、音声認識の結果得られたテキストがそのまま表示されるだけであるため、オペレータの作業効率が向上しないという問題があった。たとえば、オペレータがそのテキストを短時間に把握することを助けるための表示機能や、オペレータがそのテキストに関連して参照したい情報を得るための機能は備えられていない。 However, in the conventional operator work support system, the text obtained as a result of the speech recognition is only displayed as it is, so that there is a problem that the working efficiency of the operator is not improved. For example, a display function for helping the operator to grasp the text in a short time and a function for obtaining information that the operator wants to refer to in relation to the text are not provided.

この発明は、このような問題点を解決するためになされたものであり、オペレータの作業効率を向上するオペレータ業務支援システムとしてコンピュータを機能させるためのプログラムを提供することを目的とする。 The present invention has been made to solve such a problem, and an object of the present invention is to provide a program for causing a computer to function as an operator work support system that improves the work efficiency of an operator.

上述の問題点を解決するため、この発明に係るプログラムは、電話を介した通話を記録するとともに、通話に伴うオペレータ業務を支援する、オペレータ業務支援システムとしてコンピュータを機能させるためのプログラムであって、コンピュータを、通話の音声データをテキストデータに変換する音声認識部と、第一辞書であって、音声認識辞書と、あらかじめ音声認識に使用する単語単位に指定されたキーワードとを含む、第一辞書と、あらかじめ指定されたキーワードを含む第二辞書と、テキストデータに加工を施す応対ログ処理部と、加工を施されたテキストデータを表示する出力部と、オペレータの操作を受け付ける入力部ととして機能させ、第二辞書に含まれるキーワードは変更可能であり、音声認識部は、音声データをテキストデータに変換し、音声認識部は、音声データをテキストデータに変換する際に、第一辞書に含まれるキーワードについては、マーキングを施し、応対ログ処理部は、マーキングを施したテキストデータにおいて、第二辞書に含まれるキーワードについては、さらにマーキングを施し、出力部は、マーキングされたキーワードを強調表示するものであり、第一辞書は、さらにキーワードに係わる説明文を含み、第二辞書は、さらにキーワードに係わる説明文を含み、出力部は、さらに、入力部がキーワードを指定する操作を受け付けると、第一辞書または第二辞書のいずれかからキーワードに係わる説明文を取得して表示するものであり、テキストデータの表示とは異なる領域に、キーワードを通話に現れる順にリスト表示するものであり、入力部がリスト表示されたキーワードを指定する操作を受け付けると、指定されたキーワードを含むテキストデータを着色表示するものであるオペレータ業務支援システムとしてコンピュータを機能させる。 In order to solve the above-mentioned problems, a program according to the present invention is a program for recording a call via a telephone and for supporting the operator work associated with the call and for causing the computer to function as an operator work support system. The computer includes a voice recognition unit that converts voice data of a call into text data, a first dictionary, the voice recognition dictionary, and a keyword specified in advance for each word used for voice recognition, As a dictionary, a second dictionary that includes keywords specified in advance, a response log processing unit that processes text data, an output unit that displays processed text data, and an input unit that accepts operator operations The keyword in the second dictionary can be changed, and the voice recognition unit When the speech recognition unit converts the speech data into text data, the speech recognition unit performs the marking on the keywords included in the first dictionary, and the response log processing unit performs the first processing on the marked text data. The keywords included in the two dictionaries are further marked, and the output unit highlights the marked keywords, the first dictionary further includes explanatory text relating to the keywords, and the second dictionary further includes When the input unit accepts an operation for designating a keyword, the output unit obtains and displays a description related to the keyword from either the first dictionary or the second dictionary. Yes, in a different area from the text data display, keywords are listed in the order they appear in the call, When accepting an operation for designating a radical 19 is listed keywords, it causes a computer to the text data including the specified keyword as an operator service support system is for color display.

出力部は、マーキングされたキーワードを墨付きカッコで囲むことによって強調表示してもよい。
出力部は、さらに、テキストデータの表示とは異なる領域に、キーワードを通話に現れる順にリスト表示してもよい。
出力部は、さらに入力部がキーワードを指定する操作を受け付けると、キーワードを含むテキストデータに対応する音声データを再生するものであってもよい。
コンピュータを、さらに、コンピュータが有する記憶部にテキストデータを記憶させる認識結果記憶部として機能させ、出力部は、さらにテキストデータをテキスト編集領域に表示し、認識結果記憶部は、入力部がテキストデータを修正する操作を受け付けると、修正されたテキストデータをコンピュータが有する記憶部に記憶させるものであってもよい。
出力部は、さらに入力部がキーワードを指定する操作を受け付けると、記憶された通話のうち、指定されたキーワードを含むものをリスト表示するものであってもよい。

The output unit may highlight the marked keyword by surrounding it with black brackets.
The output unit may further display a list of keywords in an order different from that of text data in the order in which the keywords appear in the call.
Output unit further when the input unit accepts an operation for designating a keyword may be one for reproducing audio data corresponding to the text data including the keyword.
The computer, in addition, a computer to function as a recognition result storage unit for storing text data in a storage unit having an output unit further displays the text data in the text editing area, the recognition result storage unit, an input unit When an operation for correcting text data is received, the corrected text data may be stored in a storage unit included in the computer.
When the input unit further receives an operation for designating a keyword, the output unit may display a list of stored calls including the designated keyword.

この発明に係るオペレータ業務支援システムとしてコンピュータを機能させるためのプログラムによれば、あらかじめ指定されたキーワードを強調表示するので、作業効率を向上させることができる。 According to the program for causing the computer to function as the operator work support system according to the present invention, the keyword specified in advance is highlighted, so that the work efficiency can be improved.

以下、この発明の実施の形態を添付図面に基づいて説明する。
実施の形態１．
図１は、本発明の実施の形態１に係る音声認識エンジンおよびオペレータ業務支援ツールを含む、オペレータ業務支援システム１００を含む構成を示す図である。
公衆交換電話網（Public Switched Telephone Network, PSTN)２に、コールセンター用の構内交換機(Private Branch eXchange, PBX)４が接続されている。この構内交換機４は、たとえばＡＶＡＹＡ社のＤｅｆｉｎｉｔｙである。 Embodiments of the present invention will be described below with reference to the accompanying drawings.
Embodiment 1 FIG.
FIG. 1 is a diagram showing a configuration including an operator work support system 100 including a voice recognition engine and an operator work support tool according to Embodiment 1 of the present invention.
A private branch exchange (PBX) 4 for a call center is connected to a public switched telephone network (PSTN) 2. This private branch exchange 4 is, for example, a definition of AVAYA.

構内交換機４には、オペレータ２０宛の通話を受けるオペレータ用電話機である電話機１２が接続される。さらに電話機１２は、オペレータ２０が業務に使用するオペレータ用ＰＣであるＰＣ１６およびヘッドセット１８に、プラグアダプタ１９と切り替え部であるスイッチボックス１４を介して接続される。
この電話機１２は、構内交換機４からの信号を受信して、プラグアダプタ１９を介してスイッチボックス１４に送信するスピーカ出力端子１２ａと、スイッチボックス１４からの信号を受信して構内交換機４に送信するマイク出力端子１２ｂとを備えており、たとえばＡＶＡＹＡ社のＣａｌｌｍａｓｔｅｒＩＶが使用される。なお、電話機の種類によってはプラグアダプタの機能を包含している場合や、通常のモジュラジャックの場合もある。 The private branch exchange 4 is connected to a telephone 12 that is an operator telephone that receives a call addressed to the operator 20. Further, the telephone set 12 is connected to a PC 16 and a headset 18 which are operator PCs used by the operator 20 for business via a plug adapter 19 and a switch box 14 which is a switching unit.
The telephone 12 receives a signal from the private branch exchange 4 and transmits it to the switch box 14 via the plug adapter 19 and receives a signal from the switch box 14 and transmits it to the private branch exchange 4. A microphone output terminal 12b is used. For example, Callmaster IV of AVAYA is used. Depending on the type of telephone, the function of the plug adapter may be included, or it may be a normal modular jack.

ヘッドセット１８は、スイッチボックス１４からの信号を音声に変換してオペレータ２０に伝えるスピーカ１８ａと、オペレータ２０が発する音声を信号に変換してスイッチボックス１４に送るマイク１８ｂとを備えており、たとえばＶＸＩ社のＴｕｆｆＳｅｔシリーズ、Ｐａｓｓｐｏｒｔシリーズが使用される。 The headset 18 includes a speaker 18a that converts a signal from the switch box 14 into voice and transmits the voice to the operator 20, and a microphone 18b that converts voice generated by the operator 20 into a signal and sends the signal to the switch box 14, for example. VXI's TuffSet series and Passport series are used.

ＰＣ１６は、演算装置および記憶装置を有する、周知の構成を持つコンピュータであり、出力装置としてのディスプレイおよびスピーカ出力端子と、入力装置としてのキーボード、マウス、およびライン入力端子とを備える。たとえば、演算装置としてクロック周波数３ＧＨｚで作動するＰｅｎｔｉｕｍ（登録商標）４プロセッサを備え、記憶装置として１ＧＢの容量を持つメモリおよび１６０ＧＢの容量を持つハードディスクドライブを備えるものである。また、ＰＣ１６は、これ以外の周知の入出力装置、たとえばネットワークインタフェース等を備えてもよい。
なお、ＰＣ１６は、ライン入力端子を通して、２つの音声入力、たとえばステレオ音声のＬチャネルとＲチャネルのような独立した音声データである信号を同時に受け付け、単一のファイルとして記憶装置に格納することが可能である。ただし、このファイル中において、ＬチャネルおよびＲチャネルはそれぞれ独立に記録され、後に独立してスピーカ出力端子を通して再生できる形式となっている。 The PC 16 is a computer having a known configuration including an arithmetic device and a storage device, and includes a display and speaker output terminals as output devices, and a keyboard, mouse, and line input terminals as input devices. For example, a Pentium (registered trademark) 4 processor operating at a clock frequency of 3 GHz is provided as an arithmetic unit, and a memory having a capacity of 1 GB and a hard disk drive having a capacity of 160 GB are provided as a storage device. The PC 16 may include other well-known input / output devices such as a network interface.
Note that the PC 16 can simultaneously receive two audio inputs, for example, signals that are independent audio data such as stereo audio L channel and R channel, through a line input terminal, and store them in a storage device as a single file. Is possible. However, in this file, the L channel and the R channel are recorded independently and can be reproduced independently through a speaker output terminal later.

図３は、ＰＣ１６の論理的な構成の概略を示す。ＰＣ１６は、入力された音声を受け付けて音声認識を行う音声認識エンジン３０を備える。
また、ＰＣ１６の記憶装置は、音声認識処理を行う際に必要となる音声認識辞書を業務に応じて格納および提供する業務向け辞書記憶部４０を含む。
業務向け辞書記憶部４０は、たとえばオペレータが担当する業務ごとの複数の音声認識辞書を有し、通話内容に適した言語モデルが登録されており、さらに各業務において状況に応じて必要となる社名、製品名等の単語が追加登録されており、強調表示用キーワード、不要語キーワードが追加登録されており、またさらに強調表示用キーワードに係る説明文が登録されていてもよい。これらの、単語、強調表示用キーワード、不要語キーワード、および説明文は、オペレータ業務支援ツール３９に含まれないシステムまたはプログラムを使用し、簡単に登録、変更、および削除できる。 FIG. 3 shows an outline of the logical configuration of the PC 16. The PC 16 includes a speech recognition engine 30 that receives input speech and performs speech recognition.
The storage device of the PC 16 includes a business-use dictionary storage unit 40 that stores and provides a speech recognition dictionary necessary for performing speech recognition processing according to business.
The business dictionary storage unit 40 has, for example, a plurality of speech recognition dictionaries for each business handled by the operator, language models suitable for the contents of the call are registered, and company names that are necessary in accordance with the situation in each business Further, a word such as a product name may be additionally registered, a highlighting keyword and an unnecessary word keyword may be additionally registered, and further an explanatory text relating to the highlighting keyword may be registered. These words, highlighting keywords, unnecessary word keywords, and explanations can be easily registered, changed, and deleted using a system or program that is not included in the operator work support tool 39.

また、ＰＣ１６の記憶装置は、キーワード辞書記憶部４５を有する。キーワード辞書記憶部４５は、音声認識処理後に抽出する強調表示用キーワード、不要語キーワード、および直後参照用キーワードを有する。さらに、強調表示用キーワードに係る説明文が登録されていてもよい。これらの、強調表示用キーワード、不要語キーワード、直後参照用キーワード、および説明文は、オペレータ業務支援ツール３９に含まれないシステムまたはプログラムを使用し、簡単に登録、変更、および削除できる。 Further, the storage device of the PC 16 has a keyword dictionary storage unit 45. The keyword dictionary storage unit 45 includes highlighting keywords, unnecessary word keywords, and immediate reference keywords that are extracted after the speech recognition process. Furthermore, an explanatory note relating to the highlighting keyword may be registered. These highlighting keywords, unnecessary word keywords, immediate reference keywords, and explanations can be easily registered, changed, and deleted using a system or program that is not included in the operator work support tool 39.

前述の音声認識辞書では、音声認識に使用する単語単位にキーワードが登録されており、音声認識時に合致するキーワードが抽出されるが、キーワード辞書では、単語に限らず文書のキーワードの登録が可能であり、音声認識結果に対して、さらにキーワードを抽出するのに用いられる。
強調表示用キーワードは、音声認識結果の該当キーワードを強調表示するために用いられる。
不要語キーワードは、そのキーワードの表示形式を変更するか、または、不要語の表示を削除するために用いられる。
直後参照用キーワードは、そのキーワードの後に続く音声を再生するために用いられる。
説明文は、強調表示用キーワードの説明に用いられる。
また、ＰＣ１６の記憶装置は、図示しない音響辞書を含む。音響辞書は、例えば男性、女性、男女共用などの複数の音響辞書を有し、音素レベルの認識に用いられる。 In the aforementioned speech recognition dictionary, keywords are registered in units of words used for speech recognition, and keywords that match during speech recognition are extracted. However, in the keyword dictionary, it is possible to register document keywords as well as words. Yes, it is used to further extract keywords from the speech recognition result.
The highlighting keyword is used to highlight the corresponding keyword of the speech recognition result.
The unnecessary word keyword is used for changing the display format of the keyword or deleting the display of the unnecessary word.
Immediately after, the reference keyword is used to reproduce the sound that follows the keyword.
The explanatory text is used to explain the highlighting keyword.
Further, the storage device of the PC 16 includes an acoustic dictionary (not shown). The acoustic dictionary has a plurality of acoustic dictionaries such as men, women, and men and women, and is used for recognition of phoneme levels.

音声認識エンジン３０は、電話機１２からの出力に漏れこんだオペレータ２０の音声を削減してユーザ音声を明瞭にするエコーキャンセル部３１と、エコーキャンセル部３１の出力の録音を行う録音部３２と、エコーキャンセル部３１の出力のうちのオペレータ音声と、業務向け辞書記憶部４０と、図示しない音響辞書とを入力として音声認識処理を行うオペレータ音声認識部３３と、エコーキャンセル部３１の出力のうちのユーザ音声と、業務向け辞書記憶部４０と、図示しない音響辞書とを入力として音声認識処理を行うユーザ音声認識部３４とを含む。
なお、音声認識エンジン３０は、エコーキャンセル部３１を含まない構成でもよい。例えばオペレータ音声認識部３３のみで、ユーザ音声認識部３４を含まない構成の場合や、電話機１２またはスイッチボックス１４からの音声出力において、ユーザ音声とオペレータ音声が音声認識可能な程度に分離できている場合は、エコーキャンセル部３１は不要である。 The voice recognition engine 30 includes an echo canceling unit 31 that reduces the voice of the operator 20 leaking into the output from the telephone 12 to clarify the user voice, a recording unit 32 that records the output of the echo canceling unit 31, Of the outputs from the echo canceling unit 31, the operator speech recognition unit 33 that performs speech recognition processing using the operator voice, the business dictionary storage unit 40, and an acoustic dictionary (not shown) as inputs, and the output from the echo canceling unit 31 It includes a user voice, a business dictionary storage unit 40, and a user voice recognition unit 34 that performs voice recognition processing using an acoustic dictionary (not shown) as an input.
Note that the voice recognition engine 30 may not include the echo canceling unit 31. For example, in the case where only the operator voice recognition unit 33 is included and the user voice recognition unit 34 is not included, or in the voice output from the telephone set 12 or the switch box 14, the user voice and the operator voice can be separated to such a degree that voice recognition is possible. In this case, the echo cancellation unit 31 is not necessary.

ＰＣ１６の記憶装置は、録音部３２、オペレータ音声認識部３３、およびユーザ音声認識部３４それぞれの出力を格納するための、音声記憶部４１、オペレータ音声認識結果記憶部４２、およびユーザ音声認識結果記憶部４３を含む。
オペレータ業務支援ツール３９は、応対ログ処理部３５、通話管理部３８、出力部３６、および入力部３７を含む。
出力部３６は、オペレータ２０に対する出力、たとえばディスプレイへの表示、およびスイッチボックス１４を経由してスピーカ１８ａへの音声再生等を行う。
入力部３７は、オペレータ２０からたとえばキーボード、またはマウスを用いた入力を受け付ける。 The storage device of the PC 16 stores a voice storage unit 41, an operator voice recognition result storage unit 42, and a user voice recognition result storage for storing outputs of the recording unit 32, the operator voice recognition unit 33, and the user voice recognition unit 34, respectively. Part 43 is included.
The operator work support tool 39 includes a reception log processing unit 35, a call management unit 38, an output unit 36, and an input unit 37.
The output unit 36 performs output to the operator 20, for example, display on a display, and sound reproduction to the speaker 18 a via the switch box 14.
The input unit 37 receives an input from the operator 20 using, for example, a keyboard or a mouse.

応対ログ処理部３５は、図８〜１０で後述する応対ログ画面を出力部３６により表示し、入力部３７を通してユーザの指示を受け付け、通話中または過去のひとつの応対ログについて、音声記憶部４１を参照し、オペレータ音声認識結果記憶部４２、ユーザ音声認識結果記憶部４３、および通話ログ管理情報記憶部４４を参照および更新し、キーワード辞書記憶部４５を参照する。
通話管理部３８は、図１１で後述する設定画面を出力部３６により表示し、入力部３７を通してユーザの指示を受け付け、通話ログ管理情報記憶部４４に格納されたオペレータに係る情報を管理する。また、通話管理部３８は、図１２で後述する辞書設定画面を出力部３６により表示し、入力部３７を通してユーザの指示を受け付け、通話ログ管理情報記憶部４４に格納された応対ログと音声認識辞書との関係を管理する。また、通話管理部３８は、図１３で後述する応対ログ一覧画面を出力部３６により表示し、入力部３７を通してユーザの指示を受け付け、通話ログ管理情報記憶部４４に格納された各応対ログの属性情報を管理する。 The response log processing unit 35 displays a response log screen, which will be described later with reference to FIGS. 8 to 10, by the output unit 36, accepts a user instruction through the input unit 37, and stores a voice storage unit 41 for one response log during a call or in the past. , The operator voice recognition result storage unit 42, the user voice recognition result storage unit 43, and the call log management information storage unit 44 are referred to and updated, and the keyword dictionary storage unit 45 is referred to.
The call management unit 38 displays a setting screen, which will be described later with reference to FIG. 11, on the output unit 36, receives a user instruction through the input unit 37, and manages information related to the operator stored in the call log management information storage unit 44. Further, the call management unit 38 displays a dictionary setting screen, which will be described later with reference to FIG. 12, by the output unit 36, accepts a user instruction through the input unit 37, and receives a response log and voice recognition stored in the call log management information storage unit 44. Manage relationships with dictionaries. Further, the call management unit 38 displays a response log list screen, which will be described later with reference to FIG. 13, on the output unit 36, accepts a user instruction through the input unit 37, and stores each response log stored in the call log management information storage unit 44. Manage attribute information.

また、ＰＣ１６の記憶装置は、通話管理部３８が通話ログを管理する際に使用する情報を格納する、通話ログ管理情報記憶部４４を含む。
エコーキャンセル部３１を介して、リアルタイムで入力されるオペレータとユーザの音声は、録音部３２により音声記憶部４１に格納されると共に、並行してオペレータ音声認識部３３およびユーザ音声認識部３４に送信される。オペレータ音声認識部３３、およびユーザ音声認識部３４は、送信された音声を直ちに認識して、結果をリアルタイムで応対ログ処理部３５に送信すると共に、並行してオペレータ音声認識結果記憶部４２、およびユーザ音声認識結果記憶部４３に送信する。これにより、応対ログ処理部３５では、通話中に音声認識結果をリアルタイムで参照することができる。なお、図３では、オペレータ音声認識部３３およびユーザ音声認識部３４から、オペレータ音声認識結果記憶部４２およびユーザ音声認識結果記憶部４３に、直接認識結果を送信しているが、オペレータ音声認識部３３およびユーザ音声認識部３４から、応対ログ処理部３５を介して、オペレータ音声認識結果記憶部４２、およびユーザ音声認識結果記憶部４３に送信してもよい。 Further, the storage device of the PC 16 includes a call log management information storage unit 44 that stores information used when the call management unit 38 manages a call log.
The voices of the operator and the user input in real time via the echo canceling unit 31 are stored in the voice storage unit 41 by the recording unit 32 and transmitted to the operator voice recognition unit 33 and the user voice recognition unit 34 in parallel. Is done. The operator voice recognition unit 33 and the user voice recognition unit 34 immediately recognize the transmitted voice and transmit the result to the reception log processing unit 35 in real time, and at the same time, the operator voice recognition result storage unit 42 and It transmits to the user speech recognition result storage unit 43. Thereby, the reception log processing unit 35 can refer to the voice recognition result in real time during the call. In FIG. 3, the operator speech recognition unit 33 and the user speech recognition unit 34 directly transmit the recognition result to the operator speech recognition result storage unit 42 and the user speech recognition result storage unit 43. 33 and the user voice recognition unit 34 may be transmitted to the operator voice recognition result storage unit 42 and the user voice recognition result storage unit 43 via the response log processing unit 35.

また、エコーキャンセル部３１に入力される音声は、オペレータ音声はＲチャネルに、ユーザ音声はＬチャネルに割り当てて入力される。オペレータ音声とユーザ音声は共に、エコーキャンセル部３１から録音部３２に送信されて、音声記憶部４１にひとつのファイルとして録音される。並行して、オペレータ音声はオペレータ音声認識部３３に送信されて音声認識されると共に、文単位で開始時間と終了時間が付加される。並行して、ユーザ音声はユーザ音声認識部３４に送信されて音声認識されると共に、文単位で開始時間と終了情報が付加される。このような構成により、音声記憶部４１に記憶されたオペレータ音声とユーザ音声と、オペレータ音声認識結果記憶部４２に記憶された認識結果に含まれる開始時間と、終了時間と、ユーザ音声認識結果記憶部４３に記憶された認識結果に含まれる開始時間と終了時間との間で、時間の整合性が保たれる。
このように、録音部３２、オペレータ音声認識部３３、およびユーザ音声認識部３４は、通話を並行して受信する。
なお、オペレータ音声認識部３３、およびユーザ音声認識部３４は、いずれかひとつのみの構成でもよい。一般的には、オペレータ音声はマイク音声のため明瞭であり、認識しやすいように訓練された発声であるので、認識率が高く、ユーザ音声は電話音声のため不明瞭であり発声も個人差が多く、認識率が低い。このような場合には、ユーザ音声認識部３４は設けず、オペレータ音声認識部３３のみの構成としてもよい。 Further, the voice input to the echo cancellation unit 31 is input by assigning the operator voice to the R channel and the user voice to the L channel. Both the operator voice and the user voice are transmitted from the echo cancel unit 31 to the recording unit 32 and recorded as one file in the voice storage unit 41. In parallel, the operator voice is transmitted to the operator voice recognition unit 33 for voice recognition, and a start time and an end time are added in sentence units. In parallel, the user voice is transmitted to the user voice recognition unit 34 for voice recognition, and start time and end information are added for each sentence. With such a configuration, the operator voice and the user voice stored in the voice storage unit 41, the start time and the end time included in the recognition result stored in the operator voice recognition result storage unit 42, and the user voice recognition result storage Time consistency is maintained between the start time and the end time included in the recognition result stored in the unit 43.
Thus, the recording unit 32, the operator voice recognition unit 33, and the user voice recognition unit 34 receive a call in parallel.
Note that only one of the operator voice recognition unit 33 and the user voice recognition unit 34 may be configured. In general, the operator voice is clear because it is a microphone voice, and it is a utterance trained so that it can be easily recognized. Therefore, the recognition rate is high, the user voice is unclear because it is a telephone voice, and the utterance is not individual. Many recognition rates are low. In such a case, the user voice recognition unit 34 may not be provided, and only the operator voice recognition unit 33 may be configured.

なお、上述のＰＣ１６の構成は、一台のＰＣでなく、複数のコンピュータ等に分散して設けられてもよい。また、上述のＰＣ１６の構成は、オペレータ席に配置された個々のＰＣに実装されるのではなく、全オペレータの音声をサーバで一括集中処理する方式でもよい。 Note that the configuration of the PC 16 described above may be distributed in a plurality of computers or the like instead of a single PC. Further, the configuration of the PC 16 described above may be a system in which the voices of all operators are collectively processed by a server, instead of being mounted on individual PCs arranged at the operator's seat.

図４に、音声記憶部４１、オペレータ音声認識結果記憶部４２、およびユーザ音声認識結果記憶部４３に格納されるデータ形式を示す。
音声記憶部４１は、たとえば．ｗａｖという拡張子をファイル名の末尾に含む、ＷＡＶＥ形式の音声ファイルを格納する。この音声ファイルは、オペレータ２０とユーザとの通話の音声データを含み、一回の通話について一つ作成される。また、一つのファイルの中に、ユーザ音声はＬチャネルのデータとして、またオペレータ音声はＲチャネルのデータとして、独立して取り出せる形式で記憶される。 FIG. 4 shows data formats stored in the voice storage unit 41, the operator voice recognition result storage unit 42, and the user voice recognition result storage unit 43.
The voice storage unit 41 is, for example,. A WAVE format audio file including an extension of wav at the end of the file name is stored. This voice file includes voice data of a call between the operator 20 and the user, and one voice file is created for one call. Further, in one file, user voice is stored as L channel data and operator voice is stored as R channel data in a format that can be independently extracted.

オペレータ音声認識結果記憶部４２は、テキスト形式のファイル（テキストファイル）を格納する。このテキストファイルは、通話におけるオペレータ音声に対応するテキストデータ、たとえば音声認識結果のテキストデータを含み、一回の通話について一つ作成される。また、それぞれのテキストファイルは、音声記憶部４１に記憶される音声ファイルと一対一に対応しており、たとえば音声記憶部４１のファイル「通話１．ｗａｖ」に含まれるオペレータ音声のデータは、オペレータ音声認識結果記憶部４２のファイル「通話ログ１オペレータ．ｄｃｔ」に含まれるテキストに対応する。
ユーザ音声認識結果記憶部４３は、オペレータ音声認識結果記憶部４２と同様に、通話におけるユーザ音声に対応するテキストデータを格納する。また同様に、たとえば音声記憶部４１のファイル「通話１．ｗａｖ」に含まれるユーザ音声のデータは、ユーザ音声認識結果記憶部４３のファイル「通話ログ１ユーザ．ｄｃｔ」に含まれるテキストに対応する。 The operator speech recognition result storage unit 42 stores a text format file (text file). This text file includes text data corresponding to the operator voice in a call, for example, text data of a voice recognition result, and one text file is created for one call. Each text file has a one-to-one correspondence with the voice file stored in the voice storage unit 41. For example, the operator voice data included in the file “call 1.wav” in the voice storage unit 41 includes This corresponds to the text included in the file “call log 1 operator.dct” in the speech recognition result storage unit 42.
Similarly to the operator voice recognition result storage unit 42, the user voice recognition result storage unit 43 stores text data corresponding to the user voice in a call. Similarly, for example, user voice data included in the file “call 1.wav” in the voice storage unit 41 corresponds to text included in the file “call log 1 user.dct” in the user voice recognition result storage unit 43. .

図５に、通話ログ管理情報記憶部４４に格納される通話ログ管理情報のデータ形式を示す。
通話ログ管理情報は、一回の通話に関連するデータを一行とする表形式で格納される。各行は、該当の通話について、開始および終了した時刻と、対応したオペレータ２０のオペレータ名と、オペレータ２０によって指定される通話のカテゴリと、音声認識に使用された辞書名と、音声データが記憶されている音声ファイル名と、オペレータ音声およびユーザ音声の音声認識結果がそれぞれ記憶されるファイル名とを含む。 FIG. 5 shows a data format of the call log management information stored in the call log management information storage unit 44.
The call log management information is stored in a tabular format with data related to one call as one line. Each row stores the start and end times of the corresponding call, the operator name of the corresponding operator 20, the call category specified by the operator 20, the dictionary name used for voice recognition, and voice data. Voice file names and file names in which voice recognition results of operator voices and user voices are respectively stored.

図６に、通話ログ管理情報記憶部４４に格納されるオペレータ情報のデータ形式を示す。各行は、オペレータの識別名であるオペレータ名、オペレータの性別を含み、その他の情報、例えば登録日、備考などを含んでいてもよい。これらの情報は、ＣＳＶ形式のテキストで格納されていてもよいし、データベース形式で格納されていてもよい。
図７に、オペレータ音声認識結果記憶部４２に格納される通話ログデータのデータ形式を示す。各行は、開始時間、終了時間、認識結果のテキストを含み、その他の情報を含んでいても良い。開始時間、および終了時間は、音声記憶部４１に格納された音声データの始まりからの相対時間である。なお、開始時間、および終了時間は、必ずしも時間である必要はなく、画面に表示する開始時間および終了時間を算出可能であり、かつ、音声データへのポインタとなり得る値であればよい。これらの情報は、ＣＳＶ形式のテキストで保管されていてもよいし、データベース形式に保管されていてもよい。 FIG. 6 shows a data format of operator information stored in the call log management information storage unit 44. Each line includes an operator name, which is an operator identification name, and the gender of the operator, and may include other information such as a registration date and remarks. These pieces of information may be stored in a CSV format text, or may be stored in a database format.
FIG. 7 shows a data format of call log data stored in the operator voice recognition result storage unit 42. Each line includes start time, end time, recognition result text, and may include other information. The start time and the end time are relative times from the start of the audio data stored in the audio storage unit 41. Note that the start time and the end time are not necessarily time, and may be values that can calculate the start time and the end time to be displayed on the screen and can serve as pointers to the audio data. These pieces of information may be stored in a CSV format text, or may be stored in a database format.

図１のスイッチボックス１４とＰＣ１６間の接続は、オーディオ形式およびＵＳＢ形式のいずれでもよい。なお、ＵＳＢ形式の場合は、スイッチボックス内でデジタル化を行い、ＰＣ１６にサウンドカードを具備する必要は無くなる。 The connection between the switch box 14 and the PC 16 in FIG. 1 may be either an audio format or a USB format. In the case of the USB format, it is not necessary to perform digitization in the switch box and provide the PC 16 with a sound card.

オペレータ業務支援システム１００は、スイッチボックス１４、ＰＣ１６、およびヘッドセット１８を備える。また、オペレータ業務支援システム１００は、オペレータ２０が、ヘッドセット１８、電話機１２、およびＰＣ１６を使用して、電話機１２経由の通話を行うか、またはＰＣ１６に格納された通話の聞き起こしを行うかを切り替えられるようになっている。この切り替えはスイッチボックス１４のスイッチ１４ａを操作することによって行われる。スイッチ１４ａが「通話」側に設定されているときは、ヘッドセット１８、電話機１２のスピーカ出力端子１２ａ、電話機１２のマイク入力端子１２ｂ、およびＰＣ１６の接続は、電話機１２経由の通話を行うように構成される。このとき、オペレータ２０は、構内交換機４によって割り振られた呼すなわち通話において、電話機１２を通して外部の通話相手すなわちユーザと会話することができる。また、スイッチ１４ａが「聞き起こし」側に設定されているときは、これらの接続は、ＰＣ１６に格納された通話の聞き起こしを行うように構成される。
以下、本明細書において、スイッチ１４ａが通話側に設定されている状態を「通話時」、聞き起こし側に設定されている状態を「聞き起こし時」と称する。 The operator work support system 100 includes a switch box 14, a PC 16, and a headset 18. In addition, the operator work support system 100 determines whether the operator 20 uses the headset 18, the telephone set 12, and the PC 16 to make a call via the telephone set 12 or to listen to a call stored in the PC 16. It can be switched. This switching is performed by operating the switch 14a of the switch box 14. When the switch 14 a is set to the “call” side, the connection of the headset 18, the speaker output terminal 12 a of the telephone set 12, the microphone input terminal 12 b of the telephone set 12, and the PC 16 performs a call via the telephone set 12. Composed. At this time, the operator 20 can talk with an external call partner, that is, a user through the telephone set 12 in a call, that is, a call allocated by the private branch exchange 4. Also, when the switch 14a is set to the “listen” side, these connections are configured to wake up the call stored in the PC 16.
Hereinafter, in this specification, the state in which the switch 14a is set to the call side is referred to as “during call”, and the state in which the switch 14a is set to the listening side is referred to as “listen”.

図２にスイッチボックス１４の構成を示す。
スイッチボックス１４は、ヘッドセット１８、電話機１２、およびＰＣ１６の接続を切り替えるための連動リレー１４ｂを備える。この連動リレー１４ｂは、ヘッドセットのスピーカ１８ａおよびマイク１８ｂを、スイッチ１４ａの状態に応じて、通話側および聞き起こし側のどちらかに切り替える。スピーカ１８ａおよびマイク１８ｂの切り替えは連動しており、通話時すなわちスピーカ１８ａが通話側に接続されているときは、マイク１８ｂも通話側に接続され、聞き起こし時すなわちスピーカ１８ａが聞き起こし側に接続されているときは、マイク１８ｂも聞き起こし側に接続される。
さらに、スイッチボックス１４は、その回路の各所において音声信号を増幅するための複数の増幅器１４ｃを備える。 FIG. 2 shows the configuration of the switch box 14.
The switch box 14 includes an interlocking relay 14b for switching the connection of the headset 18, the telephone set 12, and the PC 16. This interlocking relay 14b switches the speaker 18a and microphone 18b of the headset to either the call side or the listening side according to the state of the switch 14a. The switching of the speaker 18a and the microphone 18b is interlocked. When the call is made, that is, when the speaker 18a is connected to the call side, the microphone 18b is also connected to the call side, and when the call is made, that is, the speaker 18a is connected to the call side. When it is set, the microphone 18b is also connected to the listening side.
The switch box 14 further includes a plurality of amplifiers 14c for amplifying the audio signal at various points in the circuit.

図１のスイッチボックス１４に電話機１２のスピーカから入力される信号は、増幅器１４ｃで増幅された後、ＰＣ１６の音声入力のＬチャネルに出力される。また、この信号は、通話時にはヘッドセット１８のスピーカ１８ａに出力されるが、聞き起こし時にはスピーカ１８ａには出力されない。
スイッチボックス１４にＰＣ１６の音声出力のＬチャネルおよびＲチャネルからそれぞれ入力される信号は、まずそれぞれ増幅器１４ｃで増幅され、合成された後、さらに別の増幅器１４ｃによって増幅される。最終的に増幅された信号は、聞き起こし時にはヘッドセット１８のスピーカ１８ａに出力されるが、通話時にはスピーカ１８ａには出力されない。
スイッチボックス１４にヘッドセット１８のマイク１８ｂから入力される信号は、増幅器１４ｃによって増幅された後、ＰＣ１６の音声入力のＲチャネルに出力される。また、この信号は、通話時には電話機１２のマイクに出力されるが、聞き起こし時には電話機１２のマイクには出力されない。 A signal input to the switch box 14 of FIG. 1 from the speaker of the telephone 12 is amplified by the amplifier 14 c and then output to the L channel of the voice input of the PC 16. In addition, this signal is output to the speaker 18a of the headset 18 during a call, but is not output to the speaker 18a during listening.
The signals input to the switch box 14 from the L channel and the R channel of the audio output of the PC 16 are first amplified by the amplifier 14c, synthesized, and then further amplified by another amplifier 14c. The finally amplified signal is output to the speaker 18a of the headset 18 when aroused, but not output to the speaker 18a during a call.
The signal input from the microphone 18b of the headset 18 to the switch box 14 is amplified by the amplifier 14c and then output to the R channel of the audio input of the PC 16. Also, this signal is output to the microphone of the telephone 12 during a call, but is not output to the microphone of the telephone 12 during a conversation.

次に、図８〜１０を用いて、ＰＣ１６がオペレータ業務支援ツール３９の応対ログ処理部３５を実行することによって表示される画面の例である、応対ログ画面について説明する。
図８は、通話中でない場合の応対ログ画面５０を示す。応対ログ画面５０は、オペレータ２０が操作する（たとえばクリックする）ことによってＰＣ１６に指示を伝えるためのボタンである、通話開始ボタン５２、確定ボタン６０、再生ボタン６２、ポーズボタン６４、停止ボタン６６、拡大ボタン６８、縮小ボタン７０、フルボタン７２、一覧表示ボタン７４、および設定ボタン７６が含まれる。画面左上には、聞き起こし対象の応対ログに係る通話開始時間、通話終了時間、応対オペレータが表示される。 Next, a response log screen, which is an example of a screen displayed when the PC 16 executes the response log processing unit 35 of the operator work support tool 39, will be described with reference to FIGS.
FIG. 8 shows a reception log screen 50 when the call is not being made. The response log screen 50 is a button for transmitting an instruction to the PC 16 when the operator 20 operates (for example, clicks), a call start button 52, a confirmation button 60, a play button 62, a pause button 64, a stop button 66, An enlarge button 68, a reduce button 70, a full button 72, a list display button 74, and a setting button 76 are included. On the upper left of the screen, a call start time, a call end time, and a response operator related to the response log to be awakened are displayed.

また、応対ログ画面５０は、オペレータ２０が操作する（たとえばクリックして選択肢を表示させ、さらにクリックして選択を行う）ことによって応対ログの種類を指示する、カテゴリ選択ボックス８０を含む。さらに、応対ログ画面５０は、通話内容が文字によって行単位で表示される応対ログ表示部９２と、通話内容の音声波形が表示される波形表示部９４とを含む。応対ログ表示部９２は、時間欄９２ａと、ユーザ欄９２ｂと、オペレータ欄９２ｃと、スクロールバー９２ｄとを含み、波形表示部９４は、オペレータ欄９４ａと、ユーザ欄９４ｂと、スクロールバー９４ｄとを含む。
このように、応対ログ画面５０では、一方の軸に時間を他方の軸にユーザとオペレータを持つ音声認識結果表示領域である応対ログ表示部９２と、オペレータ音声に係る波形を表示するオペレータ音声領域およびユーザ音声に係る波形を表示するユーザ音声領域を含む波形表示部９４とが同一画面上に表示される。
また、応対ログ画面５０は、オペレータ名および音声認識辞書をそれぞれ選択する、オペレータ入力ボックス７７および音声認識辞書選択ボックス７８を含む。さらに、応対ログ画面５０は、応対ログおよび、通話のカテゴリを、オペレータ音声認識結果記憶部４２、ユーザ音声認識結果記憶部４３、および通話ログ管理情報記憶部４４に格納するための保存ボタン８１と、オペレータ業務支援ツール３９の実行を終了させるための終了ボタン９８とを含む。 The response log screen 50 includes a category selection box 80 that indicates the type of response log when operated by the operator 20 (for example, click to display options and then click to select). Furthermore, the reception log screen 50 includes a reception log display unit 92 that displays call contents in units of lines by characters, and a waveform display unit 94 that displays voice waveforms of the call contents. The response log display unit 92 includes a time column 92a, a user column 92b, an operator column 92c, and a scroll bar 92d. The waveform display unit 94 includes an operator column 94a, a user column 94b, and a scroll bar 94d. Including.
As described above, in the response log screen 50, the response log display unit 92 which is a voice recognition result display region having a time on one axis and a user and an operator on the other axis, and an operator voice region for displaying a waveform related to the operator voice. A waveform display unit 94 including a user voice area for displaying a waveform related to the user voice is displayed on the same screen.
The response log screen 50 includes an operator input box 77 and a voice recognition dictionary selection box 78 for selecting an operator name and a voice recognition dictionary, respectively. Furthermore, the reception log screen 50 includes a save button 81 for storing the reception log and the call category in the operator voice recognition result storage unit 42, the user voice recognition result storage unit 43, and the call log management information storage unit 44. And an end button 98 for ending the execution of the operator work support tool 39.

図９は、通話中かつ音声認識実行中である場合の応対ログ画面５０を示す。図８と比較して、通話開始ボタン５２、確定ボタン６０、および終了ボタン９８は表示されず、通話終了ボタン５８および中断ボタン５４が表示される。
図１０は、通話中かつ音声認識中断中である場合の応対ログ画面５０を示す。図８と比較して、通話開始ボタン５２、確定ボタン６０、および終了ボタン９８は表示されず、通話終了ボタン５８および再開ボタン５６が表示される。 FIG. 9 shows a reception log screen 50 when a call is in progress and voice recognition is being executed. Compared to FIG. 8, the call start button 52, the confirm button 60, and the end button 98 are not displayed, and the call end button 58 and the interruption button 54 are displayed.
FIG. 10 shows a response log screen 50 when a call is in progress and voice recognition is interrupted. Compared to FIG. 8, the call start button 52, the confirm button 60, and the end button 98 are not displayed, and the call end button 58 and the resume button 56 are displayed.

以上のように構成されるオペレータ業務支援システム１００の動作を、オペレータ業務支援ツール３９の動作とともに、図８〜図１３に示す画面例、および、図１４に示すフローチャートを用いて、以下に説明する。
図１４のフローチャートは、外部のユーザからの通話にオペレータ２０が応対する際の、オペレータ業務支援ツール３９の処理の流れを示す。
電話機１２は通話中ではなく、ＰＣ１６のディスプレイには図８に示す応対ログ画面５０が表示されている（ステップＳ１１）。スイッチ１４ａは通話側に設定されている。 The operation of the operator work support system 100 configured as described above will be described below using the screen examples shown in FIGS. 8 to 13 and the flowchart shown in FIG. 14 together with the operation of the operator work support tool 39. .
The flowchart of FIG. 14 shows the flow of processing of the operator work support tool 39 when the operator 20 responds to a call from an external user.
The telephone 12 is not busy, and a response log screen 50 shown in FIG. 8 is displayed on the display of the PC 16 (step S11). The switch 14a is set to the call side.

外部のユーザからの通話が電話機１２に着信すると、オペレータ２０は、スイッチ１４ａが通話側に設定されていることを確認する。聞き起こし側に設定されていれば、通話側に切り替える。
オペレータ２０は、ヘッドセット１８によって通話を受けるとともに、通話開始ボタン５２をクリックする（ステップＳ１３）。この入力を受けて、出力部３６は、それまで応対ログ表示部９２および波形表示部９４に表示されていた通話内容および波形があればそれらを消去する（ステップＳ１４）。 When a call from an external user arrives at the telephone 12, the operator 20 confirms that the switch 14a is set to the call side. If it is set to the listening side, switch to the call side.
The operator 20 receives a call with the headset 18 and clicks the call start button 52 (step S13). In response to this input, the output unit 36 deletes any call contents and waveforms that have been displayed in the response log display unit 92 and the waveform display unit 94 so far (step S14).

続いて、録音部３２は、通話内容の記録を開始する（ステップＳ１６）。通話内容の記録において、録音部３２は、スイッチボックス１４から音声入力のＬチャネルに入力されるユーザ音声と、同じくＲチャネルに入力されるオペレータ音声とを、ステレオ音声の形式で記録すなわち録音し、さらに、後述の音声認識によって得られるテキストおよび関連する時刻も記録する。に示すエコーキャンセル部３１が、電話機１２からの出力に漏れこんだオペレータ音声をキャンセルしてユーザ音声を明瞭にし、録音部３２がオペレータ音声およびユーザ音声を音声記憶部４１に格納する。 Subsequently, the recording unit 32 starts recording the content of the call (step S16). In recording the contents of the call, the recording unit 32 records, that is, records the user voice inputted from the switch box 14 to the L channel for voice input and the operator voice inputted to the R channel in the form of stereo voice. Furthermore, the text obtained by voice recognition described later and the related time are also recorded. 2 cancels the operator voice leaked into the output from the telephone 12 to clarify the user voice, and the recording section 32 stores the operator voice and the user voice in the voice storage section 41.

オペレータ音声認識部３３は、オペレータ２０の発話の区切り、たとえば３００ｍｓ以上の時間にわたってオペレータ音声の音量が一定未満である部分を検出する（ステップＳ１７）。この区切りを検出すると、オペレータ音声認識部３３は、直前に検出された区切りあるいは記録開始時点と、新たに検出された区切りとの間に入力されたオペレータ音声を一つの文として受け取り、これを単位として音声認識を行ってテキストのデータに変換するとともに、業務向け辞書記憶部４０に登録された、該当する業務にとって重要なキーワードと照合し、一致するキーワードが含まれていれば、そのキーワードの前後に予め定められた予約語を置くなどしてマーキングする（ステップＳ１８）。このようにして、音声認識とキーワード検出を行った結果は、オペレータ音声認識結果記憶部４２に格納される。 The operator voice recognition unit 33 detects a part where the volume of the operator voice is less than a certain level over a period of the utterance of the operator 20, for example, 300 ms or longer (step S17). When this segment is detected, the operator voice recognition unit 33 receives the operator voice input between the segment detected immediately before or the recording start time and the newly detected segment as one sentence, and this is the unit. Is recognized and converted into text data, and matched with keywords important for the corresponding business registered in the business dictionary storage unit 40. If a matching keyword is included, before and after the keyword Marking is performed by placing a predetermined reserved word on (step S18). The results of performing speech recognition and keyword detection in this way are stored in the operator speech recognition result storage unit 42.

応対ログ処理部３５は、キーワード辞書記憶部４５に記憶されたキーワードの検出を行ってステップＳ１８と同様にマーキングするとともに、出力部３６を介して、認識結果を応対ログ表示部９２のオペレータ欄９２ｃに表示する。このとき、直前の発話の区切りあるいは記録開始時点の時刻を、そのテキストと同じ行の時間欄に表示する（ステップＳ１９）。
このように、ＰＣ１６は、通話中に、通話の進行に伴ってリアルタイムで、オペレータ２０の発話内容を記憶し、かつディスプレイに表示する。
音声データ入力が一定間隔以上空くと、変換されたテキストデータもその時点で区切られ、時系列的に並んだ複数のテキストデータ列として管理されていく。 The response log processing unit 35 detects the keyword stored in the keyword dictionary storage unit 45 and performs the marking in the same manner as in step S18, and also displays the recognition result via the output unit 36 in the operator column 92c of the response log display unit 92. To display. At this time, the break of the previous utterance or the time at the start of recording is displayed in the time column on the same line as the text (step S19).
In this way, during the call, the PC 16 stores the utterance content of the operator 20 in real time as the call progresses and displays it on the display.
When the voice data input is more than a certain interval, the converted text data is also divided at that time and managed as a plurality of text data strings arranged in time series.

また、応対ログ表示部９２のユーザ欄９２ｂには、同様にしてユーザ音声の音声認識結果であるテキストが表示される。ユーザ音声の音声認識は、ユーザ音声認識部３４によって行われ、結果であるテキストはユーザ音声認識結果記憶部４３に格納される。同様にして、ユーザ音声認識部３４および応対ログ処理部３５が、ユーザ音声のテキストにおけるキーワード検出を行う。
このとき、オペレータ音声のテキストとユーザ音声のテキストとは、１行おきに交互に表示されてもよいし、出力部３６が通話の状況に応じて表示行の制御を行ってもよい。さらに、上述のステップＳ１７〜１９と、後述のステップＳ２０とは、オペレータ音声に対応する処理と、ユーザ音声に対応する処理、二つの処理がそれぞれ独立して同時に実行されることになる。
また、表示される行は、応対ログ表示部９２の上から下へと進み、最下行に到達するとそれ以降は最新のテキストが常に最下行に表示され、それ以前の表示内容が順次上にスクロールして表示される。 Similarly, in the user column 92b of the reception log display unit 92, text that is a voice recognition result of the user voice is displayed. The voice recognition of the user voice is performed by the user voice recognition unit 34, and the resulting text is stored in the user voice recognition result storage unit 43. Similarly, the user voice recognition unit 34 and the response log processing unit 35 perform keyword detection in the text of the user voice.
At this time, the text of the operator voice and the text of the user voice may be alternately displayed every other line, or the output unit 36 may control the display line according to the state of the call. Further, in steps S17 to S19 described above and step S20 described later, a process corresponding to the operator voice, a process corresponding to the user voice, and the two processes are performed independently and simultaneously.
The displayed lines advance from the top to the bottom of the response log display unit 92. When the bottom line is reached, the latest text is always displayed on the bottom line, and the previous display contents scroll up sequentially. Is displayed.

次に、出力部３６は、ステップＳ１８およびステップＳ１９において検出したキーワードを強調表示する（ステップＳ２０）。
強調表示は、単にキーワードを目立つように強調表示するだけでもよいが、強調表示された文字列が選択されると、説明文が表示されてもよい。また、詳しくは、聞き起こし支援処理において説明するが、強調表示された文字列が選択されると、直後の音声が再生できてもよい。
強調表示は、たとえば図８に示すように、キーワードを墨付きカッコで囲むことによってなされるが、これはキーワードの色、フォントの種類、大きさ等を変更したり、下線を付したり、太字や斜体にすることによってなされてもよい。前述の強調表示の種類に応じて、異なる形式で表示されてもよい。
登録されるキーワードの例としては以下のようなものがある。
・社名
・「ＤＶＤ−ＲＡＭ」等の、製品の一般名称
・製品名
・製品の型番
・「フォーマットできない」等の、現象を表す表現
・「お電話番号を頂戴したいのですが」「型番をお願いします」「お名前をお願いします」「お名前を復唱いたします」等の、その直後にユーザの個人情報や製品名等、重要な情報が現れることを示す文字列 Next, the output unit 36 highlights the keywords detected in Step S18 and Step S19 (Step S20).
The highlighting may be simply highlighting the keywords so that they are conspicuous, but an explanatory text may be displayed when the highlighted character string is selected. Although the details will be described in the arousal support process, when the highlighted character string is selected, the immediately following voice may be reproduced.
For example, as shown in FIG. 8, highlighting is performed by enclosing a keyword in black brackets. This can be done by changing the keyword color, font type, size, etc., underlining, or bolding. Or italicized. Depending on the type of highlighting described above, it may be displayed in a different format.
Examples of registered keywords include the following.
・ Company name ・ General name of the product such as “DVD-RAM” ・ Product name ・ Product model number ・ Expression expressing the phenomenon such as “Cannot format” ・ “I want to receive a phone number” A character string indicating that important information such as the user's personal information or product name appears immediately after that, such as "I need your name" or "I will repeat your name"

なお、オペレータ２０は、音声認識の実行ならびにテキストおよび時刻の記録に影響を与えることなく、応対ログ表示部９２のスクロールバー９２ｄを操作して、過去に記録された所望の時刻のテキストを表示させることができる。
また、オペレータ２０は、上述のステップＳ１７〜ステップＳ２０の過程において、通話終了ボタン５８をクリックすることにより、ＰＣ１６内の各部に通話が終了したことを指示することができる。
さらに、オペレータ２０は、上述のステップＳ１７〜ステップＳ２０の過程において、カテゴリ選択ボックス８０を操作することにより、応対ログに関連して図３の通話ログ管理情報記憶部４４に記録されるカテゴリを指定することができる。指定されたカテゴリは後述するステップＳ２２の終了時に、通話ログ管理情報記憶部４４に格納する。カテゴリは、通常は、通話を開始する前には分からないので、運用の形態としては、通話の終了直前、および、聞き起こし時に入力する場合が多い。特に指定がなければデフォルトのカテゴリを格納する。 The operator 20 operates the scroll bar 92d of the reception log display unit 92 to display the text of a desired time recorded in the past without affecting the execution of voice recognition and the recording of text and time. be able to.
In addition, the operator 20 can instruct each part in the PC 16 that the call is ended by clicking the call end button 58 in the process of step S17 to step S20 described above.
Further, the operator 20 designates a category recorded in the call log management information storage unit 44 in FIG. 3 in relation to the reception log by operating the category selection box 80 in the process of the above-described steps S17 to S20. can do. The designated category is stored in the call log management information storage unit 44 at the end of step S22 described later. Since the category is usually not known before the call is started, the operation is often input immediately before the end of the call and at the time of listening. The default category is stored unless otherwise specified.

また、オペレータ２０は、上述のステップＳ１７〜ステップＳ２０の過程において、図９の中断ボタン５４をクリックすることにより、録音部３２、オペレータ音声認識部３３、およびユーザ音声認識部３４に一時中断指示を与え、通話内容の記録、音声認識、テキストの表示処理を一時中断させることができる。また、この一時中断した状態において、図１０の再開ボタン５６をクリックすることにより、再開指示を与え、これらの処理を再開させることができる。オペレータ音声認識部３３、およびユーザ音声認識部３４は、この一時中断および再開の操作がなされると、一時中断前と再開後との間に、オペレータ音声の区切りが検出されたものと認識する。 Further, in the process of steps S17 to S20, the operator 20 clicks the interruption button 54 in FIG. 9 to give a temporary interruption instruction to the recording unit 32, the operator voice recognition unit 33, and the user voice recognition unit 34. Giving, recording of call contents, voice recognition, and text display processing can be suspended. Further, in this temporarily interrupted state, by clicking the restart button 56 in FIG. 10, a restart instruction can be given and these processes can be restarted. The operator voice recognition unit 33 and the user voice recognition unit 34 recognize that a break of the operator voice is detected between the time before the temporary interruption and the time after the restart when the temporary interruption and resumption operations are performed.

この一時中断および再開は、たとえば、ユーザとの通話中に記録する必要のない会話が長時間にわたって続く場合等に使用される。また、この一時中断および再開は、オペレータ音声認識部３３、またはユーザ音声認識部３４が発話の区切りを誤って検出し、その区切り検出の誤りが連鎖的に続くことが想定されるときに、オペレータ音声認識部３３、およびユーザ音声認識部３４に強制的に発話の区切りを検出させることにより、所望のタイミングで発話の区切りを設定させるためにも使用される。 This suspension and resumption is used, for example, when a conversation that does not need to be recorded during a call with a user continues for a long time. In addition, the temporary interruption and resumption are performed when the operator voice recognition unit 33 or the user voice recognition unit 34 erroneously detects an utterance break and the error in the break detection is assumed to continue in a chain. It is also used to cause the speech recognition unit 33 and the user speech recognition unit 34 to forcibly detect the utterance break, thereby setting the utterance break at a desired timing.

ステップＳ２０の後、入力部３７は、オペレータ２０によって通話終了ボタン５８がクリックされたかどうかを判定する（ステップＳ２１）。通話開始ボタン５２が最後にクリックされてから通話終了ボタン５８がクリックされていない場合、すなわちまだ通話中である場合は、ＰＣ１６の処理はステップＳ１７に戻る。 After step S20, the input unit 37 determines whether or not the call end button 58 is clicked by the operator 20 (step S21). If the call end button 58 has not been clicked since the call start button 52 was last clicked, that is, if the call is still in progress, the processing of the PC 16 returns to step S17.

通話終了ボタン５８がクリックされていた場合、すなわちオペレータ２０から通話終了の指示があった場合は、録音部３２、オペレータ音声認識部３３、およびユーザ音声認識部３４は、ステップＳ１６において開始された通話内容の記録と音声認識を終了する（ステップＳ２２）。音声認識の結果として得られたテキストすなわち応対ログと、オペレータ音声およびユーザ音声の録音データとは、互いに関連付けられ、通話管理部３８によって、セットとして、通話ログ管理情報記憶部４４に格納される。なお、ステップＳ１７からステップ２１までの音声認識および結果表示の繰り返し処理と並行して、ステップＳ１６で開始した音声の録音は、途中で一時中断された期間を除き、記録終了であるステップＳ２２まで、実行される。なお、一時中断時は、録音部３２、オペレータ音声認識部３３、およびユーザ音声認識部３４すべてを一時中断すると、オペレータ音声認識結果記憶部４２、およびユーザ音声認識結果記憶部４３に格納される音声データは、一時中断中の相対時間は進まないことになる。また、一時中断時は、録音部３２は録音を続け、オペレータ音声認識部３３、およびユーザ音声認識部３４を一時中断する方式でもよく、この場合は、オペレータ音声認識結果記憶部４２、およびユーザ音声認識部４３に格納される音声データは、一時中断中も相対時間が進むことになる。 When the call end button 58 is clicked, that is, when the operator 20 gives an instruction to end the call, the recording unit 32, the operator voice recognition unit 33, and the user voice recognition unit 34 start the call started in step S16. The content recording and the voice recognition are finished (step S22). The text obtained as a result of the voice recognition, that is, the response log, and the recording data of the operator voice and the user voice are associated with each other and stored in the call log management information storage unit 44 by the call management unit 38 as a set. In parallel with the voice recognition and result display repetition processing from step S17 to step 21, the recording of the voice started in step S16 is the end of the recording, except for the period temporarily suspended, until step S22. Executed. When the recording unit 32, the operator voice recognition unit 33, and the user voice recognition unit 34 are all temporarily interrupted, the voices stored in the operator voice recognition result storage unit 42 and the user voice recognition result storage unit 43 are temporarily interrupted. Data will not advance relative time during suspension. Further, at the time of temporary interruption, the recording unit 32 may continue recording, and the operator voice recognition unit 33 and the user voice recognition unit 34 may be temporarily interrupted. In this case, the operator voice recognition result storage unit 42 and the user voice may be used. The audio data stored in the recognition unit 43 advances the relative time even during a temporary interruption.

次に、応対ログ処理部３５は、ステレオ音声を含む音声ファイル形式で記録されている通話内容を読み込み、オペレータ音声およびユーザ音声それぞれの波形を得る。得られた波形は、出力部３６を介して、それぞれ波形表示部９４のオペレータ欄９４ａおよびユーザ欄９４ｂに表示される（ステップＳ２３）。表示は時刻とともに波形表示部９４の左から右へと進むが、通話が長く波形表示部９４に収まらない場合は、その一部のみが表示される。オペレータ２０は、波形表示部９４のスクロールバー９４ｄを操作して、所望の時間帯の波形を表示させることができる。 Next, the reception log processing unit 35 reads the contents of the call recorded in the audio file format including the stereo audio, and obtains the waveforms of the operator audio and the user audio, respectively. The obtained waveforms are respectively displayed in the operator column 94a and the user column 94b of the waveform display unit 94 via the output unit 36 (step S23). The display proceeds from the left to the right of the waveform display unit 94 with the time, but if the call is long and does not fit in the waveform display unit 94, only a part of it is displayed. The operator 20 can display a waveform in a desired time zone by operating the scroll bar 94d of the waveform display unit 94.

その後、応対ログ処理部３５は、オペレータ２０の操作に応じて聞き起こし支援処理を行う（ステップＳ２４）。
オペレータ２０は、応対ログ表示部９２を確認しながら、オペレータ業務支援ツール３９に含まれないシステムまたはプログラムを使用し、通話に関する業務ログの入力を行う。この業務ログは、通話相手であるユーザの個人情報、ユーザからの問い合わせ内容、それに対するオペレータ２０の回答内容等が含まれる。この際、聞き起こし、すなわち通話時の音声の確認が必要となる場合があるが、この作業に伴って実行されるのがステップＳ２４の聞き起こし支援処理である。これは、応対ログ処理部３５、出力部３６、入力部３７が協働して行う。
なお、応対ログ画面５０において、任意のタイミングで通話カテゴリを選択し、保存ボタンにより、保存することができる。 Thereafter, the response log processing unit 35 performs a wake-up support process according to the operation of the operator 20 (step S24).
The operator 20 inputs a work log related to a call using a system or program not included in the operator work support tool 39 while checking the response log display unit 92. This business log includes the personal information of the user who is the other party of the call, the inquiry content from the user, the response content of the operator 20 in response thereto. At this time, there is a case where it is necessary to confirm the voice during the call, that is, the voice at the time of the telephone call, but what is executed in accordance with this work is the voice assist process in step S24. This is performed in cooperation with the response log processing unit 35, the output unit 36, and the input unit 37.
In the response log screen 50, a call category can be selected at an arbitrary timing and saved by a save button.

以下、聞き起こし支援処理の内容を説明する。
まず、オペレータ２０は、スイッチ１４ａを聞き起こし側に切り替える。
オペレータ２０自身の発話内容を実音声で確認したい場合は、オペレータ２０は、まずスクロールバー９２ｄを操作して、応対ログの所望の部分を表示させる。次に、応対ログ表示部９２内の対応するセル（該当のテキストが表示されている行の、オペレータ欄９２ｃ）をクリックする。出力部３６は、図８に示すように、応対ログ表示部９２内のクリックされたセルの背景に着色表示をするとともに、音声ログ編集用ウィンドウ９３に対応するテキストを表示し、波形表示部９４においてそのテキストに対応する部分の音声波形の背景も着色表示する。なお、この着色表示がなされている部分は、図８においては斜線によって示す。この状態で再生ボタン６２がクリックされると、出力部３６は該当部分の音声を再生してオペレータ２０に提供する。これによってオペレータ２０は、オペレータ音声を簡単に聞き起こすことができる。
なお、オペレータ２０は、連続した複数のセルを指定することもできる。オペレータ２０が、応対ログ表示部９２のオペレータ欄９２ｃの連続した複数のセルを選択すると、出力部３６は、波形表示部９４のオペレータ欄９４ｂの対応する音声波形の背景を着色表示し、音声ログ編集用ウィンドウ９３に、最初のセルに対応するテキストを表示する。オペレータ２０が、応対ログ表示部９２のユーザ欄９２ｂの連続した複数のセルを選択すると、出力部３６は、波形表示部９４のユーザ欄９４ａの対応する音声波形の背景を着色表示し、音声ログ編集用ウィンドウ９３に、最初のセルに対応するテキストを表示する。オペレータ２０が、応対ログ表示部９２のユーザ欄９２ｂおよびオペレータ欄９２ｃの連続した複数のセルを選択すると、出力部３６は、波形表示部９４のユーザ欄９４ａおよびオペレータ欄９４ｂの対応する音声波形の背景を着色表示し、音声ログ編集用ウィンドウ９３に、最初のセルに対応するテキストを表示する。この状態で再生ボタン６２がクリックされると、出力部３６は該当部分の音声を再生してオペレータ２０に提供する。 In the following, the contents of the awakening support process will be described.
First, the operator 20 switches the switch 14a to the listening side.
When the operator 20 wants to confirm the utterance content of the operator 20 with real voice, the operator 20 first operates the scroll bar 92d to display a desired portion of the response log. Next, the corresponding cell (the operator column 92c in the line where the corresponding text is displayed) in the reception log display unit 92 is clicked. As shown in FIG. 8, the output unit 36 displays a color corresponding to the background of the clicked cell in the response log display unit 92 and displays text corresponding to the voice log editing window 93, and a waveform display unit 94. The background of the speech waveform corresponding to the text is also colored. Note that the colored portion is indicated by diagonal lines in FIG. When the reproduction button 62 is clicked in this state, the output unit 36 reproduces the sound of the corresponding part and provides it to the operator 20. Thus, the operator 20 can easily hear the operator voice.
Note that the operator 20 can also specify a plurality of continuous cells. When the operator 20 selects a plurality of continuous cells in the operator column 92c of the response log display unit 92, the output unit 36 displays the background of the corresponding voice waveform in the operator column 94b of the waveform display unit 94 in a colored manner. The text corresponding to the first cell is displayed in the editing window 93. When the operator 20 selects a plurality of continuous cells in the user column 92b of the response log display unit 92, the output unit 36 displays the background of the corresponding audio waveform in the user column 94a of the waveform display unit 94 in a colored manner. The text corresponding to the first cell is displayed in the editing window 93. When the operator 20 selects a plurality of continuous cells in the user column 92b and the operator column 92c of the response log display unit 92, the output unit 36 displays the corresponding voice waveform in the user column 94a and the operator column 94b of the waveform display unit 94. The background is displayed in color, and the text corresponding to the first cell is displayed in the voice log editing window 93. When the reproduction button 62 is clicked in this state, the output unit 36 reproduces the sound of the corresponding part and provides it to the operator 20.

このように、それぞれのテキストデータ列は、その開始時間、終了時間が音声と紐付けられているため、１つのテキストデータを指定して、その部分に該当する録音音声を、一発で頭出し再生する機能を有する。 In this way, since each text data string has its start time and end time linked to the voice, one text data is designated and the recorded voice corresponding to that part is cued in one shot. It has a function to play.

この際、オペレータ２０は、音声波形の背景が着色されている部分の右端あるいは左端をマウスポインタで指示し、ドラッグアンドドロップ操作によって移動させることにより、背景が着色表示される部分、すなわち再生される部分の範囲を変更することができる。
同様に、オペレータ２０は、応対ログ表示部９２のユーザ欄９２ｂおよびオペレータ欄９２ｃの右端あるいは左端をマウスポインタで指示し、ドラッグアンドドロップ操作によって移動させることにより、それぞれの表示幅を変更したり、一方を非表示にしたりできる。 At this time, the operator 20 designates the right end or the left end of the portion where the background of the audio waveform is colored with the mouse pointer and moves it by a drag and drop operation, thereby reproducing the portion where the background is displayed in color. The range of the part can be changed.
Similarly, the operator 20 indicates the right end or the left end of the user column 92b and the operator column 92c of the response log display unit 92 with a mouse pointer and moves them by drag and drop operation, thereby changing the respective display widths, One can be hidden.

オペレータ音声の再生中、オペレータ２０は、ポーズボタン６４をクリックすることによって、再生をポーズすなわち一時停止状態にすることができる。このポーズは、再びポーズボタン６４をクリックすることにより解除され、音声の再生が再開される。また、オペレータ２０は、停止ボタン６６をクリックすることによって、再生を停止することができる。 During playback of the operator voice, the operator 20 can pause playback by clicking on the pause button 64. The pause is canceled by clicking the pause button 64 again, and the sound reproduction is resumed. The operator 20 can stop the reproduction by clicking the stop button 66.

また、オペレータ２０は、応対ログすなわち音声認識の結果として表示されているテキストの内容を修正することができる。この修正を行う場合、まずオペレータ２０は上記のようにセルおよび該当する波形部分の背景を着色表示し、音声ログ編集用ウィンドウ９３に対応するテキストを表示させる。音声ログ編集用ウィンドウ９３は、テキストデータを編集するためのテキスト編集領域である。
この音声ログ編集用ウィンドウ９３では、応対ログ表示部９２において背景が着色されていたセルのテキストが編集可能である。オペレータ２０はキーボード等の入力装置を介してテキストの修正等の編集を行い、終了後に確定ボタンをクリックする。確定ボタンがクリックされると、編集されたテキストの内容を、応対ログ表示部９２の該当セルおよびオペレータ音声認識結果記憶部４２に反映する。 Further, the operator 20 can correct the content of the text displayed as a result of the reception log, that is, the voice recognition. When this correction is performed, first, the operator 20 displays the cell and the background of the corresponding waveform portion in a colored manner as described above, and displays the text corresponding to the voice log editing window 93. The voice log editing window 93 is a text editing area for editing text data.
In the voice log editing window 93, the text of the cell whose background is colored in the response log display section 92 can be edited. The operator 20 performs editing such as correction of text via an input device such as a keyboard, and clicks the confirmation button after the completion. When the confirmation button is clicked, the contents of the edited text are reflected in the corresponding cell of the response log display unit 92 and the operator voice recognition result storage unit 42.

また、ユーザの発話内容を実音声で確認したい場合も、同様にして、対応するセル（該当の時刻が表示されている行の、ユーザ欄９２ｂ）をクリックする。出力部３６は、クリックされたセルの背景を着色するとともに、対応する部分の音声波形も背景を着色表示させる。この状態で再生ボタン６２がクリックされると、出力部３６は、該当部分、すなわち直前のオペレータ音声の発声終了時点から、次のオペレータ音声の発声開始時点までの部分を再生する。これによってオペレータ２０は、前後のオペレータ音声に挟まれたユーザ音声部分を簡単に聞き起こすことができる。
図３の構成図において、ユーザ音声認識部３４を含まない場合についても、ユーザの発話内容を実音声で確認できる。この場合、応対ログ表示部９２のユーザ欄９２ｂは空白のまま表示されるが、波形表示部９４のユーザ欄９４ｂには、前述と同様にユーザ音声の波形が表示される。 Similarly, when the user's utterance content is to be confirmed with real voice, the corresponding cell (the user column 92b in the row displaying the corresponding time) is similarly clicked. The output unit 36 colors the background of the clicked cell and also displays the background of the corresponding waveform of the voice waveform. When the playback button 62 is clicked in this state, the output unit 36 plays back the corresponding part, that is, the part from the end time of the immediately preceding operator voice to the start time of the next operator voice. As a result, the operator 20 can easily awaken the user voice portion sandwiched between the front and rear operator voices.
In the configuration diagram of FIG. 3, even when the user speech recognition unit 34 is not included, the user's utterance content can be confirmed with real speech. In this case, the user column 92b of the response log display unit 92 is displayed blank, but the user voice waveform is displayed in the user column 94b of the waveform display unit 94 as described above.

オペレータ２０は、ユーザの発話内容を実音声で確認したい場合は、オペレータ欄９２ｃに表示されたオペレータの解析結果を目印に、近接するユーザ欄９２ｂの再生したいセルをクリックする。出力部３６は、応対ログ表示部９２内のクリックされたセルの背景に着色表示をするとともに、音声ログ編集用ウィンドウ９３に対応するテキスト（ただし、空欄）を表示し、波形表示部９４においてそのテキストに対応する部分の音声波形の背景も着色表示する。この状態で再生ボタン６２がクリックされると、出力部３６は該当部分のユーザの音声を再生してオペレータ２０に提供する。また、オペレータ２０が、空欄となっている音声ログ編集用ウィンドウ９３にテキストの入力を行い、確定ボタンをクリックすると、入力されたテキストの内容を、応対ログ表示部９２の該当セルおよびユーザ音声認識結果記憶部４３に反映する。 When the operator 20 wants to confirm the content of the user's utterance with real voice, the operator 20 clicks the cell to be reproduced in the adjacent user column 92b with the operator analysis result displayed in the operator column 92c as a mark. The output unit 36 displays a color (but blank) corresponding to the voice log editing window 93 in a colored display on the background of the clicked cell in the response log display unit 92, and the waveform display unit 94 displays the text. The background of the speech waveform corresponding to the text is also colored. When the reproduction button 62 is clicked in this state, the output unit 36 reproduces the user's voice of the corresponding part and provides it to the operator 20. Further, when the operator 20 inputs text in the voice log editing window 93 that is blank and clicks the confirm button, the contents of the input text are recognized in the corresponding cell of the response log display unit 92 and user voice recognition. This is reflected in the result storage unit 43.

また、「お電話番号を頂戴したいのですが」「型番をお願いします」「お名前をお願いします」「お名前を復唱いたします」等の、その直後にユーザの個人情報や製品名等、重要な情報が現れることを示す文字列（直後参照用キーワード）を選択することにより、直後の音声を再生できるようにしてもよい。この場合、応対ログ処理部３５は、選択された直後参照用キーワードを含む文の終了時間を取得し、出力部３６は、この直後のオペレータ音声またはユーザ音声について、波形表示部９４の対応する部分の音声波形の背景を着色表示する。この状態で再生ボタン６２がクリックされると、出力部３６は該当部分の音声を再生してオペレータ２０に提供する。 In addition, "I would like to receive your phone number", "Please give me the model number", "I need your name", "I will repeat your name", etc. By selecting a character string (immediate reference keyword) indicating that important information appears, the immediately following voice may be reproduced. In this case, the response log processing unit 35 acquires the end time of the sentence including the selected reference keyword immediately after, and the output unit 36 corresponds to the corresponding part of the waveform display unit 94 for the operator voice or user voice immediately after this. The background of the audio waveform is colored. When the reproduction button 62 is clicked in this state, the output unit 36 reproduces the sound of the corresponding part and provides it to the operator 20.

また、このステップＳ２４の聞き起こし支援処理において、オペレータ２０は、拡大ボタン６８、縮小ボタン７０、およびフルボタン７２をそれぞれクリックすることにより、波形の表示形式を変更することができる。拡大ボタン６８がクリックされると、出力部３６は、表示される波形を横方向に拡大する。すなわち、より短時間の波形変化がより詳細に表示される。縮小ボタン７０がクリックされると、出力部３６は逆に、表示される波形を横方向に縮小する。すなわち、より長時間の波形変化が表示される。フルボタン７２がクリックされると、出力部３６は、処理中の通話の開始から終了までの波形をすべて画面内に表示する。
また、このステップＳ２４の聞き起こし支援処理において、オペレータ２０は、聞き起こした問合せの内容が、現状設定されているカテゴリと合致しない場合には、カテゴリ選択ボックス８０によりカテゴリを変更することができる。 Also, in the arousal support process in step S24, the operator 20 can change the waveform display format by clicking the enlarge button 68, the reduce button 70, and the full button 72, respectively. When the enlarge button 68 is clicked, the output unit 36 enlarges the displayed waveform in the horizontal direction. That is, a shorter time waveform change is displayed in more detail. When the reduction button 70 is clicked, the output unit 36 conversely reduces the displayed waveform in the horizontal direction. That is, a longer waveform change is displayed. When the full button 72 is clicked, the output unit 36 displays all waveforms from the start to the end of the call being processed on the screen.
Further, in this audible support process of step S24, the operator 20 can change the category by the category selection box 80 when the content of the queried inquiry does not match the currently set category.

以上は、外部のユーザからの通話を終了した直後に聞き起こしを行う際の処理の流れであるが、オペレータ業務支援システム１００は、オペレータ２０の操作に応じて、過去に記録された通話内容の聞き起こし支援を行うこともできる。
過去に記録された通話の聞き起こしを行う場合、オペレータ２０は、上記の聞き起こし時と同様にしてスイッチ１４ａを聞き起こし側に設定した後に、応対ログ画面５０の一覧表示ボタン７４をクリックする。 The above is the flow of processing when a call is made immediately after a call from an external user is terminated, but the operator work support system 100 displays the call contents recorded in the past according to the operation of the operator 20. Talking support can also be provided.
When a call recorded in the past is to be awakened, the operator 20 clicks the list display button 74 on the response log screen 50 after setting the switch 14a to the awakening side in the same manner as the above-mentioned awakening.

一覧表示ボタン７４がクリックされると、出力部３６は、図１３に示す応対ログ一覧画面をディスプレイに表示する。応対ログ一覧画面には、過去に記録された通話のそれぞれについて、図５に示す通話ログ管理情報記憶部の内容の一部である、記録開始時刻、記録終了時刻、通話時間、対応オペレータ名、および上述のカテゴリ選択ボックス８０によって指定された通話のカテゴリが表示される。オペレータ２０は、聞き起こし作業の対象となる通話を選択して背景を着色表示させ、ＯＫボタンをクリックする。ＯＫボタンがクリックされると、出力部３６は、応対ログ一覧画面を消去し、選択された通話内容の記録を読み込んで応対ログ画面５０に表示し、上述のステップＳ２４と同様にして聞き起こし支援処理を行う。
もしくは、オペレータ２０は、キャンセルボタンをクリックすることにより、選択を中止して応対ログ画面５０に戻ることができる。 When the list display button 74 is clicked, the output unit 36 displays the response log list screen shown in FIG. 13 on the display. On the response log list screen, for each call recorded in the past, a recording start time, a recording end time, a call time, a corresponding operator name, which are a part of the contents of the call log management information storage unit shown in FIG. The category of the call designated by the category selection box 80 is displayed. The operator 20 selects a call to be a subject of the awakening work, displays the background in color, and clicks the OK button. When the OK button is clicked, the output unit 36 deletes the response log list screen, reads the record of the selected call content, displays it on the response log screen 50, and assists in the wake-up in the same manner as in step S24 described above. Process.
Alternatively, the operator 20 can cancel the selection and return to the response log screen 50 by clicking a cancel button.

なお、オペレータ２０は、通話内容の記録が進行中である間を除き、上記の処理の流れにかかわらずいつでも、オペレータ２０自身に関する設定を行うことができる。この設定は、応対ログ画面５０の、設定ボタン７６をクリックすることにより開始される。
設定ボタン７６がクリックされると、出力部３６は、図１１に示す設定画面をディスプレイに表示する。この設定画面には、通話ログ管理情報記憶部４４に格納されたオペレータ情報、すなわち、オペレータ業務支援ツール３９に対するオペレータの識別名であるオペレータ名、オペレータの性別が表示される。 Note that the operator 20 can make settings related to the operator 20 at any time regardless of the flow of the above processing except during the recording of the call content. This setting is started by clicking the setting button 76 on the reception log screen 50.
When the setting button 76 is clicked, the output unit 36 displays the setting screen shown in FIG. 11 on the display. On this setting screen, operator information stored in the call log management information storage unit 44, that is, an operator name that is an operator identification name for the operator work support tool 39, and the gender of the operator are displayed.

この設定画面に対して、キーボードおよびマウス等の入力装置を介して、編集したいオペレータに該当する行を選択して編集ボタンをクリックした後に、オペレータ名欄の表示を編集する、または性別選択ボタンをクリックする等の入力を行うことにより、オペレータ２０は設定を変更することができる。新規ボタンをクリックすることで、新たなオペレータを追加することもできる。
設定変更を終了すると、オペレータ２０は、図１１に示す設定画面のＯＫボタンをクリックする。これによって、通話管理部３８は、入力された変更内容を反映させ、それ以降のオペレータ業務支援ツール３９の実行を変更された内容を用いて行う。
もしくは、オペレータ２０は、キャンセルボタンをクリックすることにより、選択を中止して応対ログ画面５０に戻ることができる。
この設定画面において、辞書の編集ボタンをクリックすると、図１２に示す辞書設定画面が新たに表示される。この辞書設定画面において、所望の辞書を選択してＯＫボタンをクリックすることにより、音声認識に使用すべき業務向け辞書記憶部４０に格納された音声認識辞書を選択することができる。また、音声認識辞書を選択して編集ボタンをクリックすることにより、該当の辞書の登録内容等の編集を行うことができる。
もしくは、オペレータ２０は、キャンセルボタンをクリックすることにより、選択を中止して図１１の設定画面に戻ることができる。 For this setting screen, select the line corresponding to the operator you want to edit via an input device such as a keyboard and mouse, and click the edit button. Then, edit the operator name field display or click the gender selection button. The operator 20 can change the setting by performing an input such as clicking. A new operator can be added by clicking the new button.
When the setting change is completed, the operator 20 clicks an OK button on the setting screen shown in FIG. As a result, the call management unit 38 reflects the input change content and performs the subsequent execution of the operator work support tool 39 using the changed content.
Alternatively, the operator 20 can cancel the selection and return to the response log screen 50 by clicking a cancel button.
When a dictionary editing button is clicked on this setting screen, a dictionary setting screen shown in FIG. 12 is newly displayed. On this dictionary setting screen, by selecting a desired dictionary and clicking the OK button, the speech recognition dictionary stored in the business dictionary storage unit 40 to be used for speech recognition can be selected. Also, by selecting a speech recognition dictionary and clicking an edit button, it is possible to edit the registered contents of the corresponding dictionary.
Alternatively, the operator 20 can cancel the selection and return to the setting screen of FIG. 11 by clicking the cancel button.

なお、図８の応対ログ画面で設定ボタン７６をクリックし、図１１の設定画面、および図１２の辞書設定画面で設定されたオペレータと音声認識辞書は、以後の通話で有効となる。すなわち、通話開始ボタン５２により音声認識を始めると、オペレータ音声認識部３３は設定されたオペレータの性別を通話ログ管理情報記憶部４４のオペレータ情報から取得して、該当の性別に合致した音響辞書を使用し、かつ設定された音声認識辞書を使用して、音声認識を行う。
ユーザ音声認識部３４は、男女共用の音響辞書を使用し、かつ設定された音声認識辞書を使用して、音声認識を行う。終了ボタン９８をクリックし終了する際には、設定されたオペレータと音声認識辞書は、通話ログ管理情報記憶部４４の通話ログ管理情報に格納する。 The operator and the voice recognition dictionary set in the setting screen in FIG. 11 and the dictionary setting screen in FIG. 12 by clicking the setting button 76 on the response log screen in FIG. That is, when voice recognition is started by the call start button 52, the operator voice recognition unit 33 acquires the set gender of the operator from the operator information in the call log management information storage unit 44, and an acoustic dictionary that matches the corresponding gender is obtained. Voice recognition is performed using the set voice recognition dictionary.
The user voice recognition unit 34 performs voice recognition using a male-female acoustic dictionary and using the set voice recognition dictionary. When the end button 98 is clicked to end, the set operator and voice recognition dictionary are stored in the call log management information of the call log management information storage unit 44.

このように、本発明に係るオペレータ業務支援システムとしてコンピュータを機能させるためのプログラムによれば、通話内容のテキストを表示する際、あらかじめ指定されたキーワードを強調表示するので、オペレータ２０はその通話において重要な部分をより短時間で認識でき、作業効率を向上させることができる。
リアルタイムで音声認識して、同時に強調表示するキーワードを抽出した後に、音声認識結果に対して、さらに別のキーワードについて強調表示することができる。また、音声認識時には、認識単位の単語についてキーワードを検出できなくとも、音声認識後に、認識単位とは無関係の文章についてもキーワード検出をすることができる。
不要語を目立たなくしたり、削除できるので、限られた画面の中で、効率よく、音声認識結果を確認できる。
ステレオ録音にオペレータ音声とユーザ音声を割り当てることにより、音声データと認識結果との時間整合性をとることができる。
音声データと認識結果の時間整合性が取れているために、あらかじめ定められたキーワードの直後の音声を再生することができる。
音声データと認識結果の時間整合性が取れているために、音声認識結果と音声の波形とを同期をとって、画面で表示することができる。ユーザの音声を認識していなくても、オペレータの音声認識結果と、オペレータ音声とユーザ音声との時間整合性が取れているため、オペレータ音声の間に存在するユーザ音声を再生することができ、音声認識結果を修正することができる。
オペレータ認識結果、またはユーザ認識結果の表示幅を変更したり、削除したりできるので、限られた画面の中で、見やすい表示を行うことができる。 As described above, according to the program for causing the computer to function as the operator work support system according to the present invention, when displaying the text of the call content, the keyword specified in advance is highlighted, so that the operator 20 can perform the call. Important parts can be recognized in a shorter time, and work efficiency can be improved.
After extracting the keywords to be highlighted at the same time by performing voice recognition in real time, it is possible to highlight another keyword with respect to the voice recognition result. Further, at the time of voice recognition, even if a keyword cannot be detected for a word in a recognition unit, it is possible to detect a keyword for a sentence unrelated to the recognition unit after voice recognition.
Since unnecessary words can be made inconspicuous or deleted, the speech recognition result can be confirmed efficiently in a limited screen.
By assigning operator voice and user voice to stereo recording, time consistency between voice data and recognition result can be obtained.
Since the time consistency between the voice data and the recognition result is obtained, the voice immediately after the predetermined keyword can be reproduced.
Since the time consistency between the speech data and the recognition result is obtained, the speech recognition result and the speech waveform can be synchronized and displayed on the screen. Even if the user's voice is not recognized, the operator's voice recognition result, and the operator voice and the user voice are time-consistent, so the user voice that exists between the operator voices can be reproduced, The speech recognition result can be corrected.
Since the display width of the operator recognition result or the user recognition result can be changed or deleted, easy-to-see display can be performed in a limited screen.

また、オペレータ音声とユーザ音声とをそれぞれ別のチャネルのデータとして分離して記録するので、それぞれの音声を独立して聞き起こすことができ、発話が重なって聞き起こしにくい状態を回避することができる。このため、オペレータ２０の作業効率を向上させることができる。 Also, since operator voice and user voice are separately recorded as separate channel data, each voice can be evoked independently, and it is possible to avoid a situation in which utterances overlap and are difficult to wake up. . For this reason, the working efficiency of the operator 20 can be improved.

さらに、オペレータ２０が発声した内容をリアルタイムで音声認識して自動的にテキスト化するので、オペレータ２０がテキストを入力する必要がなく、作業効率を向上させることができる。また、音声認識の結果が誤っていたとしても、テキストをみながら該当部分を選択し、その部分だけをスポット的に簡単に聞き起こして確認することができるので、通話の全音声を再生して該当する音声部分を探す必要がなく、作業効率を向上させることができる。 Furthermore, since the content uttered by the operator 20 is recognized as voice and automatically converted into text, the operator 20 does not need to input text and the work efficiency can be improved. Even if the result of voice recognition is wrong, you can select the relevant part while looking at the text, and you can easily wake up and check only that part, so you can play the whole voice of the call There is no need to search for a corresponding voice part, and work efficiency can be improved.

また、オペレータ２０は、中断ボタン５４および再開ボタン５６によって、通話のうち不要と判断した部分の記録を抑制できるので、聞き起こし作業において不要部分を処理する手間が省け、作業効率を向上させることができる。
さらに、この中断ボタン５４および再開ボタン５６によって、通話中にリアルタイムで発話の区切りを強制的に認識させ設定させることができるので、正確な音声認識結果を記録させることができ、聞き起こし作業の効率を向上させることができる。
また、通話ログ管理情報記憶部４４は、オペレータ名と性別との対応を記憶しているので、音声認識時に、オペレータ名が指定されていれば、対応する性別に合致する音響辞書を用いて音声認識を行うことができる。 In addition, since the operator 20 can suppress the recording of the unnecessary part of the call by the interruption button 54 and the resume button 56, it is possible to save the trouble of processing the unnecessary part in the listening work and improve the work efficiency. it can.
Furthermore, since the break button 54 and the resume button 56 can forcibly recognize and set the utterance break in real time during a call, it is possible to record an accurate voice recognition result and to improve the efficiency of the listening work. Can be improved.
In addition, since the call log management information storage unit 44 stores the correspondence between the operator name and the gender, if the operator name is specified at the time of voice recognition, the voice log management information storage unit 44 uses a sound dictionary that matches the corresponding gender. Recognition can be performed.

また、本発明に係るオペレータ業務支援システム１００としてコンピュータを機能させるためのプログラムによれば、オペレータ２０は、スイッチボックス１４のスイッチ１４ａによる切り替えによって、単一のヘッドセット１８を通話にも聞き起こしにも共通して使用することができるので、ヘッドセットを作業別に装着しなおして使い分ける必要がなく、オペレータ２０の作業効率を向上させることができる。 Further, according to the program for causing the computer to function as the operator operation support system 100 according to the present invention, the operator 20 can wake up a single headset 18 for a call by switching with the switch 14a of the switch box 14. Can be used in common, so that it is not necessary to mount the headset again for each work and use it separately, and the work efficiency of the operator 20 can be improved.

上記の実施の形態１において、さらに以下のような機能を実行可能であってもよい。
たとえば、出力部３６は、聞き起こしにおいて重要でない文字列を不要語として認識し、不要語を表示から自動削除する機能を有してもよい。この不要語は業務向け辞書記憶部４０、またはキーワード辞書記憶部４５に登録され、ステップＳ１８、またはステップＳ１９と同様にして、音声認識されたテキストと照合されることにより判定される。この照合は、出力部３６によって、応対ログ処理部３５、オペレータ音声認識部３３、またはユーザ音声認識部３４のいずれかを経由して行われる。不要語の例としては、「はい」、「さようでございますか」等が挙げられる。また、不要語の自動削除を行うかどうかを、オペレータ２０が随時変更可能であってもよい。さらに、不要語を削除するのではなく、他のテキストとは異なる表示形式、たとえば薄い色で表示されるようにしてもよい。
これにより、聞き起こしにおいて重要でないキーワードが画面に表示されなくなるか、または瞬間的に判別しやすくなるので、オペレータ２０が重要な部分を認識および理解するのが容易になり、作業効率を向上することができる。 In the first embodiment, the following functions may be further executed.
For example, the output unit 36 may have a function of recognizing an unimportant character string as an unnecessary word and automatically deleting the unnecessary word from the display. This unnecessary word is registered in the business-use dictionary storage unit 40 or the keyword dictionary storage unit 45, and is determined by collating with the speech-recognized text in the same manner as in step S18 or step S19. This collation is performed by the output unit 36 via any one of the response log processing unit 35, the operator voice recognition unit 33, and the user voice recognition unit 34. Examples of unnecessary words include “yes” and “good morning”. Further, the operator 20 may be able to change at any time whether or not to automatically delete unnecessary words. Further, unnecessary words may not be deleted, but may be displayed in a display format different from other text, for example, in a light color.
As a result, unimportant keywords are not displayed on the screen or can be easily discriminated instantaneously, so that it becomes easy for the operator 20 to recognize and understand important parts and improve work efficiency. Can do.

さらに、表示されるテキスト中において、強調表示されたキーワード部分がクリックされることに応じて、そのキーワードに関連付けられた情報、たとえばマニュアル等をポップアップ表示する機能や、キーワードの直後の音声を再生する機能を有してもよい。この情報は、あらかじめそれぞれのキーワードに関連付けられて指定され、業務向け辞書記憶部４０、またはキーワード辞書記憶部４５に記憶される。これにより、オペレータ２０がキーワードに関する情報を得るための手間が省け、作業効率を向上することができる。
なお、オペレータ音声のテキスト表示およびキーワードの強調表示は、上述のようにリアルタイムで行われる。このため、オペレータ２０は、たとえば、通話で製品名を発声し、その直後にキーワードとして強調表示される該当製品名をクリックすることにより、マニュアルを表示させ、通話しながらそのマニュアルを参照することができる。 Furthermore, in the displayed text, when a highlighted keyword portion is clicked, information associated with the keyword, for example, a function for popping up a manual or the like, and a sound immediately after the keyword are reproduced. It may have a function. This information is specified in advance in association with each keyword and stored in the business dictionary storage unit 40 or the keyword dictionary storage unit 45. Thereby, the labor for the operator 20 to acquire the information regarding a keyword can be saved, and work efficiency can be improved.
The operator voice text display and keyword highlight display are performed in real time as described above. For this reason, for example, the operator 20 can utter a product name by a call and then click the corresponding product name highlighted as a keyword immediately thereafter to display the manual and refer to the manual while making a call. it can.

また、キーワード部分のクリックに応じて、過去の通話記録の中から、そのキーワードを含むものをリストアップする機能を有してもよい。このリストアップは、たとえば図１３の応対ログ一覧画面と同様の形式でなされ、その中から一つの通話を選択して、図８の応対ログ表示画面を新たに作成して表示し、内容の確認を行うことができる。こうすることにより、関連する複数の通話内容にわたる作業が容易になり、作業効率を向上することができる。 In addition, in response to a click on the keyword portion, the past call records may have a function of listing those including the keyword. This list-up is performed, for example, in the same format as the response log list screen in FIG. 13. By selecting one call from the list, the response log display screen in FIG. 8 is newly created and displayed, and the contents are confirmed. It can be performed. By doing so, work over a plurality of related call contents is facilitated, and work efficiency can be improved.

また、キーワードのうち一部を、直後参照用キーワードとして、他のキーワードと区別してもよい。直後参照用キーワードとは、たとえば、「お電話番号を頂戴したいのですが」「型番をお願いします」「お名前をお願いします」等の、その直後のユーザ音声に重要な情報が現れることを示す文字列である。この直後参照用キーワードがクリックされると、そのセルのテキストの最後に相当する発話の区切りを基準として再生開始点が決定され、その時点以降のユーザ音声が再生される。この再生開始点は、たとえば発話の区切りそのものである。
こうすることにより、オペレータ２０は、見つけやすく強調表示された、決まった文字列をもとに、重要な情報を含むユーザ音声を確認できるので、応対ログ表示部９２を全文にわたって読みなおす必要がなく、作業効率を向上することができる。さらに、変換されたテキストデータが曖昧な場合でも、その前後を簡単に聞き起こすことが可能なため、応対内容を正確に把握することができる。 Also, some of the keywords may be distinguished from other keywords as immediate reference keywords. Immediately after the reference keyword, for example, "I would like to receive your phone number", "Please give me the model number", "Please give me your name", etc. Is a character string indicating Immediately after this, when the reference keyword is clicked, the reproduction start point is determined based on the utterance break corresponding to the end of the text of the cell, and the user voice after that point is reproduced. This reproduction start point is, for example, an utterance break itself.
By doing so, the operator 20 can confirm the user voice including important information based on the fixed character string highlighted so as to be easy to find, so that it is not necessary to reread the response log display unit 92 over the entire sentence. , Work efficiency can be improved. Furthermore, even if the converted text data is ambiguous, it is possible to easily awaken before and after that, so that the contents of the response can be accurately grasped.

さらに、キーワードは応対ログ表示部９２中で強調表示されるだけでなく、これに加えて別画面にリスト表示されてもよい。また、リスト表示されたキーワードが応対ログ表示部９２と関連付けられてもよい。たとえば、キーワードリスト表示画面がディスプレイに表示され、そこには通話に現れるキーワードがその順にリストアップされ、各キーワードをオペレータ２０がクリックすることにより、応対ログ表示部９２中で該当するセルおよび波形表示部９４中で該当する部分の背景が着色表示され、かつ自動的に該当する波形部分の音声が再生されてもよい。また、キーワードはオペレータが簡単に登録、変更、削除できる。
こうすることにより、オペレータが必要とする情報を含む部分を瞬時に確認できるので、オペレータは応対ログ表示部９２を全文にわたって読みなおす必要がなく、作業効率を向上することができる。 Further, the keyword may be displayed not only in the response log display unit 92 but also in a list on another screen in addition to the highlighted display. Further, the keyword displayed in a list may be associated with the reception log display unit 92. For example, a keyword list display screen is displayed on the display, and keywords appearing in the call are listed in that order, and the operator 20 clicks each keyword to display the corresponding cell and waveform in the response log display unit 92. The background of the corresponding portion in the portion 94 may be colored and displayed, and the sound of the corresponding waveform portion may be automatically reproduced. Keywords can be easily registered, changed, and deleted by the operator.
By doing so, the part including the information required by the operator can be confirmed instantaneously, so that the operator does not need to reread the response log display unit 92 over the entire sentence, and the work efficiency can be improved.

本発明の実施の形態１に係るオペレータ業務支援システム１００を含む構成を示す図である。It is a figure which shows the structure containing the operator business assistance system 100 which concerns on Embodiment 1 of this invention. 本発明の実施の形態１に係るスイッチボックス１４の構成を示す図である。It is a figure which shows the structure of the switch box 14 concerning Embodiment 1 of this invention. オペレータ業務支援システム１００の構成を示す図である。1 is a diagram illustrating a configuration of an operator business support system 100. FIG. 音声記憶部４１、オペレータ音声認識結果記憶部４２、およびユーザ音声認識結果記憶部４３に格納されるデータ形式を示す図である。It is a figure which shows the data format stored in the voice memory | storage part 41, the operator voice recognition result memory | storage part 42, and the user voice recognition result memory | storage part 43. FIG. 通話ログ管理情報記憶部４４に格納される通話ログ管理情報のデータ形式を示す図である。It is a figure which shows the data format of the call log management information stored in the call log management information storage part 44. 通話ログ管理情報記憶部４４に格納されるオペレータ情報のデータ形式を示す図である。It is a figure which shows the data format of the operator information stored in the call log management information storage part. オペレータ音声認識結果記憶部４２に格納されるオペレータ情報のデータ形式を示す図である。It is a figure which shows the data format of the operator information stored in the operator voice recognition result memory | storage part. 本発明の実施の形態１において、通話中でない場合に表示される応対ログ画面５０を示す図である。In Embodiment 1 of this invention, it is a figure which shows the reception log screen 50 displayed when it is not in a telephone call. 本発明の実施の形態１において、通話中かつ音声認識実行中である場合に表示される応対ログ画面５０を示す図である。In Embodiment 1 of this invention, it is a figure which shows the reception log screen 50 displayed when a telephone call is in progress and voice recognition is being performed. 本発明の実施の形態１において、通話中かつ音声認識中断中である場合に表示される応対ログ画面５０を示す図である。In Embodiment 1 of this invention, it is a figure which shows the reception log screen 50 displayed when it is in a telephone call and speech recognition is interrupted. 本発明の実施の形態１において表示される設定画面を示す図である。It is a figure which shows the setting screen displayed in Embodiment 1 of this invention. 本発明の実施の形態１において表示される辞書設定画面を示す図である。It is a figure which shows the dictionary setting screen displayed in Embodiment 1 of this invention. 本発明の実施の形態１において表示される応対ログ一覧画面を示す図である。It is a figure which shows the reception log list screen displayed in Embodiment 1 of this invention. 本発明の実施の形態１に係るオペレータ業務支援ツールの処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process of the operator business support tool which concerns on Embodiment 1 of this invention.

Explanation of symbols

１２電話機、２０オペレータ、３３音声認識部（オペレータ音声認識部）、３４音声認識部（ユーザ音声認識部）、３５応対ログ処理部、３６出力部、３７入力部、１００オペレータ業務支援システム。 12 telephone, 20 operator, 33 voice recognition unit (operator voice recognition unit), 34 voice recognition unit (user voice recognition unit), 35 reception log processing unit, 36 output unit, 37 input unit, 100 operator work support system.

Claims

A program for functioning a computer as an operator work support system that records a call through a telephone and supports an operator work associated with the call,
The computer,
A voice recognition unit that converts voice data of the call into text data;
A first dictionary including a voice recognition dictionary and a keyword designated in advance for each word used for voice recognition;
A second dictionary containing pre-specified keywords;
A response log processing unit for processing the text data;
An output unit for displaying the processed text data ;
Function as an input unit for receiving the operation of the operator ,
The keyword included in the second dictionary is changeable,
The voice recognition unit converts the voice data into the text data,
When the voice recognition unit converts the voice data into the text data, the keyword included in the first dictionary is marked,
In the text data that has been subjected to the marking, the response log processing unit further performs marking for keywords included in the second dictionary,
The output unit state, and are not to highlight the marked keyword,
The first dictionary further includes an explanatory text relating to the keyword,
The second dictionary further includes an explanatory text relating to the keyword,
The output unit further includes:
When the input unit accepts an operation for designating a keyword, an explanation regarding the keyword is acquired and displayed from either the first dictionary or the second dictionary;
In a different area from the display of the text data, the keywords are listed in the order in which they appear in a call,
When the input unit accepts an operation for designating the keyword displayed in the list, the text data including the designated keyword is displayed in color, for causing the computer to function as an operator business support system. program.

The program for causing the computer to function as an operator business support system according to claim 1, wherein the output unit highlights the marked keyword by enclosing it in black brackets.

Before Symbol output unit further wherein the input unit accepts an operation for designating a keyword, as an operator business support system according to claim 1 wherein in which to reproduce the audio data corresponding to the text data including the keyword A program that allows a computer to function.

The computer, to further
The recognition result storage unit for storing the text data in a storage unit included in the previous SL computer
And then allowed to function,
The output unit further displays the text data in a text editing area,
3. The operator service according to claim 1, wherein the recognition result storage unit stores the corrected text data in a storage unit of the computer when the input unit receives an operation of correcting the text data. A program that causes a computer to function as a support system.

When the input unit further receives an operation for designating the keyword, the output unit displays a list of stored calls including the designated keyword.
A program for causing a computer to function as the operator business support system according to claim 1.