JP2012189908A

JP2012189908A - Sound recognition device and sound recognition processing method

Info

Publication number: JP2012189908A
Application number: JP2011054764A
Authority: JP
Inventors: Atsushi Koinuma; 敦鯉沼
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2011-03-11
Filing date: 2011-03-11
Publication date: 2012-10-04

Abstract

PROBLEM TO BE SOLVED: To provide a sound recognition device and a sound recognition processing method that are capable of performing sound recognition processing according to the priority level when performing sound recognition processing.SOLUTION: A sound recognition device 300 acquires sound data from the outside, stores the acquired sound data into a primary output database 303, compares the sound data stored in the primary output database 303 with a condition stored in a priority level memory 305 that stores a condition for a priority level to be processed, and performs sound recognition processing on sound data that matches the condition according to the priority level of the condition. A sound recognition result obtained through the sound recognition processing is stored into a sound recognition result database 300D.

Description

本発明の実施形態は、例えば、コールセンタシステムで用いられる音声認識装置の優先度付けによる処理に関する。 Embodiments of the present invention relate to processing by prioritizing voice recognition devices used in, for example, call center systems.

通信販売、製品やサービスのサポートデスク、又は資料請求等の対応を行うコールセンタシステムでは、ユーザとオペレータとの通話を録音し、録音された音声を音声認識装置によって音声認識を行ってテキスト化し、通話終了後に内容をテキストで確認することができるようになっている。このような音声認識を行うコールセンタシステムでは、録音された順番に音声認識処理を実施している。この音声認識処理は、コールセンタで同時に多数の通話が録音されることから通話と同時に処理されるような処理速度ではない。コールセンタで録音された１日分の録音データの音声認識処理は、例えば、録音された日の翌日、又は翌々日に完了する。すなわち、コールセンタシステムでの音声認識処理は、上記のように時間がかかり、通話内容をテキストで確認することができるまでに相当の時間を要していた。 In a call center system that handles mail order sales, product / service support desks, or requests for materials, etc., a call between a user and an operator is recorded, and the recorded voice is converted into text by voice recognition using a voice recognition device. The contents can be confirmed by text after the end. In a call center system that performs such voice recognition, voice recognition processing is performed in the order of recording. This voice recognition process is not a processing speed that is processed simultaneously with a call because a large number of calls are recorded at the call center at the same time. The voice recognition processing of the recorded data for one day recorded at the call center is completed, for example, on the next day or two days after the recorded date. That is, the voice recognition processing in the call center system takes time as described above, and it takes a considerable time until the contents of the call can be confirmed by text.

ところで、コールセンタにおける音声認識処理に関し、発信者がオペレータ端末と接続されるまでの間の発信者の音声を音声認識し、音声認識の結果、ＮＧワードと一致した場合に待呼の順番を優先的に先頭に変更させるシステムが知られている。 By the way, regarding the voice recognition processing in the call center, the caller's voice until the caller is connected to the operator terminal is voice-recognized, and when the result of the voice recognition matches the NG word, the waiting call order is given priority. There is a known system that changes the head to the top.

特開２００８−２１９７４１号公報JP 2008-219741 A

しかしながら、コールセンタにおける音声認識処理自体を優先的に行うことはできなく、すぐに通話内容をテキストで確認したい場合や先に音声認識処理を行うべき処理に応対することができない。 However, the speech recognition process itself in the call center cannot be preferentially performed, and it is not possible to respond to the case where it is desired to immediately confirm the contents of the call in text or to perform the speech recognition process first.

そこで、目的は、音声認識処理を行う場合に、優先度に応じて音声認識処理を行うことができる音声認識装置及び音声認識処理方法を提供することにある。 Accordingly, an object of the present invention is to provide a speech recognition apparatus and a speech recognition processing method capable of performing speech recognition processing according to priority when performing speech recognition processing.

上記目的を達成するための音声認識装置は、音声データを外部から取得する音声取得手段と、前記音声データを、前記音声取得手段により記録される一次出力データベースと、
処理すべき優先度に応じた条件を記憶する優先度メモリと、音声データを音声認識してテキスト化する音声認識処理手段と、前記一次出力データベースに記録される前記音声データを、前記優先度メモリに記憶される条件と比較し、前記条件に一致した音声データを前記条件の優先度に応じて前記音声認識処理手段へ出力する優先度制御手段と、前記音声認識処理手段により音声認識処理された音声認識結果を音声認識結果データベースへ格納する格納処理手段とを備えることを特徴とする。 A voice recognition apparatus for achieving the above object includes voice acquisition means for acquiring voice data from the outside, a primary output database in which the voice data is recorded by the voice acquisition means,
A priority memory for storing a condition corresponding to a priority to be processed; voice recognition processing means for voice-recognizing voice data; converting the voice data recorded in the primary output database into the priority memory; Compared to the condition stored in the above, the priority control means for outputting the voice data that matches the condition to the voice recognition processing means according to the priority of the condition, and the voice recognition processing by the voice recognition processing means Storage processing means for storing the speech recognition result in the speech recognition result database.

また、上記目的を達成するための音声認識処理方法は、音声データを外部から取得する音声取得ステップと、前記音声データを、前記音声取得ステップにより一次出力データベースへ記録する一次記録ステップと、音声データを音声認識してテキスト化する音声認識処理ステップと、前記一次出力データベースに記録される前記音声データを、処理すべき優先度に応じた条件を記憶する優先度メモリに記憶される条件と比較し、前記条件に一致した音声データを前記条件の優先度に応じて音声認識処理を行わせる優先度制御ステップと、前記音声認識処理ステップにより音声認識処理された音声認識結果を音声認識結果データベースへ格納する格納処理ステップとを有することを特徴とする。 In addition, a voice recognition processing method for achieving the above object includes a voice acquisition step of acquiring voice data from the outside, a primary recording step of recording the voice data in a primary output database by the voice acquisition step, and voice data A speech recognition processing step for recognizing the text and converting it to text, and comparing the speech data recorded in the primary output database with a condition stored in a priority memory for storing a condition corresponding to a priority to be processed. A priority control step of performing voice recognition processing on the voice data matching the condition according to the priority of the condition, and storing the voice recognition result subjected to the voice recognition processing by the voice recognition processing step in the voice recognition result database And a storage processing step.

本発明の一実施の形態に係るコールセンタシステムの構成を示すブロック図。The block diagram which shows the structure of the call center system which concerns on one embodiment of this invention. 図１のコールセンタシステムに設けられる音声認識装置の構成を示すブロック図。The block diagram which shows the structure of the speech recognition apparatus provided in the call center system of FIG. 図２の音声認識装置が備える一次出力データベースに記録される音声データの管理情報の例を示す図。The figure which shows the example of the management information of the audio | voice data recorded on the primary output database with which the audio | voice recognition apparatus of FIG. 2 is provided. 図１の音声認識結果データベースに記録される音声認識結果の記録例を示す図。The figure which shows the example of a recording of the speech recognition result recorded on the speech recognition result database of FIG. 図２の音声認識装置における優先度に応じた音声認識処理の動作を示すシーケンス図。The sequence diagram which shows the operation | movement of the speech recognition process according to the priority in the speech recognition apparatus of FIG.

以下、図面を参照しながら、本実施形態に係る音声認識装置及び音声認識処理方法を説明する。 Hereinafter, a voice recognition device and a voice recognition processing method according to the present embodiment will be described with reference to the drawings.

図１は、一実施の形態を示すコールセンタシステムの構成を示すブロック図である。このコールセンタシステムは、例えばＩＰ化されたシステムであり、ＩＰ交換装置１００、通話録音装置２００、通話音声データベース２００Ｄ、音声認識装置３００、音声認識結果データベース３００Ｄ、顧客データベース４００Ｄ、及び、電話端末１１、１２（以下、電話端末１０と総称する。）を有する。 FIG. 1 is a block diagram showing a configuration of a call center system showing an embodiment. This call center system is, for example, an IP system, and includes an IP exchange device 100, a call recording device 200, a call voice database 200D, a voice recognition device 300, a voice recognition result database 300D, a customer database 400D, and a telephone terminal 11. 12 (hereinafter collectively referred to as telephone terminal 10).

ＩＰ交換装置１００は、ＩＰ網４００及びＬＡＮ（ローカルエリアネットワーク）と接続され、ＩＰ網４００からの発信を電話端末１０へ着信させる。また、ＩＰ交換装置１００は、電話端末１０からの発信を発信先へ接続する。さらに、ＩＰ交換装置１００は、電話端末１０との通話の音声データを、ＬＡＮを介して通話録音装置２００へ送る。 The IP exchange apparatus 100 is connected to an IP network 400 and a LAN (local area network), and causes the telephone terminal 10 to receive a call from the IP network 400. In addition, the IP exchange device 100 connects a call from the telephone terminal 10 to a call destination. Further, the IP exchange device 100 sends voice data of a call with the telephone terminal 10 to the call recording device 200 via the LAN.

通話録音装置２００は、ＬＡＮを介してＩＰ交換装置１００からその通話の音声データを受け取り、受け取った音声データを通話音声データベース２００Ｄへ記録する。また、通話録音装置２００は、音声データ毎にオペレータの識別番号や発信番号等の管理情報を関連付けして通話音声データベース２００Dへ記録する。この管理情報は、ＩＰ交換装置１００から通話録音装置２００へ音声データとともにオペレータの識別番号や発信番号を送られてくる。または、オペレータの識別番号、発信番号、後述する業務番号は、オペレータによるＰＣ２０の操作に基づき通話録音装置２００に送られてもよい。なお、通話音声データベース２００Ｄには、録音した順に音声データ及びその管理情報が記録される。 The call recording device 200 receives the voice data of the call from the IP exchange device 100 via the LAN, and records the received voice data in the call voice database 200D. Further, the call recording device 200 associates management information such as an operator identification number and a calling number with each voice data and records them in the call voice database 200D. This management information is sent from the IP exchange device 100 to the call recording device 200 together with the voice data together with the operator's identification number and transmission number. Alternatively, the operator's identification number, call number, and business number, which will be described later, may be sent to the call recording device 200 based on the operation of the PC 20 by the operator. Note that voice data and management information thereof are recorded in the call voice database 200D in the order of recording.

音声認識装置３００は、通話録音装置２００によって通話音声データベース２００Ｄに記録された音声データを取得するために、通話音声録音装置２００に対して通話音声取得要求を行って音声データを取得する。また、音声認識装置３００は、取得した音声データを音声認識してテキスト化し、音声認識結果データベース３００Ｄへ記録する。 In order to acquire the voice data recorded in the call voice database 200D by the call recording device 200, the voice recognition device 300 makes a call voice acquisition request to the call voice recording device 200 and acquires the voice data. The speech recognition apparatus 300 recognizes the acquired speech data as speech and converts it into text, and records it in the speech recognition result database 300D.

顧客データベース４００Ｄは、このコールセンタで応対する顧客の情報を記録する。後述するＰＣ（パーソナルコンピュータ）２０によって、顧客情報が読み出される。 The customer database 400D records customer information received at this call center. Customer information is read out by a PC (personal computer) 20 described later.

電話端末１０は、ＩＰ交換装置１００とＬＡＮを介して接続され、オペレータにより外部との通話に用いられる。なお、電話端末１０は、ＬＡＮに接続されるＰＣ２０と予め対応付けされてもよく、電話端末１０とＰＣとが対応付けされている場合は、電話応対中のオペレータがＰＣ２０を操作することにより、顧客データベース４００Ｄに記録された顧客情報を閲覧又は更新を行うことができる。 The telephone terminal 10 is connected to the IP exchange apparatus 100 via a LAN, and is used for an external call by an operator. Note that the telephone terminal 10 may be associated with the PC 20 connected to the LAN in advance. When the telephone terminal 10 and the PC are associated with each other, the operator during the telephone operation operates the PC 20 to The customer information recorded in the customer database 400D can be browsed or updated.

なお、図１では電話端末１０は２台のみ示したが、ＬＡＮには多数の電話端末が接続される。 Although only two telephone terminals 10 are shown in FIG. 1, a large number of telephone terminals are connected to the LAN.

図２は、音声認識装置３００の構成を示す機能ブロック図である。音声認識装置３００は、ＬＡＮインタフェース（Ｉ／Ｆ）部３０１、音声取得処理部３０２、一次出力データベース３０３、優先度制御部３０４、優先度メモリ３０５、音声認識処理部３０６、格納処理部３０７、及び、登録処理部３０８を備える。 FIG. 2 is a functional block diagram showing the configuration of the speech recognition apparatus 300. The voice recognition device 300 includes a LAN interface (I / F) unit 301, a voice acquisition processing unit 302, a primary output database 303, a priority control unit 304, a priority memory 305, a voice recognition processing unit 306, a storage processing unit 307, and The registration processing unit 308 is provided.

音声取得処理部３０２は、例えば、定期的に通話録音装置２００に対して通話音声取得要求を行い、この通話音声取得要求に応じて通話録音装置２００から送られてくる音声データとこの音声データに関連付けされた管理情報をＬＡＮインタフェース部３０１から受け取って一次出力データベース３０３に書き込む。また、次に音声データを取得した場合は、一次出力データベース３０３内に未だ残っている音声データの管理情報の次に今回取得した音声データの管理情報を書き込む。すなわち、取得した順に音声データが並べられる。 For example, the voice acquisition processing unit 302 periodically makes a call voice acquisition request to the call recording device 200, and the voice data sent from the call recording device 200 in response to the call voice acquisition request and the voice data The associated management information is received from the LAN interface unit 301 and written to the primary output database 303. When the voice data is acquired next time, the management information of the voice data acquired this time is written next to the management information of the voice data still remaining in the primary output database 303. That is, the audio data is arranged in the acquired order.

一次出力データベース３０３に書き込まれる情報は、通話音声データベース２００Ｄに記録される情報と同様である。一次出力データベース３０３に記録される情報の例を図３に示す。図３には、音声データを識別するための「通話１」等の通話識別番号、発信番号、後述する業務番号、及び、オペレータの識別番号が記録される。例えば、「通話１」は、発信番号が０３０−１１１１−２２２２、業務番号が２１、オペレータが佐藤さん（オペレータ識別番号が０１１４５２）であることが記録されている。「通話２」以降についても、同様に記録されている。 The information written in the primary output database 303 is the same as the information recorded in the call voice database 200D. An example of information recorded in the primary output database 303 is shown in FIG. In FIG. 3, a call identification number such as “call 1” for identifying voice data, a call origination number, a work number described later, and an operator identification number are recorded. For example, “call 1” is recorded that the transmission number is 030-1111222, the business number is 21, and the operator is Mr. Sato (the operator identification number is 014552). The same is recorded for “call 2” and thereafter.

優先度メモリ３０５は、優先すべき音声データに関する情報を記憶し、優先度制御部３０４によって優先度情報を参照される。なお、この優先すべき音声データに関する情報は、例えば、コールセンタでの応対内容を示す業務番号、重要顧客の発信番号、又は、新人オペレータの識別番号等である。ここで、業務番号とは、キャンペーンに関する通話、修理に関する通話、問合せに関する通話等、オペレータの応対内容に応じて割り振られる番号である。特定の業務番号を優先することは、特定のキャンペーンについての集計を行う際、集計タイミングが近づいている場合にそのキャンペーンに関する通話の音声データを先に音声認識を行う場合に有効である。また、重要顧客のニーズをいち早く生かすために重要顧客の発信番号を優先して音声認識したり、新人オペレータの教育のために特定のオペレータが応対した通話の音声を優先して音声認識したりすることが可能になる。 The priority memory 305 stores information related to audio data to be prioritized, and is referred to by the priority control unit 304 for priority information. The information related to the voice data to be prioritized is, for example, a business number indicating the contents of the call center, a call number of an important customer, an identification number of a new operator, or the like. Here, the business number is a number assigned according to the contents of the operator's reception such as a call related to a campaign, a call related to repair, a call related to inquiry, and the like. Giving priority to a specific business number is effective when performing voice recognition on the voice data of a call related to the campaign when the tabulation timing is approaching when tabulating the specific campaign. In order to quickly make use of important customers' needs, voice recognition is given priority to the calling number of important customers, and voices of calls answered by specific operators are given priority to educate new operators. It becomes possible.

優先度制御部３０４は、優先度メモリ３０５に記憶される優先すべき通話の音声データに関する管理情報と、一次出力データベース３０３に書き込まれた音声データの発信番号、業務番号、オペレータ識別番号等とを照合し、一次出力データベース３０３に書き込まれた音声データが優先すべき処理対象か否かを、一次出力データベース３０３に書き込まれた管理情報の順に判断する。また、優先度制御部３０４は、優先して処理すべき音声データと判断した音声データを、一次出力データベース３０３から読み出して順に音声認識処理部３０６へ出力する。優先して処理すべき音声データがなければ、優先して処理しなくてよい音声データを、一次出力データベース３０３に書き込まれた順に音声認識処理部３０６へ出力する。また、優先度制御部３０４は、一次出力データベース３０３から音声データを音声認識処理部３０６へ出力する場合に、出力した音声データの管理情報に優先度情報を付加して格納処理部３０７へ出力する。 The priority control unit 304 stores management information related to the voice data of the call to be prioritized stored in the priority memory 305, and the transmission number, work number, operator identification number, etc. of the voice data written in the primary output database 303. Whether or not the audio data written in the primary output database 303 is a processing target to be prioritized is determined in the order of management information written in the primary output database 303. Further, the priority control unit 304 reads out the audio data determined to be preferentially processed from the primary output database 303 and sequentially outputs it to the speech recognition processing unit 306. If there is no voice data to be preferentially processed, the audio data that need not be preferentially processed is output to the voice recognition processing unit 306 in the order written in the primary output database 303. Further, when outputting the voice data from the primary output database 303 to the voice recognition processing unit 306, the priority control unit 304 adds the priority information to the management information of the output voice data and outputs it to the storage processing unit 307. .

なお、優先度制御部３０４はタイマを備え、一定時間のみ優先して処理しなくてよい音声データを音声認識処理部３０６へ出力するようにしてもよい。この場合、優先度制御部３０４は、タイマがタイムアウトした場合に、タイムアウトを音声取得処理部３０２に通知し、音声取得処理部３０２は、このタイムアウトの通知を契機に通話録音装置２００に対して通話音声取得要求を行うようにしてもよい。 Note that the priority control unit 304 may include a timer, and may output to the speech recognition processing unit 306 audio data that does not need to be processed with priority for a certain period of time. In this case, when the timer times out, the priority control unit 304 notifies the voice acquisition processing unit 302 of the timeout, and the voice acquisition processing unit 302 makes a call to the call recording apparatus 200 in response to the notification of the timeout. A voice acquisition request may be made.

このように、タイマを用いて優先しなくてよい音声データを音声認識処理部３０６へ出力する処理を一定時間に限定し、タイムアウトした場合に次に録音された音声データを取得するようにすることで、優先して処理すべき音声データを優先して処理することができる。タイマがない場合は、一次出力データベース３０３に優先して処理すべき音声データが無くなるまで、優先しなくてよい音声データの音声認識処理を行うことができないが、このタイマを用い音声データの取得と連動させることで、優先しなくてよい音声データについての音声認識処理を少しずつ（優先度を下げて）行うことができる。 In this way, the process of outputting the voice data that does not need to be prioritized using the timer to the voice recognition processing unit 306 is limited to a certain time, and the next recorded voice data is acquired when a time-out occurs. Thus, the audio data to be processed with priority can be processed with priority. If there is no timer, voice recognition processing of voice data that does not need to be prioritized cannot be performed until there is no voice data to be processed with priority in the primary output database 303. By linking, voice recognition processing for voice data that does not need to be prioritized can be performed little by little (with a lower priority).

音声認識処理部３０６は、内部にバッファを備え、優先度制御部３０４によりそのバッファに音声データが書き込まれる。音声認識処理部３０６は、そのバッファに書き込まれた音声データを書き込まれた順に音声認識処理を行い、音声認識した結果としてテキストデータを格納処理部３０７に出力する。 The voice recognition processing unit 306 includes a buffer therein, and voice data is written into the buffer by the priority control unit 304. The speech recognition processing unit 306 performs speech recognition processing in the order in which the speech data written in the buffer is written, and outputs text data to the storage processing unit 307 as a result of speech recognition.

格納処理部３０７は、音声認識処理部３０６から受け取った音声認識結果であるテキストデータと、優先度制御部３０４から送られてくる優先度情報が付加された管理情報とを照合し、テキストデータを、顧客別、業務番号別、又はオペレータ別等に分類し、その分類の中で、優先して処理されたものとそうでないものとをユーザが識別可能に音声認識結果データベース３００Ｄへ記録する。例えば、優先して処理されたものとそうでないものとを別フォルダで管理する。音声認識結果データベース３００Ｄの音声認識結果の記録の例は、図４に示される。 The storage processing unit 307 collates the text data, which is the voice recognition result received from the voice recognition processing unit 306, with the management information to which the priority information sent from the priority control unit 304 is added. , Classified by customer, business number, or operator, and recorded in the speech recognition result database 300D so that the user can identify what is preferentially processed and what is not. For example, the preferentially processed and the non-prioritized are managed in different folders. An example of recording the speech recognition result in the speech recognition result database 300D is shown in FIG.

このように、分類したり優先か否かをユーザが識別可能にしたりして音声認識結果データベース３００Ｄへ記録することで、音声データをテキストで確認したい場合に優先して確認すべきものがわかりやすくなる。 In this way, by classifying or making it possible to identify whether priority is given or not and recording it in the speech recognition result database 300D, it becomes easier to understand what should be confirmed preferentially when speech data is to be confirmed in text. .

登録処理部３０８は、ＬＡＮインタフェース部３０１から入力される優先度情報の登録要求に応じて、優先度情報を優先度メモリ３０５に記憶する。例えば、オペレータがＰＣ２０を操作して音声認識装置３００に対して優先度情報の登録要求を行う。このとき、顧客に応じた優先度情報の登録要求を行う場合は、ＰＣ２０から顧客データベース４００Ｄを参照して、顧客の情報を閲覧する。 The registration processing unit 308 stores the priority information in the priority memory 305 in response to the priority information registration request input from the LAN interface unit 301. For example, the operator operates the PC 20 to make a priority information registration request to the speech recognition apparatus 300. At this time, when making a registration request of priority information according to the customer, the customer information is browsed by referring to the customer database 400D from the PC 20.

図５は、音声認識装置における優先度に応じた音声認識処理の動作を示すシーケンス図である。 FIG. 5 is a sequence diagram showing the operation of the speech recognition process according to the priority in the speech recognition apparatus.

まず、音声取得処理部３０２は、通話録音装置２００に対して通話音声要求通知を出力して音声データとその管理情報とをＬＡＮインタフェース３０１から受け取り（Ｓ１）、一次出力データベース３０３に書き込む（Ｓ２）。優先度制御部３０４は、一次出力データベース３０３に書き込まれた音声データの管理情報と優先度メモリ３０５とを照合して、優先すべき処理対象が有るか否かを判断する（Ｓ３）。優先すべき処理対象がある場合（ステップＳ３でＹｅｓ）は、管理情報の順に優先度を判定し（Ｓ４）、判定の結果、優先すべき処理対象であれば（ステップＳ４で優先）優先すべき音声データを音声認識処理部３０６へ出力する（Ｓ５）。このとき、優先度制御部３０４は、その音声データの管理情報を格納処理部３０７へ出力する。その後、優先すべき処理対象の有無の判断を行う（Ｓ３）。また、優先度判定の結果、優先すべき処理対象でなければ（ステップＳ４で非優先）その音声データを一次出力ＤＢから取り出さず、次の音声データの優先度判断を続ける（Ｓ６）。 First, the voice acquisition processing unit 302 outputs a call voice request notification to the call recording device 200, receives the voice data and its management information from the LAN interface 301 (S1), and writes it in the primary output database 303 (S2). . The priority control unit 304 compares the audio data management information written in the primary output database 303 with the priority memory 305 to determine whether there is a processing target to be prioritized (S3). When there is a processing target to be prioritized (Yes in step S3), the priority is determined in the order of management information (S4), and if the result of determination is a processing target to be prioritized (priority in step S4), priority should be given The voice data is output to the voice recognition processing unit 306 (S5). At this time, the priority control unit 304 outputs the management information of the audio data to the storage processing unit 307. Thereafter, it is determined whether there is a processing target to be prioritized (S3). As a result of the priority determination, if it is not a processing target to be prioritized (not prioritized in step S4), the audio data is not taken out from the primary output DB, and the priority determination of the next audio data is continued (S6).

ステップＳ３で優先すべき処理対象が無い場合（ステップＳ３でＮｏ）は、優先でない音声データを一次出力データベース３０３から取り出して音声認識処理部３０６へ出力する。このとき、優先度制御部３０４は、その音声データの管理情報を格納処理部３０７へ出力する。その後、タイマがタイムアウトしたか否かを判断し（Ｓ８）、タイムアウトしていなければ（ステップＳ８でＮｏ）ステップＳ７の優先でない音声データを音声認識処理部３０６へ出力する処理を継続し、タイムアウトした場合（ステップＳ８でＹｅｓ）は、ステップＳ１へ戻り音声データを通話録音装置２００から取得する。 If there is no processing target to be prioritized in step S3 (No in step S3), non-prioritized speech data is extracted from the primary output database 303 and output to the speech recognition processing unit 306. At this time, the priority control unit 304 outputs the management information of the audio data to the storage processing unit 307. Thereafter, it is determined whether or not the timer has timed out (S8), and if it has not timed out (No in step S8), the process of outputting the non-priority audio data in step S7 to the speech recognition processing unit 306 is continued and timed out. If so (Yes in step S8), the process returns to step S1 to acquire voice data from the call recording device 200.

以上のように、音声認識装置に音声認識処理を優先して行うべき条件を予め記憶しておき、その条件と一致した音声データを音声認識処理することで、音声認識処理を優先度に応じて処理することができる。また、音声認識結果を、優先か否かをユーザが識別可能にして音声認識結果データベース３００Ｄへ記録することで、テキストを確認する場合に、優先のものを纏めて確認することができる。 As described above, the voice recognition apparatus stores in advance the conditions for performing voice recognition processing with priority, and performs voice recognition processing on voice data that matches the conditions, so that the voice recognition processing is performed according to priority. Can be processed. Further, by recording the voice recognition result in the voice recognition result database 300D so that the user can identify whether or not the voice recognition result is prioritized, the prioritized ones can be collectively confirmed when the text is confirmed.

本発明は、以上の構成に限定されるものではなく、種々の変形が可能である。例えば、上記実施形態では、音声認識装置３００が備える優先度制御部３０４は、一次出力データベース３０３に記録されている音声データの優先度を判断した後に一次出力データベース３０３から音声データを取り出して音声認識処理部３０６へ出力したが、優先度制御部３０４は、内部にメモリを備えて、一次出力データベースから音声データの管理情報（または管理情報と音声データと）を取得し、非優先と判断された音声データについては、その管理情報（または管理情報と音声データと）を一次出力データベースへ戻すようにしてもよい。また、上記実施形態では、優先度を、優先または非優先としたが、優先の度合いによって３つ以上の段階に分けて、優先度の高いものから処理するようにしてもよい。 The present invention is not limited to the above configuration, and various modifications are possible. For example, in the above embodiment, the priority control unit 304 included in the speech recognition apparatus 300 extracts speech data from the primary output database 303 after determining the priority of the speech data recorded in the primary output database 303, and recognizes the speech. Although output to the processing unit 306, the priority control unit 304 has an internal memory, acquires audio data management information (or management information and audio data) from the primary output database, and is determined to be non-prioritized. For voice data, the management information (or management information and voice data) may be returned to the primary output database. Moreover, in the said embodiment, although the priority was made into priority or non-priority, you may make it process from a thing with a high priority divided into three or more steps according to the priority degree.

なお、本発明のいくつかの実施形態を説明したが、これらの実施形態は、例として提示したものであり、発明の範囲を限定することは意図していない。これら新規な実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。これら実施形態やその変形は、発明の範囲や要旨に含まれるとともに、特許請求の範囲に記載された発明とその均等の範囲に含まれる。 In addition, although some embodiment of this invention was described, these embodiment is shown as an example and is not intending limiting the range of invention. These novel embodiments can be implemented in various other forms, and various omissions, replacements, and changes can be made without departing from the scope of the invention. These embodiments and modifications thereof are included in the scope and gist of the invention, and are included in the invention described in the claims and the equivalents thereof.

１０，１１，１２…電話端末、２０…ＰＣ、１００…ＩＰ交換装置、２００…通話録音装置、２００Ｄ…通話音声データベース、３００…音声認識装置、３００Ｄ…音声認識結果データベース、３０１…ＬＡＮインタフェース部、３０２…音声取得処理部、３０３…一次出力データベース、３０４…優先度制御部。３０５…優先度メモリ、３０６…音声認識処理部、３０７…格納処理部、３０８…登録処理部、４００…ＩＰ網。 DESCRIPTION OF SYMBOLS 10, 11, 12 ... Telephone terminal, 20 ... PC, 100 ... IP switching device, 200 ... Call recording device, 200D ... Call voice database, 300 ... Voice recognition device, 300D ... Voice recognition result database, 301 ... LAN interface part, 302 ... Voice acquisition processing unit, 303 ... Primary output database, 304 ... Priority control unit. 305 ... Priority memory, 306 ... Voice recognition processing unit, 307 ... Storage processing unit, 308 ... Registration processing unit, 400 ... IP network.

Claims

Audio acquisition means for acquiring audio data from outside;
The voice data, a primary output database recorded by the voice acquisition means;
A priority memory for storing conditions according to the priority to be processed;
Speech recognition processing means for recognizing speech data and converting it into text;
The voice data recorded in the primary output database is compared with a condition stored in the priority memory, and voice data that matches the condition is output to the voice recognition processing means according to the priority of the condition. Priority control means;
A speech recognition apparatus comprising: storage processing means for storing a speech recognition result subjected to speech recognition processing by the speech recognition processing means in a speech recognition result database.

The priority control unit notifies the storage processing unit of the priority information of the voice data as the voice data is output to the voice recognition processing unit according to the priority of the condition.
The voice according to claim 1, wherein the storage processing means stores the voice recognition result in the voice recognition result database for each priority based on priority information notified from the priority control means. Recognition device.

3. The registration processing unit according to claim 1, further comprising: a registration processing unit that registers a priority to be processed and a condition according to the priority in the priority memory in response to an instruction from the outside. Voice recognition device.

An audio acquisition step for acquiring audio data from the outside;
A primary recording step of recording the audio data in a primary output database by the audio acquisition step;
A speech recognition processing step for recognizing speech data and converting it into text;
The audio data recorded in the primary output database is compared with a condition stored in a priority memory that stores a condition corresponding to a priority to be processed, and audio data that matches the condition is compared with the priority of the condition. A priority control step for performing voice recognition processing according to
And a storage processing step of storing the speech recognition result subjected to the speech recognition processing in the speech recognition processing step in a speech recognition result database.

5. The speech recognition apparatus according to claim 4, wherein the storage processing step stores the speech recognition result in the speech recognition result database for each priority.

6. The registration process step according to claim 4, further comprising a registration processing step of registering in the priority memory a priority to be processed and a condition corresponding to the priority in accordance with an instruction from the outside. Speech recognition processing method.