JP2021179639A

JP2021179639A - Voice recognition device, management system, management program, and voice recognition method

Info

Publication number: JP2021179639A
Application number: JP2021131722A
Authority: JP
Inventors: 秀憲川村; Hidenori Kawamura; 紘也永田; Hiroya Nagata
Original assignee: Til Inc
Current assignee: Til Inc
Priority date: 2019-11-12
Filing date: 2021-08-12
Publication date: 2021-11-18
Anticipated expiration: 2039-11-12
Also published as: JP2021076758A; JP7288701B2; JP6933397B2

Abstract

To provide a voice recognition device, a management system, a management program, and a management method that enable an administrator to manage utterances of an object person without confirming a voice of the object person in detail.SOLUTION: A voice recognition device 20 according to the present invention has: an utterance data generation part 211 which makes voice data received from a microphone device 10 capable of converting a voice of an object person into voice data, into a text to generate utterance data of the object person; and a totalizing part 212 which compares the utterance data with an inhibited word list 221 stored in a storage part 22 in advance to totalize a frequency of coincidence or similarity; and a display part 24 which displays the results of totalizing by the totalizing part 212 to the administrator.SELECTED DRAWING: Figure 1

Description

本発明は、音声認識装置、管理システム、管理プログラム及び音声認識方法に関する。より詳しくは、音声データをテキスト化して発言データを生成し、予め定められた禁止ワードデータと比較することで会話から生ずるトラブルを予防及び把握可能な装置、システム、プログラム及び方法に関する。 The present invention relates to a voice recognition device, a management system, a management program and a voice recognition method. More specifically, the present invention relates to a device, a system, a program and a method capable of preventing and grasping troubles caused by conversation by converting voice data into text to generate speech data and comparing it with predetermined prohibited word data.

労働現場、特に生活者と直接接触する現場などでは、会話からトラブルが生ずることがある。不適切な発言がハラスメント行為となるトラブルや、契約内容等に関する発言の有無や内容の子細が後になって争われるトラブルがしばしばある。会話によるトラブルの中には、言動による脅迫行為や、犯罪の示唆、教唆といった深刻なケースもある。 Conversations can cause trouble at work sites, especially in direct contact with consumers. There are often troubles in which inappropriate remarks become harassment acts, and troubles in which the presence or absence of remarks regarding the contents of contracts and the details of the contents are disputed later. Among the troubles caused by conversation, there are serious cases such as intimidation by words and actions, suggestion of crime, and incitement.

労働現場に限らず、日常生活においても、会話から生じるトラブルは重要な問題である。こうしたトラブルにおいて、会話を収音して分析することは、トラブルの予防及び事後処理に役立ち重要である。従来より、航空機のコクピットには乗務員の会話を収音して録音するボイスレコーダーが備え付けられている。万が一のインシデント及び事故において、このボイスレコーダーに録音された音声は再生され分析され、インシデント及び事故の原因の究明に役立つ。コールセンターのオペレーターにおいては、オペレーターの顧客に対する応対をマイク装置が収音して記録した音声データを、管理者が再生し分析してトラブル対応に役立てる管理方法が、しばしば採用されている。 Trouble caused by conversation is an important problem not only in the workplace but also in daily life. In such troubles, it is important to collect and analyze the conversation, which is useful for trouble prevention and post-processing. Traditionally, the cockpit of an aircraft has been equipped with a voice recorder that picks up and records crew conversations. In the unlikely event of an incident or accident, the voice recorded on this voice recorder is played back and analyzed, which is useful for investigating the cause of the incident and accident. In call center operators, a management method is often adopted in which the administrator reproduces and analyzes the voice data recorded by the microphone device picking up the response to the operator's customer to help troubleshoot.

一方、小型のマイク装置と録音装置とを組み合わせて携帯可能としたボイスレコーダーが考案され、利用されている（特許文献１）。特許文献１に記載の発明によると、従業員による外出先での業務中の会話がボイスレコーダーに収音して録音されるため、管理者は、事後処理のために、当該ボイスレコーダーに録音された会話を再生して分析できる。 On the other hand, a voice recorder that is portable by combining a small microphone device and a recording device has been devised and used (Patent Document 1). According to the invention described in Patent Document 1, since the conversation during work by the employee is picked up and recorded in the voice recorder, the administrator is recorded in the voice recorder for post-processing. You can play and analyze the conversation.

特開２００７−３２５３９２号公報Japanese Unexamined Patent Publication No. 2007-325392

しかしながら、特許文献１に記載の管理方法においては、ボイスレコーダーへの録音時間が膨大となる。そして、この録音時間は、対象者の数が増えるにつれて、増大する。また、この録音時間は、管理の対象とする時間が増えるにつれても、増大する。したがって、管理者は、録音された音声データからトラブルの発生を示す目的の音声データを探し出すにあたって、多大な労力と時間とを要する。また、管理者は、録音された会話を活用したトラブル対応を行うにあたって、録音された音声データを回収し、発生したトラブルに対応する音声データを探し出し、さらに、探し出された音声データを再生して音声データの内容を検査するという手順を踏む必要がある。そのため、管理者は、トラブルの発生から対応までに時間を要する。したがって、こうしたボイスレコーダー等により対象者の会話を録音する管理は、管理者が、会話によるトラブルに対して早期の対応を行うという点において、また、分析に要する時間と労力の点において、不十分な点があった。 However, in the management method described in Patent Document 1, the recording time on the voice recorder becomes enormous. Then, this recording time increases as the number of subjects increases. In addition, this recording time increases as the time to be managed increases. Therefore, it takes a lot of labor and time for the administrator to search the recorded voice data for the purpose of indicating the occurrence of the trouble. In addition, the administrator collects the recorded voice data, searches for the voice data corresponding to the trouble that has occurred, and further reproduces the found voice data when dealing with the trouble using the recorded conversation. It is necessary to take the procedure of inspecting the contents of the voice data. Therefore, the administrator takes time from the occurrence of the trouble to the response. Therefore, the management of recording the conversation of the target person by such a voice recorder or the like is insufficient in terms of the administrator's early response to the trouble caused by the conversation, and in terms of the time and labor required for the analysis. There was a point.

管理者が、対象者の会話によるトラブルに対して早期の対応を行うためには、「金を出せ」「殺すぞ」などといったトラブルの発生を直接的に示す発言を、会話から探し出すことが重要である。さらに、トラブルの疑いがある発言について、管理者が、その頻度を把握することも重要である。例えば「お母さん」という発言は、それ単独でトラブルの発生を明示するものではない。しかし、管理者が、「お母さん」という発言の頻度を把握できれば、管理者は、「お母さんに言うぞ」「お母さんに知られてもいいのか」などの言い回しを繰り返す脅迫が行われている状況を把握できる。したがって、管理者は、早期の対応を行える。 In order for the administrator to take an early action against the trouble caused by the conversation of the target person, it is important to find out from the conversation the remarks that directly indicate the occurrence of the trouble such as "pay money" and "kill". Is. Furthermore, it is important for the administrator to understand the frequency of statements that are suspected of having trouble. For example, the statement "mother" does not indicate the occurrence of trouble by itself. However, if the administrator can grasp the frequency of saying "mother", the administrator is threatened to repeat phrases such as "I will tell my mother" and "Is it okay for my mother to know?" I can grasp it. Therefore, the administrator can take an early action.

収音し録音された対象者の会話を用いた管理において、管理者は、録音された音声データの大部分を再生して音声データの内容を確認する作業を通じて発言の頻度を把握する。管理者が、その担当する多数の対象者に対して逐一こうした確認作業を行うことには、労力と時間の両方において、大きな困難が伴う。したがって、対象者の会話を収音して分析する管理方法において、収音された会話を録音して管理者が内容を確認する従来の方式には、なお一層の改善の余地がある。 In management using the conversation of the subject who has picked up and recorded the sound, the administrator grasps the frequency of remarks through the work of playing back most of the recorded voice data and confirming the content of the voice data. It is very difficult for the manager to perform such confirmation work for a large number of subjects in charge, both in terms of labor and time. Therefore, in the management method of collecting and analyzing the conversation of the target person, there is room for further improvement in the conventional method of recording the collected conversation and confirming the content by the administrator.

本発明は、このような考えに基づいてなされたものであり、マイク装置から受信した音声データをテキスト化し、予め定められた禁止ワードデータと比較し、一致又は類似する頻度を集計して表示することで、管理者が対象者の音声を逐一確認することなく対象者の発言を管理可能な音声認識装置、管理システム、管理プログラム及び音声認識方法を提供することを目的とする。 The present invention has been made based on such an idea, and the voice data received from the microphone device is converted into text, compared with a predetermined prohibited word data, and the matching or similar frequency is aggregated and displayed. It is an object of the present invention to provide a voice recognition device, a management system, a management program, and a voice recognition method capable of managing a subject's remarks without the administrator confirming the subject's voice one by one.

本発明者らは、上記課題を解決するために鋭意検討した結果、マイク装置から受信した音声データをテキスト化し、予め定められた禁止ワードデータと比較し、一致又は類似する頻度を集計して表示することで、上記の目的を達成できることを見出し、本発明を完成させるに至った。具体的に、本発明は以下のものを提供する。 As a result of diligent studies to solve the above problems, the present inventors have converted the voice data received from the microphone device into text, compared it with predetermined prohibited word data, and aggregated and displayed the matching or similar frequency. By doing so, it was found that the above object can be achieved, and the present invention has been completed. Specifically, the present invention provides the following.

第１の発明に係る特徴は、対象者の音声を音声データに変換可能なマイク装置と接続され、前記マイク装置から受信した前記音声データをテキスト化可能な音声認識装置であって、前記マイク装置から受信した前記音声データをテキスト化し、前記対象者の発言データを生成する発言データ生成部と、前記発言データと予め定められた禁止ワードデータとを比較し、一致又は類似する頻度を集計する集計部と、前記集計部による集計の結果を管理者に表示する表示部とを有する、音声認識装置である。 A feature according to the first invention is a voice recognition device that is connected to a microphone device capable of converting a subject's voice into voice data and can convert the voice data received from the microphone device into text. The voice data received from the voice data is converted into text, and the voice data generation unit that generates the voice data of the target person is compared with the voice data and a predetermined prohibited word data, and the matching or similar frequency is totaled. It is a voice recognition device having a unit and a display unit that displays the result of aggregation by the aggregation unit to the administrator.

第１の特徴に係る発明によれば、対象者の音声に含まれる禁止ワードの頻度が、自動的に集計され、表示部に表示される。例えば、管理者が「殺すぞ」などのトラブルの発生を示すフレーズを禁止ワードに定めることで、対象者がそうしたフレーズを含む発言を行った頻度が集計され、表示部を通じて管理者に表示される。管理者は、表示部の表示を通じて、対象者による禁止ワードを含む発言の頻度を把握する。管理者は、表示された禁止ワードを含む発言の頻度を用いて対象者周辺でのトラブルの発生を把握し、対象者への警告や警察への通報などの管理を行う。管理者は、対象者の音声を逐一確認することなく、対象者の発言に関するこの一連の管理を行える。 According to the invention according to the first feature, the frequency of prohibited words included in the voice of the subject is automatically totaled and displayed on the display unit. For example, by setting a phrase indicating the occurrence of trouble such as "I'll kill you" in the prohibited word, the frequency at which the target person made a statement including such a phrase is totaled and displayed to the administrator through the display unit. .. The administrator grasps the frequency of remarks including prohibited words by the target person through the display of the display unit. The administrator grasps the occurrence of troubles around the target person by using the frequency of remarks including the displayed prohibited words, and manages warnings to the target person and reports to the police. The administrator can manage this series of remarks of the target person without checking the voice of the target person one by one.

よって、第１の特徴にかかる発明によると、管理者が対象者の音声を逐一確認することなく対象者の発言を管理可能な音声認識装置を提供できる。 Therefore, according to the invention according to the first feature, it is possible to provide a voice recognition device capable of managing the remarks of the target person without the administrator confirming the voice of the target person one by one.

第２の特徴に係る発明は、第１の特徴の発明に係る発明であって、前記集計部による集計の結果、前記発言データと前記禁止ワードデータとが一致又は類似する頻度が所定の頻度以上である場合に、前記管理者に警告する警告部をさらに有する、音声認識装置を提供する。 The invention according to the second feature is an invention according to the invention of the first feature, and as a result of aggregation by the aggregation unit, the frequency at which the speech data and the prohibited word data match or resemble is equal to or higher than a predetermined frequency. A voice recognition device is provided which further has a warning unit for warning the administrator when the above is the case.

第２の特徴に係る発明によれば、発言データと禁止ワードデータとが一致又は類似する頻度が所定の頻度以上である場合に、警告部が管理者に警告する。警告部が警告を行うため、管理者は、いち早くトラブルに対応できる。また、所定の頻度以上である場合に警告部による警告が行われることで、いたずらな警告や、発音が良く似たワードを一致又は類似するものとして集計したことによる誤警告が防がれる。したがって、発言管理装置は、必要とされる警告を、誤警告に紛れることなく、管理者のもとにいち早く届けられる。管理者は、より適切かつ迅速に、トラブルの発生を示す発言を把握し、管理できる。 According to the invention according to the second feature, when the frequency at which the speech data and the prohibited word data match or are similar is equal to or higher than a predetermined frequency, the warning unit warns the administrator. Since the warning unit gives a warning, the administrator can respond to the trouble as soon as possible. Further, by issuing a warning by the warning unit when the frequency is higher than a predetermined frequency, it is possible to prevent mischievous warnings and erroneous warnings caused by counting words with similar pronunciations as matching or similar. Therefore, the speech management device can promptly deliver the required warning to the administrator without being confused with the false warning. The administrator can more appropriately and quickly grasp and manage the remarks indicating the occurrence of trouble.

第３の特徴に係る発明は、第２の特徴の発明に係る発明であって、前記音声データと予め記憶された声紋データとを比較し、前記音声データの発信元が前記声紋データの発声元であるか否かを判別する発信元判別部をさらに有し、前記警告部は、前記音声データの発信元が前記声紋データの発声元であるとき前記管理者に警告する、音声認識装置を提供する。 The invention according to the third feature is the invention according to the invention of the second feature, in which the voice data is compared with the voice print data stored in advance, and the source of the voice data is the voice source of the voice print data. Further having a source determination unit for determining whether or not the data is, the warning unit provides a voice recognition device that warns the administrator when the source of the voice data is the voice source of the voice print data. do.

第３の特徴に係る発明によれば、発信元判別部が音声データの発声元を判別することで、予め記憶された声紋データと一致する音声データである場合にのみ、警告部が管理者に警告する動作が行われる。例えば、管理者が、対象者の声紋と「絶対に儲かる」などの不当な勧誘を示す禁止ワードとを登録しておくことにより、対象者による不当な勧誘行為があったことが管理者に警告される。これにより、対象者以外の発話者の音声、特に、対象者との会話に参加していない対象者の周囲に居合わせた第三者の音声、に含まれる禁止ワードが集計され、管理者に警告される誤警告を避けられる。 According to the invention according to the third feature, the source determination unit determines the origin of the voice data, and the warning unit informs the administrator only when the voice data matches the voiceprint data stored in advance. A warning action is taken. For example, the administrator warns the administrator that there was an unjust solicitation by the target person by registering the voiceprint of the target person and a prohibited word indicating an unjust solicitation such as "absolutely profitable". Will be done. As a result, prohibited words contained in the voices of speakers other than the target person, especially the voices of third parties who are present around the target person who has not participated in the conversation with the target person, are aggregated and warned the administrator. You can avoid false warnings.

第４の特徴に係る発明は、第１から第３のいずれかの特徴に係る発明であって、前記発言データ生成部により生成された発言データを前記音声データの発信元ごとに分類して蓄積して保存する発言データ保存部をさらに有する、音声認識装置を提供する。 The invention according to the fourth feature is an invention according to any one of the first to third features, and the speech data generated by the speech data generation unit is classified and stored for each source of the voice data. Provided is a voice recognition device further having a speech data storage unit for storing the speech data.

第４の特徴に係る発明によれば、生成された発言データが発信元ごとに分類され蓄積され保存される。これにより、管理者は、保存された発言データを読み出して、その内容を精査できる。不正行為が疑われる場合などで、当事者間で発言の有無や発言の内容についての説明が食い違うことがしばしばある。こうした当事者間で発言の有無や発言の内容についての説明が食い違う状況において、管理者は、保存された発言データという証拠を精査することで、状況を客観的に把握できる。また、対象者が当事者である労働審判において、管理者は、保存された発言データを証拠として提示できる。 According to the invention according to the fourth feature, the generated speech data is classified, accumulated and stored for each source. As a result, the administrator can read the stored remark data and scrutinize its contents. In cases such as when cheating is suspected, there are often discrepancies between the parties regarding the presence or absence of remarks and the explanation of the content of the remarks. In a situation where the presence or absence of a statement and the explanation of the content of the statement differ between the parties, the administrator can objectively grasp the situation by examining the evidence of the stored statement data. In addition, in a labor trial in which the subject is a party, the manager can present the stored statement data as evidence.

また、第４の特徴に係る発明によれば、発言データが保存されることにより、管理者は、保存された発言データをビッグデータとして解析し、利用できる。管理者は、このような解析により、対象者の発言傾向を分析したり、トラブルの発生前に出現しやすいワードを発見して禁止ワードに加えたりできる。 Further, according to the invention according to the fourth feature, by storing the remark data, the administrator can analyze and use the stored remark data as big data. By such an analysis, the administrator can analyze the speech tendency of the target person, find words that are likely to appear before the trouble occurs, and add them to the prohibited words.

第５の特徴に係る発明は、第４の特徴に係る発明であって、前記表示部は、前記発言データ保存部に保存された複数の発言データの中から特定の発信元の発言データを抽出して表示可能である、音声認識装置を提供する。 The invention according to the fifth feature is the invention according to the fourth feature, and the display unit extracts speech data of a specific sender from a plurality of speech data stored in the speech data storage section. To provide a voice recognition device that can be displayed.

第５の特徴に係る発明によれば、表示部は、特定の発信元の発言データを抽出して表示できる。これにより、管理者は、営業成績が良いなど管理者にとって好ましい特徴を有する発信元の発言傾向を分析して利用できる。あるいは、管理者は、営業成績が悪い、クレームが多いなど、管理者にとって好ましくない特徴を有する発信元の発言傾向を分析して利用できる。 According to the invention according to the fifth feature, the display unit can extract and display the speech data of a specific sender. As a result, the manager can analyze and use the remark tendency of the sender, which has favorable characteristics for the manager such as good sales performance. Alternatively, the manager can analyze and use the remark tendency of the sender having characteristics that are unfavorable to the manager, such as poor sales performance and many complaints.

第６の特徴に係る発明は、第１から第５のいずれかの特徴に係る発明であって、前記集計部は、予め定められ、前記音声データの発信元が属する属性において頻繁に用いられる特定音声データと前記音声データとを比較し、前記特定音声データとは異なる前記音声データについて前記集計を行う、音声認識装置を提供する。 The invention according to the sixth feature is an invention according to any one of the first to fifth features, wherein the aggregation unit is predetermined and is frequently used in the attribute to which the source of the voice data belongs. Provided is a voice recognition device that compares voice data with the voice data and performs the aggregation for the voice data different from the specific voice data.

第６の特徴に係る発明によれば、管理者は、対象者が属する属性において、しばしば発生する大きな音を、特定音声データとして定められる。これにより、その大きな音は、集計部で処理されなくなる。例えば、対象者が水道工事作業員であり、周辺で頻繁に工具の音が鳴る場合に、管理者は、工具の音を特定音声データとして登録する。これにより、しばしば発生する工具の音が集計部で処理されなくなる。したがって、集計部の負荷が軽減される。 According to the invention according to the sixth feature, the manager defines a loud sound that often occurs in the attribute to which the subject belongs as specific voice data. As a result, the loud sound is not processed by the tabulation unit. For example, when the target person is a waterworks worker and the sound of the tool is frequently heard in the vicinity, the administrator registers the sound of the tool as specific voice data. As a result, the frequently generated tool sounds are not processed by the tabulation unit. Therefore, the load on the tabulation unit is reduced.

第７の特徴に係る発明は、対象者の音声を音声データに変換可能なマイク装置と、上記の音声認識装置とを備え、前記マイク装置は、前記音声データに含まれる音の音量を計測する音量計測部と、前記音量が所定量以上である音声データを前記音声認識装置に送信する送信部とを有する、管理システムを提供する。 The invention according to the seventh feature includes a microphone device capable of converting a subject's voice into voice data and the above-mentioned voice recognition device, and the microphone device measures the volume of sound included in the voice data. Provided is a management system including a volume measuring unit and a transmitting unit that transmits voice data having a volume equal to or higher than a predetermined amount to the voice recognition device.

第７の特徴に係る発明によれば、音量計測部が音声データに含まれる音の音量を計測することで、音量が所定量以上である音声データが音声認識装置に送信される。これにより、音量が小さく、したがって、テキスト化され比較され集計される発言が含まれない音声データが、発言データ生成部及び集計部で処理されることが避けられる。したがって、発言データ生成部及び集計部における処理量が減り、管理システムの負荷が軽減される。 According to the invention according to the seventh feature, the volume measuring unit measures the volume of the sound included in the voice data, so that the voice data whose volume is equal to or higher than a predetermined amount is transmitted to the voice recognition device. As a result, it is possible to prevent the speech data generation unit and the aggregation unit from processing the voice data that is low in volume and therefore does not include the speech that is converted into text, compared, and aggregated. Therefore, the amount of processing in the speech data generation unit and the aggregation unit is reduced, and the load on the management system is reduced.

第８の特徴に係る発明は、対象者の音声を音声データに変換可能なマイク装置と、上記の音声認識装置とを備え、前記マイク装置は、該マイク装置の現在位置を測位する測位部を有し、前記禁止ワードデータは、前記マイク装置の存在位置に応じて予め定められており、前記集計部は、前記マイク装置から受信した前記現在位置に基づいて前記発言データと前記禁止ワードデータとを比較し、一致又は類似する頻度を集計する、管理システムを提供する。 The invention according to the eighth feature includes a microphone device capable of converting a subject's voice into voice data and the above-mentioned voice recognition device, and the microphone device provides a positioning unit for positioning the current position of the microphone device. The prohibited word data is predetermined according to the existing position of the microphone device, and the aggregation unit includes the speech data and the prohibited word data based on the current position received from the microphone device. Provides a management system that compares and aggregates matches or similar frequencies.

第８の特徴に係る発明によれば、集計部は、マイク装置の存在位置に応じた禁止ワードデータを参照するため、その存在位置を含む施設に応じた不適切な発言が行われた頻度を集計できる。この施設に応じた不適切な発言は、例えば、駅などの人が多い公共の場所において社外秘の情報を発言すること等を含む。また、集計部は、その存在位置を含む施設に応じた適切な発言が行われた頻度も、集計できる。例えば、管理者が、対象者の業務に関連する単語を禁止ワードとして登録する。そして、管理システムが、対象者が業務を行う場所から離れた場所にいるときに、業務に関係する単語を含む発言を行っている頻度を集計し表示する。これにより、管理者は、対象者が、業務の一部として業務を行う場所から離れたのか、あるいは単に業務を放棄して業務を行う場所から離れたのかを、業務に関連する単語を含む発言の多寡を指標に用いて判断できる。 According to the invention according to the eighth feature, since the tabulation unit refers to the prohibited word data according to the existence position of the microphone device, the frequency of inappropriate remarks according to the facility including the existence position is determined. Can be aggregated. Inappropriate remarks according to this facility include, for example, remarking confidential information in a public place with many people such as a train station. In addition, the tabulation unit can also tabulate the frequency with which appropriate remarks are made according to the facility including its location. For example, the administrator registers a word related to the work of the target person as a prohibited word. Then, when the target person is away from the place where the business is performed, the management system aggregates and displays the frequency of making a statement including a word related to the business. As a result, the administrator can say whether the target person has left the place where the business is performed as part of the business, or simply abandoned the business and left the place where the business is performed, including words related to the business. It can be judged by using the amount of.

第９及び第１０の特徴に係る発明は、第１の特徴に係る発明のカテゴリ違いである。 The inventions according to the ninth and tenth features are different categories of the inventions according to the first feature.

本発明によれば、マイク装置から受信した音声データをテキスト化し、予め定められた禁止ワードデータと比較し、一致又は類似する頻度を集計して表示することで、管理者が対象者の音声を逐一確認することなく対象者の発言を管理できる音声認識装置を提供できる。 According to the present invention, the voice data received from the microphone device is converted into text, compared with a predetermined prohibited word data, and the matching or similar frequency is aggregated and displayed, so that the administrator can display the voice of the target person. It is possible to provide a voice recognition device that can manage the remarks of the target person without checking each time.

図１は、本実施形態における音声認識装置２０を用いた管理システム１のハードウェア構成とソフトウェア構成を概略的に示すブロック図である。FIG. 1 is a block diagram schematically showing a hardware configuration and a software configuration of a management system 1 using the voice recognition device 20 in the present embodiment. 図２は、本実施形態における管理システム１を使用した管理の流れを示すフローチャート図である。FIG. 2 is a flowchart showing a flow of management using the management system 1 in the present embodiment. 図３は、本実施形態における禁止ワードリスト２２１の一例を示す図である。FIG. 3 is a diagram showing an example of the prohibited word list 221 in the present embodiment. 図４は、本実施形態における集計リスト２２２の一例を示す図である。FIG. 4 is a diagram showing an example of the tabulation list 222 in the present embodiment. 図５は、本実施形態における発言保存リスト２２３の一例を示す図である。FIG. 5 is a diagram showing an example of the remark storage list 223 in the present embodiment. 図６は、本実施形態における管理システム１を使用して表示部２４に集計結果等を示す管理画面を表示したときの一例を示す図である。FIG. 6 is a diagram showing an example when a management screen showing an aggregation result or the like is displayed on the display unit 24 using the management system 1 in the present embodiment. 図７は、変形例２における特定音声リスト２２５の一例を示す図である。FIG. 7 is a diagram showing an example of the specific voice list 225 in the modified example 2. 図８は、変形例２において、特定音声リスト２２５を参照し、マイク装置１０が収音した音声データから、しばしば発生する大きな音の音声データを外して集計する処理の一例を示す概念図である。FIG. 8 is a conceptual diagram showing an example of a process in which the specific voice list 225 is referred to in the modified example 2 and the voice data of the loud sound that is often generated is removed from the voice data collected by the microphone device 10 and aggregated. ..

以下、本発明を実施するための好適な形態の一例について図を参照しながら説明する。なお、これはあくまでも一例であって、本発明の技術的範囲はこれに限られるものではない。 Hereinafter, an example of a suitable embodiment for carrying out the present invention will be described with reference to the drawings. It should be noted that this is only an example, and the technical scope of the present invention is not limited to this.

＜管理システム１＞
図１は、本実施形態における音声認識装置２０を用いた管理システム１のハードウェア構成とソフトウェア構成を概略的に示すブロック図である。 <Management system 1>
FIG. 1 is a block diagram schematically showing a hardware configuration and a software configuration of a management system 1 using the voice recognition device 20 in the present embodiment.

管理システム１は、対象者の音声を音声データに変換可能なマイク装置１０と、マイク装置１０と接続され、管理システム１を運用するためにデータ管理やデータ処理、画面表示や管理者への警告等を行う音声認識装置２０と、を含んで構成される。 The management system 1 is connected to a microphone device 10 capable of converting the voice of the target person into voice data, and is connected to the microphone device 10, and is used for data management, data processing, screen display, and warning to the administrator in order to operate the management system 1. It is configured to include a voice recognition device 20 for performing the above and the like.

〔マイク装置１０〕
マイク装置１０は、マイク装置１０の動作を制御する制御部１１と、制御部１１のマイクロコンピューターで実行される制御プログラム等が記憶される記憶部１２と、音声認識装置２０その他の機器と通信を行う通信部１３と、マイク装置１０の現在位置を測位する測位部１５とを備える。 [Microphone device 10]
The microphone device 10 communicates with a control unit 11 that controls the operation of the microphone device 10, a storage unit 12 that stores a control program and the like executed by the microcomputer of the control unit 11, and a voice recognition device 20 and other devices. A communication unit 13 for performing the operation and a positioning unit 15 for positioning the current position of the microphone device 10 are provided.

必須ではないが、マイク装置１０が固有の識別情報を持ち、この識別情報を音声認識装置２０に送信することが好ましい。固有の識別情報が送信されれば、音声認識装置２０は、この識別情報を受信して利用することで、マイク装置１０と発信元とを容易に紐付けられる。それによって、音声認識装置２０は、発信元を速やかに特定できる。この固有の識別情報として、例えば、マイク装置１０のＩＰｖ６アドレスや、マイク装置１０が備えるネットワークカードのＭＡＣアドレスなどが利用可能である。 Although not essential, it is preferable that the microphone device 10 has unique identification information and this identification information is transmitted to the voice recognition device 20. When the unique identification information is transmitted, the voice recognition device 20 receives and uses the identification information, so that the microphone device 10 and the source can be easily associated with each other. Thereby, the voice recognition device 20 can quickly identify the source. As this unique identification information, for example, the IPv6 address of the microphone device 10, the MAC address of the network card included in the microphone device 10, and the like can be used.

制御部１１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）等を備える。 The control unit 11 includes a CPU (Central Processing Unit), a RAM (Random Access Memory), a ROM (Read Only Memory), and the like.

記憶部１２は、データやファイルが記憶される装置であって、ハードディスクや半導体メモリ、記録媒体、メモリカード等による、データのストレージ部を有する。記憶部１２には、制御部１１のマイクロコンピューターで実行される制御プログラム等が記憶されている。 The storage unit 12 is a device for storing data and files, and has a data storage unit such as a hard disk, a semiconductor memory, a recording medium, and a memory card. The storage unit 12 stores a control program or the like executed by the microcomputer of the control unit 11.

通信部１３は、マイク装置１０その他の機器と通信可能にするためのデバイス、例えばイーサネット（登録商標）規格に対応したネットワークカード、携帯電話ネットワークに対応した無線装置等を有する。 The communication unit 13 has a device for enabling communication with the microphone device 10 and other devices, for example, a network card compatible with the Ethernet (registered trademark) standard, a wireless device compatible with a mobile phone network, and the like.

測位部１５の構成は、特に限定されない。測位部１５として、例えば、ＧＰＳ衛星からの信号を受信して測位するＧＰＳ受信機を用いた測位システムや、携帯電話の基地局からの情報を用いて測位するシステム等が挙げられる。 The configuration of the positioning unit 15 is not particularly limited. Examples of the positioning unit 15 include a positioning system using a GPS receiver for positioning by receiving a signal from a GPS satellite, a system for positioning using information from a base station of a mobile phone, and the like.

〔音声認識装置２０〕
音声認識装置２０は、音声認識装置２０の動作を制御する制御部２１と、制御部２１のマイクロコンピューターで実行される制御プログラム等や、発言データの処理に用いられる各種データ等や、発言データを処理して生成されるデータ等が格納される記憶部２２と、マイク装置１０との通信を行う通信部２３と、集計結果等を表示する表示部２４とを備える。発言データの処理に用いられるデータには、禁止ワードデータ等を含む。発言データを処理して生成されるデータには、禁止ワードと一致又は類似する発言を集計した結果等が含まれる。 [Voice recognition device 20]
The voice recognition device 20 is a control unit 21 that controls the operation of the voice recognition device 20, a control program executed by the microcomputer of the control unit 21, various data used for processing speech data, and speech data. It includes a storage unit 22 for storing data and the like generated by processing, a communication unit 23 for communicating with the microphone device 10, and a display unit 24 for displaying an aggregation result and the like. The data used for processing the speech data includes prohibited word data and the like. The data generated by processing the remark data includes the result of totaling the remarks that match or are similar to the prohibited words.

表示部２４の種類は、特に限定されない。表示部２４として、例えば、モニタ、タッチパネル等が挙げられる。 The type of the display unit 24 is not particularly limited. Examples of the display unit 24 include a monitor, a touch panel, and the like.

制御部２１は、所定のプログラムを読み込み、必要に応じて記憶部２２及び／又は通信部２３と協働することで、管理システム１におけるソフトウェア構成の要素である発言データ生成部２１１、集計部２１２、警告部２１３、発言データ保存部２１４等を実現する。 The control unit 21 reads a predetermined program and cooperates with the storage unit 22 and / or the communication unit 23 as necessary, so that the remark data generation unit 211 and the aggregation unit 212, which are elements of the software configuration in the management system 1, are used. , Warning unit 213, speech data storage unit 214, etc. are realized.

制御部２１、記憶部２２、通信部２３のハードウェア構成は、それぞれ制御部１１、記憶部１２、通信部１３のハードウェア構成と同様である。また、記憶部２２には、禁止ワードデータを列挙した禁止ワードリスト２２１、禁止ワードと一致あるいは類似する発言データの頻度を集計した集計リスト２２２等が記憶されている。加えて、記憶部２２は、マイク装置１０から受信した対象者の音声データをセット可能に構成されている。 The hardware configurations of the control unit 21, the storage unit 22, and the communication unit 23 are the same as the hardware configurations of the control unit 11, the storage unit 12, and the communication unit 13, respectively. Further, the storage unit 22 stores a prohibited word list 221 that lists prohibited word data, a summary list 222 that aggregates the frequency of speech data that matches or is similar to the prohibited words, and the like. In addition, the storage unit 22 is configured to be able to set the voice data of the target person received from the microphone device 10.

＜管理システム１を使用した管理の手順＞
図２は、音声認識装置２０を用いた管理システム１を使用して対象者の発言を管理する手順を示すフローチャートの一例である。以下では、図２を参照しながら、管理システム１の好ましいソフトウェア構成についてより詳しく説明する。 <Management procedure using management system 1>
FIG. 2 is an example of a flowchart showing a procedure for managing the remarks of the target person by using the management system 1 using the voice recognition device 20. In the following, the preferred software configuration of the management system 1 will be described in more detail with reference to FIG.

〔ステップＳ１０：音声の変換〕
まず、マイク装置１０は、対象者の音声を収音する。そして、マイク装置１０の制御部１１は、記憶部１２と協働して、この収音された音声を音声データに変換する。 [Step S10: Voice conversion]
First, the microphone device 10 picks up the voice of the subject. Then, the control unit 11 of the microphone device 10 cooperates with the storage unit 12 to convert the collected voice into voice data.

この音声データの形式には、ＷＡＶ形式やＭＰ３形式、ＦＬＡＣ形式など、周知の形式が用いられる。通信帯域の消費を抑えるために、ＭＰ３形式等の不可逆圧縮を用いた形式が用いられることが好ましい。 Well-known formats such as WAV format, MP3 format, and FLAC format are used as the format of the audio data. In order to reduce the consumption of the communication band, it is preferable to use a format using lossy compression such as the MP3 format.

〔ステップＳ１１：音量の計測〕
続いて、制御部１１は、記憶部１２と協働して、音量計測部１１１を実行し、この音声データの音量を計測する。 [Step S11: Volume measurement]
Subsequently, the control unit 11 cooperates with the storage unit 12 to execute the volume measuring unit 111 and measures the volume of the voice data.

〔ステップＳ１２：音量が所定量以上か判定〕
ステップＳ１１で計測された音量が所定量以上なら、ステップＳ１３へ進む。音量が所定量未満なら、処理を終了し、音声の入力を待つ待機状態に戻る。 [Step S12: Determining whether the volume is equal to or higher than a predetermined amount]
If the volume measured in step S11 is equal to or higher than a predetermined amount, the process proceeds to step S13. If the volume is less than the specified amount, the process ends and the system returns to the standby state waiting for voice input.

本実施形態に係る管理システム１によると、ステップＳ１１及びステップＳ１２に係る段階を経ることで、音量が所定量以上である場合にのみ音声データを音声認識装置２０に送信できる。 According to the management system 1 according to the present embodiment, by going through the steps related to steps S11 and S12, the voice data can be transmitted to the voice recognition device 20 only when the volume is equal to or higher than a predetermined amount.

〔ステップＳ１３：現在位置の測位〕
制御部１１は、測位部１５と協働してマイク装置１０の現在位置を測位し、現在位置データとして記憶部１２に記憶する。 [Step S13: Positioning of the current position]
The control unit 11 determines the current position of the microphone device 10 in cooperation with the positioning unit 15, and stores it in the storage unit 12 as the current position data.

必須の態様ではないが、予めスケジュールされたタイミングで測位部１５が測位した現在位置を、現在位置データとして記憶部１２に記憶し、この現在位置データを読み出すことで、このステップにおける現在位置の測位に代えることも好ましい。これにより、音声を変換するたびに現在位置を測位し、測位処理の完了を待つことを避けられる。 Although it is not an essential aspect, the current position determined by the positioning unit 15 at a predetermined timing is stored in the storage unit 12 as the current position data, and the current position data is read out to determine the current position in this step. It is also preferable to replace it with. As a result, it is possible to determine the current position each time the voice is converted and avoid waiting for the completion of the positioning process.

〔ステップＳ１４：音声データ送信〕
制御部１１は、記憶部１２及び通信部１３と協働して、送信部１１２を実行し、音声データ及び現在位置データを音声認識装置２０に送信する。 [Step S14: Voice data transmission]
The control unit 11 executes the transmission unit 112 in cooperation with the storage unit 12 and the communication unit 13, and transmits the voice data and the current position data to the voice recognition device 20.

本実施形態に係る管理システム１によると、ステップＳ１０からステップＳ１４に係る段階を経ることで、音声データを音声認識装置２０に送信する。また、本実施形態に係る管理システム１によると、ステップＳ１３からステップＳ１４に係る段階を経ることで、現在位置を音声認識装置２０に送信できる。 According to the management system 1 according to the present embodiment, the voice data is transmitted to the voice recognition device 20 by going through the steps from step S10 to step S14. Further, according to the management system 1 according to the present embodiment, the current position can be transmitted to the voice recognition device 20 by going through the steps from step S13 to step S14.

〔ステップＳ２０：音声データ受信〕
ステップＳ１４で音声データ及び現在位置データが送信されると、音声認識装置２０の通信部２３が、それを受信する。受信された音声データ及び現在位置データは、記憶部２２に記憶される。 [Step S20: Voice data reception]
When the voice data and the current position data are transmitted in step S14, the communication unit 23 of the voice recognition device 20 receives them. The received voice data and the current position data are stored in the storage unit 22.

〔ステップＳ２１：発言データ生成〕
ステップＳ２０で音声データが受信されると、制御部２１は、記憶部２２と協働して発言データ生成部２１１を実行し、受信された音声データをテキスト化し、発言データを生成する。生成された発言データは記憶部２２に記憶される。音声データのテキスト化には、統計モデルを用いた音声認識等、周知の音声認識技術が用いられる。 [Step S21: Remark data generation]
When the voice data is received in step S20, the control unit 21 executes the speech data generation unit 211 in cooperation with the storage unit 22, converts the received voice data into text, and generates speech data. The generated speech data is stored in the storage unit 22. Well-known speech recognition technology such as speech recognition using a statistical model is used for converting speech data into text.

〔ステップＳ２３：禁止ワードとの比較〕
ステップＳ２１で発言データが生成されると、制御部２１は、記憶部２２と協働して集計部２１２を実行し、発言データと、禁止ワードリスト２２１に記憶された禁止ワードのそれぞれとを比較する。発言データと禁止ワードとが一致するならば、ステップＳ２４へ進む。発言データと禁止ワードとが一致しないならば、ステップＳ２６へ進む。 [Step S23: Comparison with prohibited words]
When the speech data is generated in step S21, the control unit 21 executes the aggregation unit 212 in cooperation with the storage unit 22 and compares the speech data with each of the prohibited words stored in the prohibited word list 221. do. If the spoken data and the prohibited word match, the process proceeds to step S24. If the spoken data and the prohibited word do not match, the process proceeds to step S26.

図３は、本実施形態における禁止ワードリスト２２１の一例を示す図である。図３に「発信元」と「禁止ワード」とで示されるように、禁止ワードリスト２２１には、発信元と、発信元に紐付けられた禁止ワードとからなる組が、ＩＤとともに列挙されている。例示する図３のＩＤ：１は、対象者である発信元の「田中太郎」について、「お母さん」というフレーズを禁止ワードに定めたことを示している。 FIG. 3 is a diagram showing an example of the prohibited word list 221 in the present embodiment. As shown by "source" and "prohibited word" in FIG. 3, in the prohibited word list 221, a set consisting of the source and the prohibited word associated with the source is listed together with the ID. There is. ID: 1 in FIG. 3 for example indicates that the phrase "mother" is set as a prohibited word for the sender "Taro Tanaka" who is the target person.

発信元と禁止ワードとが紐付けられることにより、管理者は、発信元ごとに禁止ワードを設定できる。これにより、発信元に応じた内容の不適切な発言を、きめ細かく管理できる。例えば、「髪がきれい」というフレーズは、セクシュアル・ハラスメントにつながりうる発言であるから、禁止ワードに定めて集計を行うが、発信元が美容師である場合には発信元の業務に付随する通常の発言であり、禁止ワードに定めず、集計もしない、等の管理を行える。 By associating the sender with the prohibited word, the administrator can set the prohibited word for each sender. As a result, it is possible to finely manage inappropriate remarks according to the sender. For example, the phrase "hair is beautiful" is a statement that can lead to sexual harassment, so it is set as a prohibited word and aggregated, but if the sender is a hairdresser, it usually accompanies the sender's work. It is a remark of, and it is possible to manage such things as not stipulating prohibited words and not counting.

必須ではないが、禁止ワードリスト２２１は、図３に「エリア」で示されるように、禁止ワードと紐付けられたエリア情報を含むことも好ましい。禁止ワードリスト２２１がエリアを含むことにより、集計部２１２は、このエリアと現在位置データとを比較して、現在位置に応じた禁止ワードを参照することが可能となる。これにより、現在位置に応じた、よりきめ細かな禁止ワードの集計が可能となる。 Although not required, the prohibited word list 221 preferably includes area information associated with the prohibited word, as shown by "area" in FIG. When the prohibited word list 221 includes an area, the aggregation unit 212 can compare this area with the current position data and refer to the prohibited word according to the current position. This makes it possible to aggregate prohibited words in more detail according to the current position.

例示する図３のＩＤ：１〜３及び１０１〜１０４を用いて現在位置に応じた禁止ワードについて説明する。この例では、水道工事に従事する「田中太郎」について、７組の禁止ワードとエリアとの組を定めている。ＩＤ：１及び１０１では、対象者である発信元の「田中太郎」について、「駅」と「住宅地」の２つのエリアについて、「お母さん」というフレーズを禁止ワードに定めている。「お母さん」というフレーズは、発言が行われたエリアによらず、脅迫等のトラブルを示すため、両方のエリアに定められている。ＩＤ：２では、「駅」のエリアについて、「水道代」というフレーズを禁止ワードに定めている。水道工事に従事する田中太郎において、駅は通常の業務を行う場所ではない。しかし、「水道代」は業務に関連するフレーズであるため、駅において「水道代」を含む発言を繰り返していれば、管理者は、田中太郎が業務を行うために駅を訪れたと判断できる。ＩＤ：１０２〜１０４は、「住宅地」のエリアについて、「美人」「童顔」「髪がきれい」という、容姿に言及するフレーズを禁止ワードに定めている。田中太郎が業務を行う住宅地において容姿に言及することは、田中太郎によるセクシュアル・ハラスメントの可能性を示す。管理者は、田中太郎が住宅地においてこれらの禁止ワードを含む発言を行った頻度から、田中太郎によるセクシュアル・ハラスメントを発見できる。 The prohibited words according to the current position will be described using the IDs 1 to 3 and 101 to 104 of FIG. 3 as an example. In this example, seven sets of prohibited words and areas are defined for "Taro Tanaka" who is engaged in waterworks. In ID: 1 and 101, the phrase "mother" is defined as a prohibited word for the two areas of "station" and "residential area" for the sender "Taro Tanaka" who is the target person. The phrase "mother" is defined in both areas to indicate troubles such as intimidation, regardless of the area where the statement was made. In ID: 2, the phrase "water charges" is defined as a prohibited word for the "station" area. In Taro Tanaka, who is engaged in waterworks, the station is not a place for normal business. However, since "water charges" is a phrase related to business, if the remarks including "water charges" are repeated at the station, the manager can determine that Taro Tanaka visited the station to perform business. IDs: 102 to 104 define phrases that refer to appearance such as "beautiful woman", "baby face", and "beautiful hair" as prohibited words in the area of "residential area". Mentioning appearance in a residential area where Taro Tanaka operates indicates the possibility of sexual harassment by Taro Tanaka. The administrator can discover sexual harassment by Taro Tanaka from the frequency with which Taro Tanaka made statements including these prohibited words in the residential area.

必須ではないが、禁止ワードリスト２２１は、図３に「属性」で示されるように、複数の発信元を束ねた属性情報を含むことも好ましい。禁止ワードリスト２２１が複数の発信元を束ねた属性情報を持つことにより、管理者は、属性を利用して複数の発信元の禁止ワードを一括で変更する等の、より効率的な管理を行える。 Although not essential, the prohibited word list 221 preferably includes attribute information that bundles a plurality of sources, as shown by "attributes" in FIG. Since the prohibited word list 221 has attribute information that bundles a plurality of sources, the administrator can perform more efficient management such as changing the prohibited words of a plurality of sources at once by using the attributes. ..

〔ステップＳ２４：集計処理〕
図２に戻る。ステップＳ２３が禁止ワードと発言データとの一致を検出すると、制御部２１は、記憶部２２と協働して集計リスト２２２に保存された禁止ワードに対応する頻度を増分する。 [Step S24: Aggregation process]
Return to FIG. When step S23 detects a match between the prohibited word and the speech data, the control unit 21 cooperates with the storage unit 22 to increase the frequency corresponding to the prohibited word stored in the summary list 222.

図４は、本実施形態における集計リスト２２２の一例を示す図である。図４に示すように、集計リスト２２２には、発信元と、禁止ワードと、頻度とが紐付けられてＩＤとともに保存される。図４のＩＤ：１に、発信元「田中太郎」が、禁止ワード「お母さん」を含む発言を行っていないことが保存されている。図４のＩＤ：２に、発信元の「田中太郎」が、禁止ワード「水道代」を含む発言を８８６回行ったことが保存されている。 FIG. 4 is a diagram showing an example of the tabulation list 222 in the present embodiment. As shown in FIG. 4, in the summary list 222, the source, the prohibited word, and the frequency are associated with each other and stored together with the ID. ID: 1 in FIG. 4 stores that the sender "Taro Tanaka" has not made a statement including the prohibited word "mother". ID: 2 in FIG. 4 stores that the sender "Taro Tanaka" made a statement including the prohibited word "water charges" 886 times.

発信元と、禁止ワードと、頻度とが紐付けられることにより、表示部２４は、発信元ごとに禁止ワードを含む発言を行った頻度を、管理者に表示できる。この表示により、管理者は、発信元ごとにトラブルの発生や、トラブルの前兆をきめ細かく判断できる。 By associating the sender, the prohibited word, and the frequency, the display unit 24 can display to the administrator the frequency of making a statement including the prohibited word for each sender. From this display, the administrator can make a detailed judgment of the occurrence of trouble and the precursor of trouble for each sender.

必須ではないが、集計リスト２２２には、現在位置データ及び／又は現在位置データを含むエリアが保存されることも好ましい。集計リスト２２２に現在位置データやエリアが保存されることにより、表示部２４は、保存された現在位置を含むエリアごとの禁止ワードを含む発言の頻度を、管理者に表示できる。この表示により、管理者は、対象者が、エリアごとに定められた不適切な発言を行った頻度を、管理できる。 Although not required, it is also preferred that the aggregate list 222 stores an area containing current location data and / or current location data. By storing the current position data and the area in the summary list 222, the display unit 24 can display to the administrator the frequency of remarks including the prohibited words for each area including the saved current position. With this display, the administrator can manage the frequency with which the target person makes inappropriate remarks determined for each area.

〔ステップＳ２６：発言データの保存〕
図２に戻る。制御部２１は、記憶部２２と協働して発言データ保存部２１４を実行し、発言保存リスト２２３に発言データを保存する。 [Step S26: Saving speech data]
Return to FIG. The control unit 21 executes the remark data storage unit 214 in cooperation with the storage unit 22, and stores the remark data in the remark storage list 223.

図５は、本実施形態における発言保存リスト２２３の一例を示す図である。図５に「発信元」「発言データ」「音声データ」でそれぞれ示されるように、発言保存リスト２２３には、発信元と、テキスト化された発言データと、音声データとが紐付けられて、ＩＤとともに保存される。図５のＩＤ：１には、発信元「田中太郎」が、東京都千代田区で、２０１８年１２月１６日の１２時ちょうどに、「今月の水道代が・・・」という発言を行ったことと、その際の音声データ［ｓｏｕｎｄ１］とが、保存されている。 FIG. 5 is a diagram showing an example of the remark storage list 223 in the present embodiment. As shown in FIG. 5 by "source", "speech data", and "voice data", the source, the textualized speech data, and the voice data are associated with each other in the speech storage list 223. It is saved with the ID. In ID: 1 in Fig. 5, the sender "Taro Tanaka" made a statement in Chiyoda-ku, Tokyo, at exactly 12:00 on December 16, 2018, saying "This month's water bill is ...". That and the voice data [sound1] at that time are stored.

発信元と発言データとが紐付けられることにより、管理者は、発信元ごとに発言データを管理できる。また、発言データと音声データとが紐付けられることにより、管理者は、音声データを再生して、対象者の語調や声の大きさと言った、発言に関するより詳しい情報を得られる。 By associating the sender with the speech data, the administrator can manage the speech data for each sender. In addition, by associating the speech data with the voice data, the administrator can reproduce the voice data to obtain more detailed information about the speech, such as the tone and loudness of the voice of the target person.

必須ではないが、発言が行われた時点での対象者の現在位置を示す位置情報と、発言データとが紐付けられて発言保存リスト２２３に保存されることもまた好ましい。位置情報が保存されることにより、管理者は、発言が行われた場所を把握できる。これにより、管理者は、発言が行われた位置周辺での聞き込み等、追加の調査を行える。 Although it is not essential, it is also preferable that the position information indicating the current position of the target person at the time when the speech is made and the speech data are associated and stored in the speech storage list 223. By storing the location information, the administrator can grasp the place where the remark was made. This allows the administrator to conduct additional investigations such as listening around the position where the statement was made.

必須ではないが、発言が行われた日時を示す時間情報が発言データと紐付けられて発言保存リスト２２３に保存されることもまた好ましい。時間情報が保存されることにより、管理者は、発言が行われた日時を把握できる。これにより、管理者は、特定の日時に生じたトラブルと、発言データとが互いに関連するものかどうかを判断できる。また、管理者は、この時間情報を用いて発言データを検索し、特定の日時に生じたトラブルに関連する発言データを効率よく探し出せる。さらに、管理者は、発言が行われた日時を用いて聞き込みを行うなどの、追加の調査を行える。 Although not essential, it is also preferable that the time information indicating the date and time when the remark was made is associated with the remark data and stored in the remark storage list 223. By storing the time information, the administrator can grasp the date and time when the remark was made. This allows the administrator to determine whether the trouble that occurred at a specific date and time and the remark data are related to each other. In addition, the administrator can search the remark data using this time information and efficiently search for the remark data related to the trouble that occurred at a specific date and time. In addition, the administrator can perform additional investigations, such as listening using the date and time when the statement was made.

〔ステップＳ２７：所定の頻度との比較〕
図２に戻る。制御部２１は、記憶部２２と協働して警告部２１３を実行し、禁止ワードを含む発言の頻度と所定の頻度とを比較する。発言の頻度が所定の頻度以上ならば、ステップＳ２８へ進む。発言の頻度が所定の頻度未満ならば、ステップＳ２９へ進む。 [Step S27: Comparison with a predetermined frequency]
Return to FIG. The control unit 21 executes the warning unit 213 in cooperation with the storage unit 22 and compares the frequency of remarks including prohibited words with a predetermined frequency. If the frequency of speech is equal to or higher than the predetermined frequency, the process proceeds to step S28. If the frequency of speech is less than the predetermined frequency, the process proceeds to step S29.

〔ステップＳ２８：警告処理〕
ステップＳ２７において、発言の頻度が所定の頻度以上であれば、制御部２１は、記憶部２２と協働して、管理者に警告する。この警告は、発言の頻度が所定の頻度以上となったことを表示部２４に表示するものであることが好ましい。 [Step S28: Warning processing]
In step S27, if the frequency of remarks is equal to or higher than a predetermined frequency, the control unit 21 cooperates with the storage unit 22 to warn the administrator. It is preferable that this warning displays on the display unit 24 that the frequency of speech is equal to or higher than a predetermined frequency.

発言の頻度が所定の頻度以上となったことが警告されることにより、管理者は、いち早くトラブルの発生又はその前兆に気づける。これにより、管理者は、対象者への警告や警察への通報等をいち早く行える。また、所定の頻度以上である場合に警告が行われることで、いたずらな警告や、発音が良く似たワードを一致又は類似するものとして集計することによる誤警告が防がれる。 By being warned that the frequency of remarks has exceeded the predetermined frequency, the administrator will be aware of the occurrence of trouble or its precursors as soon as possible. As a result, the administrator can promptly give a warning to the target person or report to the police. In addition, by issuing a warning when the frequency is higher than a predetermined frequency, it is possible to prevent mischievous warnings and erroneous warnings caused by aggregating words with similar pronunciations as matching or similar.

必須ではないが、この警告は、発信元を含むことも好ましい。警告が、発信元を含むことで、管理者は、どの対象者に対してトラブル対応を行うべきかをいち早く判断できる。 Although not required, it is also preferred that this warning include the source. By including the source of the warning, the administrator can quickly determine which target person should be dealt with.

必須ではないが、この警告は、発言データを含むことも好ましい。警告が、発言データを含むことで、管理者は、対応する発言データを探すことなく、どのようなトラブル対応を行うべきかを判断できる。 Although not required, this warning may also include remark data. By including the remark data in the warning, the administrator can determine what kind of trouble should be dealt with without searching for the corresponding remark data.

必須の態様ではないが、制御部２１が記憶部２２及び通信部２３と協働して管理者にメール及び／又はインスタントメッセージを送信することで、管理者に、警告する態様もまた好ましい。管理者にメール及び／又はインスタントメッセージが送信されることで、管理者が音声認識装置２０から離れた場所にいる場合であっても、管理者は、警告を受け取れる。これにより、管理者は、いち早く禁止ワードの頻度が所定の頻度以上となったことを知り、トラブルに対応できる。 Although not an essential aspect, it is also preferable that the control unit 21 cooperates with the storage unit 22 and the communication unit 23 to send an e-mail and / or an instant message to the administrator to warn the administrator. By sending an email and / or an instant message to the administrator, the administrator can receive a warning even when the administrator is away from the voice recognition device 20. As a result, the administrator can quickly know that the frequency of prohibited words has exceeded the predetermined frequency and can deal with the trouble.

また、必須の態様ではないが、制御部２１が記憶部２２及び通信部２３と協働して、警告音や電話による呼び出し等の音声を用いた追加の警告を行う態様もまた好ましい。警告音や電話による呼び出し等の音声を用いることで、管理者は、表示部２４を常時監視すること無く、警告を受け取れる。 Further, although not essential, it is also preferable that the control unit 21 cooperates with the storage unit 22 and the communication unit 23 to give an additional warning using a voice such as a warning sound or a telephone call. By using a warning sound or a voice such as a telephone call, the administrator can receive the warning without constantly monitoring the display unit 24.

〔ステップＳ２９：表示処理〕
制御部２１は、記憶部２２と協働して、集計の結果を表示部２４に表示する。表示される集計の結果は、発信元と、禁止ワードと、禁止ワードを含む発言の頻度とを含む。 [Step S29: Display process]
The control unit 21 cooperates with the storage unit 22 to display the aggregation result on the display unit 24. The aggregated results displayed include the originator, the prohibited words, and the frequency of remarks including the prohibited words.

発信元と、禁止ワードと、禁止ワードを含む発言の頻度とが表示部２４に表示されることにより、管理者は、発信元による禁止ワードを含む発言の頻度を把握する。管理者は、表示された禁止ワードを含む発言の頻度を用いて発信元周辺でのトラブルの発生を把握し、対象者への警告や警察への通報などの管理を行う。管理者は、発信元の音声を逐一確認することなく、発信元の発言に関するこの一連の管理を行える。 By displaying the sender, the prohibited word, and the frequency of the remark including the prohibited word on the display unit 24, the administrator grasps the frequency of the remark including the prohibited word by the sender. The administrator grasps the occurrence of troubles around the sender by using the frequency of remarks including the displayed prohibited words, and manages warnings to the target person and reports to the police. The administrator can manage this series of remarks of the sender without checking the voice of the sender one by one.

必須ではないが、発言データ保存部２１４が保存した直近の発言データについて、発信元、日時、場所、会話データ及び／又は音声データへのアクセス手段を含む一組の概要を、予め定められた数だけ組を列挙して、表示部２４に表示することも可能である。直近の発言データの概要が表示されることにより、管理者は、現在発言している対象者を把握できる。概要に会話データ及び／又は音声データへのアクセス手段が含まれるため、管理者は、現在発言している対象者の発言データや音声データ等を速やかに確認できる。 Although not required, a predetermined number of outlines of the latest speech data saved by the speech data storage unit 214, including the source, date and time, location, conversation data and / or access means to voice data. It is also possible to enumerate only the sets and display them on the display unit 24. By displaying the summary of the latest remark data, the administrator can grasp the target person who is currently remarking. Since the outline includes the means for accessing the conversation data and / or the voice data, the administrator can quickly confirm the speech data, the voice data, and the like of the target person who is currently speaking.

＜管理システム１の使用例＞
続いて、本実施形態における音声認識装置２０を用いた管理システム１の使用例を説明する。 <Usage example of management system 1>
Subsequently, an example of using the management system 1 using the voice recognition device 20 in the present embodiment will be described.

まず、管理者は、マイク装置１０と音声認識装置２０とを接続する。この接続は、例えば、管理者が、マイク装置１０の記憶部１２に音声認識装置２０を示すＩＰｖ６アドレスを登録し、さらに、音声認識装置２０の記憶部２２にマイク装置１０を示すＩＰｖ６アドレスを登録し、互いにインターネットを介した通信を行えるよう構成すること等によって行われる。 First, the administrator connects the microphone device 10 and the voice recognition device 20. In this connection, for example, the administrator registers an IPv6 address indicating the voice recognition device 20 in the storage unit 12 of the microphone device 10, and further registers an IPv6 address indicating the microphone device 10 in the storage unit 22 of the voice recognition device 20. However, it is done by configuring each other so that they can communicate with each other via the Internet.

続いて、管理者は、対象者にマイク装置１０を携帯させる、マイク装置１０を対象者のデスクに据え付ける等して、対象者の音声を音声データに変換可能な状態にする。変換された音声データの音量が計測され、音量が所定以上のとき、マイク装置１０は、音声認識装置２０に音声データを送信する。 Subsequently, the administrator makes the subject carry the microphone device 10, installs the microphone device 10 on the subject's desk, and so on, so that the subject's voice can be converted into voice data. The volume of the converted voice data is measured, and when the volume is equal to or higher than a predetermined value, the microphone device 10 transmits the voice data to the voice recognition device 20.

音声認識装置２０は、マイク装置１０から音声データを受信し、音声データをテキスト化して発言データを生成し、予め定められた禁止ワードと比較し、一致又は類似する発言の頻度を集計して表示部２４に表示する一連の動作を行う。また、音声認識装置２０は、生成された発言データを、発信元ごとに分類し、発言保存リスト２２３に保存できる。 The voice recognition device 20 receives voice data from the microphone device 10, converts the voice data into text, generates speech data, compares it with a predetermined prohibited word, and aggregates and displays the frequency of matching or similar speech. Perform a series of operations displayed on the unit 24. Further, the voice recognition device 20 can classify the generated speech data for each source and save it in the speech storage list 223.

図６は、表示部２４に表示される集計の結果の表示例である。「キーワード検出数（田中太郎）」の下に、禁止ワードそれぞれについて、禁止ワードと、禁止ワードと一致又は類似する発言の頻度とからなる一組の情報が、予め定められた数だけ表示される。図６の「（田中太郎）」は、「発信元：田中太郎」について、禁止ワード及び発言の頻度が表示されていることを示す。管理者は、発信元を指定することで、任意の発信元について、禁止ワードと発言の頻度とを、同様に表示させられる。管理者は、「さらに見る」を操作することで、より多くの禁止ワードと発言の頻度との組を表示させられる。 FIG. 6 is a display example of the aggregated result displayed on the display unit 24. Under "Keyword detection number (Taro Tanaka)", for each prohibited word, a set of information consisting of the prohibited word and the frequency of remarks that match or are similar to the prohibited word is displayed in a predetermined number. .. “(Taro Tanaka)” in FIG. 6 indicates that the prohibited words and the frequency of remarks are displayed for “source: Taro Tanaka”. By designating the source, the administrator can display the prohibited words and the frequency of remarks in the same manner for any source. The administrator can display a set of more prohibited words and the frequency of remarks by operating "more".

図６に示すように、発言の頻度に応じた大きさの棒グラフを表示することも可能である。発言の頻度に応じた大きさの棒グラフを表示することにより、管理者は、より直観的に禁止ワードと一致又は類似する発言の頻度を把握できる。 As shown in FIG. 6, it is also possible to display a bar graph having a size corresponding to the frequency of speech. By displaying a bar graph sized according to the frequency of remarks, the administrator can more intuitively grasp the frequency of remarks that match or resemble the prohibited word.

図６に「最新利用状況一覧」とその下の表として示すように、保存された直近の発言データについて、対象者の発信元、日時、場所、発言データ及び／又は音声データへのアクセス手段を含む一組の概要を、予め定められた数だけ組を列挙して表示することも可能である。直近の発言データの概要が表示されることにより、管理者は、現在発言している対象者を把握できる。概要が表示されることにより、管理者は、現在発言している対象者の発言データや音声データ等を速やかに確認できる。管理者は、「会話内容（テキスト）」の列の「詳細」を操作することで、保存された発言データにアクセスできる。また、管理者は、「会話内容（音声）」の列の「詳細」を操作することで、保存された音声データにアクセスできる。管理者は、「さらに見る」を操作することで、より多くの概要を表示させられる。 As shown in FIG. 6 as the "latest usage status list" and the table below it, the source, date and time, place, speech data and / or voice data of the target person are accessed for the saved latest speech data. It is also possible to list and display a predetermined number of sets as an outline of one set including. By displaying the summary of the latest remark data, the administrator can grasp the target person who is currently remarking. By displaying the summary, the administrator can quickly confirm the speech data, voice data, etc. of the target person who is currently speaking. The administrator can access the saved speech data by manipulating the "details" in the "conversation content (text)" column. In addition, the administrator can access the saved voice data by operating the "details" in the "conversation content (voice)" column. The administrator can display more summaries by operating "More".

管理者は、表示部２４に表示された集計の結果を見て、発言の頻度からトラブルの発生の有無又はトラブルの予兆の有無を判断する。そして、トラブルが発生している、あるいは、トラブルの予兆があると判断した対象者について、対象者に警告する、警察へ通報するなどのトラブル対応を含む管理を行う。また、管理者は、表示された音声データに紐付けられた現在位置を参照して、トラブルが発生している現場に向かう、トラブルが発生している現場近くの担当者に連絡する等のトラブル対応を含む管理を行う。 The administrator looks at the result of the aggregation displayed on the display unit 24, and determines whether or not a trouble has occurred or whether or not there is a sign of trouble from the frequency of remarks. Then, for the target person who is judged to have a trouble or a sign of trouble, management including trouble handling such as warning the target person and reporting to the police is performed. In addition, the administrator refers to the current position associated with the displayed voice data, heads to the site where the problem is occurring, or contacts the person in charge near the site where the problem is occurring. Manage including correspondence.

また、音声認識装置２０は、禁止ワードと一致又は類似する発言の頻度が所定の頻度以上である場合に、表示部２４等を通じて管理者に警告を行える。警告を受けた管理者は、上記と同様のトラブル対応を含む管理を行える。 Further, the voice recognition device 20 can warn the administrator through the display unit 24 or the like when the frequency of remarks that match or resemble the prohibited words is equal to or higher than a predetermined frequency. The administrator who received the warning can perform management including troubleshooting as described above.

管理システム１のこの一連の動作は、管理者が対象者の音声を逐一確認することなく、自動的に行われる。したがって、本実施形態における管理システム１により、管理者は、対象者の音声を逐一確認することなく対象者の発言を管理できる。これにより、管理者は、対象者の発言の管理を、より少ない労力で行える。より少ない労力で発言の管理を行えることにより、管理者は、発言の管理に要する労力を増やすことなく、より多くの対象者の発言を同時に管理できる。また、警告が行われることにより、管理者は、より迅速に対象者の周囲で生じたトラブルに対応できる。 This series of operations of the management system 1 is automatically performed without the administrator checking the voice of the target person one by one. Therefore, the management system 1 in the present embodiment allows the administrator to manage the remarks of the target person without checking the voice of the target person one by one. As a result, the manager can manage the remarks of the target person with less effort. By being able to manage remarks with less effort, the administrator can simultaneously manage the remarks of a larger number of subjects without increasing the effort required to manage the remarks. In addition, by issuing a warning, the administrator can respond more quickly to troubles that occur around the target person.

＜管理プログラムとして提供可能であること＞
これまで、本発明を管理システム１として提供することについて説明したが、これに限るものではない。本実施形態に記載の発明は、コンピュータに実行させることの可能な管理プログラムとしても提供可能である。当該管理プログラムは、発言データ生成部２１１が、マイク装置１０から受信した音声データをテキスト化し、対象者の発言データを生成する、発音データ生成ステップと、集計部２１２が、発言データと予め定められた禁止ワードデータとを比較し、一致又は類似する頻度を集計する、集計ステップと、表示部２４が、集計部２１２による集計の結果を管理者に表示する表示ステップとを実行させる。 <Can be provided as a management program>
So far, the present invention has been described as being provided as the management system 1, but the present invention is not limited thereto. The invention described in this embodiment can also be provided as a management program that can be executed by a computer. In the management program, the pronunciation data generation step in which the speech data generation unit 211 converts the voice data received from the microphone device 10 into text and generates the speech data of the target person, and the aggregation unit 212 are predetermined as speech data. The aggregation step of comparing with the prohibited word data and summarizing the matching or similar frequency and the display step of displaying the aggregation result by the aggregation unit 212 to the administrator are executed by the display unit 24.

＜変形例＞
以下、本実施形態に記載の発明における種々の変形例を例示する。 <Modification example>
Hereinafter, various modifications of the invention described in the present embodiment will be illustrated.

〔変形例１〕発信元の判別
記憶部２２は、声紋データを予め記憶する声紋データリスト２２４を備え、さらに、制御部２１は、音声データと声紋データリスト２２４に記憶された声紋データとを比較し、音声データの発信元が声紋データの発声元であるか否かを判別する発信元判別部２１５を備えることも好ましい。このとき、発言データは、発信元ごとに分類されて、発言保存リスト２２３に保存されることが好ましい。さらに、警告部２１３は、音声データの発信元が声紋データの発信元であるとき、管理者に警告しても良い。 [Modification 1] Origin determination The storage unit 22 includes a voiceprint data list 224 that stores voiceprint data in advance, and the control unit 21 compares the voiceprint data with the voiceprint data stored in the voiceprint data list 224. However, it is also preferable to include a source determination unit 215 for determining whether or not the source of the voice data is the source of the voiceprint data. At this time, it is preferable that the speech data is classified by the source and stored in the speech storage list 223. Further, the warning unit 213 may warn the administrator when the source of the voice data is the source of the voiceprint data.

対象者の声紋データを声紋データリスト２２４に記憶し、発信元判別部２１５が発信元を判別することで、音声認識装置２０は、対象者本人の発言のみを集計し、管理者に警告できる。すなわち、対象者以外の第三者の音声、特に、対象者との会話に参加していない対象者の周囲に居合わせた第三者の音声、に含まれる禁止ワードが、集計され、管理者に警告されることがなくなる。管理者は、必要とされる禁止ワードの頻度の、より正確な集計結果を、得られる。 The voiceprint data of the target person is stored in the voiceprint data list 224, and the source determination unit 215 determines the source, so that the voice recognition device 20 can aggregate only the statements of the target person and warn the administrator. That is, the prohibited words contained in the voices of third parties other than the target person, especially the voices of third parties who are present around the target person who did not participate in the conversation with the target person, are aggregated and sent to the administrator. You will not be warned. The administrator can obtain a more accurate aggregate result of the required frequency of prohibited words.

さらに、この発信元の判別により、発言データ保存部２１４は、発言データを発信元とより適切に対応付け、分類し、発言保存リスト２２３に保存できる。これにより、管理者は、発信元を指定して、指定された発信元に対応する保存された発言データを読み出して、発信元による発言データの内容を精査できる。当事者間で発言の有無や発言の内容についての説明が食い違う状況において、管理者は、保存された発信元による発言データという証拠を精査することで、状況を客観的に把握できる。また、対象者が当事者である労働審判において、管理者は、保存された発信元による発言データを証拠として提示できる。 Further, by determining the originator, the speech data storage unit 214 can more appropriately associate and classify the speech data with the source, and store the speech data in the speech storage list 223. As a result, the administrator can specify the sender, read the stored speech data corresponding to the designated sender, and examine the content of the speech data by the sender. In a situation where there is a discrepancy in the presence or absence of a statement and the explanation of the content of the statement between the parties, the administrator can objectively grasp the situation by examining the evidence of the saved statement data from the sender. In addition, in a labor trial in which the subject is a party, the manager can present the stored statement data by the sender as evidence.

さらに、発言データが発信元とより適切に対応付けられて保存されることにより、管理者は、より効果的に、保存された発言データをビッグデータとして解析し、利用できる。管理者は、このような解析により、発信元を指定して、その発信元の発言傾向を分析したり、トラブルの発生前に出現しやすいワードを発見して、禁止ワードリスト２２１に加えたりできる。 Further, since the speech data is more appropriately associated with the sender and stored, the administrator can more effectively analyze and use the stored speech data as big data. By such an analysis, the administrator can specify a source, analyze the speaking tendency of the source, find a word that is likely to appear before the trouble occurs, and add it to the prohibited word list 221. ..

〔変形例２〕特定音声データを用いた負荷の軽減
記憶部２２は、対象者の属性に応じて定められた特定音声データを記憶する、特定音声リスト２２５を備えることも好ましい。特定音声リスト２２５を備えることにより、集計部２１２は、特定音声リスト２２５に記憶されている対象者の属性に応じた特定音声データと音声データとを比較し、特定音声データと異なる音声データについて集計を行える。 [Modification 2] Reduction of load using specific voice data It is also preferable that the storage unit 22 includes a specific voice list 225 that stores specific voice data determined according to the attributes of the target person. By providing the specific voice list 225, the totaling unit 212 compares the specific voice data and the voice data according to the attributes of the target person stored in the specific voice list 225, and totals the voice data different from the specific voice data. Can be done.

図７は、特定音声リスト２２５の一例である。特定音声リスト２２５には、発信元の属性と属性に対応する特定音声データとが紐づけられて保存される。図７は、水道工事という属性について、音声データ［ｎｏｉｓｅ１］（「ガンガン」というノック音）が特定音声データとして定められていることを示す。必須ではないが、図７に「備考」として示すように、特定音声リスト２２５は、特定音声データと紐づけられた特定音声データの説明その他をテキストとして含むことができる。特定音声リスト２２５が説明その他をテキストとして含むことにより、管理者は、特定音声データを逐一再生することなく、定められた特定音声データを把握し、容易に管理できる。 FIG. 7 is an example of the specific voice list 225. In the specific voice list 225, the attribute of the sender and the specific voice data corresponding to the attribute are associated and stored. FIG. 7 shows that voice data [noise1] (a knocking sound of “ganging”) is defined as specific voice data for the attribute of water works. Although not essential, as shown as "remarks" in FIG. 7, the specific voice list 225 may include a description of the specific voice data associated with the specific voice data and the like as text. By including the explanation and the like as text in the specific voice list 225, the administrator can grasp and easily manage the specified specific voice data without reproducing the specific voice data one by one.

図８の概念図を用いた例示により、この特定音声データを用いた集計について説明する。この例では、対象者は水道工事に従事する作業員であり、対象者の近くで頻繁に「ガンガン」という水道のノック音が発生する。 The aggregation using this specific voice data will be described by an example using the conceptual diagram of FIG. In this example, the subject is a worker engaged in water works, and a knocking sound of the water is frequently generated near the subject.

最上段のａ）音声データは、マイク装置１０から送信された例示する音声データが、集計の対象となりうる発言（図８に「この水道」「濡れてる」「まずい」で示されている）に加えて、水道のノック音（図８に「ガンガン」で示されている）を含んでいる様子を示している。 In the uppermost a) voice data, the example voice data transmitted from the microphone device 10 is a statement that can be the target of aggregation (indicated by "this water service", "wet", and "bad" in FIG. 8). In addition, it shows the inclusion of a tapping sound (indicated by "ganging" in FIG. 8).

二段目のｂ）特定音声データは、図７の特定音声リスト２２５に特定音声データとして記憶された、水道工事の属性と対応する「ガンガン」というノック音を示している。 The second stage b) specific voice data shows a knocking sound corresponding to the attribute of the waterworks work stored as the specific voice data in the specific voice list 225 in FIG. 7.

図８に戻る。三段目のｃ）処理は、制御部２１が集計部２１２を実行し、ｂ）に示した特定音声データと例示する音声データとを比較する様子を示している。ｃ）において、音声データの背景の色を変えている部分が、この比較の結果、特定音声データと一致した部分である。 Return to FIG. In the third stage c) processing, the control unit 21 executes the aggregation unit 212 and compares the specific voice data shown in b) with the illustrated voice data. In c), the part where the background color of the voice data is changed is the part that matches the specific voice data as a result of this comparison.

この比較の結果、最下段のｄ）処理後が示すように、例示する音声データから特定音声データと一致する部分が取り除かれ、「この水道」「濡れてる」「まずい」で示された対象者による発言の内容のみが発言データとして集計される。このように、特定音声データと異なる部分のみが集計されることにより、集計部２１２の実行による処理負荷が軽減される。 As a result of this comparison, as shown in d) after processing at the bottom, the part corresponding to the specific voice data was removed from the illustrated voice data, and the subject indicated by "this water supply", "wet", and "bad". Only the content of the remarks made by is aggregated as remark data. In this way, by totaling only the portion different from the specific voice data, the processing load due to the execution of the totaling unit 212 is reduced.

以上、本発明の実施形態について説明したが、本発明は上述したこれらの実施形態に限るものではない。また、本発明の実施形態に記載された効果は、本発明から生じる最も好適な効果を列挙したものに過ぎず、本発明による効果は、本発明の実施形態に記載されたものに限定されるものではない。 Although the embodiments of the present invention have been described above, the present invention is not limited to these embodiments described above. Further, the effects described in the embodiments of the present invention are merely a list of the most suitable effects arising from the present invention, and the effects according to the present invention are limited to those described in the embodiments of the present invention. It's not a thing.

また、上述した実施の形態は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、ある実施の形態の構成の一部を他の実施の形態の構成に置き換えることが可能であり、また、ある実施の形態の構成に他の実施の形態の構成を加えることも可能である。 Further, the above-described embodiments have been described in detail in order to explain the present invention in an easy-to-understand manner, and are not necessarily limited to those having all the described configurations. Further, it is possible to replace a part of the configuration of one embodiment with the configuration of another embodiment, and it is also possible to add the configuration of another embodiment to the configuration of one embodiment. ..

また、上記の各構成、機能、処理部は、それらの一部又は全部を、ハードウェア（例えば、集積回路）で実現してもよい。また、上記の各構成、機能、処理部は、プロセッサがそれぞれの機能を実現するプログラムを解釈し、実行することによりソフトウェアで実現してもよい。各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリや、ハードディスク、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等の記録装置、又は、ＩＣカード、ＳＤカード、ＤＶＤ等の記録媒体に置くことができる。 Moreover, each of the above-mentioned configurations, functions, and processing units may realize a part or all of them by hardware (for example, an integrated circuit). Further, each of the above configurations, functions, and processing units may be realized by software by the processor interpreting and executing a program that realizes each function. Information such as programs, tables, and files that realize each function can be placed in a memory, a hard disk, a recording device such as an SSD (Solid State Drive), or a recording medium such as an IC card, an SD card, or a DVD.

１管理システム
１０マイク装置
１１制御部
１１１音量計測部
１１２送信部
１２記憶部
１３通信部
１５測位部
２０音声認識装置
２１制御部
２１１発言データ生成部
２１２集計部
２１３警告部
２１４発言データ保存部
２１５発信元判別部
２２記憶部
２２１禁止ワードリスト
２２２集計リスト
２２３発言保存リスト
２２４声紋データリスト
２２５特定音声リスト
２３通信部
２４表示部

1 Management system 10 Microphone device 11 Control unit 111 Volume measurement unit 112 Transmission unit 12 Storage unit 13 Communication unit 15 Positioning unit 20 Voice recognition device 21 Control unit 211 Speech data generation unit 212 Aggregation unit 213 Warning unit 214 Speech data storage unit 215 Transmission Original discrimination unit 22 Storage unit 221 Prohibited word list 222 Aggregate list 223 Speech storage list 224 Voice print data list 225 Specific voice list 23 Communication unit 24 Display unit

Claims

A voice recognition device that is connected to a microphone device capable of converting the voice of a target person into voice data and can convert the voice data received from the microphone device into text.
A speech data generation unit that converts the voice data received from the microphone device into text and generates speech data of the target person, and a speech data generation unit.
A tabulation unit that compares the remark data with predetermined prohibited word data and aggregates the frequency of matching or similarities.
A display unit that displays the result of aggregation by the aggregation unit to the administrator,
Has a voice recognition device.

The first aspect of claim 1, further comprising a warning unit that warns the administrator when the frequency at which the speech data and the prohibited word data match or are similar as a result of aggregation by the aggregation unit is equal to or higher than a predetermined frequency. Voice recognition device.

Further, it has a source determination unit that compares the voice data with the voiceprint data stored in advance and determines whether or not the source of the voice data is the voiceprint source of the voiceprint data.
The voice recognition device according to claim 2, wherein the warning unit warns the administrator when the source of the voice data is the source of the voiceprint data.

The voice according to any one of claims 1 to 3, further comprising a voice data storage unit that classifies, stores, and stores the voice data generated by the voice data generation unit for each source of the voice data. Recognition device.

The voice recognition device according to claim 4, wherein the display unit can extract and display speech data of a specific sender from a plurality of speech data stored in the speech data storage section.

The aggregation unit compares the specific voice data, which is predetermined and frequently used in the attribute to which the source of the voice data belongs, with the voice data, and performs the totalization for the voice data different from the specific voice data. , The voice recognition device according to any one of claims 1 to 5.

A microphone device capable of converting a subject's voice into voice data, and a voice recognition device according to any one of claims 1 to 6 connected to the microphone device.
The microphone device is
A volume measuring unit that measures the volume of sound included in the voice data,
A transmission unit that transmits voice data whose volume is equal to or higher than a predetermined amount to the voice recognition device, and
Has a management system.

A microphone device capable of converting a subject's voice into voice data, and a voice recognition device according to any one of claims 1 to 6 connected to the microphone device.
The microphone device has a positioning unit for positioning the current position of the microphone device.
The prohibited word data is predetermined according to the current position of the microphone device.
The aggregation unit is a management system that compares the speech data with the prohibited word data based on the current position received from the microphone device, and aggregates the matching or similar frequency.

It has a pronunciation data generation unit, an aggregation unit, and a display unit, and is connected to a microphone device capable of converting the voice of the target person into voice data, and the voice data received from the microphone device can be converted into text. It is a management program that is executed by the computer of the voice recognition device.
A pronunciation data generation step in which the speech data generation unit converts the voice data received from the microphone device into text and generates speech data of the target person.
An aggregation step in which the aggregation unit compares the remark data with predetermined prohibited word data and aggregates the matching or similar frequency.
A display step in which the display unit displays the result of aggregation by the aggregation unit to the administrator.
A management program that causes the computer to execute.

A method in a voice recognition device that is connected to a microphone device capable of converting a subject's voice into voice data and can convert the voice data received from the microphone device into text.
A pronunciation data generation step in which the speech data generation unit converts the voice data received from the microphone device into text and generates speech data of the target person.
An aggregation step in which the aggregation unit compares the remark data with the predetermined prohibited word data and aggregates the matching or similar frequency.
A display step in which the display unit displays the result of aggregation by the aggregation unit to the administrator, and
The method.