JP7288701B2

JP7288701B2 - voice recognition device

Info

Publication number: JP7288701B2
Application number: JP2021131722A
Authority: JP
Inventors: 秀憲川村; 紘也永田
Original assignee: ティ・アイ・エル株式会社
Priority date: 2019-11-12
Filing date: 2021-08-12
Publication date: 2023-06-08
Anticipated expiration: 2039-11-12
Also published as: JP2021179639A; JP6933397B2; JP2021076758A

Description

本発明は、音声認識装置、管理システム、管理プログラム及び音声認識方法に関する。より詳しくは、音声データをテキスト化して発言データを生成し、予め定められた禁止ワードデータと比較することで会話から生ずるトラブルを予防及び把握可能な装置、システム、プログラム及び方法に関する。 The present invention relates to a speech recognition device, a management system, a management program, and a speech recognition method. More specifically, the present invention relates to a device, system, program, and method capable of preventing and grasping troubles arising from conversation by converting voice data into text, generating utterance data, and comparing it with predetermined prohibited word data.

労働現場、特に生活者と直接接触する現場などでは、会話からトラブルが生ずることがある。不適切な発言がハラスメント行為となるトラブルや、契約内容等に関する発言の有無や内容の子細が後になって争われるトラブルがしばしばある。会話によるトラブルの中には、言動による脅迫行為や、犯罪の示唆、教唆といった深刻なケースもある。 At work sites, especially at sites where people come into direct contact with people, troubles may arise from conversations. There are often troubles in which inappropriate remarks constitute harassment, and troubles in which the presence or absence of remarks regarding the contents of contracts and the details of the contents are disputed later. Among the troubles caused by conversation, there are also serious cases such as threatening acts by words and actions, and suggestions or abetting crimes.

労働現場に限らず、日常生活においても、会話から生じるトラブルは重要な問題である。こうしたトラブルにおいて、会話を収音して分析することは、トラブルの予防及び事後処理に役立ち重要である。従来より、航空機のコクピットには乗務員の会話を収音して録音するボイスレコーダーが備え付けられている。万が一のインシデント及び事故において、このボイスレコーダーに録音された音声は再生され分析され、インシデント及び事故の原因の究明に役立つ。コールセンターのオペレーターにおいては、オペレーターの顧客に対する応対をマイク装置が収音して記録した音声データを、管理者が再生し分析してトラブル対応に役立てる管理方法が、しばしば採用されている。 Trouble caused by conversation is an important problem not only in the workplace but also in daily life. In such troubles, it is important to collect and analyze conversations to help prevent troubles and deal with them after the fact. Conventionally, aircraft cockpits are equipped with voice recorders that pick up and record crew conversations. In the unlikely event of an incident or accident, the voice recorded by this voice recorder will be played back and analyzed to help investigate the cause of the incident or accident. Call center operators often adopt a management method in which voice data recorded by a microphone device of the operator's response to a customer is played back and analyzed by the manager to assist in troubleshooting.

一方、小型のマイク装置と録音装置とを組み合わせて携帯可能としたボイスレコーダーが考案され、利用されている（特許文献１）。特許文献１に記載の発明によると、従業員による外出先での業務中の会話がボイスレコーダーに収音して録音されるため、管理者は、事後処理のために、当該ボイスレコーダーに録音された会話を再生して分析できる。 On the other hand, a portable voice recorder has been devised and used by combining a small microphone device and a recording device (Patent Document 1). According to the invention described in Patent Literature 1, conversations by employees while they are out and about are recorded by a voice recorder. Conversations can be played back and analyzed.

特開２００７－３２５３９２号公報JP 2007-325392 A

しかしながら、特許文献１に記載の管理方法においては、ボイスレコーダーへの録音時間が膨大となる。そして、この録音時間は、対象者の数が増えるにつれて、増大する。また、この録音時間は、管理の対象とする時間が増えるにつれても、増大する。したがって、管理者は、録音された音声データからトラブルの発生を示す目的の音声データを探し出すにあたって、多大な労力と時間とを要する。また、管理者は、録音された会話を活用したトラブル対応を行うにあたって、録音された音声データを回収し、発生したトラブルに対応する音声データを探し出し、さらに、探し出された音声データを再生して音声データの内容を検査するという手順を踏む必要がある。そのため、管理者は、トラブルの発生から対応までに時間を要する。したがって、こうしたボイスレコーダー等により対象者の会話を録音する管理は、管理者が、会話によるトラブルに対して早期の対応を行うという点において、また、分析に要する時間と労力の点において、不十分な点があった。 However, in the management method described in Patent Literature 1, the recording time to the voice recorder becomes enormous. And this recording time increases as the number of subjects increases. Also, this recording time increases as the time to be managed increases. Therefore, it takes a lot of labor and time for the administrator to search for the voice data indicating the occurrence of the trouble from the recorded voice data. In addition, when dealing with troubles using recorded conversations, the administrator collects the recorded voice data, searches for the voice data corresponding to the trouble that occurred, and furthermore, plays back the found voice data. It is necessary to take steps to inspect the contents of the audio data using the Therefore, it takes time for the administrator to take action after the occurrence of the trouble. Therefore, the management of recording the subject's conversations with a voice recorder, etc. is insufficient in terms of the administrator's early response to troubles caused by conversations, and in terms of the time and effort required for analysis. There was a point.

管理者が、対象者の会話によるトラブルに対して早期の対応を行うためには、「金を出せ」「殺すぞ」などといったトラブルの発生を直接的に示す発言を、会話から探し出すことが重要である。さらに、トラブルの疑いがある発言について、管理者が、その頻度を把握することも重要である。例えば「お母さん」という発言は、それ単独でトラブルの発生を明示するものではない。しかし、管理者が、「お母さん」という発言の頻度を把握できれば、管理者は、「お母さんに言うぞ」「お母さんに知られてもいいのか」などの言い回しを繰り返す脅迫が行われている状況を把握できる。したがって、管理者は、早期の対応を行える。 In order for the manager to respond early to troubles caused by the conversation of the target person, it is important to find utterances that directly indicate the occurrence of trouble, such as "give me money" and "I'll kill you", from conversations. is. Furthermore, it is also important for the administrator to grasp the frequency of statements suspected of causing trouble. For example, the statement "mother" alone does not clearly indicate the occurrence of trouble. However, if the administrator can grasp the frequency of utterances such as "Mom", the administrator will be able to understand the situation where threats are being made by repeating phrases such as "I will tell your mother" and "Is it okay for your mother to know?" I can grasp it. Therefore, the administrator can take early action.

収音し録音された対象者の会話を用いた管理において、管理者は、録音された音声データの大部分を再生して音声データの内容を確認する作業を通じて発言の頻度を把握する。管理者が、その担当する多数の対象者に対して逐一こうした確認作業を行うことには、労力と時間の両方において、大きな困難が伴う。したがって、対象者の会話を収音して分析する管理方法において、収音された会話を録音して管理者が内容を確認する従来の方式には、なお一層の改善の余地がある。 In management using collected and recorded conversations of subjects, the administrator grasps the frequency of utterances through the work of reproducing most of the recorded voice data and confirming the contents of the voice data. It is very difficult for an administrator to perform such confirmation work one by one for a large number of subjects under his charge, in terms of both labor and time. Therefore, in the management method for collecting and analyzing the subject's conversation, there is still room for further improvement in the conventional method in which the collected conversation is recorded and the content is confirmed by the manager.

本発明は、このような考えに基づいてなされたものであり、マイク装置から受信した音声データをテキスト化し、予め定められた禁止ワードデータと比較し、一致又は類似する頻度を集計して表示することで、管理者が対象者の音声を逐一確認することなく対象者の発言を管理可能な音声認識装置、管理システム、管理プログラム及び音声認識方法を提供することを目的とする。 The present invention was made based on such an idea, converts voice data received from a microphone device into text, compares it with predetermined prohibited word data, counts and displays the matching or similar frequency. Accordingly, it is an object of the present invention to provide a speech recognition apparatus, a management system, a management program, and a speech recognition method that enable an administrator to manage speech of a target person without checking the speech of the target person.

本発明者らは、上記課題を解決するために鋭意検討した結果、マイク装置から受信した音声データをテキスト化し、予め定められた禁止ワードデータと比較し、一致又は類似する頻度を集計して表示することで、上記の目的を達成できることを見出し、本発明を完成させるに至った。具体的に、本発明は以下のものを提供する。 As a result of intensive studies to solve the above problems, the present inventors converted voice data received from a microphone device into text, compared it with predetermined prohibited word data, aggregated and displayed the frequency of coincidence or similarity. By doing so, the inventors have found that the above object can be achieved, and have completed the present invention. Specifically, the present invention provides the following.

第１の発明に係る特徴は、対象者の音声を音声データに変換可能なマイク装置と接続され、前記マイク装置から受信した前記音声データをテキスト化可能な音声認識装置であって、前記マイク装置から受信した前記音声データをテキスト化し、前記対象者の発言データを生成する発言データ生成部と、前記発言データと予め定められた禁止ワードデータとを比較し、一致又は類似する頻度を集計する集計部と、前記集計部による集計の結果を管理者に表示する表示部とを有する、音声認識装置である。 A first feature of the invention is a speech recognition device connected to a microphone device capable of converting a subject's voice into voice data, and capable of converting the voice data received from the microphone device into text, wherein the microphone device A utterance data generation unit that converts the voice data received from the above into text and generates the utterance data of the subject, and a tally that compares the utterance data with predetermined prohibited word data and tallies the frequency of matching or similarity. and a display unit for displaying a result of aggregation by the aggregation unit to an administrator.

第１の特徴に係る発明によれば、対象者の音声に含まれる禁止ワードの頻度が、自動的に集計され、表示部に表示される。例えば、管理者が「殺すぞ」などのトラブルの発生を示すフレーズを禁止ワードに定めることで、対象者がそうしたフレーズを含む発言を行った頻度が集計され、表示部を通じて管理者に表示される。管理者は、表示部の表示を通じて、対象者による禁止ワードを含む発言の頻度を把握する。管理者は、表示された禁止ワードを含む発言の頻度を用いて対象者周辺でのトラブルの発生を把握し、対象者への警告や警察への通報などの管理を行う。管理者は、対象者の音声を逐一確認することなく、対象者の発言に関するこの一連の管理を行える。 According to the first aspect of the invention, the frequency of prohibited words contained in the subject's voice is automatically totaled and displayed on the display unit. For example, by setting phrases that indicate the occurrence of trouble, such as "I'll kill you," as prohibited words, the frequency with which the target person made remarks containing such phrases is aggregated and displayed to the administrator through the display unit. . Through the display on the display unit, the administrator grasps the frequency of statements including prohibited words by the target person. The administrator uses the displayed frequency of utterances containing prohibited words to grasp the occurrence of troubles around the target, and performs management such as issuing warnings to the target and reporting to the police. The administrator can manage this series of utterances of the subject without checking the voice of the subject one by one.

よって、第１の特徴にかかる発明によると、管理者が対象者の音声を逐一確認することなく対象者の発言を管理可能な音声認識装置を提供できる。 Therefore, according to the invention according to the first characteristic, it is possible to provide a speech recognition apparatus capable of managing the utterances of the subject without the administrator having to check the speech of the subject one by one.

第２の特徴に係る発明は、第１の特徴の発明に係る発明であって、前記集計部による集計の結果、前記発言データと前記禁止ワードデータとが一致又は類似する頻度が所定の頻度以上である場合に、前記管理者に警告する警告部をさらに有する、音声認識装置を提供する。 The invention according to a second characteristic is the invention according to the first characteristic, wherein as a result of the counting by the counting unit, the frequency of matching or similarity between the utterance data and the prohibited word data is equal to or greater than a predetermined frequency. Provided is a speech recognition device further comprising a warning unit for warning the administrator when .

第２の特徴に係る発明によれば、発言データと禁止ワードデータとが一致又は類似する頻度が所定の頻度以上である場合に、警告部が管理者に警告する。警告部が警告を行うため、管理者は、いち早くトラブルに対応できる。また、所定の頻度以上である場合に警告部による警告が行われることで、いたずらな警告や、発音が良く似たワードを一致又は類似するものとして集計したことによる誤警告が防がれる。したがって、発言管理装置は、必要とされる警告を、誤警告に紛れることなく、管理者のもとにいち早く届けられる。管理者は、より適切かつ迅速に、トラブルの発生を示す発言を把握し、管理できる。 According to the second aspect of the invention, the warning section warns the administrator when the frequency of matching or similarity between the utterance data and the prohibited word data is equal to or higher than a predetermined frequency. Since the warning unit issues a warning, the administrator can quickly deal with the problem. In addition, since the warning unit issues a warning when the frequency is equal to or higher than a predetermined frequency, mischievous warnings and false warnings due to aggregating words with similar pronunciations as matching or similar words can be prevented. Therefore, the utterance management device can quickly deliver a necessary warning to the administrator without being mistaken for a false warning. The manager can more appropriately and quickly grasp and manage remarks indicating the occurrence of trouble.

第３の特徴に係る発明は、第２の特徴の発明に係る発明であって、前記音声データと予め記憶された声紋データとを比較し、前記音声データの発信元が前記声紋データの発声元であるか否かを判別する発信元判別部をさらに有し、前記警告部は、前記音声データの発信元が前記声紋データの発声元であるとき前記管理者に警告する、音声認識装置を提供する。 The invention according to a third feature is the invention according to the second feature, wherein the voice data is compared with pre-stored voiceprint data, and the source of the voice data is determined as the source of the voiceprint data. and the warning unit warns the administrator when the source of the voice data is the source of the voiceprint data. do.

第３の特徴に係る発明によれば、発信元判別部が音声データの発声元を判別することで、予め記憶された声紋データと一致する音声データである場合にのみ、警告部が管理者に警告する動作が行われる。例えば、管理者が、対象者の声紋と「絶対に儲かる」などの不当な勧誘を示す禁止ワードとを登録しておくことにより、対象者による不当な勧誘行為があったことが管理者に警告される。これにより、対象者以外の発話者の音声、特に、対象者との会話に参加していない対象者の周囲に居合わせた第三者の音声、に含まれる禁止ワードが集計され、管理者に警告される誤警告を避けられる。 According to the third aspect of the invention, the caller identification unit identifies the originator of the voice data, so that the warning unit notifies the administrator only when the voice data matches pre-stored voiceprint data. A warning action is taken. For example, by registering the target person's voiceprint and prohibited words such as "absolutely profitable" indicating unfair solicitation, the administrator warns the administrator that there has been an unfair solicitation by the target person. be done. As a result, prohibited words included in the voices of speakers other than the target person, especially those of third parties who were present around the target person who did not participate in the conversation with the target person, are aggregated and warned to the administrator. avoid false alarms.

第４の特徴に係る発明は、第１から第３のいずれかの特徴に係る発明であって、前記発言データ生成部により生成された発言データを前記音声データの発信元ごとに分類して蓄積して保存する発言データ保存部をさらに有する、音声認識装置を提供する。 The invention according to a fourth characteristic is the invention according to any one of the first to third characteristics, wherein the utterance data generated by the utterance data generation unit is classified according to the source of the voice data and stored. Provided is a speech recognition device, further comprising a utterance data storage unit that stores the speech data.

第４の特徴に係る発明によれば、生成された発言データが発信元ごとに分類され蓄積され保存される。これにより、管理者は、保存された発言データを読み出して、その内容を精査できる。不正行為が疑われる場合などで、当事者間で発言の有無や発言の内容についての説明が食い違うことがしばしばある。こうした当事者間で発言の有無や発言の内容についての説明が食い違う状況において、管理者は、保存された発言データという証拠を精査することで、状況を客観的に把握できる。また、対象者が当事者である労働審判において、管理者は、保存された発言データを証拠として提示できる。 According to the fourth aspect of the invention, the generated utterance data is classified for each source, accumulated, and saved. This allows the administrator to read out the saved utterance data and scrutinize its content. In cases such as when fraudulent activity is suspected, there are often discrepancies between the parties regarding the presence or absence of remarks and the content of remarks. In such a situation where there is a discrepancy between the parties regarding the presence or absence of remarks and the content of the remarks, the administrator can objectively grasp the situation by examining the evidence of the saved remark data. Also, in a labor tribunal trial in which the subject is a party, the administrator can present the saved remark data as evidence.

また、第４の特徴に係る発明によれば、発言データが保存されることにより、管理者は、保存された発言データをビッグデータとして解析し、利用できる。管理者は、このような解析により、対象者の発言傾向を分析したり、トラブルの発生前に出現しやすいワードを発見して禁止ワードに加えたりできる。 Also, according to the fourth aspect of the invention, the utterance data is saved, so that the administrator can analyze and use the saved utterance data as big data. Through such analysis, the manager can analyze the utterance tendencies of the target person, find words that tend to appear before trouble occurs, and add them to the prohibited words.

第５の特徴に係る発明は、第４の特徴に係る発明であって、前記表示部は、前記発言データ保存部に保存された複数の発言データの中から特定の発信元の発言データを抽出して表示可能である、音声認識装置を提供する。 The invention according to a fifth feature is the invention according to the fourth feature, wherein the display unit extracts utterance data of a specific caller from among the plurality of utterance data stored in the utterance data storage unit. to provide a speech recognition device that can be displayed as a

第５の特徴に係る発明によれば、表示部は、特定の発信元の発言データを抽出して表示できる。これにより、管理者は、営業成績が良いなど管理者にとって好ましい特徴を有する発信元の発言傾向を分析して利用できる。あるいは、管理者は、営業成績が悪い、クレームが多いなど、管理者にとって好ましくない特徴を有する発信元の発言傾向を分析して利用できる。 According to the fifth aspect of the invention, the display section can extract and display utterance data of a specific caller. As a result, the manager can analyze and use the utterance tendencies of senders who have favorable characteristics for the manager, such as good sales performance. Alternatively, the manager can analyze and use the utterance tendencies of senders with characteristics that are undesirable to the manager, such as poor sales performance and many complaints.

第６の特徴に係る発明は、第１から第５のいずれかの特徴に係る発明であって、前記集計部は、予め定められ、前記音声データの発信元が属する属性において頻繁に用いられる特定音声データと前記音声データとを比較し、前記特定音声データとは異なる前記音声データについて前記集計を行う、音声認識装置を提供する。 The invention according to a sixth feature is the invention according to any one of the first to fifth features, wherein the aggregating unit is predetermined and frequently used in an attribute to which the source of the voice data belongs. Provided is a speech recognition device that compares speech data with the speech data, and performs the aggregation for the speech data different from the specific speech data.

第６の特徴に係る発明によれば、管理者は、対象者が属する属性において、しばしば発生する大きな音を、特定音声データとして定められる。これにより、その大きな音は、集計部で処理されなくなる。例えば、対象者が水道工事作業員であり、周辺で頻繁に工具の音が鳴る場合に、管理者は、工具の音を特定音声データとして登録する。これにより、しばしば発生する工具の音が集計部で処理されなくなる。したがって、集計部の負荷が軽減される。 According to the sixth aspect of the invention, the manager defines loud sounds that frequently occur in the attribute to which the subject belongs as specific audio data. As a result, the loud sound is not processed by the tallying unit. For example, if the target person is a plumbing worker and the tool sounds frequently in the vicinity, the manager registers the tool sound as specific voice data. As a result, frequently occurring tool sounds are not processed in the totalizer. Therefore, the load on the counting unit is reduced.

第７の特徴に係る発明は、対象者の音声を音声データに変換可能なマイク装置と、上記の音声認識装置とを備え、前記マイク装置は、前記音声データに含まれる音の音量を計測する音量計測部と、前記音量が所定量以上である音声データを前記音声認識装置に送信する送信部とを有する、管理システムを提供する。 The invention according to a seventh feature comprises a microphone device capable of converting a subject's voice into voice data, and the voice recognition device described above, wherein the microphone device measures the volume of sound contained in the voice data. A management system is provided, comprising: a volume measuring unit; and a transmitting unit that transmits voice data whose volume is equal to or greater than a predetermined volume to the voice recognition device.

第７の特徴に係る発明によれば、音量計測部が音声データに含まれる音の音量を計測することで、音量が所定量以上である音声データが音声認識装置に送信される。これにより、音量が小さく、したがって、テキスト化され比較され集計される発言が含まれない音声データが、発言データ生成部及び集計部で処理されることが避けられる。したがって、発言データ生成部及び集計部における処理量が減り、管理システムの負荷が軽減される。 According to the seventh aspect of the invention, the sound volume measurement unit measures the sound volume of the sound contained in the sound data, so that the sound data whose sound volume is equal to or greater than a predetermined amount is transmitted to the speech recognition device. As a result, it is possible to avoid processing speech data, which has a low volume and therefore does not contain speeches to be converted into text, compared and tabulated, by the speech data generation unit and the tabulation unit. Therefore, the amount of processing in the utterance data generation unit and totalization unit is reduced, and the load on the management system is reduced.

第８の特徴に係る発明は、対象者の音声を音声データに変換可能なマイク装置と、上記の音声認識装置とを備え、前記マイク装置は、該マイク装置の現在位置を測位する測位部を有し、前記禁止ワードデータは、前記マイク装置の存在位置に応じて予め定められており、前記集計部は、前記マイク装置から受信した前記現在位置に基づいて前記発言データと前記禁止ワードデータとを比較し、一致又は類似する頻度を集計する、管理システムを提供する。 An invention according to an eighth feature comprises a microphone device capable of converting a subject's voice into voice data, and the speech recognition device described above, wherein the microphone device includes a positioning unit for positioning the current position of the microphone device. wherein the prohibited word data is predetermined according to the location of the microphone device, and the tallying unit combines the utterance data and the prohibited word data based on the current location received from the microphone device. and tally the frequency of matches or similarities.

第８の特徴に係る発明によれば、集計部は、マイク装置の存在位置に応じた禁止ワードデータを参照するため、その存在位置を含む施設に応じた不適切な発言が行われた頻度を集計できる。この施設に応じた不適切な発言は、例えば、駅などの人が多い公共の場所において社外秘の情報を発言すること等を含む。また、集計部は、その存在位置を含む施設に応じた適切な発言が行われた頻度も、集計できる。例えば、管理者が、対象者の業務に関連する単語を禁止ワードとして登録する。そして、管理システムが、対象者が業務を行う場所から離れた場所にいるときに、業務に関係する単語を含む発言を行っている頻度を集計し表示する。これにより、管理者は、対象者が、業務の一部として業務を行う場所から離れたのか、あるいは単に業務を放棄して業務を行う場所から離れたのかを、業務に関連する単語を含む発言の多寡を指標に用いて判断できる。 According to the invention according to the eighth characteristic, since the aggregating unit refers to the prohibited word data according to the location of the microphone device, the frequency of inappropriate remarks according to the facility including the location of the microphone is calculated. can be aggregated. Inappropriate remarks according to this facility include, for example, remarking confidential information in a public place such as a train station where many people are present. In addition, the aggregation unit can also aggregate the frequency of appropriate remarks according to the facility including the location. For example, the administrator registers words related to the subject's work as prohibited words. Then, the management system aggregates and displays the frequency of utterances including words related to work when the target person is away from the work place. As a result, the manager can determine whether the subject has left the place of work as part of the work, or whether the person has simply abandoned the work and left the place of work. can be determined using the amount of

第９及び第１０の特徴に係る発明は、第１の特徴に係る発明のカテゴリ違いである。 The inventions according to the ninth and tenth features are different in category from the inventions according to the first feature.

本発明によれば、マイク装置から受信した音声データをテキスト化し、予め定められた禁止ワードデータと比較し、一致又は類似する頻度を集計して表示することで、管理者が対象者の音声を逐一確認することなく対象者の発言を管理できる音声認識装置を提供できる。 According to the present invention, voice data received from a microphone device is converted into text, compared with predetermined prohibited word data, and the frequency of coincidence or similarity is aggregated and displayed so that the administrator can recognize the target person's voice. It is possible to provide a speech recognition device that can manage the utterances of a target person without checking them one by one.

図１は、本実施形態における音声認識装置２０を用いた管理システム１のハードウェア構成とソフトウェア構成を概略的に示すブロック図である。FIG. 1 is a block diagram schematically showing the hardware configuration and software configuration of a management system 1 using a speech recognition device 20 according to this embodiment. 図２は、本実施形態における管理システム１を使用した管理の流れを示すフローチャート図である。FIG. 2 is a flow chart showing the flow of management using the management system 1 in this embodiment. 図３は、本実施形態における禁止ワードリスト２２１の一例を示す図である。FIG. 3 is a diagram showing an example of the prohibited word list 221 in this embodiment. 図４は、本実施形態における集計リスト２２２の一例を示す図である。FIG. 4 is a diagram showing an example of the tally list 222 in this embodiment. 図５は、本実施形態における発言保存リスト２２３の一例を示す図である。FIG. 5 is a diagram showing an example of the statement save list 223 in this embodiment. 図６は、本実施形態における管理システム１を使用して表示部２４に集計結果等を示す管理画面を表示したときの一例を示す図である。FIG. 6 is a diagram showing an example of when a management screen showing total results and the like is displayed on the display unit 24 using the management system 1 according to this embodiment. 図７は、変形例２における特定音声リスト２２５の一例を示す図である。FIG. 7 is a diagram showing an example of the specific voice list 225 in Modification 2. As shown in FIG. 図８は、変形例２において、特定音声リスト２２５を参照し、マイク装置１０が収音した音声データから、しばしば発生する大きな音の音声データを外して集計する処理の一例を示す概念図である。FIG. 8 is a conceptual diagram showing an example of a process of referring to the specific sound list 225 and excluding frequently occurring loud sound data from the sound data picked up by the microphone device 10 and totalizing the data in the modification 2. .

以下、本発明を実施するための好適な形態の一例について図を参照しながら説明する。なお、これはあくまでも一例であって、本発明の技術的範囲はこれに限られるものではない。 An example of a preferred embodiment for carrying out the present invention will be described below with reference to the drawings. This is just an example, and the technical scope of the present invention is not limited to this.

＜管理システム１＞
図１は、本実施形態における音声認識装置２０を用いた管理システム１のハードウェア構成とソフトウェア構成を概略的に示すブロック図である。 <Management system 1>
FIG. 1 is a block diagram schematically showing the hardware configuration and software configuration of a management system 1 using a speech recognition device 20 according to this embodiment.

管理システム１は、対象者の音声を音声データに変換可能なマイク装置１０と、マイク装置１０と接続され、管理システム１を運用するためにデータ管理やデータ処理、画面表示や管理者への警告等を行う音声認識装置２０と、を含んで構成される。 The management system 1 is connected to a microphone device 10 capable of converting a subject's voice into voice data, and is connected to the microphone device 10. In order to operate the management system 1, data management, data processing, screen display, and warning to the administrator are performed. and a speech recognition device 20 that performs the above.

〔マイク装置１０〕
マイク装置１０は、マイク装置１０の動作を制御する制御部１１と、制御部１１のマイクロコンピューターで実行される制御プログラム等が記憶される記憶部１２と、音声認識装置２０その他の機器と通信を行う通信部１３と、マイク装置１０の現在位置を測位する測位部１５とを備える。 [Microphone device 10]
The microphone device 10 includes a control unit 11 that controls the operation of the microphone device 10, a storage unit 12 that stores control programs and the like executed by the microcomputer of the control unit 11, and a speech recognition device 20 that communicates with other devices. and a positioning unit 15 for positioning the current position of the microphone device 10 .

必須ではないが、マイク装置１０が固有の識別情報を持ち、この識別情報を音声認識装置２０に送信することが好ましい。固有の識別情報が送信されれば、音声認識装置２０は、この識別情報を受信して利用することで、マイク装置１０と発信元とを容易に紐付けられる。それによって、音声認識装置２０は、発信元を速やかに特定できる。この固有の識別情報として、例えば、マイク装置１０のＩＰｖ６アドレスや、マイク装置１０が備えるネットワークカードのＭＡＣアドレスなどが利用可能である。 Although not essential, it is preferable that the microphone device 10 have unique identification information and transmit this identification information to the speech recognition device 20 . If the unique identification information is transmitted, the speech recognition device 20 receives and uses this identification information, thereby easily linking the microphone device 10 and the caller. Thereby, the speech recognition device 20 can quickly identify the caller. As this unique identification information, for example, the IPv6 address of the microphone device 10, the MAC address of the network card included in the microphone device 10, or the like can be used.

制御部１１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）等を備える。 The control unit 11 includes a CPU (Central Processing Unit), a RAM (Random Access Memory), a ROM (Read Only Memory), and the like.

記憶部１２は、データやファイルが記憶される装置であって、ハードディスクや半導体メモリ、記録媒体、メモリカード等による、データのストレージ部を有する。記憶部１２には、制御部１１のマイクロコンピューターで実行される制御プログラム等が記憶されている。 The storage unit 12 is a device for storing data and files, and has a data storage unit such as a hard disk, a semiconductor memory, a recording medium, or a memory card. The storage unit 12 stores control programs and the like executed by the microcomputer of the control unit 11 .

通信部１３は、マイク装置１０その他の機器と通信可能にするためのデバイス、例えばイーサネット（登録商標）規格に対応したネットワークカード、携帯電話ネットワークに対応した無線装置等を有する。 The communication unit 13 has a device for enabling communication with the microphone device 10 and other devices, such as a network card compatible with the Ethernet (registered trademark) standard, a wireless device compatible with a mobile phone network, and the like.

測位部１５の構成は、特に限定されない。測位部１５として、例えば、ＧＰＳ衛星からの信号を受信して測位するＧＰＳ受信機を用いた測位システムや、携帯電話の基地局からの情報を用いて測位するシステム等が挙げられる。 The configuration of the positioning unit 15 is not particularly limited. Examples of the positioning unit 15 include a positioning system using a GPS receiver that receives signals from GPS satellites and performs positioning, and a positioning system that uses information from a mobile phone base station.

〔音声認識装置２０〕
音声認識装置２０は、音声認識装置２０の動作を制御する制御部２１と、制御部２１のマイクロコンピューターで実行される制御プログラム等や、発言データの処理に用いられる各種データ等や、発言データを処理して生成されるデータ等が格納される記憶部２２と、マイク装置１０との通信を行う通信部２３と、集計結果等を表示する表示部２４とを備える。発言データの処理に用いられるデータには、禁止ワードデータ等を含む。発言データを処理して生成されるデータには、禁止ワードと一致又は類似する発言を集計した結果等が含まれる。 [Voice recognition device 20]
The speech recognition device 20 includes a control unit 21 that controls the operation of the speech recognition device 20, a control program executed by a microcomputer of the control unit 21, various data used for processing speech data, and speech data. It includes a storage unit 22 that stores data generated by processing, etc., a communication unit 23 that communicates with the microphone device 10, and a display unit 24 that displays tallied results and the like. The data used for processing the utterance data includes prohibited word data and the like. The data generated by processing the utterance data includes the result of tallying utterances that match or are similar to prohibited words.

表示部２４の種類は、特に限定されない。表示部２４として、例えば、モニタ、タッチパネル等が挙げられる。 The type of display unit 24 is not particularly limited. Examples of the display unit 24 include a monitor and a touch panel.

制御部２１は、所定のプログラムを読み込み、必要に応じて記憶部２２及び／又は通信部２３と協働することで、管理システム１におけるソフトウェア構成の要素である発言データ生成部２１１、集計部２１２、警告部２１３、発言データ保存部２１４等を実現する。 The control unit 21 reads a predetermined program, and cooperates with the storage unit 22 and/or the communication unit 23 as necessary to generate an utterance data generation unit 211 and an aggregation unit 212, which are software configuration elements in the management system 1. , warning unit 213, utterance data storage unit 214, and the like.

制御部２１、記憶部２２、通信部２３のハードウェア構成は、それぞれ制御部１１、記憶部１２、通信部１３のハードウェア構成と同様である。また、記憶部２２には、禁止ワードデータを列挙した禁止ワードリスト２２１、禁止ワードと一致あるいは類似する発言データの頻度を集計した集計リスト２２２等が記憶されている。加えて、記憶部２２は、マイク装置１０から受信した対象者の音声データをセット可能に構成されている。 The hardware configurations of the control unit 21, the storage unit 22, and the communication unit 23 are the same as the hardware configurations of the control unit 11, the storage unit 12, and the communication unit 13, respectively. The storage unit 22 also stores a prohibited word list 221 that lists prohibited word data, a total list 222 that tallies the frequencies of utterance data that match or are similar to prohibited words, and the like. In addition, the storage unit 22 is configured to be able to set the subject's voice data received from the microphone device 10 .

＜管理システム１を使用した管理の手順＞
図２は、音声認識装置２０を用いた管理システム１を使用して対象者の発言を管理する手順を示すフローチャートの一例である。以下では、図２を参照しながら、管理システム１の好ましいソフトウェア構成についてより詳しく説明する。 <Management procedure using management system 1>
FIG. 2 is an example of a flow chart showing a procedure for managing a subject's utterances using the management system 1 using the speech recognition device 20. As shown in FIG. A preferred software configuration of the management system 1 will be described in more detail below with reference to FIG.

〔ステップＳ１０：音声の変換〕
まず、マイク装置１０は、対象者の音声を収音する。そして、マイク装置１０の制御部１１は、記憶部１２と協働して、この収音された音声を音声データに変換する。 [Step S10: Audio conversion]
First, the microphone device 10 picks up the voice of the subject. The control unit 11 of the microphone device 10 cooperates with the storage unit 12 to convert the collected sound into sound data.

この音声データの形式には、ＷＡＶ形式やＭＰ３形式、ＦＬＡＣ形式など、周知の形式が用いられる。通信帯域の消費を抑えるために、ＭＰ３形式等の不可逆圧縮を用いた形式が用いられることが好ましい。 Well-known formats such as WAV format, MP3 format, and FLAC format are used as the format of this audio data. In order to suppress communication band consumption, it is preferable to use a format using irreversible compression, such as the MP3 format.

〔ステップＳ１１：音量の計測〕
続いて、制御部１１は、記憶部１２と協働して、音量計測部１１１を実行し、この音声データの音量を計測する。 [Step S11: Volume measurement]
Subsequently, the control unit 11 cooperates with the storage unit 12 to execute the volume measurement unit 111 to measure the volume of this audio data.

〔ステップＳ１２：音量が所定量以上か判定〕
ステップＳ１１で計測された音量が所定量以上なら、ステップＳ１３へ進む。音量が所定量未満なら、処理を終了し、音声の入力を待つ待機状態に戻る。 [Step S12: Determining whether the sound volume is equal to or greater than a predetermined amount]
If the volume measured in step S11 is equal to or greater than the predetermined amount, the process proceeds to step S13. If the volume is less than the predetermined amount, the processing is terminated and the device returns to the standby state for waiting for voice input.

本実施形態に係る管理システム１によると、ステップＳ１１及びステップＳ１２に係る段階を経ることで、音量が所定量以上である場合にのみ音声データを音声認識装置２０に送信できる。 According to the management system 1 according to the present embodiment, through steps S11 and S12, it is possible to transmit voice data to the voice recognition device 20 only when the volume is equal to or greater than a predetermined amount.

〔ステップＳ１３：現在位置の測位〕
制御部１１は、測位部１５と協働してマイク装置１０の現在位置を測位し、現在位置データとして記憶部１２に記憶する。 [Step S13: Measurement of current position]
The control unit 11 measures the current position of the microphone device 10 in cooperation with the positioning unit 15 and stores it in the storage unit 12 as current position data.

必須の態様ではないが、予めスケジュールされたタイミングで測位部１５が測位した現在位置を、現在位置データとして記憶部１２に記憶し、この現在位置データを読み出すことで、このステップにおける現在位置の測位に代えることも好ましい。これにより、音声を変換するたびに現在位置を測位し、測位処理の完了を待つことを避けられる。 Although it is not an essential mode, the current position measured by the positioning unit 15 at a pre-scheduled timing is stored as current position data in the storage unit 12, and the current position data is read out to perform positioning of the current position in this step. It is also preferable to replace with As a result, it is possible to avoid positioning the current position each time the voice is converted and waiting for the completion of the positioning process.

〔ステップＳ１４：音声データ送信〕
制御部１１は、記憶部１２及び通信部１３と協働して、送信部１１２を実行し、音声データ及び現在位置データを音声認識装置２０に送信する。 [Step S14: Audio data transmission]
The control unit 11 cooperates with the storage unit 12 and the communication unit 13 to execute the transmission unit 112 and transmit the voice data and the current position data to the voice recognition device 20 .

本実施形態に係る管理システム１によると、ステップＳ１０からステップＳ１４に係る段階を経ることで、音声データを音声認識装置２０に送信する。また、本実施形態に係る管理システム１によると、ステップＳ１３からステップＳ１４に係る段階を経ることで、現在位置を音声認識装置２０に送信できる。 According to the management system 1 according to the present embodiment, voice data is transmitted to the voice recognition device 20 through steps S10 to S14. Further, according to the management system 1 according to the present embodiment, the current position can be transmitted to the speech recognition device 20 through steps S13 to S14.

〔ステップＳ２０：音声データ受信〕
ステップＳ１４で音声データ及び現在位置データが送信されると、音声認識装置２０の通信部２３が、それを受信する。受信された音声データ及び現在位置データは、記憶部２２に記憶される。 [Step S20: Receive voice data]
When the voice data and the current position data are transmitted in step S14, the communication unit 23 of the voice recognition device 20 receives them. The received voice data and current position data are stored in the storage unit 22 .

〔ステップＳ２１：発言データ生成〕
ステップＳ２０で音声データが受信されると、制御部２１は、記憶部２２と協働して発言データ生成部２１１を実行し、受信された音声データをテキスト化し、発言データを生成する。生成された発言データは記憶部２２に記憶される。音声データのテキスト化には、統計モデルを用いた音声認識等、周知の音声認識技術が用いられる。 [Step S21: Generation of utterance data]
When the voice data is received in step S20, the control unit 21 cooperates with the storage unit 22 to execute the utterance data generation unit 211, convert the received voice data into text, and generate utterance data. The generated utterance data is stored in the storage unit 22 . A well-known speech recognition technique such as speech recognition using a statistical model is used to convert speech data into text.

〔ステップＳ２３：禁止ワードとの比較〕
ステップＳ２１で発言データが生成されると、制御部２１は、記憶部２２と協働して集計部２１２を実行し、発言データと、禁止ワードリスト２２１に記憶された禁止ワードのそれぞれとを比較する。発言データと禁止ワードとが一致するならば、ステップＳ２４へ進む。発言データと禁止ワードとが一致しないならば、ステップＳ２６へ進む。 [Step S23: Comparison with Prohibited Words]
When the utterance data is generated in step S21, the control unit 21 cooperates with the storage unit 22 to execute the counting unit 212, and compares the utterance data with each of the prohibited words stored in the prohibited word list 221. do. If the utterance data and the prohibited word match, the process proceeds to step S24. If the utterance data and the prohibited word do not match, the process proceeds to step S26.

図３は、本実施形態における禁止ワードリスト２２１の一例を示す図である。図３に「発信元」と「禁止ワード」とで示されるように、禁止ワードリスト２２１には、発信元と、発信元に紐付けられた禁止ワードとからなる組が、ＩＤとともに列挙されている。例示する図３のＩＤ：１は、対象者である発信元の「田中太郎」について、「お母さん」というフレーズを禁止ワードに定めたことを示している。 FIG. 3 is a diagram showing an example of the prohibited word list 221 in this embodiment. As indicated by "source" and "prohibited word" in FIG. there is ID: 1 in FIG. 3 indicates that the phrase "mother" is defined as a prohibited word for the sender "Taro Tanaka" who is the target person.

発信元と禁止ワードとが紐付けられることにより、管理者は、発信元ごとに禁止ワードを設定できる。これにより、発信元に応じた内容の不適切な発言を、きめ細かく管理できる。例えば、「髪がきれい」というフレーズは、セクシュアル・ハラスメントにつながりうる発言であるから、禁止ワードに定めて集計を行うが、発信元が美容師である場合には発信元の業務に付随する通常の発言であり、禁止ワードに定めず、集計もしない、等の管理を行える。 By associating the originator with the prohibited word, the administrator can set the prohibited word for each originator. As a result, it is possible to finely manage inappropriate remarks according to the source. For example, the phrase "My hair is beautiful" is a remark that could lead to sexual harassment, so we set it as a prohibited word and tabulate it, but if the sender is a hairdresser, it is normal for the sender's work. It is a remark, and it can be managed such that it is not set as a prohibited word and not counted.

必須ではないが、禁止ワードリスト２２１は、図３に「エリア」で示されるように、禁止ワードと紐付けられたエリア情報を含むことも好ましい。禁止ワードリスト２２１がエリアを含むことにより、集計部２１２は、このエリアと現在位置データとを比較して、現在位置に応じた禁止ワードを参照することが可能となる。これにより、現在位置に応じた、よりきめ細かな禁止ワードの集計が可能となる。 Although not essential, the prohibited word list 221 preferably also includes area information associated with the prohibited words, as indicated by "Area" in FIG. By including the area in the prohibited word list 221, the tabulating unit 212 can compare this area with the current position data and refer to the prohibited word corresponding to the current position. As a result, it is possible to collect prohibited words more precisely according to the current position.

例示する図３のＩＤ：１～３及び１０１～１０４を用いて現在位置に応じた禁止ワードについて説明する。この例では、水道工事に従事する「田中太郎」について、７組の禁止ワードとエリアとの組を定めている。ＩＤ：１及び１０１では、対象者である発信元の「田中太郎」について、「駅」と「住宅地」の２つのエリアについて、「お母さん」というフレーズを禁止ワードに定めている。「お母さん」というフレーズは、発言が行われたエリアによらず、脅迫等のトラブルを示すため、両方のエリアに定められている。ＩＤ：２では、「駅」のエリアについて、「水道代」というフレーズを禁止ワードに定めている。水道工事に従事する田中太郎において、駅は通常の業務を行う場所ではない。しかし、「水道代」は業務に関連するフレーズであるため、駅において「水道代」を含む発言を繰り返していれば、管理者は、田中太郎が業務を行うために駅を訪れたと判断できる。ＩＤ：１０２～１０４は、「住宅地」のエリアについて、「美人」「童顔」「髪がきれい」という、容姿に言及するフレーズを禁止ワードに定めている。田中太郎が業務を行う住宅地において容姿に言及することは、田中太郎によるセクシュアル・ハラスメントの可能性を示す。管理者は、田中太郎が住宅地においてこれらの禁止ワードを含む発言を行った頻度から、田中太郎によるセクシュアル・ハラスメントを発見できる。 Prohibited words according to the current position will be described using IDs 1 to 3 and 101 to 104 in FIG. 3 as an example. In this example, 7 pairs of prohibited words and areas are defined for "Taro Tanaka" who is engaged in plumbing work. In IDs: 1 and 101, the phrase "mother" is defined as a prohibited word for the two areas of "station" and "residential area" for "Taro Tanaka", who is the target person and sender. The phrase "mother" is defined in both areas to indicate trouble, such as threats, regardless of the area in which the statement is made. In ID: 2, the phrase "water bill" is defined as a prohibited word for the "station" area. For Taro Tanaka, who is engaged in waterworks, the station is not a place where he does his usual work. However, since "water bill" is a phrase related to work, if Taro Tanaka repeats remarks including "water bill" at the station, the administrator can determine that Taro Tanaka has visited the station to do his work. IDs 102 to 104 define phrases referring to physical appearance as prohibited words for the area of "residential area", such as "beautiful woman", "baby face", and "beautiful hair". Mentioning appearance in the residential area where Taro Tanaka works indicates the possibility of sexual harassment by Taro Tanaka. The administrator can detect sexual harassment by Taro Tanaka from the frequency with which Taro Tanaka made remarks containing these prohibited words in the residential area.

必須ではないが、禁止ワードリスト２２１は、図３に「属性」で示されるように、複数の発信元を束ねた属性情報を含むことも好ましい。禁止ワードリスト２２１が複数の発信元を束ねた属性情報を持つことにより、管理者は、属性を利用して複数の発信元の禁止ワードを一括で変更する等の、より効率的な管理を行える。 Although not essential, the prohibited word list 221 preferably includes attribute information that bundles multiple senders, as indicated by "attribute" in FIG. Since the prohibited word list 221 has attribute information that bundles multiple senders, the administrator can perform more efficient management, such as collectively changing the prohibited words of multiple senders using attributes. .

〔ステップＳ２４：集計処理〕
図２に戻る。ステップＳ２３が禁止ワードと発言データとの一致を検出すると、制御部２１は、記憶部２２と協働して集計リスト２２２に保存された禁止ワードに対応する頻度を増分する。 [Step S24: Tabulation Processing]
Return to FIG. When step S 23 detects a match between the prohibited word and the utterance data, the control unit 21 cooperates with the storage unit 22 to increment the frequency corresponding to the prohibited word stored in the tally list 222 .

図４は、本実施形態における集計リスト２２２の一例を示す図である。図４に示すように、集計リスト２２２には、発信元と、禁止ワードと、頻度とが紐付けられてＩＤとともに保存される。図４のＩＤ：１に、発信元「田中太郎」が、禁止ワード「お母さん」を含む発言を行っていないことが保存されている。図４のＩＤ：２に、発信元の「田中太郎」が、禁止ワード「水道代」を含む発言を８８６回行ったことが保存されている。 FIG. 4 is a diagram showing an example of the tally list 222 in this embodiment. As shown in FIG. 4, in the tally list 222, senders, prohibited words, and frequencies are associated with IDs and stored. ID: 1 in FIG. 4 stores that the originator "Taro Tanaka" has not made any remarks including the prohibited word "mother". ID: 2 in FIG. 4 stores that the caller "Taro Tanaka" made 886 remarks including the prohibited word "water bill".

発信元と、禁止ワードと、頻度とが紐付けられることにより、表示部２４は、発信元ごとに禁止ワードを含む発言を行った頻度を、管理者に表示できる。この表示により、管理者は、発信元ごとにトラブルの発生や、トラブルの前兆をきめ細かく判断できる。 By associating the originator, the prohibited word, and the frequency, the display unit 24 can display to the administrator the frequency of utterances containing the prohibited word for each originator. With this display, the administrator can finely judge the occurrence of troubles and the signs of troubles for each source.

必須ではないが、集計リスト２２２には、現在位置データ及び／又は現在位置データを含むエリアが保存されることも好ましい。集計リスト２２２に現在位置データやエリアが保存されることにより、表示部２４は、保存された現在位置を含むエリアごとの禁止ワードを含む発言の頻度を、管理者に表示できる。この表示により、管理者は、対象者が、エリアごとに定められた不適切な発言を行った頻度を、管理できる。 Although not required, the summary list 222 also preferably stores the current location data and/or the area containing the current location data. By storing the current location data and areas in the tally list 222, the display unit 24 can display to the administrator the frequency of utterances containing prohibited words for each area including the stored current locations. With this display, the administrator can manage the frequency with which the target person made inappropriate remarks defined for each area.

〔ステップＳ２６：発言データの保存〕
図２に戻る。制御部２１は、記憶部２２と協働して発言データ保存部２１４を実行し、発言保存リスト２２３に発言データを保存する。 [Step S26: Saving Speech Data]
Return to FIG. Control unit 21 cooperates with storage unit 22 to execute utterance data storage unit 214 and store utterance data in utterance storage list 223 .

図５は、本実施形態における発言保存リスト２２３の一例を示す図である。図５に「発信元」「発言データ」「音声データ」でそれぞれ示されるように、発言保存リスト２２３には、発信元と、テキスト化された発言データと、音声データとが紐付けられて、ＩＤとともに保存される。図５のＩＤ：１には、発信元「田中太郎」が、東京都千代田区で、２０１８年１２月１６日の１２時ちょうどに、「今月の水道代が・・・」という発言を行ったことと、その際の音声データ［ｓｏｕｎｄ１］とが、保存されている。 FIG. 5 is a diagram showing an example of the statement save list 223 in this embodiment. As indicated by "source", "utterance data", and "voice data" in FIG. Saved with ID. In ID: 1 in Fig. 5, the originator "Taro Tanaka" made a statement in Chiyoda Ward, Tokyo, at exactly 12:00 on December 16, 2018, "This month's water bill is..." and the sound data [sound1] at that time are stored.

発信元と発言データとが紐付けられることにより、管理者は、発信元ごとに発言データを管理できる。また、発言データと音声データとが紐付けられることにより、管理者は、音声データを再生して、対象者の語調や声の大きさと言った、発言に関するより詳しい情報を得られる。 By associating the caller and the message data, the administrator can manage the message data for each caller. In addition, by linking the utterance data and the voice data, the administrator can reproduce the voice data and obtain more detailed information about the utterance, such as the subject's tone of voice and voice volume.

必須ではないが、発言が行われた時点での対象者の現在位置を示す位置情報と、発言データとが紐付けられて発言保存リスト２２３に保存されることもまた好ましい。位置情報が保存されることにより、管理者は、発言が行われた場所を把握できる。これにより、管理者は、発言が行われた位置周辺での聞き込み等、追加の調査を行える。 Although it is not essential, it is also preferable that position information indicating the current position of the target person at the time the statement was made and the statement data be associated with each other and stored in the statement storage list 223 . By storing the position information, the administrator can grasp the place where the statement was made. This allows the administrator to conduct additional research, such as asking around the location where the statement was made.

必須ではないが、発言が行われた日時を示す時間情報が発言データと紐付けられて発言保存リスト２２３に保存されることもまた好ましい。時間情報が保存されることにより、管理者は、発言が行われた日時を把握できる。これにより、管理者は、特定の日時に生じたトラブルと、発言データとが互いに関連するものかどうかを判断できる。また、管理者は、この時間情報を用いて発言データを検索し、特定の日時に生じたトラブルに関連する発言データを効率よく探し出せる。さらに、管理者は、発言が行われた日時を用いて聞き込みを行うなどの、追加の調査を行える。 Although not essential, it is also preferable that the time information indicating the date and time when the statement was made is associated with the statement data and stored in the statement storage list 223 . By storing the time information, the administrator can grasp the date and time when the statement was made. This allows the administrator to determine whether the trouble that occurred on a specific date and time and the message data are related to each other. In addition, the administrator can search for statement data using this time information, and can efficiently find statement data related to trouble that occurred on a specific date and time. In addition, the administrator can conduct additional research, such as asking questions using the date and time the statement was made.

〔ステップＳ２７：所定の頻度との比較〕
図２に戻る。制御部２１は、記憶部２２と協働して警告部２１３を実行し、禁止ワードを含む発言の頻度と所定の頻度とを比較する。発言の頻度が所定の頻度以上ならば、ステップＳ２８へ進む。発言の頻度が所定の頻度未満ならば、ステップＳ２９へ進む。 [Step S27: Comparison with Predetermined Frequency]
Return to FIG. Control unit 21 cooperates with storage unit 22 to execute warning unit 213, and compares the frequency of utterances containing prohibited words with a predetermined frequency. If the frequency of utterances is equal to or higher than the predetermined frequency, the process proceeds to step S28. If the frequency of utterances is less than the predetermined frequency, the process proceeds to step S29.

〔ステップＳ２８：警告処理〕
ステップＳ２７において、発言の頻度が所定の頻度以上であれば、制御部２１は、記憶部２２と協働して、管理者に警告する。この警告は、発言の頻度が所定の頻度以上となったことを表示部２４に表示するものであることが好ましい。 [Step S28: Warning processing]
In step S27, if the frequency of utterances is equal to or higher than the predetermined frequency, the control unit 21 cooperates with the storage unit 22 to warn the administrator. This warning preferably displays on the display unit 24 that the frequency of utterances has exceeded a predetermined frequency.

発言の頻度が所定の頻度以上となったことが警告されることにより、管理者は、いち早くトラブルの発生又はその前兆に気づける。これにより、管理者は、対象者への警告や警察への通報等をいち早く行える。また、所定の頻度以上である場合に警告が行われることで、いたずらな警告や、発音が良く似たワードを一致又は類似するものとして集計することによる誤警告が防がれる。 By being warned that the frequency of utterances exceeds a predetermined frequency, the administrator can quickly notice the occurrence of trouble or its precursor. As a result, the administrator can promptly issue a warning to the target person, report to the police, or the like. Further, by issuing a warning when the frequency is equal to or higher than a predetermined frequency, it is possible to prevent mischievous warnings and false warnings caused by aggregating words with similar pronunciations as matching or similar words.

必須ではないが、この警告は、発信元を含むことも好ましい。警告が、発信元を含むことで、管理者は、どの対象者に対してトラブル対応を行うべきかをいち早く判断できる。 Although not required, this alert also preferably includes the originator. By including the sender of the warning, the administrator can quickly determine which target person should be dealt with.

必須ではないが、この警告は、発言データを含むことも好ましい。警告が、発言データを含むことで、管理者は、対応する発言データを探すことなく、どのようなトラブル対応を行うべきかを判断できる。 Although not required, this alert also preferably includes utterance data. By including the message data in the warning, the administrator can determine what kind of troubleshooting should be done without searching for the corresponding message data.

必須の態様ではないが、制御部２１が記憶部２２及び通信部２３と協働して管理者にメール及び／又はインスタントメッセージを送信することで、管理者に、警告する態様もまた好ましい。管理者にメール及び／又はインスタントメッセージが送信されることで、管理者が音声認識装置２０から離れた場所にいる場合であっても、管理者は、警告を受け取れる。これにより、管理者は、いち早く禁止ワードの頻度が所定の頻度以上となったことを知り、トラブルに対応できる。 Although not an essential aspect, it is also preferable that the control unit 21 cooperate with the storage unit 22 and the communication unit 23 to send an email and/or an instant message to the administrator to warn the administrator. By sending an email and/or an instant message to the administrator, the administrator can be alerted even when the administrator is away from the speech recognition device 20 . As a result, the administrator can quickly know that the frequency of prohibited words exceeds a predetermined frequency, and can deal with the trouble.

また、必須の態様ではないが、制御部２１が記憶部２２及び通信部２３と協働して、警告音や電話による呼び出し等の音声を用いた追加の警告を行う態様もまた好ましい。警告音や電話による呼び出し等の音声を用いることで、管理者は、表示部２４を常時監視すること無く、警告を受け取れる。 Moreover, although not an essential mode, it is also preferable that the control section 21 cooperates with the storage section 22 and the communication section 23 to give an additional warning using a warning sound or a voice such as a telephone call. The administrator can receive the warning without always monitoring the display unit 24 by using the warning sound or the voice such as calling by telephone.

〔ステップＳ２９：表示処理〕
制御部２１は、記憶部２２と協働して、集計の結果を表示部２４に表示する。表示される集計の結果は、発信元と、禁止ワードと、禁止ワードを含む発言の頻度とを含む。 [Step S29: Display processing]
The control unit 21 cooperates with the storage unit 22 to display the tallied result on the display unit 24 . The aggregated results displayed include the source, the banned words, and the frequency of utterances containing the banned words.

発信元と、禁止ワードと、禁止ワードを含む発言の頻度とが表示部２４に表示されることにより、管理者は、発信元による禁止ワードを含む発言の頻度を把握する。管理者は、表示された禁止ワードを含む発言の頻度を用いて発信元周辺でのトラブルの発生を把握し、対象者への警告や警察への通報などの管理を行う。管理者は、発信元の音声を逐一確認することなく、発信元の発言に関するこの一連の管理を行える。 By displaying the caller, the prohibited words, and the frequency of statements containing the prohibited words on the display unit 24, the administrator can grasp the frequency of the statements containing the prohibited words by the callers. The administrator uses the displayed frequency of utterances containing prohibited words to grasp the occurrence of troubles around the originator, and performs management such as issuing a warning to the target person and reporting to the police. The manager can manage this series of statements of the caller without checking the caller's voice one by one.

必須ではないが、発言データ保存部２１４が保存した直近の発言データについて、発信元、日時、場所、会話データ及び／又は音声データへのアクセス手段を含む一組の概要を、予め定められた数だけ組を列挙して、表示部２４に表示することも可能である。直近の発言データの概要が表示されることにより、管理者は、現在発言している対象者を把握できる。概要に会話データ及び／又は音声データへのアクセス手段が含まれるため、管理者は、現在発言している対象者の発言データや音声データ等を速やかに確認できる。 Although not essential, for the most recent utterance data stored in the utterance data storage unit 214, a set of outlines including caller, date and time, place, means of access to conversation data and/or voice data are stored in a predetermined number. It is also possible to enumerate only the sets and display them on the display unit 24 . By displaying a summary of the most recent speech data, the administrator can grasp the target person who is currently speaking. Since the summary includes means for accessing conversation data and/or voice data, the administrator can quickly check the speech data, voice data, etc. of the subject who is currently speaking.

＜管理システム１の使用例＞
続いて、本実施形態における音声認識装置２０を用いた管理システム１の使用例を説明する。 <Usage example of management system 1>
Next, a usage example of the management system 1 using the speech recognition device 20 according to this embodiment will be described.

まず、管理者は、マイク装置１０と音声認識装置２０とを接続する。この接続は、例えば、管理者が、マイク装置１０の記憶部１２に音声認識装置２０を示すＩＰｖ６アドレスを登録し、さらに、音声認識装置２０の記憶部２２にマイク装置１０を示すＩＰｖ６アドレスを登録し、互いにインターネットを介した通信を行えるよう構成すること等によって行われる。 First, the administrator connects the microphone device 10 and the speech recognition device 20 . For this connection, for example, the administrator registers an IPv6 address indicating the voice recognition device 20 in the storage unit 12 of the microphone device 10, and further registers an IPv6 address indicating the microphone device 10 in the storage unit 22 of the voice recognition device 20. This is done by, for example, configuring them so that they can communicate with each other via the Internet.

続いて、管理者は、対象者にマイク装置１０を携帯させる、マイク装置１０を対象者のデスクに据え付ける等して、対象者の音声を音声データに変換可能な状態にする。変換された音声データの音量が計測され、音量が所定以上のとき、マイク装置１０は、音声認識装置２０に音声データを送信する。 Subsequently, the administrator causes the subject to carry the microphone device 10, installs the microphone device 10 on the subject's desk, or the like, so that the subject's voice can be converted into voice data. The volume of the converted voice data is measured, and the microphone device 10 transmits the voice data to the voice recognition device 20 when the volume is greater than or equal to a predetermined value.

音声認識装置２０は、マイク装置１０から音声データを受信し、音声データをテキスト化して発言データを生成し、予め定められた禁止ワードと比較し、一致又は類似する発言の頻度を集計して表示部２４に表示する一連の動作を行う。また、音声認識装置２０は、生成された発言データを、発信元ごとに分類し、発言保存リスト２２３に保存できる。 The speech recognition device 20 receives speech data from the microphone device 10, converts the speech data into text to generate utterance data, compares it with predetermined prohibited words, counts and displays the frequency of matching or similar utterances. A series of operations to be displayed on the unit 24 are performed. Further, the speech recognition device 20 can classify the generated utterance data according to the originator and store them in the utterance storage list 223 .

図６は、表示部２４に表示される集計の結果の表示例である。「キーワード検出数（田中太郎）」の下に、禁止ワードそれぞれについて、禁止ワードと、禁止ワードと一致又は類似する発言の頻度とからなる一組の情報が、予め定められた数だけ表示される。図６の「（田中太郎）」は、「発信元：田中太郎」について、禁止ワード及び発言の頻度が表示されていることを示す。管理者は、発信元を指定することで、任意の発信元について、禁止ワードと発言の頻度とを、同様に表示させられる。管理者は、「さらに見る」を操作することで、より多くの禁止ワードと発言の頻度との組を表示させられる。 FIG. 6 is a display example of the totalization result displayed on the display unit 24 . Under the "number of detected keywords (Taro Tanaka)", for each prohibited word, a set of information consisting of the prohibited word and the frequency of statements that match or are similar to the prohibited word is displayed for a predetermined number. . "(Taro Tanaka)" in FIG. 6 indicates that prohibited words and frequency of utterances are displayed for "source: Taro Tanaka". By specifying the sender, the administrator can similarly display the prohibited words and the frequency of utterances for arbitrary senders. By operating "See more", the administrator can display more combinations of prohibited words and frequency of remarks.

図６に示すように、発言の頻度に応じた大きさの棒グラフを表示することも可能である。発言の頻度に応じた大きさの棒グラフを表示することにより、管理者は、より直観的に禁止ワードと一致又は類似する発言の頻度を把握できる。 As shown in FIG. 6, it is also possible to display a bar graph whose size corresponds to the frequency of utterances. By displaying a bar graph whose size corresponds to the frequency of utterances, the administrator can more intuitively grasp the frequency of utterances that match or are similar to prohibited words.

図６に「最新利用状況一覧」とその下の表として示すように、保存された直近の発言データについて、対象者の発信元、日時、場所、発言データ及び／又は音声データへのアクセス手段を含む一組の概要を、予め定められた数だけ組を列挙して表示することも可能である。直近の発言データの概要が表示されることにより、管理者は、現在発言している対象者を把握できる。概要が表示されることにより、管理者は、現在発言している対象者の発言データや音声データ等を速やかに確認できる。管理者は、「会話内容（テキスト）」の列の「詳細」を操作することで、保存された発言データにアクセスできる。また、管理者は、「会話内容（音声）」の列の「詳細」を操作することで、保存された音声データにアクセスできる。管理者は、「さらに見る」を操作することで、より多くの概要を表示させられる。 As shown in FIG. 6 as a "latest usage list" and a table below it, for the most recent saved utterance data, the sender, date and time, place, and means of access to the utterance data and/or voice data of the target person are displayed. It is also possible to display a set of summaries including a predetermined number of sets listed. By displaying a summary of the most recent speech data, the administrator can grasp the target person who is currently speaking. By displaying the summary, the administrator can quickly check the utterance data, voice data, etc. of the target person who is currently speaking. The administrator can access the saved statement data by operating "details" in the "conversation content (text)" column. Also, the administrator can access the saved voice data by operating "details" in the "conversation content (voice)" column. The administrator can display more overviews by operating "See more".

管理者は、表示部２４に表示された集計の結果を見て、発言の頻度からトラブルの発生の有無又はトラブルの予兆の有無を判断する。そして、トラブルが発生している、あるいは、トラブルの予兆があると判断した対象者について、対象者に警告する、警察へ通報するなどのトラブル対応を含む管理を行う。また、管理者は、表示された音声データに紐付けられた現在位置を参照して、トラブルが発生している現場に向かう、トラブルが発生している現場近くの担当者に連絡する等のトラブル対応を含む管理を行う。 The administrator looks at the tallied result displayed on the display unit 24 and judges whether or not trouble has occurred or whether there is a sign of trouble based on the frequency of utterances. Then, for a target person who is determined to be in trouble or to have a sign of trouble, management including trouble handling such as warning the target person or reporting to the police is performed. In addition, the administrator can refer to the current location linked to the displayed voice data, go to the site where the trouble is occurring, contact the person in charge near the site where the trouble is occurring, etc. Management including response.

また、音声認識装置２０は、禁止ワードと一致又は類似する発言の頻度が所定の頻度以上である場合に、表示部２４等を通じて管理者に警告を行える。警告を受けた管理者は、上記と同様のトラブル対応を含む管理を行える。 In addition, the speech recognition device 20 can warn the administrator through the display unit 24 or the like when the frequency of statements that match or are similar to prohibited words is equal to or higher than a predetermined frequency. The administrator who received the warning can perform management including troubleshooting in the same manner as above.

管理システム１のこの一連の動作は、管理者が対象者の音声を逐一確認することなく、自動的に行われる。したがって、本実施形態における管理システム１により、管理者は、対象者の音声を逐一確認することなく対象者の発言を管理できる。これにより、管理者は、対象者の発言の管理を、より少ない労力で行える。より少ない労力で発言の管理を行えることにより、管理者は、発言の管理に要する労力を増やすことなく、より多くの対象者の発言を同時に管理できる。また、警告が行われることにより、管理者は、より迅速に対象者の周囲で生じたトラブルに対応できる。 This series of operations of the management system 1 is automatically performed without the administrator checking the target person's voice one by one. Therefore, with the management system 1 of this embodiment, the administrator can manage the utterances of the subject without checking the voice of the subject one by one. As a result, the administrator can manage the utterances of the target person with less effort. By managing utterances with less effort, the manager can simultaneously manage utterances of more subjects without increasing the effort required to manage utterances. In addition, the warning allows the administrator to more quickly deal with troubles that occur around the target person.

＜管理プログラムとして提供可能であること＞
これまで、本発明を管理システム１として提供することについて説明したが、これに限るものではない。本実施形態に記載の発明は、コンピュータに実行させることの可能な管理プログラムとしても提供可能である。当該管理プログラムは、発言データ生成部２１１が、マイク装置１０から受信した音声データをテキスト化し、対象者の発言データを生成する、発音データ生成ステップと、集計部２１２が、発言データと予め定められた禁止ワードデータとを比較し、一致又は類似する頻度を集計する、集計ステップと、表示部２４が、集計部２１２による集計の結果を管理者に表示する表示ステップとを実行させる。 <Can be provided as a management program>
So far, the present invention has been described as being provided as the management system 1, but it is not limited to this. The invention described in this embodiment can also be provided as a management program that can be executed by a computer. The management program includes a pronunciation data generation step in which the utterance data generation unit 211 converts the voice data received from the microphone device 10 into text and generates utterance data of the target person, and Comparing the prohibited word data with the prohibited word data obtained, totaling the matching or similar frequencies, and a displaying step of displaying the result of totaling by the totalizing unit 212 to the administrator.

＜変形例＞
以下、本実施形態に記載の発明における種々の変形例を例示する。 <Modification>
Various modifications of the invention described in this embodiment will be exemplified below.

〔変形例１〕発信元の判別
記憶部２２は、声紋データを予め記憶する声紋データリスト２２４を備え、さらに、制御部２１は、音声データと声紋データリスト２２４に記憶された声紋データとを比較し、音声データの発信元が声紋データの発声元であるか否かを判別する発信元判別部２１５を備えることも好ましい。このとき、発言データは、発信元ごとに分類されて、発言保存リスト２２３に保存されることが好ましい。さらに、警告部２１３は、音声データの発信元が声紋データの発信元であるとき、管理者に警告しても良い。 [Modification 1] Determining Caller The storage unit 22 has a voiceprint data list 224 that stores voiceprint data in advance, and the control unit 21 compares the voice data with voiceprint data stored in the voiceprint data list 224. However, it is also preferable to include a caller determination unit 215 for determining whether or not the caller of the voice data is the caller of the voiceprint data. At this time, it is preferable that the utterance data is classified according to the originator and stored in the utterance storage list 223 . Furthermore, the warning unit 213 may warn the administrator when the source of the voice data is the source of the voiceprint data.

対象者の声紋データを声紋データリスト２２４に記憶し、発信元判別部２１５が発信元を判別することで、音声認識装置２０は、対象者本人の発言のみを集計し、管理者に警告できる。すなわち、対象者以外の第三者の音声、特に、対象者との会話に参加していない対象者の周囲に居合わせた第三者の音声、に含まれる禁止ワードが、集計され、管理者に警告されることがなくなる。管理者は、必要とされる禁止ワードの頻度の、より正確な集計結果を、得られる。 By storing the voiceprint data of the target person in the voiceprint data list 224 and determining the caller by the caller determination unit 215, the voice recognition device 20 can count only the utterances of the target person and warn the administrator. In other words, the prohibited words contained in the voices of third parties other than the target, especially the voices of third parties who were present around the target who did not participate in the conversation with the target, are aggregated and sent to the administrator. You will no longer be warned. The administrator can get a more accurate tally of the forbidden word frequency required.

さらに、この発信元の判別により、発言データ保存部２１４は、発言データを発信元とより適切に対応付け、分類し、発言保存リスト２２３に保存できる。これにより、管理者は、発信元を指定して、指定された発信元に対応する保存された発言データを読み出して、発信元による発言データの内容を精査できる。当事者間で発言の有無や発言の内容についての説明が食い違う状況において、管理者は、保存された発信元による発言データという証拠を精査することで、状況を客観的に把握できる。また、対象者が当事者である労働審判において、管理者は、保存された発信元による発言データを証拠として提示できる。 Further, by identifying the originator, the utterance data storage unit 214 can more appropriately associate the utterance data with the originator, classify the utterance data, and store the data in the utterance storage list 223 . Thereby, the administrator can specify a caller, read out the saved message data corresponding to the specified caller, and examine the contents of the caller's message data. In a situation where there is a discrepancy between the parties regarding the presence or absence of statements and the content of the statements, the administrator can objectively grasp the situation by carefully examining the evidence of the saved statement data by the originator. Also, in a labor tribunal trial in which the subject is a party, the administrator can present the saved utterance data of the sender as evidence.

さらに、発言データが発信元とより適切に対応付けられて保存されることにより、管理者は、より効果的に、保存された発言データをビッグデータとして解析し、利用できる。管理者は、このような解析により、発信元を指定して、その発信元の発言傾向を分析したり、トラブルの発生前に出現しやすいワードを発見して、禁止ワードリスト２２１に加えたりできる。 Furthermore, since the utterance data is stored in a more appropriate correspondence with the originator, the administrator can more effectively analyze and use the stored utterance data as big data. Through such analysis, the manager can specify a source and analyze the utterance tendency of the source, find words that tend to appear before trouble occurs, and add them to the prohibited word list 221. .

〔変形例２〕特定音声データを用いた負荷の軽減
記憶部２２は、対象者の属性に応じて定められた特定音声データを記憶する、特定音声リスト２２５を備えることも好ましい。特定音声リスト２２５を備えることにより、集計部２１２は、特定音声リスト２２５に記憶されている対象者の属性に応じた特定音声データと音声データとを比較し、特定音声データと異なる音声データについて集計を行える。 [Modification 2] Reduction of load using specific voice data The storage unit 22 preferably includes a specific voice list 225 that stores specific voice data determined according to the attributes of the subject. By providing the specific audio list 225, the counting unit 212 compares the specific audio data corresponding to the attributes of the subject stored in the specific audio list 225 with the audio data, and totals the audio data different from the specific audio data. can do

図７は、特定音声リスト２２５の一例である。特定音声リスト２２５には、発信元の属性と属性に対応する特定音声データとが紐づけられて保存される。図７は、水道工事という属性について、音声データ［ｎｏｉｓｅ１］（「ガンガン」というノック音）が特定音声データとして定められていることを示す。必須ではないが、図７に「備考」として示すように、特定音声リスト２２５は、特定音声データと紐づけられた特定音声データの説明その他をテキストとして含むことができる。特定音声リスト２２５が説明その他をテキストとして含むことにより、管理者は、特定音声データを逐一再生することなく、定められた特定音声データを把握し、容易に管理できる。 FIG. 7 is an example of the specific voice list 225. As shown in FIG. In the specific voice list 225, the attribute of the caller and the specific voice data corresponding to the attribute are linked and stored. FIG. 7 shows that audio data [noise1] (knocking sound of "gangan") is defined as specific audio data for the attribute "waterworks". Although not essential, as shown as "remarks" in FIG. 7, the specific sound list 225 can include, as text, explanations of the specific sound data associated with the specific sound data, and the like. Since the specific audio list 225 includes explanations and other information as text, the administrator can grasp and easily manage the specified specific audio data without reproducing the specific audio data one by one.

図８の概念図を用いた例示により、この特定音声データを用いた集計について説明する。この例では、対象者は水道工事に従事する作業員であり、対象者の近くで頻繁に「ガンガン」という水道のノック音が発生する。 Aggregation using this specific audio data will be described by way of an example using the conceptual diagram of FIG. In this example, the target person is a worker who engages in waterworks, and the knocking sound of the tap water frequently occurs near the target person.

最上段のａ）音声データは、マイク装置１０から送信された例示する音声データが、集計の対象となりうる発言（図８に「この水道」「濡れてる」「まずい」で示されている）に加えて、水道のノック音（図８に「ガンガン」で示されている）を含んでいる様子を示している。 The a) voice data in the uppermost row is an example of voice data transmitted from the microphone device 10, which is an utterance (indicated by "this water supply", "wet", and "bad" in FIG. 8) that can be counted. In addition, it is shown to include the knocking sound of tap water (indicated by "banging" in FIG. 8).

二段目のｂ）特定音声データは、図７の特定音声リスト２２５に特定音声データとして記憶された、水道工事の属性と対応する「ガンガン」というノック音を示している。 The b) specific voice data on the second row indicates a knocking sound of "gan-gan" corresponding to the attribute of plumbing, which is stored as specific voice data in the specific voice list 225 of FIG.

図８に戻る。三段目のｃ）処理は、制御部２１が集計部２１２を実行し、ｂ）に示した特定音声データと例示する音声データとを比較する様子を示している。ｃ）において、音声データの背景の色を変えている部分が、この比較の結果、特定音声データと一致した部分である。 Return to FIG. The process c) in the third row shows how the control unit 21 executes the counting unit 212 and compares the specific audio data shown in b) with the audio data shown as an example. In c), the portion where the background color of the audio data is changed is the portion that matches the specific audio data as a result of this comparison.

この比較の結果、最下段のｄ）処理後が示すように、例示する音声データから特定音声データと一致する部分が取り除かれ、「この水道」「濡れてる」「まずい」で示された対象者による発言の内容のみが発言データとして集計される。このように、特定音声データと異なる部分のみが集計されることにより、集計部２１２の実行による処理負荷が軽減される。 As a result of this comparison, as shown in d) after processing at the bottom, the part matching the specific voice data was removed from the exemplified voice data, and the subject indicated by "this water", "wet", and "bad". Only the contents of the utterances by are aggregated as utterance data. In this way, only the parts different from the specific audio data are tallied, so that the processing load due to the execution of the tallying unit 212 is reduced.

以上、本発明の実施形態について説明したが、本発明は上述したこれらの実施形態に限るものではない。また、本発明の実施形態に記載された効果は、本発明から生じる最も好適な効果を列挙したものに過ぎず、本発明による効果は、本発明の実施形態に記載されたものに限定されるものではない。 Although the embodiments of the present invention have been described above, the present invention is not limited to these embodiments described above. Moreover, the effects described in the embodiments of the present invention are merely a list of the most suitable effects resulting from the present invention, and the effects of the present invention are limited to those described in the embodiments of the present invention. not a thing

また、上述した実施の形態は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、ある実施の形態の構成の一部を他の実施の形態の構成に置き換えることが可能であり、また、ある実施の形態の構成に他の実施の形態の構成を加えることも可能である。 Moreover, the above-described embodiments are described in detail for easy understanding of the present invention, and are not necessarily limited to those having all the described configurations. Also, part of the configuration of one embodiment can be replaced with the configuration of another embodiment, and the configuration of another embodiment can be added to the configuration of one embodiment. .

また、上記の各構成、機能、処理部は、それらの一部又は全部を、ハードウェア（例えば、集積回路）で実現してもよい。また、上記の各構成、機能、処理部は、プロセッサがそれぞれの機能を実現するプログラムを解釈し、実行することによりソフトウェアで実現してもよい。各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリや、ハードディスク、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等の記録装置、又は、ＩＣカード、ＳＤカード、ＤＶＤ等の記録媒体に置くことができる。 Further, each configuration, function, and processing unit described above may be implemented partially or entirely by hardware (for example, an integrated circuit). Further, each of the configurations, functions, and processing units described above may be realized by software by a processor interpreting and executing a program for realizing each function. Information such as programs, tables, and files that implement each function can be stored in recording devices such as memories, hard disks, SSDs (Solid State Drives), or recording media such as IC cards, SD cards, and DVDs.

１管理システム
１０マイク装置
１１制御部
１１１音量計測部
１１２送信部
１２記憶部
１３通信部
１５測位部
２０音声認識装置
２１制御部
２１１発言データ生成部
２１２集計部
２１３警告部
２１４発言データ保存部
２１５発信元判別部
２２記憶部
２２１禁止ワードリスト
２２２集計リスト
２２３発言保存リスト
２２４声紋データリスト
２２５特定音声リスト
２３通信部
２４表示部

1 management system 10 microphone device 11 control unit 111 volume measurement unit 112 transmission unit 12 storage unit 13 communication unit 15 positioning unit 20 speech recognition unit 21 control unit 211 utterance data generation unit 212 aggregation unit 213 warning unit 214 utterance data storage unit 215 transmission Source discrimination unit 22 Storage unit 221 Prohibited word list 222 Aggregation list 223 Statement storage list 224 Voiceprint data list 225 Specific voice list 23 Communication unit 24 Display unit

Claims

A speech recognition device connected to a microphone device capable of converting a subject's voice into voice data, and capable of converting the voice data received from the microphone device into text,
a utterance data generation unit that converts the voice data received from the microphone device into text and generates utterance data of the subject person;
an aggregation unit that compares the utterance data with predetermined prohibited word data and aggregates the frequency of matching or similarity;
a display unit for displaying a result of counting by the counting unit to an administrator;
an utterance data storage unit for classifying, accumulating, and storing the utterance data generated by the utterance data generation unit for each source of the voice data;
has
The aggregating unit is predetermined and includes a target person who is a source of the voice data.
As corresponding to frequently occurring sounds in the vicinity of people who have the same attribute as the attribute,
A speech recognition device that compares specific speech data stored in association with an attribute with the speech data, and performs the aggregation for the speech data that is different from the specific speech data.