JP2021092713A

JP2021092713A - Correction candidate specification device, correction candidate specification method and correction candidate specification program

Info

Publication number: JP2021092713A
Application number: JP2019224221A
Authority: JP
Inventors: 智洋細川; Tomohiro Hosokawa
Original assignee: Mitsubishi Electric Information Systems Corp
Current assignee: Mitsubishi Electric Information Systems Corp
Priority date: 2019-12-12
Filing date: 2019-12-12
Publication date: 2021-06-17
Anticipated expiration: 2039-12-12
Also published as: JP6830148B1

Abstract

To appropriately specify a correction candidate while a calculation load is suppressed.SOLUTION: A word extraction unit 22 extracts a word included in text data constituted of a plurality of sentences. A similarity calculation unit 25 sets the plurality of sentences as target sentences, sets at least one of sentences before and after the target sentence as an opponent sentence, and calculates similarity between a word extracted from the target sentence and a word extracted from the opponent sentence as time series similarity. A candidate specification unit 26 sets the plurality of sentences as the target sentences and specifies the target sentence as a correction candidate when the time series similarity calculated on the target sentence is lower than a first threshold.SELECTED DRAWING: Figure 2

Description

この発明は、複数の文章から構成されるテキストデータの修正が必要な個所を推定する技術に関する。 The present invention relates to a technique for estimating a position where text data composed of a plurality of sentences needs to be modified.

音声認識技術を用いて会話をテキストデータに変換することが行われている。例えば、会議室に複数の音声収集装置が設けられ、会議室での会話の音声データが収集され、テキストデータに変換される。この際、各音声収集装置が音声ミキサー装置によって１つの音声データに集約された上で、音声認識技術を用いてテキストデータに変換される。これにより、会議で話された内容がテキストデータとして保存される。 Conversations are converted into text data using voice recognition technology. For example, a plurality of voice collecting devices are provided in the conference room, and voice data of conversation in the conference room is collected and converted into text data. At this time, each voice collecting device is aggregated into one voice data by a voice mixer device, and then converted into text data by using voice recognition technology. As a result, the content spoken at the meeting is saved as text data.

しかし、テキストデータには誤りが含まれている可能性がある。誤りとしては、音声の誤認識と、認識された音声の同音異義語への誤変換等がある。そのため、正しいテキストデータを得るためには、テキストデータを人手により確認して、正しく修正する作業を行う必要がある。 However, the text data may contain errors. The errors include erroneous recognition of speech and erroneous conversion of the recognized speech into homonyms. Therefore, in order to obtain correct text data, it is necessary to manually check the text data and correct it correctly.

特許文献１には、第１の言語モデルにより認識されたテキストデータにおける修正候補を特定して、第２の言語モデルにより補正を行う技術が記載されている。
特許文献１では、単語の出現頻度に基づき特徴語の集合が抽出される。そして、特徴語に含まれない特異語と特徴語との類似度が計算され、類似度が低い特異語が修正候補として特定される。 Patent Document 1 describes a technique for identifying correction candidates in text data recognized by the first language model and performing correction by the second language model.
In Patent Document 1, a set of characteristic words is extracted based on the frequency of appearance of words. Then, the similarity between the singular word not included in the feature word and the feature word is calculated, and the singular word having a low similarity is specified as a correction candidate.

特開２０１２−１８２０１号公報Japanese Unexamined Patent Publication No. 2012-18201

特許文献１に記載された技術では、テキストデータに含まれる各単語の出現頻度を計算する。そのため、計算負荷が高い。
この発明は、計算負荷を抑えつつ、適切に修正候補を特定可能にすることを目的とする。 In the technique described in Patent Document 1, the frequency of occurrence of each word included in the text data is calculated. Therefore, the calculation load is high.
An object of the present invention is to make it possible to appropriately identify correction candidates while suppressing a calculation load.

この発明に係る修正候補特定装置は、
複数の文章から構成されるテキストデータに含まれる単語を抽出する単語抽出部と、
前記複数の文章それぞれを対象文章とし、前記対象文章の前と後との少なくともいずれかの文章を相手文章として、前記単語抽出部によって前記対象文章から抽出された単語と、前記単語抽出部によって前記相手文章から抽出された単語との類似度を時系列類似度として計算する類似度計算部と、
前記複数の文章それぞれを対象文章として、前記対象文章について前記類似度計算部によって計算された前記時系列類似度が第１閾値よりも低い場合に、前記対象文章を修正候補として特定する候補特定部と
を備える。 The modification candidate identification device according to the present invention is
A word extractor that extracts words contained in text data composed of multiple sentences, and a word extractor
The words extracted from the target sentence by the word extraction unit and the word extracted from the target sentence by the word extraction unit and the word extracted from the target sentence by the word extraction unit, with each of the plurality of sentences as the target sentence and at least one of the sentences before and after the target sentence as the partner sentence. A similarity calculation unit that calculates the similarity with words extracted from the other sentence as a time-series similarity,
A candidate identification unit that specifies the target sentence as a correction candidate when the time-series similarity calculated by the similarity calculation unit for the target sentence is lower than the first threshold value, with each of the plurality of sentences as the target sentence. And.

前記類似度計算部は、前記候補特定部によって特定された前記修正候補以外の文章を比較文章として、前記修正候補から抽出された単語と、前記比較文章それぞれから抽出された単語との間の類似度を組合せ類似度として計算し、
前記候補特定部は、比較文章から抽出されたどの単語についての前記組合せ類似度も第２閾値よりも低い場合に、前記修正候補から抽出された単語を修正対象として特定する。 The similarity calculation unit uses sentences other than the correction candidates specified by the candidate identification unit as comparison sentences, and the similarity between the words extracted from the correction candidates and the words extracted from each of the comparison sentences. Calculate the degree as a combination similarity and
When the combination similarity of any word extracted from the comparative sentence is lower than the second threshold value, the candidate identification unit identifies the word extracted from the correction candidate as a correction target.

前記テキストデータは、複数の音声収集装置それぞれによって収集された音声データが変換された部分テキストデータを合成することによって生成された。 The text data was generated by synthesizing the converted partial text data of the voice data collected by each of the plurality of voice collecting devices.

前記類似度計算部は、前記修正候補を含む部分テキストデータ以外の部分テキストデータに含まれる文章を前記比較対象として、前記組合せ類似度を計算する。 The similarity calculation unit calculates the combination similarity with the sentences included in the partial text data other than the partial text data including the correction candidate as the comparison target.

前記修正候補特定装置は、さらに、
前記単語抽出部によって抽出された単語を対象単語として、前記対象単語の意味を表す表現ベクトルを生成するベクトル生成部
を備え、
前記類似度計算部は、前記ベクトル生成部によって生成された前記表現ベクトルを用いて、前記類似度を計算する。 The correction candidate identification device further
A vector generation unit for generating an expression vector representing the meaning of the target word is provided with the word extracted by the word extraction unit as the target word.
The similarity calculation unit calculates the similarity using the expression vector generated by the vector generation unit.

前記類似度計算部は、一方の文章から抽出された単語についての前記表現ベクトルと、他方の文章から抽出された単語についての前記表現ベクトルとの内積を前記類似度として計算する。 The similarity calculation unit calculates the inner product of the expression vector for the word extracted from one sentence and the expression vector for the word extracted from the other sentence as the similarity.

この発明に係る修正候補特定方法は、
単語抽出部が、複数の文章から構成されるテキストデータに含まれる単語を抽出し、
類似度計算部が、前記複数の文章それぞれを対象文章とし、前記対象文章の前と後との少なくともいずれかの文章を相手文章として、前記対象文章から抽出された単語と、前記相手文章から抽出された単語との類似度を時系列類似度として計算し、
候補特定部が、前記複数の文章それぞれを対象文章として、前記対象文章について計算された前記時系列類似度が第１閾値よりも低い場合に、前記対象文章を修正候補として特定する。 The method for identifying modification candidates according to the present invention is
The word extraction unit extracts words contained in text data composed of multiple sentences,
The similarity calculation unit extracts each of the plurality of sentences as a target sentence, sets at least one of the sentences before and after the target sentence as the other sentence, and extracts the words extracted from the target sentence and the other sentence. Calculate the similarity with the word as the time series similarity,
The candidate identification unit specifies each of the plurality of sentences as a target sentence, and when the time-series similarity calculated for the target sentence is lower than the first threshold value, the target sentence is specified as a correction candidate.

この発明に係る修正候補特定プログラムは、
複数の文章から構成されるテキストデータに含まれる単語を抽出する単語抽出処理と、
前記複数の文章それぞれを対象文章とし、前記対象文章の前と後との少なくともいずれかの文章を相手文章として、前記単語抽出処理によって前記対象文章から抽出された単語と、前記単語抽出処理によって前記相手文章から抽出された単語との類似度を時系列類似度として計算する類似度計算処理と、
前記複数の文章それぞれを対象文章として、前記対象文章について前記類似度計算処理によって計算された前記時系列類似度が第１閾値よりも低い場合に、前記対象文章を修正候補として特定する候補特定処理と
を行う修正候補特定装置としてコンピュータを機能させる。 The modification candidate identification program according to the present invention
Word extraction processing that extracts words contained in text data composed of multiple sentences, and
Each of the plurality of sentences is a target sentence, and at least one of the sentences before and after the target sentence is set as a partner sentence, and the word extracted from the target sentence by the word extraction process and the word extracted by the word extraction process are described. Similarity calculation processing that calculates the similarity with the word extracted from the other sentence as the time series similarity,
Candidate identification processing for specifying the target sentence as a correction candidate when the time-series similarity calculated by the similarity calculation process for the target sentence is lower than the first threshold value, with each of the plurality of sentences as the target sentence. Make the computer function as a correction candidate identification device.

この発明では、複数の文章それぞれを対象文章とし、対象文章の前と後との少なくともいずれかの文章を相手文章として、時系列類似度として計算することにより、修正候補を特定する。これにより、計算負荷を抑えつつ、適切に修正候補を特定可能になる。 In the present invention, each of a plurality of sentences is set as a target sentence, and at least one sentence before and after the target sentence is set as a partner sentence, and a correction candidate is specified by calculating as a time series similarity. This makes it possible to appropriately identify correction candidates while suppressing the calculation load.

実施の形態１に係る音声認識システム１の構成図。The block diagram of the voice recognition system 1 which concerns on Embodiment 1. FIG. 実施の形態１に係る修正候補特定装置１０の構成図。The block diagram of the correction candidate identification apparatus 10 which concerns on Embodiment 1. FIG. 実施の形態１に係る修正候補特定装置１０の動作を示すフローチャート。The flowchart which shows the operation of the correction candidate identification apparatus 10 which concerns on Embodiment 1. FIG. 実施の形態１に係る部分テキストデータ５１の説明図。The explanatory view of the partial text data 51 which concerns on Embodiment 1. FIG. Ｓｋｉｐ−Ｇｒａｍモデルの説明図。Explanatory drawing of Skip-Gram model. 実施の形態１に係る合成テキストデータ５４の説明図。The explanatory view of the synthetic text data 54 which concerns on Embodiment 1. FIG. 実施の形態１に係る時系列類似度計算処理の説明図。The explanatory view of the time series similarity calculation process which concerns on Embodiment 1. FIG. 実施の形態１に係る候補特定処理の説明図。The explanatory view of the candidate identification process which concerns on Embodiment 1. FIG. 変形例４に係る修正候補特定装置１０の構成図。The block diagram of the correction candidate identification apparatus 10 which concerns on modification 4. FIG. 実施の形態２に係る修正候補特定装置１０の動作を示すフローチャート。The flowchart which shows the operation of the correction candidate identification apparatus 10 which concerns on Embodiment 2. 実施の形態２に係る組合せ類似度計算処理の説明図。The explanatory view of the combination similarity calculation process which concerns on Embodiment 2.

実施の形態１．
＊＊＊構成の説明＊＊＊
図１を参照して、実施の形態１に係る音声認識システム１の構成を説明する。
音声認識システム１は、修正候補特定装置１０と、複数の音声収集装置４１と、複数の音声認識装置４２とを備える。
修正候補特定装置１０は、音声認識されたテキストデータにおける修正候補を特定するコンピュータである。修正候補は、音声認識に誤りがある可能性があり、修正が必要と推定される箇所を示す。ここでは、音声認識の誤りは、音声の誤認識と、認識された音声の同音異義語への誤変換との少なくともいずれかである。各音声収集装置４１は、音声データを収集する装置である。各音声収集装置４１は、具体例としては、マイクロフォンである。各音声認識装置４２は、音声認識技術により音声データをテキストデータに変換する装置である。 Embodiment 1.
*** Explanation of configuration ***
The configuration of the voice recognition system 1 according to the first embodiment will be described with reference to FIG.
The voice recognition system 1 includes a correction candidate identification device 10, a plurality of voice collection devices 41, and a plurality of voice recognition devices 42.
The correction candidate identification device 10 is a computer that identifies correction candidates in voice-recognized text data. The correction candidates indicate the parts where there is a possibility that there is an error in speech recognition and it is presumed that correction is necessary. Here, the speech recognition error is at least one of a speech misrecognition and a misconversion of the recognized speech to a homonym. Each voice collecting device 41 is a device that collects voice data. Each voice collecting device 41 is, as a specific example, a microphone. Each voice recognition device 42 is a device that converts voice data into text data by voice recognition technology.

なお、図１では、音声収集装置４１毎に音声認識装置４２が設けられている。しかし、これに限らず、複数の音声収集装置４１に対して１つの音声認識装置４２が設けられていてもよい。 In FIG. 1, a voice recognition device 42 is provided for each voice collecting device 41. However, the present invention is not limited to this, and one voice recognition device 42 may be provided for each of the plurality of voice collection devices 41.

実施の形態１では、人毎に音声収集装置４１が設けられており、各音声収集装置４１に対応する人が話をしているとする。これは、例えば、ウェブ会議を行っており、会議の参加者がそれぞれ異なる場所にいるような場合が想定される。また、会議室において、席毎に音声収集装置４１が設けられている場合が想定される。 In the first embodiment, it is assumed that a voice collecting device 41 is provided for each person, and a person corresponding to each voice collecting device 41 is talking. This is assumed, for example, when a web conference is held and the participants of the conference are in different places. Further, it is assumed that a voice collecting device 41 is provided for each seat in the conference room.

図２を参照して、実施の形態１に係る修正候補特定装置１０の構成を説明する。
修正候補特定装置１０は、プロセッサ１１と、メモリ１２と、ストレージ１３と、通信インタフェース１４とのハードウェアを備える。プロセッサ１１は、信号線を介して他のハードウェアと接続され、これら他のハードウェアを制御する。 The configuration of the modification candidate identification device 10 according to the first embodiment will be described with reference to FIG.
The modification candidate identification device 10 includes hardware for a processor 11, a memory 12, a storage 13, and a communication interface 14. The processor 11 is connected to other hardware via a signal line and controls these other hardware.

プロセッサ１１は、プロセッシングを行うＩＣ（ＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）である。プロセッサ１１は、具体例としては、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）、ＧＰＵ（ＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）である。 The processor 11 is an IC (Integrated Circuit) that performs processing. Specific examples of the processor 11 are a CPU (Central Processing Unit), a DSP (Digital Signal Processor), and a GPU (Graphics Processing Unit).

メモリ１２は、データを一時的に記憶する記憶装置である。メモリ１２は、具体例としては、ＳＲＡＭ（ＳｔａｔｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ＤＲＡＭ（ＤｙｎａｍｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）である。 The memory 12 is a storage device that temporarily stores data. Specific examples of the memory 12 are SRAM (Static Random Access Memory) and DRAM (Dynamic Random Access Memory).

ストレージ１３は、データを保管する記憶装置である。ストレージ１３は、具体例としては、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）である。また、ストレージ１３は、ＳＤ（登録商標，ＳｅｃｕｒｅＤｉｇｉｔａｌ）メモリカード、ＣＦ（ＣｏｍｐａｃｔＦｌａｓｈ，登録商標）、ＮＡＮＤフラッシュ、フレキシブルディスク、光ディスク、コンパクトディスク、ブルーレイ（登録商標）ディスク、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）といった可搬記録媒体であってもよい。 The storage 13 is a storage device for storing data. As a specific example, the storage 13 is an HDD (Hard Disk Drive). The storage 13 includes SD (registered trademark, Secure Digital) memory card, CF (CompactFlash, registered trademark), NAND flash, flexible disk, optical disk, compact disk, Blu-ray (registered trademark) disk, DVD (Digital Versaille Disk), and the like. It may be a portable recording medium.

通信インタフェース１４は、外部の装置と通信するためのインタフェースである。通信インタフェース１４は、具体例としては、Ｅｔｈｅｒｎｅｔ（登録商標）、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）、ＨＤＭＩ（登録商標，Ｈｉｇｈ−ＤｅｆｉｎｉｔｉｏｎＭｕｌｔｉｍｅｄｉａＩｎｔｅｒｆａｃｅ）のポートである。 The communication interface 14 is an interface for communicating with an external device. As a specific example, the communication interface 14 is a port of Ethernet (registered trademark), USB (Universal Serial Bus), HDMI (registered trademark, High-Definition Multimedia Interface).

修正候補特定装置１０は、機能構成要素として、データ取得部２１と、単語抽出部２２と、ベクトル生成部２３と、テキスト合成部２４と、類似度計算部２５と、候補特定部２６とを備える。修正候補特定装置１０の各機能構成要素の機能はソフトウェアにより実現される。
ストレージ１３には、修正候補特定装置１０の各機能構成要素の機能を実現するプログラムが格納されている。このプログラムは、プロセッサ１１によりメモリ１２に読み込まれ、プロセッサ１１によって実行される。これにより、修正候補特定装置１０の各機能構成要素の機能が実現される。 The correction candidate identification device 10 includes a data acquisition unit 21, a word extraction unit 22, a vector generation unit 23, a text composition unit 24, a similarity calculation unit 25, and a candidate identification unit 26 as functional components. .. The functions of each functional component of the modification candidate identification device 10 are realized by software.
The storage 13 stores a program that realizes the functions of each functional component of the modification candidate identification device 10. This program is read into the memory 12 by the processor 11 and executed by the processor 11. As a result, the functions of each functional component of the modification candidate identification device 10 are realized.

図２では、プロセッサ１１は、１つだけ示されていた。しかし、プロセッサ１１は、複数であってもよく、複数のプロセッサ１１が、各機能を実現するプログラムを連携して実行してもよい。 In FIG. 2, only one processor 11 was shown. However, the number of processors 11 may be plural, and the plurality of processors 11 may execute programs that realize each function in cooperation with each other.

＊＊＊動作の説明＊＊＊
図３から図８を参照として、実施の形態１に係る修正候補特定装置１０の動作を説明する。
実施の形態１に係る修正候補特定装置１０の動作手順は、実施の形態１に係る修正候補特定方法に相当する。また、実施の形態１に係る修正候補特定装置１０の動作を実現するプログラムは、実施の形態１に係る修正候補特定プログラムに相当する。 *** Explanation of operation ***
The operation of the modification candidate identification device 10 according to the first embodiment will be described with reference to FIGS. 3 to 8.
The operation procedure of the modification candidate identification device 10 according to the first embodiment corresponds to the modification candidate identification method according to the first embodiment. Further, the program that realizes the operation of the modification candidate identification device 10 according to the first embodiment corresponds to the modification candidate identification program according to the first embodiment.

（図３のステップＳ１：データ取得処理）
データ取得部２１は、各音声収集装置４１で収集された音声データが各音声認識装置４２によって変換されたテキストデータである部分テキストデータ５１を取得する。
具体的には、各音声収集装置４１で、その音声収集装置４１に割り当てられた人が発話した音声データが収集される。すると、各音声収集装置４１で収集された音声が、その音声収集装置４１に対応する音声認識装置４２で音声データが変換され、部分テキストデータ５１が生成される。データ取得部２１は、各音声認識装置４２で生成された部分テキストデータ５１を取得する。
図４に示すように、各部分テキストデータ５１は、複数の文章５２を含んでいる。発話する間隔が基準時間以上空いた場合に、別の文章５２として認識されて、部分テキストデータ５１が生成される。各文章５２には、発話が開始された時刻が設定される。 (Step S1: Data acquisition process in FIG. 3)
The data acquisition unit 21 acquires partial text data 51 in which the voice data collected by each voice collecting device 41 is text data converted by each voice recognition device 42.
Specifically, each voice collecting device 41 collects voice data spoken by a person assigned to the voice collecting device 41. Then, the voice collected by each voice collecting device 41 is converted into voice data by the voice recognition device 42 corresponding to the voice collecting device 41, and the partial text data 51 is generated. The data acquisition unit 21 acquires the partial text data 51 generated by each voice recognition device 42.
As shown in FIG. 4, each partial text data 51 includes a plurality of sentences 52. When the utterance interval is longer than the reference time, it is recognized as another sentence 52 and the partial text data 51 is generated. In each sentence 52, the time when the utterance is started is set.

（図３のステップＳ２：単語抽出処理）
単語抽出部２２は、ステップＳ１で取得された各部分テキストデータ５１を対象の部分テキストデータ５１として、対象の部分テキストデータ５１に含まれる単語を抽出する。
具体的には、単語抽出部２２は、既存の単語抽出技術を用いて、対象の部分テキストデータ５１に含まれる単語を抽出する。具体例としては、単語抽出部２２は、対象の部分テキストデータ５１について、形態素解析を行い、名詞と動詞と形容詞といった特定の品詞を抽出することにより、対象の部分テキストデータ５１に含まれる単語を抽出する。 (Step S2 in FIG. 3: Word extraction process)
The word extraction unit 22 extracts the words included in the target partial text data 51 by using each partial text data 51 acquired in step S1 as the target partial text data 51.
Specifically, the word extraction unit 22 uses an existing word extraction technique to extract words included in the target partial text data 51. As a specific example, the word extraction unit 22 performs morphological analysis on the target partial text data 51 and extracts specific parts of speech such as nouns, verbs, and adjectives to obtain words included in the target partial text data 51. Extract.

（図３のステップＳ３：ベクトル生成処理）
ベクトル生成部２３は、ステップＳ２で抽出された各単語を対象単語として、対象単語の意味を表す表現ベクトル５３を生成する。
具体的には、ベクトル生成部２３は、Ｓｋｉｐ−Ｇｒａｍモデル又はＣＢｏＷ（ＣｏｎｔｉｎｕｏｕｓＢａｇ−ｏｆ−Ｗｏｒｄｓ）といった技術を用いて、対象単語についての表現ベクトル５３を生成する。実施の形態１では、Ｓｋｉｐ−Ｇｒａｍモデルを用いる場合を例として説明する。 (Step S3 in FIG. 3: Vector generation process)
The vector generation unit 23 uses each word extracted in step S2 as a target word to generate an expression vector 53 representing the meaning of the target word.
Specifically, the vector generation unit 23 generates the expression vector 53 for the target word by using a technique such as Skip-Gram model or CBoW (Continuous Bag-of-Words). In the first embodiment, the case where the Skip-Gram model is used will be described as an example.

Ｓｋｉｐ−Ｇｒａｍモデルを用いて表現ベクトル５３を生成する方法を説明する。
事前準備として、ベクトル生成部２３は、部分テキストデータ５１に含まれると予想される単語を含む多数の文章のテキストデータをＳｋｉｐ−Ｇｒａｍモデルに与える。例えば、会議の会話の音声データが変換された部分テキストデータ５１が取得される場合には、過去の会議の議事録のテキストデータをＳｋｉｐ−Ｇｒａｍモデルに与えることが考えられる。
これにより、Ｓｋｉｐ−Ｇｒａｍモデルに与えられたテキストデータに含まれる各単語について、その単語の周辺に現れると予測される単語を表すベクトルが生成される。具体的には、図５に示すように、各単語について、その単語の周辺に他の単語が現れる確率を要素として持つベクトルが生成される。なお、図５では、各単語について、３個の要素を持つベクトルが示されている。しかし、実際には、各単語について、数百個といった多数の要素を持つベクトルが生成される。
ここで、類似した単語については、その単語の周辺に現れる単語が類似する。そのため、類似した単語については類似した要素を持つベクトルが生成される。 A method of generating the representation vector 53 using the Skip-Gram model will be described.
As a preliminary preparation, the vector generation unit 23 provides the Skip-Gram model with text data of a large number of sentences including words expected to be included in the partial text data 51. For example, when the partial text data 51 in which the voice data of the conversation of the meeting is converted is acquired, it is conceivable to give the text data of the minutes of the past meeting to the Skip-Gram model.
As a result, for each word contained in the text data given to the Skip-Gram model, a vector representing a word that is predicted to appear around the word is generated. Specifically, as shown in FIG. 5, for each word, a vector having the probability that another word appears around the word is generated as an element. Note that FIG. 5 shows a vector having three elements for each word. However, in reality, for each word, a vector with many elements, such as hundreds, is generated.
Here, for similar words, the words appearing around the word are similar. Therefore, a vector with similar elements is generated for similar words.

ベクトル生成部２３は、各単語についてのベクトルをストレージ１３に記憶する。なお、ベクトル生成部２３は、各単語についてのベクトルを、修正候補特定装置１０の外部の記憶装置に記憶してもよい。
そして、ステップＳ２で単語が抽出されると、ベクトル生成部２３は、抽出された単語である対象単語についてのベクトルをストレージ１３から検索する。ベクトル生成部２３は、検索にヒットしたベクトルを、対象単語についての表現ベクトル５３に設定する。これにより、対象単語についての表現ベクトル５３が生成される。 The vector generation unit 23 stores the vector for each word in the storage 13. The vector generation unit 23 may store the vector for each word in an external storage device of the correction candidate identification device 10.
Then, when the word is extracted in step S2, the vector generation unit 23 searches the storage 13 for the vector of the target word which is the extracted word. The vector generation unit 23 sets the vector that hits the search in the expression vector 53 for the target word. As a result, the expression vector 53 for the target word is generated.

（図３のステップＳ４：テキスト合成処理）
テキスト合成部２４は、ステップＳ１で取得された各部分テキストデータ５１を合成して合成テキストデータ５４を生成する。
具体的には、テキスト合成部２４は、各部分テキストデータ５１に含まれる文章５２を時系列に並べて合成することにより、合成テキストデータ５４を生成する。具体例としては、図４に示す２人が会話した音声データが変換された２つの部分テキストデータ５１を合成する。すると、図６に示すように、会話された順に文章５２が並んだ合成テキストデータ５４が生成される。合成テキストデータ５４の各文章５２には、元の部分テキストデータ５１を示す元識別子５５が付される。図６では、図４の左側の部分テキストデータ５１については元識別子として“０１”が付され、図４の右側の部分テキストデータ５１については元識別子として“０２”が付されている。 (Step S4 in FIG. 3: Text composition processing)
The text composition unit 24 synthesizes each partial text data 51 acquired in step S1 to generate the composite text data 54.
Specifically, the text synthesizing unit 24 generates the composite text data 54 by arranging the sentences 52 included in each partial text data 51 in chronological order and synthesizing them. As a specific example, two partial text data 51 in which the voice data of the conversation between the two persons shown in FIG. 4 is converted is synthesized. Then, as shown in FIG. 6, synthetic text data 54 in which sentences 52 are arranged in the order of conversation is generated. Each sentence 52 of the composite text data 54 is given an original identifier 55 indicating the original partial text data 51. In FIG. 6, the partial text data 51 on the left side of FIG. 4 is given an original identifier of “01”, and the partial text data 51 on the right side of FIG. 4 is given an original identifier of “02”.

（図３のステップＳ５：時系列類似度計算処理）
類似度計算部２５は、ステップＳ４で生成された合成テキストデータ５４に含まれる複数の文章５２それぞれを対象文章とし、対象文章の前と後との少なくともいずれかの文章５２を相手文章として設定する。実施の形態１では、類似度計算部２５は、対象文章の前と後とに文章５２がある場合には、対象文章の前の文章５２と後の文章５２とをそれぞれ相手文章として設定する。
そして、類似度計算部２５は、ステップＳ２で対象文章から抽出された単語と、ステップＳ２で相手文章から抽出された単語との類似度を時系列類似度として計算する。対象文章と相手文章との少なくともいずれかから複数の単語が抽出されている場合には、類似度計算部２５は、対象文章から抽出された単語と、相手文章から抽出された単語との各組合せについて類似度を計算する。そして、類似度計算部２５は、各組合せについての類似度の平均値又は中央値を時系列類似度として計算する。 (Step S5 in FIG. 3: Time series similarity calculation process)
The similarity calculation unit 25 sets each of the plurality of sentences 52 included in the synthetic text data 54 generated in step S4 as the target sentence, and sets at least one of the sentences 52 before and after the target sentence as the partner sentence. .. In the first embodiment, when there are sentences 52 before and after the target sentence, the similarity calculation unit 25 sets the sentence 52 before and after the target sentence 52 as the other sentence, respectively.
Then, the similarity calculation unit 25 calculates the similarity between the word extracted from the target sentence in step S2 and the word extracted from the partner sentence in step S2 as the time series similarity. When a plurality of words are extracted from at least one of the target sentence and the other sentence, the similarity calculation unit 25 uses each combination of the words extracted from the target sentence and the words extracted from the other sentence. Calculate the similarity for. Then, the similarity calculation unit 25 calculates the average value or the median of the similarity for each combination as the time series similarity.

図７に示すように、合成テキストデータ５４における単語が抽出されたとする。図７では、下線が引かれた単語がステップＳ２で抽出されているとする。すると、例えば、ＩＤ００１の文章５２が対象文章である場合には、ＩＤ００２の文章５２が相手文章として設定される。また、ＩＤ００２の文章５２が対象文章である場合には、ＩＤ００１の文章５２とＩＤ００３の文章５２とがそれぞれ相手文章として設定される。
そして、ＩＤ００１の文章５２が対象文章である場合には、相手文章であるＩＤ００２の文章５２との時系列類似が次のように計算される。ＩＤ００１の文章５２からは、“Ａ”と“お世話”とが単語として抽出されている。また、ＩＤ００２の文章５２からは、“Ｂ”と“お世話”とが単語として抽出されている。そこで、“Ａ”と“Ｂ”との組合せと、“Ａ”と“お世話”との組合せと、“お世話”と“Ｂ”との組合せと、“お世話”と“お世話”との組合せとの４つの組合せそれぞれについての類似度が計算される。そして、４つの組合せについての類似度の平均値が時系列類似度として計算される。 As shown in FIG. 7, it is assumed that the words in the synthetic text data 54 are extracted. In FIG. 7, it is assumed that the underlined word is extracted in step S2. Then, for example, when the sentence 52 of ID001 is the target sentence, the sentence 52 of ID002 is set as the partner sentence. When the sentence 52 of ID002 is the target sentence, the sentence 52 of ID001 and the sentence 52 of ID003 are set as the other sentence, respectively.
Then, when the sentence 52 of ID001 is the target sentence, the time series similarity with the sentence 52 of ID002 which is the partner sentence is calculated as follows. From the sentence 52 of ID001, "A" and "care" are extracted as words. Further, from the sentence 52 of ID002, "B" and "care" are extracted as words. Therefore, the combination of "A" and "B", the combination of "A" and "care", the combination of "care" and "B", and the combination of "care" and "care" The similarity for each of the four combinations is calculated. Then, the average value of the similarity for the four combinations is calculated as the time series similarity.

類似度計算部２５は、対象文章から抽出された単語についてステップＳ３で生成された表現ベクトル５３と、相手文章から抽出された単語についてステップＳ３で生成された表現ベクトル５３とを用いて、類似度を計算する。具体的には、類似度計算部２５は、対象文章から抽出された単語についての表現ベクトル５３と、相手文章から抽出された単語についての表現ベクトル５３との内積を類似度として計算する。
具体例として“打合せ”という単語と、“会議”という単語との類似度の計算方法を説明する。図５に示すように、“打合せ”という単語の表現ベクトル５３の要素は、｛０．３，０．２，０．６｝である。“会議”という単語の表現ベクトル５３の要素は、｛０．５，０．１，０．８｝である。したがって、類似度は、０．３×０．５＋０．２×０．１＋０．６×０．８＝０．６５である。 The similarity calculation unit 25 uses the expression vector 53 generated in step S3 for the word extracted from the target sentence and the expression vector 53 generated in step S3 for the word extracted from the other sentence, and the similarity calculation unit 25 uses the similarity degree. To calculate. Specifically, the similarity calculation unit 25 calculates the inner product of the expression vector 53 for the word extracted from the target sentence and the expression vector 53 for the word extracted from the partner sentence as the similarity.
As a specific example, a method of calculating the degree of similarity between the word "meeting" and the word "meeting" will be described. As shown in FIG. 5, the element of the expression vector 53 of the word “meeting” is {0.3, 0.2, 0.6}. The element of the expression vector 53 of the word "meeting" is {0.5, 0.1, 0.8}. Therefore, the similarity is 0.3 × 0.5 + 0.2 × 0.1 + 0.6 × 0.8 = 0.65.

（図３のステップＳ６：候補特定処理）
候補特定部２６は、ステップＳ４で生成された合成テキストデータ５４に含まれる複数の文章５２それぞれを対象文章として、対象文章についてステップＳ５で計算された時系列類似度が第１閾値よりも低いか否かを判定する。候補特定部２６は、時系列類似度が第１閾値よりも低い場合に、対象文章を修正候補として特定する。候補特定部２６は、対象文章の前と後とに文章５２がある場合には、前の文章５２との時系列類似度と、後の文章５２との時系列類似度とのどちらも第１閾値よりも低い場合に、対象文章を修正候補として特定する。
図８に示すように、各文章５２について時系列類似度が計算され、第１閾値が０．６である場合には、ＩＤ００４の文章５２が修正候補として特定される。 (Step S6 in FIG. 3: Candidate identification process)
The candidate identification unit 26 sets each of the plurality of sentences 52 included in the synthetic text data 54 generated in step S4 as the target sentence, and whether the time series similarity calculated in step S5 for the target sentence is lower than the first threshold value. Judge whether or not. The candidate identification unit 26 identifies the target sentence as a correction candidate when the time series similarity is lower than the first threshold value. When the candidate identification unit 26 has sentences 52 before and after the target sentence, both the time-series similarity with the previous sentence 52 and the time-series similarity with the subsequent sentence 52 are the first. If it is lower than the threshold value, the target sentence is specified as a correction candidate.
As shown in FIG. 8, the time series similarity is calculated for each sentence 52, and when the first threshold value is 0.6, the sentence 52 of ID004 is specified as a correction candidate.

＊＊＊実施の形態１の効果＊＊＊
以上のように、実施の形態１に係る修正候補特定装置１０は、時系列に文章５２間の類似度を計算して、類似度が低い文章５２を修正候補として特定する。これにより、合成テキストデータ５４に含まれる全ての単語の出現頻度を求めることなく、修正候補となる文章５２を特定可能である。その結果、計算負荷を抑えつつ、適切に修正候補を特定可能になる。 *** Effect of Embodiment 1 ***
As described above, the modification candidate identification device 10 according to the first embodiment calculates the similarity between sentences 52 in chronological order, and identifies the sentence 52 having a low similarity as a modification candidate. Thereby, the sentence 52 as a correction candidate can be specified without obtaining the appearance frequency of all the words included in the synthetic text data 54. As a result, it becomes possible to appropriately identify correction candidates while suppressing the calculation load.

複数の人が会話をした場合には、前後いずれかの文章５２と関連のある内容を話すと想定される。したがって、時系列に文章５２間の類似度を計算して、類似度が低い文章５２を修正候補として特定することにより、適切に修正候補を特定可能である。 When a plurality of people have a conversation, it is assumed that they speak the content related to the sentence 52 before or after. Therefore, by calculating the similarity between the sentences 52 in chronological order and specifying the sentence 52 having a low similarity as the correction candidate, the correction candidate can be appropriately specified.

実施の形態１では、人毎に音声収集装置４１が設けられている。この場合には、複数の人の音声が重なって記録され難いため、音声認識の精度が高くなる。一方、他の人の発話した内容が考慮されずに文字変換されるため、同音異義語への誤変換が起こり易くなる。例えば、図８のＩＤ００３の文章５２と、ＩＤ００４の文章５２とは別の部分テキストデータ５１に含まれる文章５２である。そのため、ＩＤ００４の文章５２の生成時には、ＩＤ００３の文章５２に関しては考慮されずに文字変換が行われる。その結果、“雨”ではなく“飴”という誤変換が起こり易くなっている。
実施の形態１に係る修正候補特定装置１０は、時系列に文章５２間の類似度を計算することにより、他の部分テキストデータ５１における前後の文章５２との関係から修正候補が特定される。そのため、人毎に音声収集装置４１が設けられているような場合に特に効果を発揮する。 In the first embodiment, the voice collecting device 41 is provided for each person. In this case, it is difficult to record the voices of a plurality of people in an overlapping manner, so that the accuracy of voice recognition is high. On the other hand, since the characters are converted without considering the contents spoken by other people, erroneous conversion to homonyms is likely to occur. For example, the sentence 52 of ID003 in FIG. 8 and the sentence 52 included in the partial text data 51 different from the sentence 52 of ID004. Therefore, when the sentence 52 of ID004 is generated, the character conversion is performed without considering the sentence 52 of ID003. As a result, erroneous conversion of "candy" instead of "rain" is likely to occur.
The correction candidate identification device 10 according to the first embodiment calculates the similarity between the sentences 52 in time series, and the correction candidates are specified from the relationship with the preceding and following sentences 52 in the other partial text data 51. Therefore, it is particularly effective when the voice collecting device 41 is provided for each person.

＊＊＊他の構成＊＊＊
＜変形例１＞
実施の形態１では、類似度計算部２５は、表現ベクトル５３の内積を時系列類似度として計算した。しかし、類似度計算部２５は、対象文章から抽出された単語についての表現ベクトル５３と、相手文章から抽出された単語についての表現ベクトル５３とコサイン類似度を時系列類似度として計算してもよい。 *** Other configurations ***
<Modification example 1>
In the first embodiment, the similarity calculation unit 25 calculates the inner product of the expression vector 53 as the time series similarity. However, the similarity calculation unit 25 may calculate the expression vector 53 for the word extracted from the target sentence, the expression vector 53 for the word extracted from the other sentence, and the cosine similarity as the time series similarity. ..

＜変形例２＞
図３のステップＳ３で一部の単語についてのベクトルが記憶されておらず、表現ベクトル５３を生成できない可能性がある。表現ベクトル５３を生成できない単語がある場合には、その単語を含む文章５２を、表現ベクトル５３を生成できない単語がある文章５２として特定してもよい。
表現ベクトル５３を生成できない単語は、誤認識された単語である可能性がある。そのため、表現ベクトル５３を生成できない単語がある文章５２として特定することにより、適切に修正すべき候補を特定可能である。 <Modification 2>
In step S3 of FIG. 3, the vector for some words is not stored, and the expression vector 53 may not be generated. When there is a word for which the expression vector 53 cannot be generated, the sentence 52 including the word may be specified as the sentence 52 for which the expression vector 53 cannot be generated.
A word for which the expression vector 53 cannot be generated may be a misrecognized word. Therefore, the candidate to be appropriately corrected can be specified by specifying the word 52 in which the expression vector 53 cannot be generated as the sentence 52.

この際、図３のステップＳ６で特定された修正候補と、表現ベクトル５３を生成できない単語がある文章５２とを別々に提示してもよい。 At this time, the correction candidate identified in step S6 of FIG. 3 and the sentence 52 containing a word for which the expression vector 53 cannot be generated may be presented separately.

＜変形例３＞
実施の形態１では、対象文章の前と後との少なくともいずれかの文章５２を相手文章として設定した。相手文章は、対象文章の直前または直後の文章５２だけでなく、複数の文章５２を相手文章として設定してもよい。例えば対象文章より前にあり、対象文章と異なる元識別子５５を有する文章５２を相手文章としてもよい。また対象文章より後にあり、対象文章と異なる元識別子５５を有する文章５２を相手文章としてもよい。 <Modification example 3>
In the first embodiment, at least one sentence 52 before and after the target sentence is set as the partner sentence. As the partner sentence, not only the sentence 52 immediately before or after the target sentence but also a plurality of sentences 52 may be set as the partner sentence. For example, a sentence 52 that precedes the target sentence and has a source identifier 55 different from the target sentence may be used as the partner sentence. Further, a sentence 52 that is after the target sentence and has a source identifier 55 different from the target sentence may be used as the partner sentence.

＜変形例４＞
実施の形態１では、各機能構成要素がソフトウェアで実現された。しかし、変形例４として、各機能構成要素はハードウェアで実現されてもよい。この変形例４について、実施の形態１と異なる点を説明する。 <Modification example 4>
In the first embodiment, each functional component is realized by software. However, as a modification 4, each functional component may be realized by hardware. The difference between the modified example 4 and the first embodiment will be described.

図９を参照して、変形例４に係る修正候補特定装置１０の構成を説明する。
各機能構成要素がハードウェアで実現される場合には、修正候補特定装置１０は、プロセッサ１１とメモリ１２とストレージ１３とに代えて、電子回路１５を備える。電子回路１５は、各機能構成要素と、メモリ１２と、ストレージ１３との機能とを実現する専用の回路である。 With reference to FIG. 9, the configuration of the modification candidate identification device 10 according to the modification 4 will be described.
When each functional component is realized by hardware, the modification candidate identification device 10 includes an electronic circuit 15 instead of the processor 11, the memory 12, and the storage 13. The electronic circuit 15 is a dedicated circuit that realizes the functions of each functional component, the memory 12, and the storage 13.

電子回路１５としては、単一回路、複合回路、プログラム化したプロセッサ、並列プログラム化したプロセッサ、ロジックＩＣ、ＧＡ（ＧａｔｅＡｒｒａｙ）、ＡＳＩＣ（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）、ＦＰＧＡ（Ｆｉｅｌｄ−ＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ）が想定される。
各機能構成要素を１つの電子回路１５で実現してもよいし、各機能構成要素を複数の電子回路１５に分散させて実現してもよい。 Examples of the electronic circuit 15 include a single circuit, a composite circuit, a programmed processor, a parallel programmed processor, a logic IC, a GA (Gate Array), an ASIC (Application Specific Integrated Circuit), and an FPGA (Field-Programmable Gate Array). is assumed.
Each functional component may be realized by one electronic circuit 15, or each functional component may be distributed and realized by a plurality of electronic circuits 15.

＜変形例５＞
変形例５として、一部の各機能構成要素がハードウェアで実現され、他の各機能構成要素がソフトウェアで実現されてもよい。 <Modification 5>
As a modification 5, some functional components may be realized by hardware, and other functional components may be realized by software.

プロセッサ１１とメモリ１２とストレージ１３と電子回路１５とを処理回路という。つまり、各機能構成要素の機能は、処理回路により実現される。 The processor 11, the memory 12, the storage 13, and the electronic circuit 15 are referred to as processing circuits. That is, the function of each functional component is realized by the processing circuit.

実施の形態２．
実施の形態２は、修正候補からさらに修正対象となる箇所を絞り込む点が実施の形態１と異なる。実施の形態２では、この異なる点を説明して、同一の点については説明を省略する。 Embodiment 2.
The second embodiment is different from the first embodiment in that the parts to be corrected are further narrowed down from the correction candidates. In the second embodiment, these different points will be described, and the same points will be omitted.

＊＊＊動作の説明＊＊＊
図１０及び図１１を参照として、実施の形態２に係る修正候補特定装置１０の動作を説明する。
実施の形態２に係る修正候補特定装置１０の動作手順は、実施の形態２に係る修正候補特定方法に相当する。また、実施の形態２に係る修正候補特定装置１０の動作を実現するプログラムは、実施の形態２に係る修正候補特定プログラムに相当する。 *** Explanation of operation ***
The operation of the modification candidate identification device 10 according to the second embodiment will be described with reference to FIGS. 10 and 11.
The operation procedure of the modification candidate identification device 10 according to the second embodiment corresponds to the modification candidate identification method according to the second embodiment. Further, the program that realizes the operation of the modification candidate identification device 10 according to the second embodiment corresponds to the modification candidate identification program according to the second embodiment.

ステップＳ１からステップＳ６の処理は、実施の形態１と同じである。 The processing of steps S1 to S6 is the same as that of the first embodiment.

（図１０のステップＳ７：組合せ類似度計算処理）
類似度計算部２５は、ステップＳ６で特定された各修正候補を対象の修正候補に設定する。類似度計算部２５は、対象の修正候補以外の文章５２を比較文章として、対象の修正候補から抽出された単語と、比較文章それぞれから抽出された各単語との間の類似度を組合せ類似度として計算する。対象の修正候補から複数の単語が抽出された場合には、類似度計算部２５は、対象の修正候補から抽出された各単語と、比較文章それぞれから抽出された各単語との類似度を組合せ類似度として計算する。
図１１に示すように、ＩＤ００４の文章５２が修正候補として特定されたとする。この場合には、類似度計算部２５は、ＩＤ００１，ＩＤ００２，ＩＤ００３，ＩＤ００５，ＩＤ００６の各文章５２を比較文章として設定する。そして、ＩＤ００４の文章５２から抽出された“最近”と“飴”とのそれぞれの単語について、比較文章から抽出された各単語との類似度を組合せ類似度として計算する。なお、類似度計算部２５は、ステップＳ５と同様に、修正候補から抽出された単語についての表現ベクトル５３と、比較対象から抽出された単語についての表現ベクトル５３とを用いて、類似度を計算する。 (Step S7 in FIG. 10: Combination similarity calculation process)
The similarity calculation unit 25 sets each correction candidate specified in step S6 as a target correction candidate. The similarity calculation unit 25 uses a sentence 52 other than the target correction candidate as a comparison sentence, and combines the similarity between the words extracted from the target correction candidates and each word extracted from each of the comparison sentences, and the similarity degree. Calculate as. When a plurality of words are extracted from the target correction candidates, the similarity calculation unit 25 combines the similarity between each word extracted from the target correction candidates and each word extracted from each of the comparative sentences. Calculate as similarity.
As shown in FIG. 11, it is assumed that the sentence 52 of ID004 is specified as a correction candidate. In this case, the similarity calculation unit 25 sets each sentence 52 of ID001, ID002, ID003, ID005, and ID006 as a comparison sentence. Then, for each of the words "recent" and "candy" extracted from the sentence 52 of ID004, the similarity with each word extracted from the comparative sentence is calculated as a combination similarity. As in step S5, the similarity calculation unit 25 calculates the similarity using the expression vector 53 for the words extracted from the correction candidates and the expression vector 53 for the words extracted from the comparison target. To do.

（図１０のステップＳ８：修正対象特定処理）
候補特定部２６は、修正候補から抽出された各単語を対象の単語に設定する。候補特定部２６は、比較文章から抽出された各単語について、ステップＳ７で計算された対象の単語との組合せ類似度が第２閾値よりも低いか否かを判定する。候補特定部２６は、比較文章から抽出された全ての単語について、対象の単語との組合せ類似度が第２閾値よりも低い場合に、修正候補から抽出された単語を修正対象として特定する。 (Step S8 of FIG. 10: Correction target identification process)
The candidate identification unit 26 sets each word extracted from the correction candidates as a target word. The candidate identification unit 26 determines whether or not the combination similarity with the target word calculated in step S7 is lower than the second threshold value for each word extracted from the comparative sentence. The candidate identification unit 26 specifies the word extracted from the correction candidates as the correction target when the combination similarity with the target word is lower than the second threshold value for all the words extracted from the comparative sentence.

図１１の場合には、“最近”は、ＩＤ００３の文章５２における“明日”との類似度が第２閾値よりも高くなる。一方、“飴”は、全ての単語との類似度が第２閾値よりも低くなる。その結果、ＩＤ００４の文章５２における“飴”が修正対象として特定される。 In the case of FIG. 11, “recent” has a higher similarity to “tomorrow” in the sentence 52 of ID003 than the second threshold value. On the other hand, "candy" has a similarity with all words lower than the second threshold value. As a result, the "candy" in the sentence 52 of ID004 is specified as a correction target.

＊＊＊実施の形態２の効果＊＊＊
以上のように、実施の形態２に係る修正候補特定装置１０は、修正候補からさらに修正対象となる箇所を絞り込む。これにより、実施の形態１に比べ、修正作業に係る手間を減らすことが可能である。 *** Effect of Embodiment 2 ***
As described above, the correction candidate identification device 10 according to the second embodiment further narrows down the parts to be corrected from the correction candidates. As a result, it is possible to reduce the labor involved in the correction work as compared with the first embodiment.

例えば、直前の文章５２よりも前の文章５２と関連する内容の文章５２が修正候補となっている場合には、修正候補に含まれる単語は、直前の文章５２よりも前の文章５２の単語と類似している可能性がある。実施の形態２では、修正候補の文章５２に対して、他の文章５２全てを比較文章として、全ての単語に対する組み合わせ類似度を計算する。
このように、実施の形態２に係る修正候補特定装置１０は、修正候補の文章５２に対して、文章５２全体との組み合わせ類似度を計算したうえで、より確度が高い修正候補を特定することができるという効果も奏する。 For example, when the sentence 52 having the content related to the sentence 52 before the immediately preceding sentence 52 is a correction candidate, the word included in the correction candidate is the word of the sentence 52 before the immediately preceding sentence 52. May be similar to. In the second embodiment, the combination similarity for all the words is calculated with respect to the correction candidate sentence 52, using all the other sentences 52 as comparative sentences.
As described above, the correction candidate identification device 10 according to the second embodiment calculates the combination similarity with the entire sentence 52 for the correction candidate sentence 52, and then identifies the correction candidate with higher accuracy. It also has the effect of being able to.

＊＊＊他の構成＊＊＊
＜変形例６＞
図１０のステップＳ８で、候補特定部２６は、比較文章から抽出された全ての単語について、対象の単語との組合せ類似度が第２閾値よりも低い単語を、修正対象として特定した。候補特定部２６は、比較文章から抽出された全ての単語について対象の単語との組合せ類似度を計算したとき、いずれかの単語との組合せ類似度が第２閾値より高い単語を、修正候補が除外するように動作してもよい。この場合、除外されない単語が修正候補となる。
また候補特定部２６は、比較文章から抽出された全ての単語について対象の単語との組合せ類似度を計算し、対象の単語ごとに組合せ類似度の最大値を特定する。この組合せ類似度の最大値が一番低いものを修正候補として特定するように、動作してもよい。あるいは、この組合せ類似度の最大値が高い順から一定量を修正候補から除外するように、動作してもよい。 *** Other configurations ***
<Modification 6>
In step S8 of FIG. 10, the candidate identification unit 26 specified, for all the words extracted from the comparative sentence, the words whose combination similarity with the target words was lower than the second threshold value as the correction target. When the candidate identification unit 26 calculates the combination similarity with the target word for all the words extracted from the comparative sentence, the correction candidate sets the word whose combination similarity with any word is higher than the second threshold value. It may act to exclude. In this case, the words that are not excluded are candidates for correction.
Further, the candidate identification unit 26 calculates the combination similarity with the target word for all the words extracted from the comparative sentence, and specifies the maximum value of the combination similarity for each target word. It may operate so as to specify the one having the lowest maximum value of the combination similarity as a correction candidate. Alternatively, it may operate so as to exclude a certain amount from the correction candidates in descending order of the maximum value of the combination similarity.

＜変形例７＞
実施の形態２では、類似度計算部２５は、対象の修正候補以外の文章５２を比較文章とした。しかし、類似度計算部２５は、対象の修正候補以外の文章５２のうち、対象の修正候補と異なる元識別子５５が付された文章５２だけを比較文章としてもよい。
これにより、対象の修正候補の元になった音声データを発話した人とは別の人が発話した音声データから生成された文章５２だけが比較文章となる。人によって単語のイントネーションに癖がある場合がある。これが原因となり、同じ人が発話した音声データから生成された文章５２には、同じ誤りが含まれる可能性がある。例えば、“あめ（雨）”と発話した場合に、ある人が発話すると何度も“雨”ではなく“飴”と誤認識されてしまう場合がある。
そのため、対象の修正候補の元になった音声データを発話した人の他の文章５２を比較文章としてしまうと、比較文章にも同じ誤りが含まれており、適切に修正対象を特定できない可能性がある。これに対して、変形例７に係る方法であれば、適切に修正対象を特定可能になる。 <Modification 7>
In the second embodiment, the similarity calculation unit 25 uses sentences 52 other than the target correction candidates as comparative sentences. However, the similarity calculation unit 25 may use only the sentence 52 having the original identifier 55 different from the target correction candidate among the sentences 52 other than the target correction candidate as the comparison sentence.
As a result, only the sentence 52 generated from the voice data uttered by a person other than the person who uttered the voice data that is the source of the target correction candidate becomes the comparison sentence. Some people have a habit of intonation of words. Due to this, the sentence 52 generated from the voice data spoken by the same person may contain the same error. For example, when a person speaks "Ame (rain)", when a person speaks, it may be mistakenly recognized as "candy" instead of "rain".
Therefore, if the other sentence 52 of the person who uttered the voice data that is the source of the target correction candidate is used as the comparison sentence, the same error is included in the comparison sentence, and there is a possibility that the correction target cannot be properly specified. There is. On the other hand, if the method according to the modification 7 is used, the correction target can be appropriately specified.

以上、この発明の実施の形態及び変形例について説明した。これらの実施の形態及び変形例のうち、いくつかを組み合わせて実施してもよい。また、いずれか１つ又はいくつかを部分的に実施してもよい。なお、この発明は、以上の実施の形態及び変形例に限定されるものではなく、必要に応じて種々の変更が可能である。 The embodiments and modifications of the present invention have been described above. Some of these embodiments and modifications may be combined and carried out. In addition, any one or several may be partially carried out. The present invention is not limited to the above embodiments and modifications, and various modifications can be made as needed.

１０修正候補特定装置、１１プロセッサ、１２メモリ、１３ストレージ、１４通信インタフェース、１５電子回路、２１データ取得部、２２単語抽出部、２３ベクトル生成部、２４テキスト合成部、２５類似度計算部、２６候補特定部、４１音声収集装置、４２音声認識装置、５１部分テキストデータ、５２文章、５３表現ベクトル、５４合成テキストデータ、５５元識別子。 10 Correction candidate identification device, 11 Processor, 12 Memory, 13 Storage, 14 Communication interface, 15 Electronic circuit, 21 Data acquisition unit, 22 Word extraction unit, 23 Vector generation unit, 24 Text synthesis unit, 25 Similarity calculation unit, 26 Candidate identification unit, 41 voice collector, 42 voice recognition device, 51 partial text data, 52 sentences, 53 expression vectors, 54 synthetic text data, 55 original identifiers.

Claims

A word extractor that extracts words contained in text data composed of multiple sentences, and a word extractor
The words extracted from the target sentence by the word extraction unit and the word extracted from the target sentence by the word extraction unit and the word extracted from the target sentence by the word extraction unit, with each of the plurality of sentences as the target sentence and at least one of the sentences before and after the target sentence as the partner sentence. A similarity calculation unit that calculates the similarity with words extracted from the other sentence as a time-series similarity,
A candidate identification unit that specifies the target sentence as a correction candidate when the time-series similarity calculated by the similarity calculation unit for the target sentence is lower than the first threshold value, with each of the plurality of sentences as the target sentence. Correction candidate identification device including.

The similarity calculation unit uses sentences other than the correction candidates specified by the candidate identification unit as comparison sentences, and the similarity between the words extracted from the correction candidates and the words extracted from each of the comparison sentences. Calculate the degree as a combination similarity and
The candidate identification unit according to claim 1, wherein when the combination similarity of any word extracted from the comparative sentence is lower than the second threshold value, the word extracted from the correction candidate is specified as a correction target. Correction candidate identification device.

The modification candidate identification device according to claim 2, wherein the text data is generated by synthesizing partial text data obtained by converting voice data collected by each of a plurality of voice collecting devices.

The correction candidate identification device according to claim 3, wherein the similarity calculation unit calculates the combination similarity using a sentence included in the partial text data other than the partial text data including the correction candidate as the comparison sentence.

The correction candidate identification device further
A vector generation unit for generating an expression vector representing the meaning of the target word is provided with the word extracted by the word extraction unit as the target word.
The modification candidate identification device according to any one of claims 1 to 4, wherein the similarity calculation unit calculates the similarity using the expression vector generated by the vector generation unit.

The similarity calculation unit according to claim 5 calculates the inner product of the expression vector for a word extracted from one sentence and the expression vector for a word extracted from the other sentence as the similarity. Modification candidate identification device.

The word extraction unit extracts words contained in text data composed of multiple sentences,
The similarity calculation unit extracts each of the plurality of sentences as a target sentence, sets at least one of the sentences before and after the target sentence as the other sentence, and extracts the words extracted from the target sentence and the other sentence. Calculate the similarity with the word as the time series similarity,
A method for identifying a correction candidate in which the candidate identification unit specifies the target sentence as a correction candidate when the time-series similarity calculated for the target sentence is lower than the first threshold value, with each of the plurality of sentences as the target sentence. ..

Word extraction processing that extracts words contained in text data composed of multiple sentences, and
Each of the plurality of sentences is a target sentence, and at least one of the sentences before and after the target sentence is set as a partner sentence, and the word extracted from the target sentence by the word extraction process and the word extracted by the word extraction process are described. Similarity calculation processing that calculates the similarity with the word extracted from the other sentence as the time series similarity,
Candidate identification processing for specifying the target sentence as a correction candidate when the time-series similarity calculated by the similarity calculation process for the target sentence is lower than the first threshold value, with each of the plurality of sentences as the target sentence. A modification candidate identification program that makes the computer function as a modification candidate identification device.