JP2009187349A

JP2009187349A - Text correction support system, text correction support method and program for supporting text correction

Info

Publication number: JP2009187349A
Application number: JP2008027471A
Authority: JP
Inventors: Hirotaka Muramatsu; 広丘村松
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2008-02-07
Filing date: 2008-02-07
Publication date: 2009-08-20

Abstract

<P>PROBLEM TO BE SOLVED: To correct character information output for a large amount by voice recognition processing into more easily understandable information in a short period of time and to provide it. <P>SOLUTION: A data processor to which character string information as the expression data made into texts of voice obtained by the voice recognition processing is input includes: a notation level correction means for generating a corrected text in which a word which may be an error at a notation level is corrected for the text indicated by the input character string information; a corrected text display means for causing the corrected text generated by the notation level correction means to be displayed on the screens of respective terminals for readers and respective terminals for operators; a correction instruction receiving means for receiving a correction instruction to the text under display by the operation of the operator from the terminal for the operator; and an image updating means for reflecting correction by the correction instruction received by the correction instruction receiving means on the screens of the respective terminals for the readers and the respective terminals for the operators specify a correction part. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、音声認識処理によって得られる音声の文章化表現データである文字列情報で示される文章の修正を読み手が理解しやすい文章にして提示するための文章修正支援システム、文章修正支援方法、および文章修正支援用プログラムに関する。 The present invention provides a sentence correction support system, a sentence correction support method for presenting a correction of a sentence indicated by character string information, which is voiced expression data obtained by voice recognition processing, as a sentence that can be easily understood by a reader, And a sentence correction support program.

会議で交わされる音声や、ラジオやテレビ等の媒体から発せられる音声による情報を文字列で提供する音声認識の実施方法として、例えば、マイクロフォン等を介して入力される音声情報を解析して、その結果をエディタ画面等に出力する方法がある。これにより、音声による情報の獲得が困難な聴覚障害者や高齢者、日本語習得途上の外国人に対し、より正確な情報を提供することができる。 As an implementation method of voice recognition that provides information by voice that is exchanged at a meeting or voice from a medium such as a radio or television, for example, by analyzing voice information input via a microphone or the like, There is a method of outputting the result to an editor screen or the like. As a result, it is possible to provide more accurate information to hearing-impaired persons, elderly persons, and foreigners who are currently learning Japanese, who are difficult to acquire information by voice.

ところで、コンピュータの音声データ処理において、音声によるコミュニケーションが短時間に多くの言葉が発せられることや周りの雑音等の障壁が、音声の識別を困難にしている。 By the way, in voice data processing of a computer, voice communication makes a lot of words uttered in a short time, and barriers such as surrounding noise make it difficult to identify voices.

このように、様々な原因によって誤認識が生じた文字列を正しい文字列に修正するための技術の一例として、例えば、特許文献１には、オペレータを介在させて修正を行う文字データ修正装置が記載されている。特許文献１に記載されている文字データ修正装置では、番組内容の音声を音声認識によりテキストデータに変換した文字列に対して、不一致箇所選択手段が、オペレータの操作により音声との不一致箇所の選択を行う。そして、選択された不一致箇所の内容に対応した修正を行った修正テキストデータを入力して、元のテキストデータに加えることにより修正付加テキストデータを生成、出力する。 Thus, as an example of a technique for correcting a character string that has been erroneously recognized due to various causes to a correct character string, for example, Patent Document 1 discloses a character data correction device that performs correction with an operator interposed therebetween. Are listed. In the character data correction device described in Patent Document 1, the mismatched portion selecting means selects a mismatched portion with the voice by the operation of the operator for the character string obtained by converting the voice of the program contents into text data by voice recognition. I do. Then, the corrected text data that has been corrected in accordance with the content of the selected mismatched portion is input and added to the original text data to generate and output corrected additional text data.

また、例えば、特許文献２には、単語間の意味関係に基づく制約情報を参照して音声認識結果の誤りを修正する音声認識方法が記載されている。特許文献２に記載されている音声認識方法では、単語間の意味関係とその度合いを示す関連度とを記憶しておき、その関連度に基づいて、最も確からしい単語列を選択する。 Further, for example, Patent Literature 2 describes a speech recognition method that corrects an error in a speech recognition result with reference to constraint information based on a semantic relationship between words. In the speech recognition method described in Patent Document 2, the semantic relationship between words and the degree of association indicating the degree are stored, and the most probable word string is selected based on the degree of association.

特開２００４−１５１６１４号公報JP 2004-151614 A 特開２００７−２５６８３６号公報JP 2007-256836 A

しかし、特許文献１に記載されているような、単純にオペレータを介在させて修正を行う方法では、オペレータが、託される文字列を全て追って不一致箇所を選択しなければならず、長時間文字を追う場合には、そのために生じる文章のゆれが顕著となる。ここで、”ゆれ”とは、同音異義語等、発音としては正しいものの、表記文章としては不適切な表現が含まれる状態をいう。結果、その情報の要点等を把握しずらくなり、リアルタイムでの正しい読解が妨げられてしまう。これでは、例えば、聴覚障害者が音声コミュニケーションの場に参加するために用いようとしても、用をなさない。 However, in the method of correcting by simply interposing an operator as described in Patent Document 1, the operator has to select a mismatched portion by following all the character strings to be entrusted, and the characters for a long time. In the case of chasing, the fluctuation of the sentence that occurs for that purpose becomes remarkable. Here, “sway” means a state in which an expression that is correct as a pronunciation but inappropriate as a written sentence is included, such as a homonym. As a result, it becomes difficult to grasp the main points of the information, and correct reading in real time is hindered. In this case, for example, even if a hearing-impaired person tries to use it for participating in a place for voice communication, there is no use.

また、特許文献２に記載されているような、コンピュータ解析により自動で修正を行う方法では、正確度として低い結果が出る恐れがある。また、コンピュータ解析のみで、話される音声情報を全て解析し、常に正確に出力しようとすれば、解析に要する時間も長くなり、やはりリアルタイムでの正しい読解の妨げとなってしまう。 In addition, in the method of automatically correcting by computer analysis as described in Patent Document 2, there is a risk that a low accuracy result may be obtained. Moreover, if all spoken speech information is analyzed by computer analysis alone and is always output accurately, the time required for the analysis becomes longer, which also hinders correct reading in real time.

そこで、本発明は、音声認識処理によって多量に出力される文字情報を、短時間で、よりわかりやすい情報に修正して提供することを目的とする。 In view of the above, an object of the present invention is to provide character information output in a large amount by voice recognition processing by correcting it to more easily understandable information in a short time.

本発明による文章修正支援システムは、音声認識処理によって得られる音声の文章化表現データである文字列情報で示される文章を、読み手が理解しやすい文章にして提示するための文章修正支援システムであって、音声の文章化表現データである文字列情報が入力されるデータ処理装置と、読み手が使用する１台以上の読み手用端末と、オペレータが使用する１台以上のオペレータ用端末とを備え、前記データ処理装置は、入力される文字列情報で示される文章に対し、表記レベルでの誤りの可能性のある単語を補正した補正文章を生成する表記レベル補正手段と、前記表記レベル補正手段によって生成された補正文章を、各読み手用端末および各オペレータ用端末の画面に表示させる補正文章表示手段と、オペレータ用端末から、オペレータの操作による表示中の文章に対する修正指示を受け付ける修正指示受け付け手段と、各読み手用端末および各オペレータ用端末の画面に、前記修正指示受け付け手段が受け付けた修正指示による修正を、修正個所を特定可能に反映させる画面更新手段とを備えたことを特徴とする。 A sentence correction support system according to the present invention is a sentence correction support system for presenting a sentence indicated by character string information, which is voiced expression data obtained by voice recognition processing, as a sentence that can be easily understood by a reader. A data processing device for inputting character string information, which is voiced expression data, one or more reader terminals used by a reader, and one or more operator terminals used by an operator, The data processing device includes a notation level correcting unit that generates a corrected sentence by correcting a word that may be erroneous at a notation level for a sentence indicated by input character string information, and the notation level correcting unit. The corrected text display means for displaying the generated corrected text on the screen of each reader terminal and each operator terminal, and the operator terminal. A correction instruction receiving means for receiving a correction instruction for the displayed text by the operation of the data, and specifying a correction location on the screen of each reader terminal and each operator terminal for the correction by the correction instruction received by the correction instruction receiving means It is characterized by comprising a screen updating means for reflecting it as possible.

また、本発明による文章修正支援方法は、音声認識処理によって得られる音声の文章化表現データである文字列情報で示される文章を、読み手が理解しやすい文章にして提示するための文章修正支援方法であって、音声の文章化表現データである文字列情報が入力されるデータ処理装置が、入力される文字列情報で示される文章に対し、表記レベルでの誤りの可能性のある単語を補正した補正文章を生成し、生成された補正文章を、各読み手用端末および各オペレータ用端末の画面に表示させ、オペレータ用端末から、オペレータの操作による表示中の文章に対する修正指示を受け付け、各読み手用端末および各オペレータ用端末の画面に、オペレータ用端末から受け付けた修正指示による修正を、修正個所を特定可能に反映させることを特徴とする。 Further, the sentence correction support method according to the present invention is a sentence correction support method for presenting a sentence indicated by character string information, which is voiced expression data obtained by voice recognition processing, as a sentence that can be easily understood by a reader. A data processing device to which character string information, which is voiced expression data of speech, is input, corrects a word that may be erroneous at the notation level for the text indicated by the input character string information. The corrected text is generated, and the generated corrected text is displayed on the screen of each reader terminal and each operator terminal. From the operator terminal, a correction instruction for the displayed text by the operation of the operator is received, and each reader The correction based on the correction instruction received from the operator terminal is reflected on the screen of the operator terminal and each operator terminal so that the correction location can be specified. To.

また、本発明による文章修正支援用プログラムは、音声認識処理によって得られる音声の文章化表現データである文字列情報で示される文章を、読み手が理解しやすい文章にして提示するための文章修正支援用プログラムであって、音声の文章化表現データである文字列情報が入力されるデータ処理装置に適用される文章修正支援用プログラムにおいて、コンピュータに、入力される文字列情報で示される文章に対し、表記レベルでの誤りの可能性のある単語を補正した補正文章を生成する処理、生成された補正文章を、各読み手用端末および各オペレータ用端末の画面に表示させる処理、オペレータ用端末から、オペレータの操作による表示中の文章に対する修正指示を受け付ける処理、および各読み手用端末および各オペレータ用端末の画面に、オペレータ用端末から受け付けた修正指示による修正を、修正個所を特定可能に反映させる処理を実行させることを特徴とする。 In addition, the sentence correction support program according to the present invention provides a sentence correction support for presenting a sentence indicated by character string information, which is voiced expression data obtained by voice recognition processing, as a sentence that can be easily understood by a reader. Program for supporting correction of text applied to a data processing device to which character string information that is voiced text representation data is input, for a sentence indicated by character string information input to a computer , A process for generating a corrected sentence that corrects a word that may be erroneous at the notation level, a process for displaying the generated corrected sentence on the screen of each reader terminal and each operator terminal, from the operator terminal, Processing to accept correction instructions for the displayed text by the operator's operation, and for each reader terminal and each operator terminal The surface, the correction by the correction instruction received from the operator terminal, characterized in that to execute a process of identifiable reflect modifications locations.

本発明によれば、音声認識処理によって多量に出力される文字情報を、短時間で、よりわかりやすい情報に修正して提供することができる。 ADVANTAGE OF THE INVENTION According to this invention, the character information output in large quantities by voice recognition processing can be corrected and provided in a short time to more easily understandable information.

以下、本発明の実施形態を図面を参照して説明する。図１は、本発明による文章修正支援システムの構成例を示すブロック図である。図１に示す文章修正支援システムは、音声認識処理によって得られる音声の文章化表現データである文字列情報で示される文章を、読み手が理解しやすい文章に修正して提示するための文章修正支援システムであって、音声の文章化表現データである文字列情報が入力されるデータ処理装置１と、読み手が使用する１台以上の読み手用端末２−１と、オペレータが使用する１台以上のオペレータ用端末２−２とを備える。 Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing a configuration example of a text correction support system according to the present invention. The text correction support system shown in FIG. 1 is a text correction support system for correcting and presenting a text indicated by character string information, which is voiced expression data obtained by voice recognition processing, into a text that is easy for a reader to understand. The system is a data processing device 1 to which character string information, which is voiced expression data, is input, one or more reader terminals 2-1 used by a reader, and one or more terminals used by an operator And an operator terminal 2-2.

また、データ処理装置１は、表記レベル補正手段１１と、補正文章表示手段１２と、修正指示受け付け手段１３と、画面更新手段１４とを含む。 The data processing apparatus 1 also includes a notation level correction unit 11, a corrected text display unit 12, a correction instruction reception unit 13, and a screen update unit 14.

表記レベル補正手段１１は、入力される文字列情報で示される文章に対し、表記レベルでの誤りの可能性のある単語を補正した補正文章を生成する。表記レベル補正手段１１は、例えば、文字列情報で示される文章に含まれる各単語について、予め登録されている単語辞書との照合を行い、単語辞書に登録されていない場合に、当該単語に表記レベルでの誤りの可能性があるとして、当該単語をひらがなに変換することによって補正文章を生成してもよい。また、表記レベル補正手段１１は、例えば、文字列情報で示される文章に含まれる各単語について、予め登録されている単語辞書との照合を行い、単語辞書におなじ読みをもつ異なる表記の単語が登録されている場合に、当該単語に表記レベルでの誤りの可能性があるとして、当該単語をひらがなに変換することによって補正文書を生成してもよい。 The notation level correcting means 11 generates a corrected sentence in which a word that may be erroneous at the notation level is corrected for the sentence indicated by the input character string information. For example, the notation level correcting unit 11 compares each word included in the sentence indicated by the character string information with a word dictionary registered in advance, and if it is not registered in the word dictionary, the notation level is corrected. Since there is a possibility of an error in the level, the corrected sentence may be generated by converting the word into hiragana. In addition, the notation level correcting unit 11 collates each word included in the sentence indicated by the character string information with a word dictionary registered in advance, and words having different notations having the same reading in the word dictionary are obtained. If the word is registered, the corrected document may be generated by converting the word into hiragana on the assumption that the word may have an error at the notation level.

補正文章表示手段１２は、表記レベル補正手段１１によって生成された補正文章を、各読み手用端末および各オペレータ用端末の画面に表示させる。 The corrected text display means 12 displays the corrected text generated by the notation level correction means 11 on the screen of each reader terminal and each operator terminal.

修正指示受け付け手段１３は、オペレータ用端末２−２から、オペレータ操作による表示中の文章に対する修正指示を受け付ける。 The correction instruction receiving means 13 receives from the operator terminal 2-2 a correction instruction for the displayed text by the operator operation.

画面更新手段１４は、各読み手用端末２−１および各オペレータ用端末２−２の画面に、修正指示受け付け手段１３が受け付けた修正指示による修正を、修正個所を特定可能に反映させる。 The screen updating unit 14 reflects the correction based on the correction instruction received by the correction instruction receiving unit 13 on the screen of each reader terminal 2-1 and each operator terminal 2-2 so that the correction location can be specified.

また、図２は、本発明による文章修正支援システムの他の構成例を示すブロック図である。図２に示すように、データ処理装置１は、さらに、文章加工手段１５と、加工文書表示手段１６と、モード設定手段１７とを含んでいてもよい。 FIG. 2 is a block diagram showing another configuration example of the text correction support system according to the present invention. As shown in FIG. 2, the data processing apparatus 1 may further include a sentence processing unit 15, a processed document display unit 16, and a mode setting unit 17.

文章加工手段１５は、表記レベル補正手段１１によって生成された補正文章を予め定められた表示モードに応じて加工した加工文章を生成する。 The sentence processing unit 15 generates a processed sentence obtained by processing the corrected sentence generated by the notation level correcting unit 11 according to a predetermined display mode.

加工文書表示手段１６は、表示中の補正文章と対応づけて、各読み手が使用している読み手用端末２−１の画面に、該読み手に対し設定されている表示モードに即した加工文章を表示させるとともに、オペレータ用端末２−２の画面に、少なくとも読み手用端末に表示させた表示モードに即した加工文章を表示させる。 The processed document display means 16 associates the corrected text being displayed with the processed text corresponding to the display mode set for the reader on the screen of the reader terminal 2-1 used by each reader. At the same time, at least the processed text corresponding to the display mode displayed on the reader terminal is displayed on the screen of the operator terminal 2-2.

なお、本例の場合には、修正指示受け付け手段１３は、オペレータ用端末から、オペレータの操作による前記加工文章に対する修正指示を受け付ける。また、画面更新手段１４は、各読み手用端末および各オペレータ用端末の画面に、修正指示受け付け手段１３が受け付けた加工文章に対する修正指示による修正を、修正箇所を特定可能に反映させる。 In the case of this example, the correction instruction receiving means 13 receives a correction instruction for the processed text by the operator's operation from the operator terminal. Further, the screen updating unit 14 reflects the correction by the correction instruction on the processed text received by the correction instruction receiving unit 13 on the screens of the reader terminals and the operator terminals so that the correction part can be specified.

モード設定手段１７は、読み手用端末２−１から、読み手の操作による表示モードの切り替え指示を受け付ける。モード設定手段１７は、例えば、表示モードとして、発声内容とほぼ同一の内容を提示する詳細モードと、発声内容から冗長さを排除した字幕モードと、発声内容の要点を箇条書きで提示する簡易モードとを用意し、読み手のレベルに応じた切り替えを可能としてもよい。 The mode setting means 17 receives a display mode switching instruction by the reader's operation from the reader terminal 2-1. The mode setting means 17 is, for example, a detailed mode that presents almost the same content as the utterance content, a subtitle mode that eliminates redundancy from the utterance content, and a simple mode that presents the main points of the utterance content in a list as display modes. And may be switched according to the level of the reader.

また、加工文章表示手段１６は、モード設定手段１７が表示モードの切り替え指示を受け付けた場合には、少なくとも要求元の読み手用端末に、該読み手に対し設定されている表示モードに即した加工文章を表示させてもよい。 In addition, when the mode setting unit 17 accepts a display mode switching instruction, the processed text display unit 16 is processed at least in the requesting terminal for the reader according to the display mode set for the reader. May be displayed.

以下、より具体的な実施形態について説明する。図３は、より具体的な文章修正支援システムの構成例を示すブロック図である。図３に示す文章修正支援システム１０００は、音声認識装置１００と、データ処理装置２００と、１つ以上の文字入出力装置３００とを備える。 Hereinafter, more specific embodiments will be described. FIG. 3 is a block diagram illustrating a more specific configuration example of the text correction support system. A text correction support system 1000 shown in FIG. 3 includes a speech recognition device 100, a data processing device 200, and one or more character input / output devices 300.

音声認識装置１００は、既存の技術によるものであり、音声を入力するマイクロフォン装置やアナログ信号である音声をディジタル信号に変換するアナログデジタルコンバータ装置、音声認識処理を実現するソフトウェア（より具体的には、該ソフトウェアに従って動作するＣＰＵ等のプロセッサ）によって実現される。 The voice recognition device 100 is based on existing technology, and is a microphone device that inputs voice, an analog-digital converter device that converts voice, which is an analog signal, into a digital signal, and software that implements voice recognition processing (more specifically, And a processor such as a CPU that operates in accordance with the software.

なお、本実施形態における音声認識装置１００は、データ処理装置２００に対し、少なくとも入力される音声を文字列に変換して出力する。なお、音声認識装置１００が出力する音声認識結果には、入力音声で表現された言葉を示す文字列だけでなく、その文字列における語句の区切り等の文法情報を含んでいてもよい。 Note that the speech recognition apparatus 100 according to the present embodiment converts at least input speech into a character string and outputs it to the data processing apparatus 200. Note that the speech recognition result output by the speech recognition apparatus 100 may include not only a character string indicating a word expressed by the input speech but also grammatical information such as a word break in the character string.

データ処理装置２００は、音声認識装置１００から入力される音声認識結果の文字列を必要に応じて修正しつつ、各文字入出力装置３００に出力するための制御を行う。データ処理装置２００は、認識結果補正手段２１０と、モード設定手段２２０と、文章加工手段２３０と、文章編集手段２４０と、画面制御手段２５０とを含む。 The data processing device 200 performs control for outputting to the character input / output devices 300 while correcting the character string of the speech recognition result input from the speech recognition device 100 as necessary. The data processing device 200 includes a recognition result correction unit 210, a mode setting unit 220, a text processing unit 230, a text editing unit 240, and a screen control unit 250.

認識結果補正手段２１０は、音声認識装置１００によって得られた文字列情報に対し、誤変換等の表記レベルでの誤りを補正する。なお、認識結果補正手段２１０は、上述の表記レベル補正手段１１の機能を実現する処理手段である。 The recognition result correction unit 210 corrects an error at a notation level such as erroneous conversion on the character string information obtained by the speech recognition apparatus 100. The recognition result correction unit 210 is a processing unit that realizes the function of the above-described notation level correction unit 11.

モード設定手段２２０は、読み手毎に、文章の表示モードを設定する。本実施形態では、文章の表示モードとして、詳細モード，字幕（要約）モード，簡易モードを用意する。詳細モードは、入力音声で表現された文章をそのままの形式で表示するモードである。字幕モードは、入力音声で表現された文章を字幕のように要約した形式で表示するモードである。簡易モードは、入力音声で表現された文章において情報獲得のポイントとなるキーワードを列挙する形式で表示するモードである。モード設定手段２２０は、例えば、読み手からの要求に応じて、該読み手における文章の表示モードを設定する。なお、モード設定手段２２０は、上述のモード設定手段１７の機能を実現する処理手段である。 The mode setting means 220 sets a text display mode for each reader. In the present embodiment, a detailed mode, a caption (summary) mode, and a simple mode are prepared as text display modes. The detailed mode is a mode in which a sentence expressed by input speech is displayed as it is. The subtitle mode is a mode in which a sentence expressed by input speech is displayed in a summarized form like a subtitle. The simple mode is a mode in which keywords that are points of information acquisition are listed in a sentence expressed by input speech. For example, the mode setting means 220 sets a text display mode for the reader in response to a request from the reader. The mode setting unit 220 is a processing unit that realizes the function of the mode setting unit 17 described above.

文章加工手段２３０は、各モードに即した文章の加工を行う。なお、文章加工手段２３０は、上述のモード設定手段１７の機能を実現する処理手段である。 The sentence processing unit 230 processes a sentence in accordance with each mode. The sentence processing unit 230 is a processing unit that realizes the function of the mode setting unit 17 described above.

文章編集手段２４０は、文章加工手段２３０が加工した文章に対し、オペレータからの修正操作に応じて編集を実施する。なお、文章編集手段２４０は、上述の修正指示受け付け手段の機能を実現する処理手段である。 The text editing unit 240 edits the text processed by the text processing unit 230 in accordance with a correction operation from the operator. The text editing unit 240 is a processing unit that realizes the function of the above-described correction instruction receiving unit.

画面制御手段２５０は、所定のタイミングで、音声認識結果の文字列から生成される文章を、各文字入出力装置３００に表示させるための処理を行う。なお、画面制御手段２５０が行う処理には、各文字入出力装置３００に、文章編集手段２４０によって編集（修正）された部分の更新を行わせるための処理を含む。本実施形態では、画面制御手段２５０は、認識結果補正手段２１０による補正が完了した時点で各文字入出力装置３００に文章を表示させ、その後、読み手毎に文章の表示モードが設定された際や文章編集手段２４０によって文章の編集が行われた際に、その旨およびその箇所がわかるように再表示させる。なお、画面制御手段２５０は、上述の補正文章表示手段１２、画面更新手段１４、および加工文章表示手段１６の機能を実現する処理手段である。 The screen control unit 250 performs processing for causing each character input / output device 300 to display a sentence generated from the character string of the voice recognition result at a predetermined timing. The processing performed by the screen control unit 250 includes processing for causing each character input / output device 300 to update the portion edited (corrected) by the text editing unit 240. In the present embodiment, the screen control unit 250 causes each character input / output device 300 to display a sentence when the correction by the recognition result correction unit 210 is completed, and then when the display mode of the sentence is set for each reader. When the text editing unit 240 edits the text, the text editing section 240 redisplays the text so that the fact and the location can be understood. The screen control unit 250 is a processing unit that realizes the functions of the corrected text display unit 12, the screen update unit 14, and the processed text display unit 16 described above.

文字入出力装置３００は、データ処理装置２００で補正、加工、編集された文章を表示したり、その文章に対する操作を入力するための入出力装置である。文字入出力装置３００は、例えば、ディスプレイ装置等の出力装置と、マウスやキーボード等の入力装置とを備えたネットワーク通信機能を有するパーソナルコンピュータ（ＰＣ）によって実現される。実施形態では、２台のＰＣを配置する例を用いて説明する。ここでは、２台のうちの１台でオペレータが文字編集支援を行い、残りの１台で、音声による情報の獲得が困難な利用者（読み手）が修正文章の閲覧を行うものとする。なお、オペレータ用に２台以上配置したり、読み手用に２台以上配置することも可能である。また、文字入出力装置３００を実現するＰＣのＣＰＵの一つが、データ処理装置２００を実現するといった構成も可能である。 The character input / output device 300 is an input / output device for displaying a sentence corrected, processed, and edited by the data processing apparatus 200 and inputting an operation on the sentence. The character input / output device 300 is realized by, for example, a personal computer (PC) having a network communication function including an output device such as a display device and an input device such as a mouse and a keyboard. The embodiment will be described using an example in which two PCs are arranged. Here, it is assumed that an operator supports character editing on one of the two units, and a user (reader) who cannot easily obtain information by voice browses the corrected text on the remaining one unit. It is possible to arrange two or more units for the operator or two or more units for the reader. Further, a configuration in which one of the CPUs of the PC that implements the character input / output device 300 implements the data processing device 200 is also possible.

次に、本実施形態の動作について説明する。図４は、文章修正支援システム１０００の動作例を示すフローチャートである。なお、図４に示す例では、データ処理装置２００が、一連の文字列情報（例えば、人間が人前で話をするときに、次に一息つくまでの長さ程度に区切った文章についての文字列情報）を単位に、ステップＳ１０〜Ｓ７０の処理を繰り返し行うループ構造となっている。なお、ステップＳ６０，Ｓ７０については、ループ構造から離れ、オペレータからの操作に応じたタイミングで処理するようにしてもよい。 Next, the operation of this embodiment will be described. FIG. 4 is a flowchart showing an operation example of the text correction support system 1000. In the example shown in FIG. 4, the data processing device 200 has a series of character string information (for example, a character string for a sentence divided into lengths until the next time when a person speaks in front of a person. It is a loop structure in which the processing of steps S10 to S70 is repeated for each unit of information. Steps S60 and S70 may be processed at a timing according to the operation from the operator, away from the loop structure.

なお、図示省略しているが、まず音声認識装置１００に、人間が発声発語した音声が入力される。音声認識装置１００は、音声認識処理によって入力された音声を、該音声の文章化表現データである文字列情報に変換して、データ処理装置２００に出力する。なお、文字列情報には、音声を文章表記した際の文字列の情報だけでなく、読みや語句の区切りの情報等が含まれるものとする。 Although not shown in the figure, first, speech uttered by a human is input to the speech recognition apparatus 100. The voice recognition device 100 converts the voice input by the voice recognition processing into character string information that is textual expression data of the voice, and outputs it to the data processing device 200. Note that the character string information includes not only information on the character string when the speech is written in text, but also information on readings and phrase breaks.

データ処理装置２００では、認識結果補正手段２１０が、音声認識装置１００から出力される文字列情報を入力情報として、該文字列情報で示される音声認識装置１００の認識結果である文章を補正する（ステップＳ１０）。なお、認識結果補正手段２１０が補正した文章を示す文字列情報は、加工文章表示のための元データとして文章加工手段２３０に入力される。 In the data processing device 200, the recognition result correction unit 210 uses the character string information output from the speech recognition device 100 as input information, and corrects the text that is the recognition result of the speech recognition device 100 indicated by the character string information ( Step S10). Note that the character string information indicating the sentence corrected by the recognition result correcting unit 210 is input to the sentence processing unit 230 as original data for displaying the processed sentence.

認識結果補正手段２１０は、文字列情報で示される文章に対し、誤変換等の表記レベルでの誤りを補正する。なお、表記レベルでの誤りとは、同音異義語のような日本語（単語）としては存在する言葉であるが、表記された文章としては誤りであるものをいう。 The recognition result correcting unit 210 corrects an error at a notation level such as erroneous conversion for the sentence indicated by the character string information. Note that an error at the notation level means a word that exists as Japanese (word), such as a homonym, but is an error as a written sentence.

本実施例では、認識結果補正手段２１０は、単語辞書を有し、その単語辞書と照合することによって、表記レベルでの誤りを補正する。なお、本実施例では、単語辞書で照合できる範囲での補正を行うのみとし、前後の文脈から判断するような補正は行わない。そのような補正は、文章加工手段２３０で既存の発明を活用したり、オペレータによる修正指示によって補う。 In this embodiment, the recognition result correcting unit 210 has a word dictionary and corrects an error at the notation level by collating with the word dictionary. In this embodiment, only correction within a range that can be collated with the word dictionary is performed, and correction that is determined from the context before and after is not performed. Such correction is supplemented by utilizing the existing invention in the text processing means 230 or by a correction instruction by the operator.

一連の文字列情報に対し、認識結果補正手段２１０による補正が完了すると、画面制御手段２５０は、認識結果補正手段２１０によって得られた補正後の文章を第１次文章として、各文字入出力装置３００の画面に表示させる（ステップＳ２０）。例えば、予め各文字入出力装置３００を当該データ処理装置２００にアクセスさせておき、画面制御手段２５０は、その際ダウンロードさせた端末用アプリケーションソフトウェアに対して、補正後の文章の文字列情報を含む画面更新要求メッセージを送信することによって、各文字入出力装置３００の画面に補正後の文章を表示させてもよい。また、例えば、周期的に自動更新を行う処理を組み込んだＷｅｂページの画面情報を予め用意しておき、各文字入出力装置３００に、そのＷｅｂページにアクセスさせておくことによって、各文字入出力装置３００からの自動更新要求に応じて補正後の文章をそのＷｅｂページに反映（掲載）させることによって、表示させてもよい。 When the correction by the recognition result correction unit 210 is completed for a series of character string information, the screen control unit 250 sets each corrected text obtained by the recognition result correction unit 210 as a primary sentence and outputs each character input / output device. It is displayed on the screen 300 (step S20). For example, each character input / output device 300 is made to access the data processing device 200 in advance, and the screen control means 250 includes corrected character string information for the terminal application software downloaded at that time. The corrected text may be displayed on the screen of each character input / output device 300 by transmitting a screen update request message. In addition, for example, screen information of a Web page that incorporates a process of performing automatic update periodically is prepared in advance, and each character input / output device 300 is allowed to access the Web page to thereby input / output each character. The corrected text may be reflected (published) on the Web page in response to an automatic update request from the device 300 and displayed.

ここでは、その後のモード設定手段２２０や文章編集手段２４０の処理が、読み手やオペレータの介在（操作）を必要とすることにより、情報獲得の面で、数十秒程度のタイムロスを生じる可能性が高いことを考慮して、少しでも早く状況（話されている内容の要点）の把握を可能にさせるために、読み手やオペレータからの操作を待たずに、第一次文章として表示させている。図５は、文字入出力装置３００に表示させる画面の一例を示す説明図である。画面制御手段２５０は、例えば、図５に示すように、第一次文章を画面の上段に表示させ、その後加工された文章や編集された文章を第二次文章として画面の下段に表示させるなど、リアルタイム性を損なわずに、かつ正しい情報の把握がしやすい構成で表示させる。 Here, since the subsequent processing of the mode setting means 220 and the text editing means 240 requires the intervention (operation) of the reader or the operator, there is a possibility that a time loss of about several tens of seconds may occur in terms of information acquisition. Considering the fact that it is expensive, it is displayed as the primary sentence without waiting for the operation from the reader or the operator in order to make it possible to grasp the situation (the main points of the content being spoken) as soon as possible. FIG. 5 is an explanatory diagram illustrating an example of a screen displayed on the character input / output device 300. For example, as shown in FIG. 5, the screen control unit 250 displays the primary sentence on the upper part of the screen, and then displays the processed sentence or the edited sentence on the lower part of the screen as the secondary sentence. It is displayed in a configuration that does not impair the real-time property and allows easy understanding of correct information.

次に、モード設定手段２２０は、読み手からの要請に応じて、読み手毎に文章表示モードを設定する（ステップＳ３０）。本例では、文章表示モードとして、大まかな情報を提供するレベルの「簡易モード」と、字幕と同様のレベルの「字幕モード」と、認識結果をより洗練化した「詳細モード」を用意し、その三種類から選択可能な構成とする。モード設定手段２２０は、文字入出力装置３００において読み手が指定した文章表示モードをイベントとして受け取り、当該データ処理装置２００で管理（保持）している読み手別文章表示モードを設定したり、その設定を変更したりする。そして、設定されたモードに応じた加工文章を出力するための処理を行う。例えば、文章加工手段２３０に、ある文字入出力装置３００がモードの設定または変更をした旨を通知したりする。 Next, the mode setting means 220 sets a text display mode for each reader in response to a request from the reader (step S30). In this example, as the text display mode, we prepare a “simple mode” that provides rough information, a “subtitle mode” that is the same level as subtitles, and a “detail mode” that refines the recognition results. The configuration can be selected from the three types. The mode setting means 220 receives the sentence display mode designated by the reader in the character input / output device 300 as an event, and sets or sets the reader-specific sentence display mode managed (held) by the data processing apparatus 200. Or change it. And the process for outputting the processed text according to the set mode is performed. For example, the text processing unit 230 is notified that a certain character input / output device 300 has set or changed the mode.

なお、オペレータ用端末として使用される文字入出力装置３００に対しては、モードを複数設定可能としてもよい。例えば、１台のオペレータ用端末で文章の編集を行う場合には、各モードに応じた加工文章をその１台のオペレータ用端末に全て表示させる。なお、詳細モードのみをオペレータにチェックさせ、その詳細モードに対する修正を、他のモードに反映させるといった方法も可能である。 A plurality of modes may be set for the character input / output device 300 used as an operator terminal. For example, when editing a sentence on one operator terminal, all the processed sentences corresponding to each mode are displayed on the one operator terminal. It is also possible to have the operator check only the detailed mode and reflect the correction to the detailed mode in other modes.

次に、文章加工手段２３０が、モード設定手段２２０によって設定された各読み手のモードに合わせ、文章（ここでは、認識結果補正手段２１０によって補正された文章）を加工する（ステップＳ４０）。「簡易モード」の場合には、例えば、特開２００５−１０７７９３号公報に開示されているキーワード抽出方法を適用して、情報獲得のポイントとなりうるキーワードを列挙する形式で表記される文字列情報を生成してもよい。また、「字幕モード」の場合には、例えば、特開２００７−２３３８２３号公報に開示されている自動要約装置を適用して、字幕のように要約された形式で表記される文字列情報を生成してもよい。また、「詳細モード」の場合には、例えば、特開平１１−１７１０５８号公報に開示されている文章処理装置を適用して、認識結果をさらに正しい日本語として読める程度に洗練化した形式で表記される文字列情報を生成してもよい。 Next, the sentence processing unit 230 processes the sentence (here, the sentence corrected by the recognition result correcting unit 210) in accordance with each reader mode set by the mode setting unit 220 (step S40). In the case of the “simple mode”, for example, by applying the keyword extraction method disclosed in Japanese Patent Application Laid-Open No. 2005-107793, character string information expressed in a format for enumerating keywords that can be information acquisition points is displayed. It may be generated. In the case of “subtitle mode”, for example, an automatic summarization device disclosed in Japanese Patent Application Laid-Open No. 2007-233823 is applied to generate character string information that is expressed in a condensed form like subtitles. May be. In the case of “detail mode”, for example, a sentence processing device disclosed in Japanese Patent Application Laid-Open No. 11-171058 is applied, and the recognition result is expressed in a format that is refined so that it can be read as correct Japanese. Character string information to be generated may be generated.

なお、モード設定手段２２０からの通知を待たずに、各モードに応じた加工文章を生成する処理を先行して行い、モード設定手段２２０からの通知がされた際に、画面制御手段２５０に対してその読み手に対して設定されたモードに応じた加工文章を出力するようにしてもよい。また、例えば、予めモードの初期値を設定しておき、読み手からの切り替え要請（イベント）がくるまでは、現在設定されているモードで処理を進めるようにしてもよい。なお、読み手から切り替え要請があった場合には、ステップＳ３０以降の処理を再度行うようにすればよい。 It should be noted that the process of generating the processed text corresponding to each mode is performed in advance without waiting for the notification from the mode setting unit 220, and when the notification from the mode setting unit 220 is received, the screen control unit 250 is notified. The processed text corresponding to the mode set for the reader may be output. Further, for example, an initial value of the mode may be set in advance, and the process may be performed in the currently set mode until a switching request (event) is received from the reader. If there is a switching request from the reader, the processing after step S30 may be performed again.

一連の文字列情報に対し、文章加工手段２３０によって、ある読み手が設定したモードに応じた文章の加工が完了すると、画面制御手段２５０は、文章加工手段２３０によって得られたその読み手の文章表示モードに応じた加工後の文章を第２次文章として、該読み手が使用している文字入出力装置３００の画面に表示させる（ステップＳ５０）。なお、第２次文章の画面表示に係る処理は、ステップＳ２０と同様でよい。例えば、画面制御手段２５０は、対象とした文字入出力装置３００にインストールされている端末用アプリケーションソフトウェアに対して、加工後の文章の文字列情報を含む画面更新要求メッセージを送信することによって、該文字入出力装置３００の画面に加工後の文章を追加表示させてもよい。また、例えば、対象とした文字入出力装置３００からのＷｅｂページの画面情報に従って行われる自動更新要求に応じて加工後の文章をそのＷｅｂページに反映（掲載）させることによって、追加表示させてもよい。なお、第２次文章を表示する際には、既に表示中の第１次文章と対応づけて表示させることが好ましい。 When the text processing unit 230 completes the text processing according to the mode set by a certain reader for a series of character string information, the screen control unit 250 displays the reader's text display mode obtained by the text processing unit 230. The processed text corresponding to the text is displayed as a secondary text on the screen of the character input / output device 300 used by the reader (step S50). In addition, the process which concerns on the screen display of a secondary sentence may be the same as that of step S20. For example, the screen control unit 250 transmits the screen update request message including the character string information of the processed text to the terminal application software installed in the target character input / output device 300, thereby The processed text may be additionally displayed on the screen of the character input / output device 300. Further, for example, the processed text may be additionally displayed by reflecting (posting) the processed text on the Web page in response to an automatic update request made according to the screen information of the Web page from the target character input / output device 300. Good. When displaying the secondary text, it is preferable to display the secondary text in association with the primary text that is already displayed.

次に、文章編集手段２４０が、オペレータ用端末として使用されている文字入出力装置３００からの文章の修正指示を受け付け、その指示に従い、文章を編集する（ステップＳ６０）。文章編集手段２４０は、例えば、オペレータ操作に応じて入力される、表示文章に対する範囲の設定を示す情報を受け付けると、その設定範囲の情報を記憶するとともに、その設定範囲内の文字列に対する修正文字列の入力を行わせるようにしてもよい。そして、入力された設定範囲内の修正文字列の情報を受け取ればよい。文章編集手段２４０は、例えば、オペレータ用端末で表示させる画面情報に、表示文章に対して範囲の設定操作を受け付けて修正文字列を入力させる機能を組み込んでおき、画面操作に応じて必要な情報が当該データ処理装置２００に送信されるようにし、それを受信してもよい。なお、オペレータ用端末に対するステップＳ２０およびステップＳ５０の画面制御処理を文章編集手段２４０が行うようにしてもよい。 Next, the text editing unit 240 receives a text correction instruction from the character input / output device 300 used as an operator terminal, and edits the text according to the instruction (step S60). For example, when the text editing unit 240 receives information indicating the setting of the range for the displayed text, which is input in response to an operator operation, the text editing unit 240 stores the information of the setting range and corrects characters for the character string within the setting range. A column may be input. And the information of the correction character string within the input setting range may be received. For example, the text editing unit 240 incorporates a function for accepting a range setting operation for a displayed text and inputting a corrected character string in the screen information to be displayed on the operator terminal, and information necessary according to the screen operation. May be transmitted to the data processing apparatus 200 and received. Note that the text editing unit 240 may perform the screen control processing in steps S20 and S50 for the operator terminal.

また、文章編集手段２４０は、修正が確定した時点で、画面制御手段２５０に対して、修正による文章の編集部分の画面更新を指示する。 In addition, the text editing unit 240 instructs the screen control unit 250 to update the screen of the text editing portion by the correction when the correction is confirmed.

画面制御手段２５０は、オペレータからの修正指示により文章編集手段２４０によって編集された箇所がわかるように、各文字入出力装置３００の画面表示を更新させる（ステップＳ７０）。画面制御手段２５０は、例えば、編集箇所に関する情報（例えば、修正後の内容、修正範囲の始点位置、長さ、該当文章の送信カウンタ等）を含む画面更新要求メッセージを送信することによって、各文字入出力装置３００の画面に編集後の文章を、その箇所がわかるように表示させてもよい。なお、送信カウンタは、一連の文章として処理される単位での文章を識別するための情報であって、例えば、ステップＳ２０による文字列情報の送信が行われる度に、該文字列情報を識別するために割り当てられる番号である。各文字入出力装置３００は、データ処理装置２００から送信された送信カウンタを元に、更新をかける対象の加工文章を追跡し、その上で、送信された始点位置と長さで示される範囲の文字列をオペレータによる修正内容に上書きすればよい。その際、他の文字列と区別した表現（他の文字列と異なる色等）を用いる。 The screen control means 250 updates the screen display of each character input / output device 300 so that the part edited by the text editing means 240 can be known by the correction instruction from the operator (step S70). For example, the screen control unit 250 transmits a screen update request message including information on the edited portion (for example, the content after correction, the start position of the correction range, the length, the transmission counter of the corresponding sentence, etc.) The edited text may be displayed on the screen of the input / output device 300 so that the part can be understood. The transmission counter is information for identifying a sentence in a unit processed as a series of sentences. For example, each time character string information is transmitted in step S20, the character string information is identified. It is a number assigned for the purpose. Each character input / output device 300 tracks the processed text to be updated based on the transmission counter transmitted from the data processing device 200, and then, within the range indicated by the transmitted start point position and length. The character string may be overwritten with the correction contents by the operator. At that time, an expression (color etc. different from other character strings) distinguished from other character strings is used.

また、例えば、画面制御手段２５０は、各文字入出力装置３００で表示しているＷｅｂページの画面情報に従ってなされる自動更新要求に応じて、そのＷｅｂページに、最新の編集後の文章を編集箇所を強調しつつ反映（掲載）させることによって、画面表示を更新させてもよい。 Further, for example, the screen control unit 250 edits the latest edited text on the Web page in response to an automatic update request made according to the screen information of the Web page displayed on each character input / output device 300. The screen display may be updated by reflecting (posting) with emphasis.

以下、具体的な例を用いて本実施形態の動作を説明する。図６は、文章修正支援システム１０００の実施イメージの一例を示す説明図である。なお、図６に示す例では、音声認識装置１００およびデータ処理装置２００としてマイクロフォンに接続されるサーバを用いる。また、読み手用の文字入出力装置３００（読み手用端末）として、そのサーバとＬＡＮ等のネットワークを介して接続されるＰＣを用いる。また、オペレータ用の文字入出力装置３００（オペレータ用端末）として、そのサーバとネットワークを介して接続されるＰＣを用いる。本例では、音声認識装置１００は、サーバが有する既存の音声ワープロ機能（音声を文字列に変換する機能）をもつソフトウェア（より具体的には、そのソフトウェアに従って動作するＣＰＵ）によって実現されている。また、データ処理装置２００は、サーバが有する文章修正機能（データ処理装置２００の各処理手段の機能）をもつソフトウェアによって実現されている。なお、文章修正機能をもつソフトウェアは、音声ワープロ機能をもつソフトウェアのプラグイン・ソフトウェアとして実装されているものとする。 The operation of this embodiment will be described below using a specific example. FIG. 6 is an explanatory diagram showing an example of an implementation image of the text correction support system 1000. In the example illustrated in FIG. 6, a server connected to a microphone is used as the voice recognition device 100 and the data processing device 200. A PC connected to the server via a network such as a LAN is used as the character input / output device 300 (reader terminal) for the reader. Further, as the operator character input / output device 300 (operator terminal), a PC connected to the server via a network is used. In this example, the speech recognition apparatus 100 is realized by software (more specifically, a CPU that operates according to the software) having an existing speech word processor function (a function for converting speech into a character string) that the server has. . The data processing device 200 is realized by software having a sentence correction function (function of each processing means of the data processing device 200) of the server. Note that the software having the sentence correction function is implemented as plug-in software of software having the voice word processor function.

また、文字入出力装置３００を実現する各ＰＣは、ＴＣＰ／ＩＰ等の通信プロトコルに基づいた通信が可能とする。また、サーバと通信を行って画面表示を更新したり、修正指示を通知したりするクライアントソフトウェアがインストールされていてもよい。なお、特別なクライアントソフトウェアを用意せずに、ブラウザ等の既存のソフトウェアを利用することも可能である。 In addition, each PC that implements the character input / output device 300 can perform communication based on a communication protocol such as TCP / IP. Further, client software may be installed that communicates with the server to update the screen display or notify a correction instruction. It is also possible to use existing software such as a browser without preparing special client software.

サーバでは、スピーカから出力される音声の情報がマイクロフォンを介して入力され、音声認識装置１００を実現する音声ワープロ機能をもつソフトウェアが、音声認識処理を行うことにより文字列情報に変換する。変換された文字列情報は、逐次、データ処理装置２００を実現する文章修正機能をもつソフトウェアに入力される。 In the server, voice information output from a speaker is input via a microphone, and software having a voice word processor function for realizing the voice recognition apparatus 100 converts the information into character string information by performing voice recognition processing. The converted character string information is sequentially input to software having a sentence correction function for realizing the data processing device 200.

文字列情報が入力されると、文章修正機能をもつソフトウェアによって実現される認識結果補正手段２１０が、補正処理を行う。ここでは、「すでにつうちぶんなどでごれんらくのとおり、ろくがつよっかげつようびごぜんじゅうじごふんから、にせんななねんどほんしゃびるぼうさいくんれんをじっしいたします。」と発声されたことに対し、「素手に通知文などでご連絡のとうり、６月４日月曜日午前１０時５分から、２００７年度御社ビル防災訓練を実施いたします。」という文章を示す文字列情報が音声認識結果として出力された場合を例に説明する。 When the character string information is input, the recognition result correction unit 210 realized by software having a sentence correction function performs correction processing. Here, “As I already have a lot of money at Tsuchibun, etc., I'll start with the Nendoroid Nendoroids, starting from the first half of the year.” Character string information indicating the sentence that "We will conduct disaster prevention drill for your company building in 2007 from 10:05 am on Monday, June 4th" in response to what was spoken. Is described as an example.

図７は、認識結果補正手段２１０の補正処理の一例を示すフローチャートである。認識結果補正手段２１０は、上記音声認識装置１００から出力された文字列情報に含まれる語句一つひとつについて、図７に示すフローチャートに沿って一連の処理を行うことにより、補正後の文章である第１次文章（より具体的には第１次文章を示す文字列情報）を生成する。ここでは、上記音声認識装置１００から出力された文字列情報に含まれる語句一つひとつについて、単語辞書との照合を行い文書として日本語が正しく使用されているかどうかを判別し、日本語として誤りである語句、または誤りであることが判別できない語句を、一旦ひらがなに変換することによって、第１次文章を生成する。 FIG. 7 is a flowchart illustrating an example of the correction process of the recognition result correction unit 210. The recognition result correction unit 210 performs a series of processes according to the flowchart shown in FIG. 7 for each word included in the character string information output from the speech recognition apparatus 100, so that the first sentence which is a corrected sentence is obtained. The next sentence (more specifically, character string information indicating the first sentence) is generated. Here, each word included in the character string information output from the speech recognition apparatus 100 is checked against the word dictionary to determine whether or not Japanese is correctly used as a document. A primary sentence is generated by temporarily converting a phrase or phrase that cannot be determined to be erroneous into hiragana.

図８は、単語辞書のフォーマットの一例を示す説明図である。単語辞書には、各単語について、その読みと表記文字列とが対応づけられて登録されている。図８に示す例では、各単語の情報をｗｏｒｄ［］［］［］という形式で登録している。なお、ｗｏｒｄ［］［］［］の最初の配列添字は、先頭文字（読み）を特定するための番号が割り当てられている。例えば、「あ」から始まる単語は、ｗｏｒｄ［１１］［］［］として登録される。また、「い」から始まる単語は、ｗｏｒｄ［１２］［］［］として登録される。また、「お」から始まる単語は、ｗｏｒｄ［１５］［］［］として登録される。また、「か」から始まる単語は、ｗｏｒｄ［２１］［］［］として登録される。また、「さ」から始まる単語は、ｗｏｒｄ［３１］［］［］として登録される。すなわち、図８に示す例では、最初の配列添字の番号の十の位が、５０音順における子音の順番に対応し、一の位が、母音の順番に対応している。なお、子音と母音とを分けずに、５０音順の各音に順番に番号を割り振って配列添字の番号としてもよい。 FIG. 8 is an explanatory diagram showing an example of a word dictionary format. In the word dictionary, for each word, a reading and a written character string are associated with each other and registered. In the example shown in FIG. 8, the information of each word is registered in the format of word [] [] []. Note that a number for identifying the first character (reading) is assigned to the first array subscript of word [] [] []. For example, a word starting with “a” is registered as word [11] [] []. A word starting with “I” is registered as word [12] [] []. A word starting with “o” is registered as word [15] [] []. A word beginning with “ka” is registered as word [21] [] []. A word starting from “sa” is registered as word [31] [] []. That is, in the example shown in FIG. 8, the tenth digit of the first array subscript number corresponds to the order of consonants in the order of 50 tones, and the first digit corresponds to the order of vowels. Instead of separating consonants and vowels, numbers may be assigned in order to the sounds in the order of the 50 tones as array subscript numbers.

また、次の配列添字は、所定の文字（読み）から始まる単語について、その最初の表記文字を特定するための番号が割り当てられている。例えば、ｗｏｒｄ［１１］［１］［］には、最初の表記文字が「愛」である単語が登録される。また、ｗｏｒｄ［１１］［２］［］には、最初の表記文字が「藍」である単語が登録される。なお、最初の表記文字別に配列添字を用いず、よみの５０音順で並べて登録するようにしてもよい。 The next array subscript is assigned a number for identifying the first written character of a word starting from a predetermined character (reading). For example, in word [11] [1] [], a word whose first written character is “love” is registered. Also, a word whose first written character is “indigo” is registered in word [11] [2] []. It should be noted that the first subscript may be registered in the order of the Japanese syllabary without using an array subscript.

また、単語辞書には、誤用句として、発声される単語と表記される単語とで読みが異なるような単語について、発声単語のよみと表記文字列とを対応づけて登録するようにしてもよい。図８に示す例では、誤用句登録単語として、発声単語のよみ「とうり」と表記文字列「とおり」とを対応づけて登録する例が示されている。 Further, in the word dictionary, as a misuse phrase, a word that is read differently between a word to be spoken and a word to be written may be registered in association with the reading of the utterance word and the written character string. . In the example shown in FIG. 8, an example is shown in which the utterance word reading “Touri” and the written character string “Dori” are registered in association with each other as the misuse phrase registration word.

また、単語辞書には、各単語のかかりや受けといった組み合わせ確率の高い単語や表現の情報を含んでいてもよい。 In addition, the word dictionary may include information on words and expressions with high combination probabilities such as catching and receiving each word.

まず、認識結果補正手段２１０は、図７に示すように、音声認識装置１００によって出力された文章（文字列情報）を入力情報とするため、入力バッファにて情報をフェッチする（ステップＳ１０１）。フェッチ時に、当該文章をいくつかの単語に分解し、分解した単語ごとに以下の走査を行っていく。すなわち、認識結果補正手段２１０は、分解した単語の個数分、ステップＳ１０３〜Ｓ１０８の走査処理をループする（ステップＳ１０２）。以下、現在走査を行っている単語を単語Ｘという。 First, as shown in FIG. 7, the recognition result correction unit 210 fetches information in the input buffer in order to use the text (character string information) output by the speech recognition apparatus 100 as input information (step S101). At the time of fetching, the sentence is broken down into several words, and the following scan is performed for each broken word. That is, the recognition result correction unit 210 loops the scanning process of steps S103 to S108 by the number of decomposed words (step S102). Hereinafter, the word currently being scanned is referred to as word X.

ループ内の走査処理として、認識結果補正手段２１０は、まず、単語Ｘの先頭文字が何であるかを識別する（ステップＳ１０３）。次に、識別された文字を元に、単語辞書との照合を行う（ステップＳ１０４）。例えば、「素手」という単語Ｘが走査されている場合には、ステップＳ１０３で先頭文字が「す」であると識別され、その「す」を元に単語辞書での照合を行えばよい。図８に示す例では、先頭文字「す」の登録単語である照合先として、ｗｏｒｄ［３３］［］［］の部分についてのみ走査をかければよい。 As the scanning process in the loop, the recognition result correction unit 210 first identifies what the first character of the word X is (step S103). Next, collation with the word dictionary is performed based on the identified character (step S104). For example, when the word X “bare hand” is scanned, the first character is identified as “su” in step S103, and collation in the word dictionary may be performed based on the “su”. In the example shown in FIG. 8, it is only necessary to scan the word [33] [] [] portions as the collation destination that is the registered word of the first character “su”.

先頭文字が一致する単語が登録されている場合（ステップＳ１０５のＹｅｓ）、その先頭文字で始まる登録単語を順番に走査して次の処理を行う（ステップＳ１０６）。ここでは、認識結果補正手段２１０は、照合対象である単語Ｘと照合先となった登録単語とが一致するか否かを確認し（ステップＳ１０７）、一致しない場合には照合先を次の登録単語へ移行してステップＳ１０７に戻ることで同様の処理を繰り返す（ステップＳ１０８）。 If a word that matches the first character is registered (Yes in step S105), the registered word starting with the first character is scanned in order to perform the next process (step S106). Here, the recognition result correction unit 210 checks whether or not the word X to be collated and the registered word that is the collation destination match (step S107). The same processing is repeated by moving to a word and returning to step S107 (step S108).

一方、照合対象である単語Ｘと照合先となった登録単語とが一致する場合には、単語辞書にその登録単語の同音異義語が存在するか否かを判定する（ステップＳ１０９）。同音異義語が存在する場合、単語Ｘは日本語として曖昧性を含んでいるとみなし、単語Ｘをひらがな表記に変換する（ステップＳ１１０）。一方、同音異義語が存在しない場合には、そのまま登録単語での表記とする（ステップＳ１１１）。 On the other hand, if the word X to be collated matches the registered word that is the collation destination, it is determined whether or not the homonym of the registered word exists in the word dictionary (step S109). If there is a homonym, the word X is considered to contain ambiguity as Japanese, and the word X is converted to hiragana notation (step S110). On the other hand, when the homonym does not exist, the registered word is used as it is (step S111).

認識結果補正手段２１０は、例えば、照合先となった配列群（ここでは、ｗｏｒｄ［３３］［］［］）の中に、単語Ｘと同一な単語が登録されている配列が存在するか否かを判別し、存在していた場合、それに加えて同じ読み方をする別の配列が存在するか否かを判別すればよい。図８に示す例では、単語Ｘが「素手」である場合に、ｗｏｒｄ［３３］［１］［０］として登録されている「素手」が単語Ｘと一致したと判別され、さらに、ｗｏｒｄ［３３］［２］［０］として登録されている「既」が単語Ｘの同音異義語であると判別される。このような場合には、認識結果の「素手」をひらがなの「すで」に入れ替える処理を行う。 For example, the recognition result correction unit 210 determines whether or not there is an array in which the same word as the word X is registered in the array group (word [33] [] []) as a collation destination. If it exists, it may be determined whether or not there is another sequence that reads the same way. In the example illustrated in FIG. 8, when the word X is “bare hand”, it is determined that the “bare hand” registered as word [33] [1] [0] matches the word X, and further, word [ 33] It is determined that “already” registered as [2] [0] is a homonym of the word X. In such a case, a process of replacing the recognition result “bare hand” with a hiragana “sude” is performed.

また、ステップＳ１０４において、単語Ｘの先頭文字による照合で一致しなかった場合には（ステップＳ１０５のＮｏ）、そもそも正確な日本語ではないとみなし、単語Ｘが漢字表現であれば文章をよみにくくする原因となることを考慮し、ここでは、ひらがなでの表記に変換する（ステップＳ１１２）。なお、正確な日本語であっても、難しい漢字である場合には単語辞書には登録しないようにし、ひらがなでの表記に変換させるという方法も可能である。 In step S104, if there is no match in the collation by the first character of the word X (No in step S105), it is regarded as not correct Japanese in the first place, and if the word X is expressed in kanji, the sentence is difficult to read. In this case, it is converted into a notation in hiragana (step S112). In addition, even if it is accurate Japanese, if it is difficult kanji, it is possible not to register it in the word dictionary and to convert it into notation in hiragana.

また、ステップＳ１０６〜ステップＳ１０８の走査処理を繰り返す中で、照合先となった配列群（単語Ｘと先頭文字が一致する単語群）に、単語Ｘと一致するものが存在しないまま走査処理を終えた場合についても、単語Ｘが正確な日本語ではないとみなし、ひらがなでの表記に変換する（ステップＳ１１２）。 In addition, while repeating the scanning process of step S106 to step S108, the scanning process is finished without the matching of the word X in the array group (a word group of which the first character matches the word X) as the collation destination. Also, the word X is regarded as not being in correct Japanese, and is converted into the hiragana notation (step S112).

単語Ｘについて、上記一連の処理が終了すると、照合対象（単語Ｘ）を次の分解された単語に移し、ステップＳ１０３に戻ることで同様の処理を繰り返す（ステップＳ１１３）。 When the above-described series of processing is completed for the word X, the verification target (word X) is moved to the next decomposed word, and the same processing is repeated by returning to step S103 (step S113).

なお、ここでは、音声認識装置１００による音声認識結果である文章が漢字かな混じりの形式で表現されていることを基本としつつ、内部情報として、ひらがなのみの文章の情報を併せ持つことを前提としている。よってステップＳ１１０およびＳ１１２の処理において、当該漢字が、ひらがなのみの文章のどの部分に対応するかをパターン照合によって抽出することにより変換すればよい。なお、音声認識結果がひらがなの形式で表現されている場合には、ステップＳ１０７において、登録単語の読みとの照合を行うようにすればよい。 Here, it is assumed that the sentence that is the result of the speech recognition by the speech recognition apparatus 100 is expressed in a kanji-kana mixed format, and that the internal information also includes information of only the hiragana text. . Therefore, in the processing of steps S110 and S112, it may be converted by extracting to which part of the sentence only the hiragana the kanji corresponds by pattern matching. If the speech recognition result is expressed in the hiragana format, it is sufficient to perform collation with the reading of the registered word in step S107.

また、「とうり」のような慣用句については、「とうり」は正しくは「とおり」と表記することを単語辞書に事前に登録しておき、照合するようにすればよい。 In addition, for an idiom such as “Touri”, it is only necessary to register in advance in the word dictionary that “Touri” is correctly written as “Dori” and to collate it.

本例では、本来「既に」、「とおり」、「防災」と表記される部分が、それぞれ「素手に」、「とうり」、「亡妻」と認識されており、日本語として意味が通りづらくなっている。認識結果補正手段２１０は、「素手に」という語句について、単語辞書によって「素手に」と「既に」の２通りに変換できることを確認して、その双方とも文脈を確認しない限り日本語として誤りであるか否かを判別することができない性質のものと認識し、ここでは、一旦「すでに」というひらがなに変換する。 In this example, the parts that were originally written as “already”, “street”, and “disaster prevention” are recognized as “bare hands”, “touri”, and “hidden wife”, respectively, and the meaning is difficult to pass in Japanese It has become. The recognition result correcting unit 210 confirms that the phrase “bare hand” can be converted into “bare hand” and “already” by the word dictionary, and both of them are erroneous as Japanese unless the context is confirmed. It is recognized that it has a property that it cannot be determined whether or not it exists, and here, the hiragana is converted to “already” once.

また、「とうり」という語句は、一般的な語句として単語辞書に登録されていない語句として確認し、かつ単語辞書において「とおり」の表記誤りと認識することにより、ここでは、「とおり」に変換する。 In addition, the word “Touri” is confirmed as a word that is not registered in the word dictionary as a general word, and is recognized as “not” in the word dictionary. Convert.

また、「亡妻」については、次に続く語句が「訓練」であることを認識した上で、単語辞書との照合により、「防災」に変換される。最終的には、認識結果補正手段２１０によって「既に通知文などでご連絡のとおり、６月４日月曜日午前１０時５分から、２００７年度御社ビル防災訓練を実施いたします。」という文章に補正されることになる。なお、この段階では、「本社」を「御社」と認識されている点については補正されないでいる。 The “lost wife” is converted to “disaster prevention” by recognizing that the next phrase is “training” and collating with the word dictionary. Eventually, the recognition result correction means 210 will correct the sentence to “Fiscal 2007 disaster prevention drill will be held from 10:05 am on Monday, June 4th, as we have already communicated with the notice.” Will be. At this stage, the point where “head office” is recognized as “your company” is not corrected.

認識結果補正手段２１０による補正が完了すると、画面制御手段２５０が各文字入出力装置３００に、補正後の文章を第１次文章として表示させる。なお、図９に示すように、音声認識装置１００による音声認識結果を得た時点で、音声認識結果（第０次文章）をそのまま表示させ、補正後の文章を詳細モード（デフォルトモード）として即時表示させることも可能である。そのような場合には、文章加工手段２３０の詳細モードに対する加工処理は省略される。 When the correction by the recognition result correction unit 210 is completed, the screen control unit 250 causes each character input / output device 300 to display the corrected sentence as a primary sentence. As shown in FIG. 9, when the speech recognition result by the speech recognition apparatus 100 is obtained, the speech recognition result (0th sentence) is displayed as it is, and the corrected sentence is immediately set as the detailed mode (default mode). It can also be displayed. In such a case, the processing for the detailed mode of the text processing means 230 is omitted.

なお、表示画面において、画面の上部、下部とも文章が長くなって表示領域からあふれた場合には、上へスクロールし、新しい文章が下の方から表示させることでより見やすくする。 In the display screen, when the text is long and overflows from the display area at the top and bottom of the screen, the screen is scrolled upward so that the new text is displayed from the bottom to make it easier to see.

ここで、読み手が、画面中央部に設けてあるモード一覧から「簡易モード」または「字幕モード」を選択したとする。その読み手が使用している読み手用端末としての文字入出力装置３００が、読み手からの画面操作に応じて、モード変更イベントをサーバ（データ処理装置２００）に通知する。データ処理装置２００では、モード設定手段２２０が、そのモード変更イベント通知によって、モードの設定が行われ、変更後のモードに即した文章がその文字入出力装置３００に表示されることになる。 Here, it is assumed that the reader selects “simple mode” or “caption mode” from the mode list provided in the center of the screen. The character input / output device 300 as a reader terminal used by the reader notifies the server (data processing device 200) of a mode change event in response to a screen operation from the reader. In the data processing device 200, the mode setting means 220 sets the mode in response to the mode change event notification, and the text corresponding to the changed mode is displayed on the character input / output device 300.

文章加工手段２３０は、モード設定手段２２０によるモードの設定を受けて、設定されたモードの状態を認知し、そのモードに応じて加工された文章を出力すればよい。 The sentence processing unit 230 may receive the mode setting by the mode setting unit 220, recognize the state of the set mode, and output a sentence processed according to the mode.

なお、図１０は、字幕モードが設定された状態での文字入出力装置３００における画面表示の例を示す説明図である。また、図１１は、簡易モードが設定された状態での文字入出力装置３００における画面表示の一例を示す説明図である。 FIG. 10 is an explanatory diagram showing an example of screen display in the character input / output device 300 in a state where the subtitle mode is set. FIG. 11 is an explanatory diagram showing an example of a screen display in the character input / output device 300 in a state where the simple mode is set.

文章編集手段２４０は、文章加工手段２３０による加工文書の出力をトリガに動作を開始する。本例では、事前にＬＡＮ接続等によって読み手用端末とオペレータ用端末とサーバとが通信可能な状態になっているため、読み手用端末の画面に加工文章が出力されるたびに、オペレータ用端末にも同一の文章による加工文章が表示されるものとする。また、一連の処理が行われる文章を単位に送信し、送信が行われる度に送信カウンタを１つずつ増やしている構成とする。 The sentence editing unit 240 starts the operation with the output of the processed document by the sentence processing unit 230 as a trigger. In this example, the reader terminal, the operator terminal, and the server are ready to communicate with each other by a LAN connection or the like in advance, so that each time the processed text is output to the reader terminal screen, the operator terminal Also, it is assumed that the processed text with the same text is displayed. In addition, a sentence in which a series of processing is performed is transmitted in units, and a transmission counter is incremented by one each time transmission is performed.

文章編集手段２４０または画面制御手段２５０は、オペレータ用端末に対して、画面下部に表示される加工文章部分が編集可能に表示させる。以降、文章編集手段２４０は、図１２に示す処理フローに沿って編集処理を行う。 The text editing means 240 or the screen control means 250 causes the operator terminal to display the processed text portion displayed at the bottom of the screen in an editable manner. Thereafter, the text editing unit 240 performs editing processing according to the processing flow shown in FIG.

図１２は、文章編集手段２４０による編集処理の一例を示すフローチャートである。ここでは、オペレータが、オペレータ用端末の画面に表示される加工文章を見て、修正すべき箇所があると判断した場合に、当該部分をマウス等で範囲設定を行う。オペレータ用端末では、オペレータからの画面操作に応じて、範囲設定通知イベントをサーバ（データ処理装置２００）に通知する。例えば、範囲設定された領域の先頭位置（文字単位）と、長さ、当該加工文章の送信カウンタを含む範囲設定通知イベントをデータ処理装置２００に送信する。図１３は、オペレータ用端末における範囲設定操作にかかる画面イメージの例を示す説明図である。 FIG. 12 is a flowchart illustrating an example of editing processing by the text editing unit 240. Here, when the operator looks at the processed text displayed on the screen of the operator terminal and determines that there is a portion to be corrected, the range is set with the mouse or the like. In the operator terminal, a range setting notification event is notified to the server (data processing apparatus 200) in accordance with a screen operation from the operator. For example, a range setting notification event including the start position (character unit), the length, and the processed text transmission counter of the processed region is transmitted to the data processing device 200. FIG. 13 is an explanatory diagram illustrating an example of a screen image related to a range setting operation on the operator terminal.

データ処理装置２００では、文章編集手段２４０が、その範囲設定通知イベントを受け取る（ステップＳ２０１）。文章編集手段２４０は、範囲設定を受け付けた応答を、そのオペレータ用端末に返信し、その範囲内での正しい文章を入力させるための画面表示を該オペレータ用端末に行わせる（ステップＳ２０２）。 In the data processing device 200, the text editing unit 240 receives the range setting notification event (step S201). The text editing unit 240 returns a response that accepted the range setting to the operator terminal, and causes the operator terminal to display a screen for inputting a correct text within the range (step S202).

そして、オペレータ用端末では、オペレータからの設定範囲に対する正しい文章の入力を受け付け、受け付けた修正文章をサーバに送信する。図１４は、オペレータ用端末における修正文章入力操作にかかる画面イメージの例を示す説明図である。 Then, the operator terminal receives an input of a correct sentence for the set range from the operator, and transmits the received corrected sentence to the server. FIG. 14 is an explanatory diagram showing an example of a screen image related to a corrected text input operation on the operator terminal.

図１４に示す例では、オペレータが、入力した文章に誤りがないことを確認し、キーボード上のＥｎｔｅｒキーを押下することにより、範囲指定箇所の修正文章がデータ処理装置２００に送信される。ここでは、「御社」と表記されている箇所を「本社」に修正する指示を行っている。 In the example shown in FIG. 14, the operator confirms that there is no error in the input sentence, and presses the Enter key on the keyboard, so that the corrected sentence at the range designated portion is transmitted to the data processing device 200. In this case, an instruction is given to correct the location labeled “your company” to “head office”.

データ処理装置２００の文章編集手段２４０は、範囲指定箇所の修正文章を受信すると（ステップＳ２０３）、その修正文章を加工文章に反映させることにより、文章を編集する（ステップＳ２０４）。なお、文章編集手段２４０は、修正個所が他のモードでも表示されている文字を含んでいる場合には、他のモードにおける該当個所もあわせて編集する。そして、画面制御手段２５０に、編集後の文章に対応した画面更新をさせる（ステップＳ２０５）。 When the text editing unit 240 of the data processing device 200 receives the corrected text of the range designation portion (step S203), the text editing means 240 edits the text by reflecting the corrected text on the processed text (step S204). Note that the text editing unit 240 also edits the corresponding part in the other mode when the correction part includes the character displayed in the other mode. Then, the screen control means 250 is caused to update the screen corresponding to the edited text (step S205).

画面制御手段２５０は、例えば、文章編集手段２４０によって編集された修正個所に関する内容、当該範囲の始点位置、長さ、文章の送信カウンタを元に、更新をかける対象の加工文章を追跡し、その上で、各文字入出力装置３００に送信した文字単位の始点位置、長さ情報に該当する部分をシークする。ここでは、シールした位置から、オペレータによる修正内容を元と異なる色（例えば、赤色）で上書きする。 For example, the screen control means 250 tracks the processed text to be updated based on the content related to the correction part edited by the text editing means 240, the start position of the range, the length, and the text transmission counter. Above, the portion corresponding to the start position and length information of each character transmitted to each character input / output device 300 is sought. Here, the correction contents by the operator are overwritten with a different color (for example, red) from the sealed position.

オペレータによる編集操作は、読み手用端末の画面上では一切見えない状態で行わせる。そうすることによって、読み手は現在表示されている文章をよむことのみに集中できる環境が形成される。そして、オペレータによる編集操作が確定した時点で、修正すべき部分を色分けした形式で更新表示させる。なお、図１４に示す例では、反転表示させる例を示しているが、この修正後の画面イメージは、画面制御手段２５０の画面更新処理により読み手用端末に対しても適用される。 The editing operation by the operator is performed in a state where it cannot be seen at all on the screen of the reader terminal. By doing so, an environment is formed in which the reader can concentrate only on reading the currently displayed text. Then, when the editing operation by the operator is confirmed, the portion to be corrected is updated and displayed in a color-coded format. In the example shown in FIG. 14, an example of reverse display is shown, but the screen image after correction is also applied to the reader terminal by the screen update processing of the screen control means 250.

また、読み手用端末の画面において、更新をかけた箇所がスクロールされて見えなくなっている場合には、修正文章の表示欄の右側に「※」マークを表示させるなど、更新された文章がある旨を読み手に知らせるようにしてもよい。 In addition, on the screen of the reader's terminal, if the updated part is scrolled and is no longer visible, there will be an updated sentence, such as displaying a "*" mark on the right side of the corrected sentence display field. May be informed to the reader.

なお、本例において、オペレータは、音声認識装置１００に入力される元の音声を聴ける環境にいて、かつ音声のみでほぼ意味、背景、状況を理解可能な人を配置する。 In this example, the operator arranges a person who can listen to the original voice input to the voice recognition apparatus 100 and can understand the meaning, background, and situation with only the voice.

また、例えば、簡易／字幕（要約）／詳細のそれぞれのモードに応じてオペレータを配置し、各モードでの編集作業を行わせてもよい。例えば、図１５に示すように、字幕モードを担当するオペレータでは、字幕としての文章の表現についても確認をするなどの実施方法が考えられる。図１５に示す例では、「実施いたします」となっている語尾の「いたします」を「します」に修正する指示を行っている。 Further, for example, an operator may be arranged according to each mode of simple / caption (summary) / detail, and editing work may be performed in each mode. For example, as shown in FIG. 15, an operator in charge of the subtitle mode may consider an implementation method of confirming the expression of text as subtitles. In the example shown in FIG. 15, an instruction to correct “I will do” at the end of “I will do” to “I will” is given.

以上のように、本実施形態によれば、音声認識が行われる際、マイクロフォンに音源以外の雑音が入りやすい環境等により、誤変換をおこないやすい場合であっても、その「ゆれ」を可能な限り文章編集支援等の処理を経て、より確度の高い文章として出力することができる。本発明では、支援のためのオペレータを介した必要最小限のヒューマンインタフェースのあり方も一つの構成として提案している。人手による支援を含むことにより、明らかに誤りとわかる文章の識別がしやすい構成となっているとともに、リアルタイム性を損なわないよう提示するための表示制御を行っている。結果、音声でしか情報を得られない環境であっても、すばやい状況判断がしやすくなり、特に、聴覚障害者にとっても自主的なコミュニケーションがとりやすい土壌が形成される。 As described above, according to the present embodiment, when voice recognition is performed, even if it is easy to perform erroneous conversion due to an environment in which noise other than a sound source is likely to be input to the microphone, the “sway” can be performed. As long as the text editing support process is performed, the text can be output as a more accurate text. In the present invention, a minimum human interface through an operator for support is also proposed as one configuration. By including support by hand, it has a configuration that makes it easy to identify sentences that are clearly mistaken, and performs display control to present without impairing real-time performance. As a result, even in an environment where information can be obtained only by voice, it is easy to make a quick determination of the situation, and in particular, a soil is formed that makes it easier for a hearing impaired person to communicate independently.

例えば、音声でしか情報を得られない環境では、聴覚障害者にとっては、例えば今、どこで何が起こっているのか、今、誰が、何がおかしくて笑っているのか、また、今、誰が何に対して何の意見を述べているのかというようなことすらも把握できず、いちいち周りの人に状況を筆談等で説明してもらうようにアクションを起こさなければならないが、本発明を利用すれば、リアルタイム性を損なわずに、状況判断に必要な情報を得ることができる。 For example, in an environment where information can only be obtained by voice, for the hearing impaired, for example, what is happening now, who is laughing at what is wrong now, and who is now I don't even know what I'm saying about it, and I have to take action so that people around me can explain the situation in writing, etc. Information necessary for situation determination can be obtained without impairing real-time performance.

なお、コンピュータ処理だけで、精度よく誤りか否かを判断しようとすると、ある程度の長さの文章が必要になったり、処理時間がかかったりといった問題が生じるが、本実施形態では、あいまいなものについては一旦ひらがなに変換し、その上でオペレータによる修正指示を受け付けるようになっているため、より短いスパンで、認識性を低下させる可能性のある変換結果を排除した文章を出力することができる。 It should be noted that if it is attempted to determine whether or not the error is accurate only by computer processing, there are problems that a sentence of a certain length is required and processing time is required, but in this embodiment, it is ambiguous. Is converted to hiragana and then receives correction instructions from the operator, so it is possible to output a sentence that eliminates the conversion result that may reduce recognition in a shorter span. .

また、音声認識処理によって多量に出力される文字列情報から、曖昧さや冗長さ等を排除し、よりわかりやすい形で情報を提供することができる。発声された文章に対し、どのレベルで表示させるかを当事者の判断で設定することができるので、当事者の要請、スキル等に応じて、映画字幕のレベルに落とし込むか、またはもっと簡便に、箇条書きレベルまで落とし込むか等を決定できるため、当事者がストレスをためることなく、音声しか流れない環境であっても、ごく自然な形で情報を獲得する環境を形成することができる。 In addition, ambiguity, redundancy, etc. can be eliminated from character string information that is output in large quantities by voice recognition processing, and information can be provided in a more easily understandable form. The level at which the spoken text is displayed can be set at the discretion of the party, so that it can be reduced to the subtitle level of the movie or more easily according to the request, skill, etc. of the party. Since it is possible to determine whether to reduce the level, etc., it is possible to form an environment in which information is acquired in a very natural manner even in an environment where only the voice flows without stressing the parties.

本発明は、音声認識結果として出力される文字列情報で示される文章に限らず、リアルタイムで更新される文章を対象にするシステムであれば、好適に適用可能である。 The present invention can be suitably applied to any system that is not limited to the text indicated by the character string information output as the speech recognition result but is a text that is updated in real time.

本発明による文章修正支援システムの構成例を示すブロックである。It is a block which shows the structural example of the text correction assistance system by this invention. 本発明による文章修正支援システムの他の構成例を示すブロックである。It is a block which shows the other structural example of the text correction assistance system by this invention. より具体的な文章修正支援システムの構成例を示すブロック図である。It is a block diagram which shows the structural example of a more concrete text correction assistance system. 文章修正支援システム１０００の動作例を示すフローチャートである。10 is a flowchart illustrating an operation example of the text correction support system 1000. 文字入出力装置３００に表示させる画面の一例を示す説明図である。It is explanatory drawing which shows an example of the screen displayed on the character input / output device. 文章修正支援システム１０００の実施イメージの一例を示す説明図である。It is explanatory drawing which shows an example of the implementation image of the text correction assistance system. 認識結果補正手段２１０の補正処理の一例を示すフローチャートである。It is a flowchart which shows an example of the correction process of the recognition result correction | amendment means 210. FIG. 単語辞書のフォーマットの一例を示す説明図である。It is explanatory drawing which shows an example of the format of a word dictionary. 文字入出力装置３００に表示させる画面の一例を示す説明図である。It is explanatory drawing which shows an example of the screen displayed on the character input / output device. 字幕モードが設定された状態での文字入出力装置３００における画面表示の例を示す説明図である。It is explanatory drawing which shows the example of the screen display in the character input / output device 300 in the state by which the subtitle mode was set. 簡易モードが設定された状態での文字入出力装置３００における画面表示の一例を示す説明図である。It is explanatory drawing which shows an example of the screen display in the character input / output device 300 in the state in which the simple mode was set. 文章編集手段２４０による編集処理の一例を示すフローチャートである。It is a flowchart which shows an example of the edit process by the text edit means 240. オペレータ用端末における範囲設定操作にかかる画面イメージの例を示す説明図である。It is explanatory drawing which shows the example of the screen image concerning range setting operation in an operator terminal. オペレータ用端末における修正文章入力操作にかかる画面イメージの例を示す説明図である。It is explanatory drawing which shows the example of the screen image concerning the correction text input operation in an operator terminal. オペレータ用端末における文章修正にかかる画面イメージの例を示す説明図である。It is explanatory drawing which shows the example of the screen image concerning the sentence correction in an operator terminal.

Explanation of symbols

１データ処理装置
１１表記レベル補正手段
１２補正文章表示手段
１３修正指示受け付け手段
１４画面更新手段
１５文章加工手段
１６加工文章表示手段
１７モード設定手段
１００音声認識装置
２００データ処理装置
２１０認識結果補正手段
２２０モード設定手段
２３０文章加工手段
２４０文章編集手段
２５０画面制御手段
３００文字入出力装置 DESCRIPTION OF SYMBOLS 1 Data processing apparatus 11 Notation level correction means 12 Correction text display means 13 Correction instruction reception means 14 Screen update means 15 Text processing means 16 Processed text display means 17 Mode setting means 100 Speech recognition apparatus 200 Data processing apparatus 210 Recognition result correction means 220 Mode setting means 230 Text processing means 240 Text editing means 250 Screen control means 300 Character input / output device

Claims

A sentence correction support system for presenting a sentence indicated by character string information, which is voiced expression data obtained by voice recognition processing, as a sentence that is easy for a reader to understand,
A data processing device for inputting character string information that is voiced expression data, one or more reader terminals used by a reader, and one or more operator terminals used by an operator,
The data processing device includes:
A notation level correcting means for generating a corrected sentence by correcting a word having a possibility of an error at the notation level with respect to a sentence indicated by input character string information;
Correction text display means for displaying the correction text generated by the notation level correction means on the screen of each reader terminal and each operator terminal,
A correction instruction receiving means for receiving a correction instruction for the text being displayed by the operator's operation from the operator terminal;
Sentence correction support, characterized in that the screen of each reader terminal and each operator terminal comprises screen updating means for reflecting the correction by the correction instruction received by the correction instruction receiving means so that the correction location can be specified. system.

A text processing means for generating a processed text in which the corrected text is processed according to a predetermined display mode;
In correspondence with the correction text being displayed, the processed text corresponding to the display mode set for the reader is displayed on the screen of the reader terminal used by each reader, and the screen of the operator terminal is displayed. A processed text display means for displaying at least the processed text in accordance with the display mode displayed on the reader terminal;
The correction instruction accepting unit accepts a correction instruction for the processed text by an operator's operation from the operator terminal,
The screen update means reflects the correction by the correction instruction for the processed sentence received by the correction instruction receiving means on the screen of each reader terminal and each operator terminal so that the correction portion can be specified. Sentence correction support system.

The notation level correction means collates each word included in the sentence indicated by the character string information with a word dictionary registered in advance, and if the word is not registered in the word dictionary, The sentence correction support system according to claim 1, wherein the corrected sentence is generated by converting the word into hiragana as a possibility of an error.

The notation level correcting means collates each word included in the sentence indicated by the character string information with a word dictionary registered in advance, and when a word having a different notation having the same reading is registered in the word dictionary The sentence correction according to any one of claims 1 to 3, wherein a correction document is generated by converting the word into hiragana, assuming that the word may have an error at a notation level. Support system.

A mode setting means for receiving a display mode switching instruction by a reader operation from a reader terminal,
The processed text display means, when the mode setting means accepts a display mode switching instruction, displays at least the processed text corresponding to the display mode set for the reader on the reader terminal of the request source. The sentence correction support system according to claim 1.

The display mode includes a detailed mode for presenting almost the same content as the utterance content, a subtitle mode in which redundancy is eliminated from the utterance content, and a simple mode for presenting the main points of the utterance content in bulleted form. Item 6. The sentence correction support system according to any one of items 5 to 6.

A sentence correction support method for presenting a sentence indicated by character string information, which is voiced expression data obtained by voice recognition processing, as a sentence that can be easily understood by a reader,
A data processing apparatus to which character string information that is voiced expression data is input,
For the sentence indicated by the input character string information, generate a corrected sentence that corrects a word that may be erroneous at the notation level,
Display the generated corrected text on the screen of each reader terminal and each operator terminal,
From the operator terminal, accept correction instructions for the displayed text by the operator operation,
A sentence correction support method characterized in that the correction by the correction instruction received from the operator terminal is reflected on the screen of each reader terminal and each operator terminal so that the correction part can be specified.

Generate a modified sentence that has been processed according to a predetermined display mode,
In correspondence with the correction text being displayed, the processed text corresponding to the display mode set for the reader is displayed on the screen of the reader terminal used by each reader, and the screen of the operator terminal is displayed. , Display at least the processed text that matches the display mode displayed on the reader's terminal,
From the operator terminal, accept correction instructions for the processed text by the operator's operation,
The text correction support method according to claim 7, wherein the correction by the correction instruction for the processed text received by the correction instruction receiving means is reflected on the screen of each reader terminal and each operator terminal so that the correction portion can be specified.

A sentence correction support program for presenting a sentence indicated by character string information, which is voiced expression data obtained by voice recognition processing, as a sentence that can be easily understood by a reader,
In a sentence correction support program applied to a data processing device to which character string information which is voiced expression data is input,
On the computer,
A process for generating a corrected sentence by correcting a word that may be erroneous at the notation level for the sentence indicated by the input character string information,
Processing to display the generated corrected text on the screen of each reader terminal and each operator terminal;
Processing to accept correction instructions for the displayed text by the operator's operation from the operator's terminal, and correction points received from the operator's terminal can be specified on the screen of each reader's terminal and each operator's terminal. Sentence correction support program to execute the process to be reflected in the document.

On the computer,
A process of generating a secondary sentence obtained by processing the corrected sentence according to a predetermined display mode;
In correspondence with the correction text being displayed, the processed text corresponding to the display mode set for the reader is displayed on the screen of the reader terminal used by each reader, and the screen of the operator terminal is displayed. , Processing to display at least the processed text in accordance with the display mode displayed on the reader terminal,
Processing for receiving a correction instruction for the processed text by the operator's operation from the operator terminal, and correction by the correction instruction for the processed text received by the correction instruction receiving means on the screen of each reader terminal and each operator terminal The program for supporting sentence correction according to claim 9, wherein a process for reflecting the corrected part is specified.