JP2022138750A

JP2022138750A - Question-answering system, information processing apparatus, information processing method, and program

Info

Publication number: JP2022138750A
Application number: JP2021038814A
Authority: JP
Inventors: 昭一内藤; Shoichi Naito; 慎也井口; Shinya Iguchi; 晋太郎川村; Shintaro Kawamura; 敦子島田; Atsuko Shimada; 真弓中村; Mayumi Nakamura
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2021-03-11
Filing date: 2021-03-11
Publication date: 2022-09-26

Abstract

To solve the problem that, in prior arts, when a question-answering system cannot answer to a prescribed question, ambiguity remains even in an additionally inputted question because an additional request given for answering to the prescribed question is indistinct.SOLUTION: A transmission/reception unit 31 of an information processing apparatus 3 transmits an answer to a question, which is generated by an answer generation unit 37, to an input device 2 (step S14) and receives at least one of image information and speech information newly inputted for additional request information included in the response to the question, from the input device 2 (step S12).SELECTED DRAWING: Figure 5

Description

本発明は、質問応答システム、情報処理装置、情報処理方法及びプログラムに関する。 The present invention relates to a question answering system, an information processing device, an information processing method and a program.

自然言語処理を利用したアプリケーションの一つとして、ユーザの質問に自動で回答を生成する質問応答が存在する。質問応答の最も一般的な設定は、ユーザの質問に対して一回で回答を出力する一問一答の設定である。これに対して、ユーザの質問に曖昧性があった場合に、ユーザと対話的にやりとりをしながら回答を生成する技術が知られている。 As one of applications using natural language processing, there is a question answering system that automatically generates answers to user questions. The most common question answering setting is the one-question-one-answer setting in which the user's question is answered once. On the other hand, there is known a technique of generating an answer while interacting with the user when the user's question is ambiguous.

また、近年では、画像が与えられ、その画像を参照した上で質問に回答するＶＱＡ(Visual Question Answering)と呼ばれる分野が存在する。この分野において、自動応答装置１００は、Ｎ次問い合わせについて、受付部１０が、音声および画像の組み合わせとなっている問い合わせを顧客端末２００から受信した場合、音声認識および画像認識の各アルゴリズムを適用して、音声データが示す顧客の会話内容に該当するテキストに変換する技術が知られている（例えば、特許文献１参照）。 Also, in recent years, there is a field called VQA (Visual Question Answering) in which an image is given and a question is answered with reference to the image. In this field, the automatic response device 100 applies each algorithm of speech recognition and image recognition when the reception unit 10 receives an inquiry that is a combination of voice and image from the customer terminal 200 for the Nth inquiry. There is a known technique for converting voice data into text corresponding to the content of a customer's conversation indicated by voice data (see Patent Document 1, for example).

しかしながら、従来の技術では、所定の質問に対して回答を応答できない場合に、応答するために与える追加要求が不明確であるために、追加で入力される質問においても曖昧性が残ってしまうという課題があった。 However, in the conventional technology, when an answer to a given question cannot be answered, the additional request to be given for answering is unclear, so ambiguity remains even in the additionally input question. I had a problem.

上述した課題を解決するために、請求項１に係る発明は、所定の質問を入力する入力装置と、前記入力装置が送信した前記所定の質問に対する回答を応答する情報処理装置と、有する質問応答システムであって、前記情報処理装置は、前記入力装置が送信した前記所定の質問を表す質問情報を受信する受信手段と、前記所定の質問に対する回答を表す回答情報、又は、前記回答情報を生成するために前記所定の質問に関連付けられた追加要求を表す追加要求情報を、前記入力装置に対して送信する送信手段と、を有し、前記入力装置は、前記情報処理装置が送信した前記追加要求情報を表示手段に表示する表示制御手段と、前記追加要求情報に対して入力された画像情報及び音声情報のうち少なくとも一方の情報を前記情報処理装置に送信する送信手段と、を有する、ことを特徴とする質問応答システムを提供する。 In order to solve the above-described problems, the invention according to claim 1 provides an input device for inputting a predetermined question, and an information processing device for responding to the predetermined question transmitted by the input device. A system, wherein the information processing device includes receiving means for receiving question information representing the predetermined question transmitted by the input device, and answer information representing an answer to the predetermined question, or generating the answer information. transmitting means for transmitting, to the input device, additional request information representing an additional request associated with the predetermined question in order to perform the additional request transmitted by the information processing device; display control means for displaying request information on a display means; and transmission means for transmitting at least one of image information and audio information input in response to the additional request information to the information processing apparatus. To provide a question answering system characterized by

以上説明したように本発明によれば、所定の質問に対する回答を応答できない場合の追加で入力される質問の曖昧性を解消することが可能になるという効果を奏する。 As described above, according to the present invention, it is possible to eliminate the ambiguity of the additionally input question when the answer to the predetermined question cannot be answered.

質問応答システムの一例を示す全体構成図である。1 is an overall configuration diagram showing an example of a question answering system; FIG. 入力装置及び情報処理装置のハードウエア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of an input device and an information processing apparatus. 質問応答システムの機能構成の一例を示す図である。It is a figure showing an example of functional composition of a question answering system. 保守情報管理テーブルの一例を示す概念図である。4 is a conceptual diagram showing an example of a maintenance information management table; FIG. 部品構成管理テーブルの一例を示す概念図である。4 is a conceptual diagram showing an example of a component configuration management table; FIG. 追加情報管理テーブルの一例を示す概念図である。FIG. 4 is a conceptual diagram showing an example of an additional information management table; 質問応答処理の一例を示すシーケンス図である。FIG. 10 is a sequence diagram showing an example of question answering processing; 質問の入力を受け付ける質問受付画面の画面表示例である。It is an example of a screen display of a question reception screen which receives input of a question. 回答情報及び追加情報の生成処理の一例を示すフローチャートである。It is a flow chart which shows an example of generation processing of reply information and additional information. 応答通知の内容を表示する応答通知画面の画面表示例である。It is a screen display example of a response notification screen displaying the content of the response notification. 追加情報の入力を受け付ける追加情報受付画面の画面表示例である。It is a screen display example of an additional information reception screen for receiving input of additional information.

以下、図面を用いて、発明を実施するための形態について説明する。なお、図面の説明において同一要素には同一符号を付し、重複する部分があれば、その説明を省略する。 DETAILED DESCRIPTION OF THE INVENTION Embodiments for carrying out the invention will be described below with reference to the drawings. In the description of the drawings, the same elements are denoted by the same reference numerals, and if there are overlapping parts, the description thereof will be omitted.

〔実施形態〕
図１乃至図９を用いて、本実施形態について説明する。 [Embodiment]
This embodiment will be described with reference to FIGS. 1 to 9. FIG.

〔質問応答システム１の全体構成〕
図１は、本実施形態に係る質問応答システムの一例を示す全体構成図である。図１に示す質問応答システム１は、情報処理装置３によって生成された回答を、情報処理装置３に対して質問を与えた入力装置２に応答する一例である。質問応答システム１は、例えば、入力装置２によって入力された文字等のテキストデータに基づいて生成された質問に対して、情報処理装置３が自然言語処理のタスクのひとつである回答生成を行うシステムである。図１に示されているように、質問応答システム１は、質問を入力する利用者が使用する入力装置２、入力装置２と通信し、入力装置２が送信した質問の内容を処理して入力装置２に対して回答を応答する情報処理装置３を有している。また、入力装置２と情報処理装置３は、それぞれ通信ネットワーク１００を介して互いに接続されている。ここで、通信ネットワーク１００は、インターネット、移動体通信網、ＬＡＮ(Local Area Network)等によって構築されている。なお、通信ネットワーク１００には、有線通信だけでなく、３Ｇ(3rd Generation)、４Ｇ(4th Generation)、５Ｇ(5th Generation)、ＷｉＭＡＸ(Worldwide Interoperability for Microwave Access)又はＬＴＥ(Long Term Evolution)等の無線通信によるネットワークが含まれてもよい。 [Overall configuration of question answering system 1]
FIG. 1 is an overall configuration diagram showing an example of a question answering system according to this embodiment. The question answering system 1 shown in FIG. 1 is an example in which an answer generated by an information processing device 3 is responded to an input device 2 that has given a question to the information processing device 3 . The question answering system 1 is, for example, a system in which an information processing device 3 generates an answer, which is one of the tasks of natural language processing, in response to a question generated based on text data such as characters input by an input device 2. is. As shown in FIG. 1, a question answering system 1 communicates with an input device 2 used by a user who inputs a question, and communicates with the input device 2, processes the contents of the question sent by the input device 2, and inputs the question. It has an information processing device 3 that responds an answer to the device 2 . Also, the input device 2 and the information processing device 3 are connected to each other via a communication network 100 . Here, the communication network 100 is constructed by the Internet, a mobile communication network, a LAN (Local Area Network), or the like. The communication network 100 includes not only wired communication but also wireless communication such as 3G (3rd Generation), 4G (4th Generation), 5G (5th Generation), WiMAX (Worldwide Interoperability for Microwave Access), or LTE (Long Term Evolution). A network of communications may be included.

＜入力装置＞
入力装置２は、一般的なＯＳが搭載された情報処理装置（コンピュータシステム）によって実現される。入力装置２は、入力されたテキスト情報を、通信ネットワーク１００を介して情報処理装置３に送信する。さらに、入力装置２は、情報処理装置３が送信したテキスト情報を受信し、テキスト情報を音声(音)情報に変換してスピーカを通して外部に出力する。さらに、入力装置２は、受信したテキスト情報を表示手段に表示させるための画面情報に変換して、表示手段に表示させる。また、入力装置２は、マイクを通して得られた人間の発する音声(自然言語)又は機械が発生する音声等の音声(音)情報を入力し、音声(音)情報をテキスト情報に変換して、通信ネットワーク１００を介して情報処理装置３に送信することも可能である。なお、入力装置２は、例えば、スマートフォン、タブレット端末、ＰＤＡ(Personal Digital Assistant)、ウエアラブルＰＣ（サングラス型、腕時計型等）の通信機能を有する通信端末であってもよい。さらに、入力装置２は、一般的なＰＣ(Personal Computer)であってもよい。つまり、入力装置２は、ブラウザソフトウエア等のソフトウエアを動作させることが可能な端末が用いられてもよい。 <Input device>
The input device 2 is realized by an information processing device (computer system) on which a general OS is installed. The input device 2 transmits the input text information to the information processing device 3 via the communication network 100 . Further, the input device 2 receives the text information transmitted by the information processing device 3, converts the text information into voice (sound) information, and outputs the voice information to the outside through a speaker. Further, the input device 2 converts the received text information into screen information to be displayed on the display means, and causes the display means to display the screen information. Also, the input device 2 inputs speech (sound) information such as human speech (natural language) or machine-generated speech obtained through a microphone, converts the speech (sound) information into text information, It is also possible to transmit to the information processing device 3 via the communication network 100 . Note that the input device 2 may be, for example, a communication terminal having a communication function such as a smart phone, a tablet terminal, a PDA (Personal Digital Assistant), or a wearable PC (sunglass type, wristwatch type, etc.). Furthermore, the input device 2 may be a general PC (Personal Computer). That is, the input device 2 may be a terminal capable of operating software such as browser software.

＜情報処理装置＞
情報処理装置３は、一般的なＯＳが搭載されサーバ機能を備えた情報処理装置（コンピュータシステム）によって実現される。情報処理装置３は、入力装置２と通信ネットワーク１００を介して通信し、入力装置２が送信した質問に係るデータを、機械学習(以下、機械学習の一例である深層学習という用語を用いる)による処理を行い、質問に係る回答(応答)情報及び追加の質問を受け付けるための追加要求情報を生成して、入力装置２に送信する。なお、情報処理装置３は、上述した知識源３００を生成可能な構成を有するものであれば、一般的なＰＣであってもよい。 <Information processing device>
The information processing device 3 is realized by an information processing device (computer system) having a general OS installed and having a server function. The information processing device 3 communicates with the input device 2 via the communication network 100, and performs machine learning (hereinafter, the term “deep learning,” which is an example of machine learning) is used to process data related to questions transmitted by the input device 2. Processing is performed to generate answer (response) information related to the question and additional request information for accepting additional questions, and transmit them to the input device 2 . Note that the information processing device 3 may be a general PC as long as it has a configuration capable of generating the knowledge source 300 described above.

●用語について●
本実施形態において用いられる「質問応答」（以下、単に質問応答と表記する）とは、入力された質問を、深層学習を用いて読み解き、欲しい情報をピンポイントで質問をした利用者(質問者)に提供することである。 ●Terms●
The “question answering” (hereinafter simply referred to as “question answering”) used in the present embodiment means a user (questioner) who reads and understands an input question using deep learning and pinpoints desired information. ).

〔ハードウエア構成〕
＜入力装置及び情報処理装置のハードウエア構成＞
続いて、実施形態に係る各装置のハードウエア構成について説明する。図２は、本実施形態に係る入力装置及び情報処理装置のハードウエア構成の一例を示す図である。入力装置２は、図２に示されているようにＣＰＵ２０１、ＲＯＭ２０２、ＲＡＭ２０３、ＨＤ２０４、ＨＤＤコントローラ２０５、ディスプレイ２０６、外部機器接続Ｉ／Ｆ２０８、ネットワークＩ／Ｆ２０９、キーボード２１１、ポインティングデバイス２１２、ＤＶＤ－ＲＷ(Digital Versatile Disk-Rewritable)ドライブ２１４、メディアＩ／Ｆ２１６、マイク２１８、スピーカ２１９、音入出力Ｉ／Ｆ２１７、カメラ２２０、撮像素子Ｉ／Ｆ２２１及びバスライン２１０を含むハードウエア資源を備えている。 [Hardware configuration]
<Hardware Configuration of Input Device and Information Processing Device>
Next, the hardware configuration of each device according to the embodiment will be described. FIG. 2 is a diagram showing an example of the hardware configuration of the input device and the information processing device according to this embodiment. The input device 2 includes, as shown in FIG. RW (Digital Versatile Disk-Rewritable) drive 214, media I/F 216, microphone 218, speaker 219, sound input/output I/F 217, camera 220, imaging element I/F 221, and hardware resources including bus line 210 .

これらのうち、ＣＰＵ２０１は、入力装置２全体の動作を制御する。ＲＯＭ２０２は、ＣＰＵ２０１の駆動に用いられるプログラムを記憶する。ＲＡＭ２０３は、ＣＰＵ２０１のワークエリアとして使用される。ＨＤ２０４は、プログラム等の各種データを記憶する。ＨＤＤコントローラ２０５は、ＣＰＵ２０１の制御にしたがってＨＤ２０４に対する各種データの読出し又は書込みを制御する。ディスプレイ２０６は、カーソル、メニュー、ウィンドウ、文字、テンキー、実行キー又は画像等の各種情報を表示する。ディスプレイ２０６は、表示手段の一例である。外部機器接続Ｉ／Ｆ２０８は、各種の外部機器を接続するためのインターフェイスである。この場合の外部機器は、例えば、ＵＳＢメモリ又はＵＳＢ機器である。バスライン２１０は、ＣＰＵ２０１等の各構成要素を電気的に接続するためのアドレスバス又はデータバス等である。 Among these, the CPU 201 controls the operation of the entire input device 2 . The ROM 202 stores programs used to drive the CPU 201 . A RAM 203 is used as a work area for the CPU 201 . The HD 204 stores various data such as programs. The HDD controller 205 controls reading or writing of various data to/from the HD 204 under the control of the CPU 201 . A display 206 displays various information such as cursors, menus, windows, characters, numeric keys, execution keys, and images. The display 206 is an example of display means. The external device connection I/F 208 is an interface for connecting various external devices. The external device in this case is, for example, a USB memory or a USB device. A bus line 210 is an address bus, a data bus, or the like for electrically connecting components such as the CPU 201 .

また、ネットワークＩ／Ｆ２０９は、通信ネットワーク１００を利用してデータ通信をするためのインターフェイスである。キーボード２１１は、文字、数値、各種指示等の入力のための複数のキーを備えた入力手段の一種である。ポインティングデバイス２１２は、各種指示の選択もしくは実行、処理対象の選択、又はカーソルの移動等を行う入力手段の一種である。なお、入力手段は、キーボード２１１及びポインティングデバイス２１２のみならず、タッチパネル又は音声入力装置等であってもよい。ＤＶＤ－ＲＷドライブ２１４は、着脱可能な記録媒体の一例としてのＤＶＤ－ＲＷ２１３に対する各種データの読出し又は書込み(記憶)を制御する。なお、ＤＶＤ－ＲＷに限らず、ＤＶＤ－Ｒ又はＢｌｕ-ｒａｙ(登録商標) Ｄｉｓｃ(ブルーレイディスク)等であってもよい。メディアＩ／Ｆ２１６は、フラッシュメモリ等の記録メディア２１５に対するデータの読出し又は書込みを制御する。マイク２１８は、音声や周囲の音(音信号)を入力する音声入力手段の一種である。スピーカ２１９は、入力された音信号を変換して得られた出力用音信号を出力する音声出力手段の一種である。音入出力Ｉ／Ｆ２１７は、ＣＰＵ２０１の制御に従ってマイク２１８及びスピーカ２１９との間で音信号の入出力を処理する回路である。カメラ２２０は、ＣＰＵ２０１の制御に従って被写体を撮像して画像データを得る内蔵型の撮像手段の一種である。撮像素子Ｉ／Ｆ２２１は、カメラ２２０の駆動を制御する回路である。 A network I/F 209 is an interface for data communication using the communication network 100 . The keyboard 211 is a type of input means having a plurality of keys for inputting characters, numerical values, various instructions, and the like. The pointing device 212 is a type of input means for selecting or executing various instructions, selecting a processing target, moving a cursor, or the like. Input means may be not only the keyboard 211 and the pointing device 212 but also a touch panel, a voice input device, or the like. A DVD-RW drive 214 controls reading or writing (storage) of various data to a DVD-RW 213 as an example of a removable recording medium. It should be noted that not only DVD-RW but also DVD-R or Blu-ray (registered trademark) Disc (Blu-ray Disc) may be used. A media I/F 216 controls reading or writing of data to a recording medium 215 such as a flash memory. The microphone 218 is a kind of voice input means for inputting voice and ambient sounds (sound signals). The speaker 219 is a kind of audio output means for outputting an output sound signal obtained by converting an input sound signal. A sound input/output I/F 217 is a circuit for processing input/output of sound signals between the microphone 218 and the speaker 219 under the control of the CPU 201 . The camera 220 is a type of built-in image capturing means that captures an image of a subject under the control of the CPU 201 and obtains image data. The imaging device I/F 221 is a circuit that controls driving of the camera 220 .

情報処理装置３は、コンピュータによって構築されており、図２に示されているようにＣＰＵ３０１、ＲＯＭ３０２、ＲＡＭ３０３、ＨＤ(Hard Disk)３０４、ＨＤＤコントローラ３０５、ディスプレイ３０６、外部機器接続Ｉ／Ｆ３０８、ネットワークＩ／Ｆ３０９キーボード３１１、ポインティングデバイス３１２、メディアＩ／Ｆ３１６及びバスライン３１０を含むハードウエア資源を備えている。 The information processing device 3 is constructed by a computer, and as shown in FIG. It has hardware resources including an I/F 309 keyboard 311 , pointing device 312 , media I/F 316 and bus line 310 .

これらのうち、ＣＰＵ３０１－ポインティングデバイス３１２は、入力装置２のＣＰＵ２０１－ポインティングデバイス２１２までの各ハードウエア資源と同様の構成を有するため、詳細の説明を省略する。メディアＩ／Ｆ３１６は、フラッシュメモリ等の記録メディア３１５に対するデータの読出し又は書込み(記憶)を制御する。ここで、情報処理装置３が一般的なＰＣである場合、入力装置２が備えるＤＶＤ－ＲＷドライブ２１４、マイク２１８、スピーカ２１９、音入出力Ｉ／Ｆ２１７、カメラ２２０、及び撮像素子Ｉ／Ｆ２２１に相当する各種ハードウエア資源を備えていてもよい。 Of these, the CPU 301-pointing device 312 has the same configuration as each hardware resource from the CPU 201-pointing device 212 of the input device 2, so detailed description thereof will be omitted. A media I/F 316 controls reading or writing (storage) of data to a recording medium 315 such as a flash memory. Here, when the information processing device 3 is a general PC, the DVD-RW drive 214, the microphone 218, the speaker 219, the sound input/output I/F 217, the camera 220, and the imaging device I/F 221 provided in the input device 2 Corresponding hardware resources may be provided.

なお、図２に示されたコンピュータは一例であって、ＨＵＤ(Head Up Display)装置、産業機械、ネットワーク家電、携帯電話、スマートフォン、タブレット端末、ゲーム機、ＰＤＡ(Personal Digital Assistant)等であってもよい。 The computer shown in FIG. 2 is an example, and may be a HUD (Head Up Display) device, an industrial machine, a network appliance, a mobile phone, a smartphone, a tablet terminal, a game machine, a PDA (Personal Digital Assistant), or the like. good too.

さらに、上記各プログラムは、インストール可能な形式又は実行可能な形式のファイルで、コンピュータで読み取り可能な記録媒体に記録、又はネットワークを介してダウンロードを行い流通させるようにしてもよい。記録媒体の例として、ＣＤ－Ｒ(Compact Disc Recordable)、ＤＶＤ、Ｂｌｕ-ｒａｙ(登録商標) Ｄｉｓｃ、ＳＤカード、ＵＳＢメモリ等が挙げられる。また、記録媒体は、プログラム製品(Program Product)として、国内又は国外へ提供されることができる。例えば、情報処理装置３は、本発明に係るプログラムが実行されることで本発明に係る情報処理方法を実現する。 Furthermore, each of the above programs may be recorded in a computer-readable recording medium in an installable format or executable format file, or may be downloaded via a network and distributed. Examples of recording media include CD-R (Compact Disc Recordable), DVD, Blu-ray (registered trademark) Disc, SD card, and USB memory. Also, the recording medium can be provided domestically or internationally as a program product. For example, the information processing device 3 implements the information processing method according to the present invention by executing the program according to the present invention.

〔質問応答システムの機能構成〕
次に、図３及び図４を用いて、実施形態に係る各装置の機能構成について説明する。図３は、本実施形態に係る質問応答システムの機能構成の一例を示す図である。 [Functional configuration of question answering system]
Next, the functional configuration of each device according to the embodiment will be described with reference to FIGS. 3 and 4. FIG. FIG. 3 is a diagram showing an example of the functional configuration of the question answering system according to this embodiment.

＜入力装置の機能構成＞
図３に示されているように、入力装置２は、送受信部２１、操作受付部２２、入出力部２３、表示制御部２４、判断部２５、変換生成部２７及び記憶読出部２９を有する。これら各部は、図２に示された各ハードウエア資源のいずれかが、ＲＯＭ２０２及びＨＤ２０４のうち少なくとも一方からＲＡＭ２０３上に展開された入力装置２用のプログラムに従ったＣＰＵ２０１からの命令によって動作することで実現される機能又は手段である。また、入力装置２は、図２に示されているＲＯＭ２０２及びＨＤ２０４のうち少なくとも一方によって構築される記憶部２０００を有している。また、記憶部２０００には、入力装置２が実行する入力処理プログラムが記憶されている。さらに、記憶部２０００には、情報処理装置３と通信するための通信アプリが記憶、管理されている。 <Functional configuration of input device>
As shown in FIG. 3 , the input device 2 has a transmission/reception section 21 , an operation reception section 22 , an input/output section 23 , a display control section 24 , a judgment section 25 , a conversion generation section 27 and a memory/read section 29 . 2 operates according to instructions from the CPU 201 according to the program for the input device 2 developed on the RAM 203 from at least one of the ROM 202 and the HD 204. It is a function or means realized by The input device 2 also has a storage unit 2000 constructed by at least one of the ROM 202 and the HD 204 shown in FIG. The storage unit 2000 also stores an input processing program executed by the input device 2 . Further, the storage unit 2000 stores and manages a communication application for communicating with the information processing device 3 .

<<入力装置の各機能構成>>
次に、入力装置２の各機能構成について詳細に説明する。図３に示されている入力装置２の送受信部２１は、主に、図２に示されている外部機器接続Ｉ／Ｆ２０８及びネットワークＩ／Ｆ２０９に対するＣＰＵ２０１の処理によって実現され、通信ネットワーク１００を介して、情報処理装置３と各種データ(又は情報)の送受信を行う。本実施形態において、送受信部２１は、送信手段及び受信手段の一例として機能する。 <<Each functional configuration of the input device>>
Next, each functional configuration of the input device 2 will be described in detail. The transmission/reception unit 21 of the input device 2 shown in FIG. 3 is mainly realized by the processing of the external device connection I/F 208 and the network I/F 209 shown in FIG. to transmit and receive various data (or information) to and from the information processing device 3 . In this embodiment, the transmitting/receiving unit 21 functions as an example of transmitting means and receiving means.

操作受付部２２は、主に、図２に示されているキーボード２１１及びポインティングデバイス２１２が受け付けた各種操作により生成された信号をＣＰＵ２０１が処理することによって実現され、利用者から各種の操作、及び選択の入力を受け付ける。また、キーボード２１１及びポインティングデバイス２１２だけでなく、他の入力手段として操作ボタン(押下、又はタップ可能なＵＩ(User Interface)を持つもの)等が用いられてもよい。本実施形態において、操作受付部２２は、操作受付手段の一例として機能する。 The operation reception unit 22 is mainly implemented by the CPU 201 processing signals generated by various operations received by the keyboard 211 and pointing device 212 shown in FIG. Accept input for selection. In addition to the keyboard 211 and the pointing device 212, an operation button (having a UI (User Interface) that can be pressed or tapped) or the like may be used as other input means. In this embodiment, the operation receiving unit 22 functions as an example of operation receiving means.

入出力部２３は、主に、図２に示されているマイク２１８、スピーカ２１９及び音入出力Ｉ／Ｆ２１７が入出力した音(音声)に係る音信号、並びに、カメラ２２０及び撮像素子Ｉ／Ｆ２２１が入出力した画像(動画、静止画)に係る画像信号をＣＰＵ２０１が処理することによって実現される。入出力部２３は、入力装置２を利用する利用者により発話された音声又は機械により発せられた音の入力を行う。さらに入出力部２３は、主に、図２に示されているスピーカ２１９及び音入出力Ｉ／Ｆ２１７に対するＣＰＵ２０１の処理によって実現され、情報処理装置３が送信した応答に係る情報を音声信号に変換し、スピーカ２１９を介して音声を出力する。本実施形態において、入出力部２３は、入出力手段の一例として機能する。さらに、本実施形態において、上述した操作受付部２２及び入出力部２３は、入力手段の一例として機能する。 The input/output unit 23 mainly receives sound signals related to sounds (sounds) input/output by the microphone 218, the speaker 219, and the sound input/output I/F 217 shown in FIG. This is achieved by the CPU 201 processing an image signal related to an image (moving image, still image) input/output by the F221. The input/output unit 23 inputs a voice uttered by a user using the input device 2 or a sound produced by a machine. Further, the input/output unit 23 is mainly realized by the processing of the CPU 201 for the speaker 219 and the sound input/output I/F 217 shown in FIG. and outputs audio through the speaker 219 . In the present embodiment, the input/output unit 23 functions as an example of input/output means. Furthermore, in the present embodiment, the operation reception unit 22 and the input/output unit 23 described above function as an example of input means.

表示制御部２４は、主に、図２に示されているディスプレイ２０６に対するＣＰＵ２０１の処理によって実現され、ディスプレイ２０６に各種画像、文字、コード情報等を表示させるための制御を行う。本実施形態において、表示制御部２４は、表示制御手段の一例として機能する。なお、本実施形態において、表示制御部２４が表示する各種情報は、入力装置のディスプレイ２０６に限らず、入力装置２と通信可能な他の装置の表示部等に表示されてもよい。 The display control unit 24 is mainly realized by processing of the display 206 shown in FIG. In the present embodiment, the display control section 24 functions as an example of display control means. In addition, in the present embodiment, various information displayed by the display control unit 24 may be displayed not only on the display 206 of the input device but also on a display unit or the like of another device that can communicate with the input device 2 .

判断部２５は、主に、図に示されているＣＰＵ２０１の処理によって実現され、入力装置２における各種判断を行う。本実施形態において、判断部２５は、判断手段の一例として機能する。 The determination unit 25 is mainly realized by processing of the CPU 201 shown in the drawing, and performs various determinations in the input device 2 . In this embodiment, the determination unit 25 functions as an example of determination means.

変換生成部２７は、主に、図に示されているＣＰＵ２０１の処理によって実現され、入出力部２３で入力された自然言語等による音情報をテキスト(文字)情報に変換し、質問文を生成する。本実施形態において、変換生成部２７は、変換生成手段の一例として機能する。 The conversion generation unit 27 is mainly realized by the processing of the CPU 201 shown in the figure, converts sound information in natural language etc. input by the input/output unit 23 into text (character) information, and generates a question sentence. do. In this embodiment, the conversion generation unit 27 functions as an example of conversion generation means.

記憶読出部２９は、主に、図２に示されているＲＯＭ２０２及びＨＤ２０４のうち少なくとも一つに対するＣＰＵ２０１の処理によって実現され、記憶部２０００に各種データ(又は情報)を記憶したり、記憶部２０００から各種データ(又は情報)を読み出したりする。本実施形態において、記憶読出部２９は、記憶読出手段の一例として機能する。 The memory reading unit 29 is mainly implemented by the processing of the CPU 201 on at least one of the ROM 202 and the HD 204 shown in FIG. Various data (or information) are read out from the . In the present embodiment, the memory reading unit 29 functions as an example of memory reading means.

＜情報処理装置の機能構成＞
図３に示されているように、情報処理装置３は、送受信部３１、取得部３２、解析算出部３３、判断部３５、回答生成部３７及び記憶読出部３９を有する。これら各機能部は、図３に示された各ハードウエア資源のいずれかが、ＲＯＭ３０２及びＨＤ３０４うち少なくとも一方からＲＡＭ３０３に展開された情報処理装置３用のプログラムに従ったＣＰＵ３０１からの命令により動作することで実現される機能又は手段である。また、情報処理装置３は、図２に示されているＲＯＭ３０２及びＨＤ３０４のうち少なくとも一方によって構築される記憶部３０００を有している。また、記憶部３０００には、情報処理装置３が実行する情報処理プログラムが記憶されている。さらに、記憶部３０００には、入力装置２と通信するための通信アプリが記憶、管理されている。 <Functional Configuration of Information Processing Device>
As shown in FIG. 3 , the information processing device 3 has a transmission/reception section 31 , an acquisition section 32 , an analysis calculation section 33 , a determination section 35 , an answer generation section 37 and a memory/read section 39 . Each of these functional units operates according to a command from the CPU 301 according to the program for the information processing apparatus 3 developed in the RAM 303 from at least one of the ROM 302 and the HD 304 by one of the hardware resources shown in FIG. It is a function or means realized by The information processing device 3 also has a storage unit 3000 configured by at least one of the ROM 302 and the HD 304 shown in FIG. Further, an information processing program executed by the information processing device 3 is stored in the storage unit 3000 . Furthermore, a communication application for communicating with the input device 2 is stored and managed in the storage unit 3000 .

●保守情報管理テーブル●
図４Ａは、保守情報管理テーブルの一例を示す概念図である。記憶部３０００には、図４Ａに示されているような保守情報管理テーブルによって構成された保守情報管理ＤＢ３００１が構築されている。保守情報管理テーブルでは、事象識別情報ごとに、発生した事象及び対処方法の各項目が関連付けられて記憶、管理されている。これらのうち、事象識別情報は、発生した事象を識別するための識別情報を表し、例えば、MID001、MID002等で与えられ、管理される。 ●Maintenance information management table●
FIG. 4A is a conceptual diagram showing an example of a maintenance information management table. A maintenance information management DB 3001 configured by a maintenance information management table as shown in FIG. 4A is constructed in the storage unit 3000 . In the maintenance information management table, each item of an event that has occurred and a coping method are associated with each event identification information, and stored and managed. Of these, the event identification information represents identification information for identifying an event that has occurred, and is given and managed as MID001, MID002, etc., for example.

発生した事象は、保守対象の管理対象装置においてどのような事象が発生したかを予め抽出したものである。ここで、管理対象装置とは、ＭＦＰ(Multifunction Peripheral/Product/Printer)、工作機械等を一例とする、定期的な保守(メンテナンス)を必要とする装置、機械の総称をいう。発生した事象としては、例えば、「部品Ａが破損している」、「部品Ｃで異音がする」、「部品Ｃが変形している」、「部品Ｄで異音がする等の内容で与えられ、管理される。なお、これらの発生した事象は、例えば、管理対象装置を保守する保守担当者によって管理対象装置を管理するコールセンター等から予め入手され、入手した情報に基づいて保守情報管理テーブルに入力された情報である。 The event that occurred is a pre-extracted event that occurred in the managed device to be maintained. Here, the managed device is a general term for devices and machines that require regular maintenance, such as MFP (Multifunction Peripheral/Product/Printer) and machine tools. Events that have occurred include, for example, "part A is damaged," "part C makes an abnormal noise," "part C is deformed," and "part D makes an abnormal noise." These events are obtained in advance from a call center or the like that manages the managed device by a maintenance person who maintains the managed device, and the maintenance information is managed based on the obtained information. This is the information entered into the table.

対処方法は、保守対象の管理対象装置においてどのような対処を行ったかを抽出したものである。対処方法は、例えば、「ユニットＥを交換した」、「取付調整をした」、「部品Ｃを交換した」、「部品Ｄを交換した」等が与えられ、管理される。 The countermeasure method is an extraction of what kind of countermeasure was taken in the managed device to be maintained. For example, "unit E was replaced", "mounting adjustment was made", "part C was replaced", "part D was replaced", etc. are given and managed.

●部品情報管理テーブル●
図４Ｂは、部品情報管理テーブルの一例を示す概念図である。記憶部３０００には、図４Ｂに示されているような部品情報管理テーブルによって構成された部品情報管理ＤＢ３００２が構築されている。この部品情報管理テーブルでは、構成識別情報ごとに、第１階層及び第２階層が関連付けられて記憶、管理されている。これらのうち、構成識別情報は、管理対象装置を構成する部品、ユニット等の各階層別の構成の対象を識別するための識別情報を表し、例えば、SID001、SID002等で与えられ、管理される。 ●Component information management table●
FIG. 4B is a conceptual diagram showing an example of a component information management table. In the storage unit 3000, a parts information management DB 3002 configured by a parts information management table as shown in FIG. 4B is constructed. In this parts information management table, the first layer and the second layer are associated with each piece of configuration identification information and stored and managed. Among these, the configuration identification information represents identification information for identifying the target of the configuration for each hierarchy, such as the parts and units that make up the device to be managed. .

第１階層は、管理対象装置を構成する部品、ユニット等が写真、サービスマニュアル等の文書内画像、ＣＡＤデータで表される場合に最上位に位置するもので、例えば、「ユニットＥ」等で与えられ、管理される。第２階層は、第１階層のユニットＥを構成する一つ下位の階層に位置するもので、例えば、「部品Ａ」、「部品Ｄ」等で与えられ、管理される。なお、製品構成管理テーブルの構成要素は、第２階層までではなく、第１階層以下であってもよいし、第３階層以上あってもよい。 The first layer is the highest level when the parts, units, etc. that make up the device to be managed are represented by photographs, images in documents such as service manuals, and CAD data. given and managed. The second hierarchy is located in the hierarchy one level lower than the unit E of the first hierarchy, and is given and managed by, for example, "part A" and "part D". It should be noted that the components of the product configuration management table may not be up to the second level, but may be in the first level or lower, or may be in the third level or higher.

●追加情報管理テーブル●
図４Ｃは、追加情報管理テーブルの一例を示す概念図である。記憶部３０００には、図４Ｃに示されているような追加情報管理テーブルによって構成された追加情報管理ＤＢ３００３が構築されている。この追加情報管理テーブルでは、事象識別情報ごとに、発生事象画像情報及び発生事象音声情報が関連付けられて記憶、管理されている。 ●Additional information management table●
FIG. 4C is a conceptual diagram showing an example of an additional information management table. In the storage unit 3000, an additional information management DB 3003 configured by an additional information management table as shown in FIG. 4C is constructed. In this additional information management table, event image information and event sound information are stored and managed in association with each event identification information.

これらのうち、発生事象画像情報は、事象が発生した箇所又は周辺の画像(静止画及び動画のうち少なくとも一方)を表す情報であり、入力装置２が送信した「.jpg」、「.gif」、「.png」、「.tif」、「.bmp」、「.mp4」、「.avi」、「.mov」等のファイルフォーマットで表される各種画像(静止画、動画)データが用いられる。 Of these, the incident image information is information representing an image (at least one of a still image and a moving image) of the location where the event occurred or its surroundings, and is ".jpg", ".gif" transmitted by the input device 2 , ``.png'', ``.tif'', ``.bmp'', ``.mp4'', ``.avi'', ``.mov'', etc. .

発生事象音声情報は、事象が発生した箇所又は周辺の音声(音)を表す情報であり、入力装置２が送信した「.wav」、「.mp3」、「.dct」等のファイルフォーマットで表される各種音声(音)データが用いられる。 Occurrence event audio information is information representing the location where the event occurred or the surrounding audio (sound), and is represented in a file format such as ".wav", ".mp3", ".dct" transmitted by the input device 2. Various audio (sound) data are used.

なお、上述した保守情報管理テーブル（保守情報管理ＤＢ３００１）、部品情報管理テーブル（部品情報管理ＤＢ３００２）、及び追加情報管理テーブル（追加情報管理ＤＢ３００３）は、テーブルデータとしてではなく、記憶部３０００の各所定領域でそれぞれ管理されるデータであってもよい。 Note that the above-described maintenance information management table (maintenance information management DB 3001), parts information management table (parts information management DB 3002), and additional information management table (additional information management DB 3003) are stored in storage unit 3000, not as table data. The data may be data managed in respective predetermined areas.

<<情報処理装置の各機能構成>>
次に、情報処理装置３の各機能構成について詳細に説明する。図３に示されている情報処理装置３の送受信部３１は、主に、図２に示されている外部機器接続Ｉ／Ｆ３０８及びネットワークＩ／Ｆ３０９に対するＣＰＵ３０１の処理によって実現され、通信ネットワーク１００を介して、入力装置２との間で各種データ(又は情報)の送受信を行う。また送受信部３１は、利用者からの質問を自然文で受信して情報処理装置３に入力する。インターフェイスとしてはチャットボットやWebブラウザ上におけるテキスト入力ボックスなどがあるが特に形態は問わない。さらに、送受信部３１は、利用者により入力された質問に対する回答に加え、発生した事象の要因となった画像情報も併せて入力装置２に対して送信する。なお、利用者からの質問には、テキスト情報等の自然文のほかに、機械が出力する音声による質問が含まれていてもよい。本実施形態において、送受信部３１は、送信手段及び受信手段の一例として機能するの一例として機能する。 <<Each functional configuration of the information processing device>>
Next, each functional configuration of the information processing device 3 will be described in detail. The transmission/reception unit 31 of the information processing device 3 shown in FIG. Various data (or information) are transmitted/received to/from the input device 2 via. The transmitting/receiving unit 31 also receives a question from the user in a natural sentence and inputs it to the information processing device 3 . Interfaces include chatbots and text input boxes on web browsers, but any form is acceptable. Further, the transmitting/receiving section 31 transmits to the input device 2 the image information that caused the event that occurred in addition to the answer to the question input by the user. The question from the user may include a question by voice output by a machine in addition to natural sentences such as text information. In this embodiment, the transmitting/receiving unit 31 functions as an example of transmitting means and receiving means.

取得部３２は、主に、図２に示されているＣＰＵ３０１の処理によって実現され、入力装置２が送信した所定の質問に係る質問情報、追加要求情報に対する追加情報を取得する。本実施形態において、取得部３２は、取得手段の一例として機能する。 The acquisition unit 32 is mainly implemented by the processing of the CPU 301 shown in FIG. 2, and acquires additional information for additional request information and question information relating to a predetermined question transmitted by the input device 2 . In the present embodiment, the acquisition unit 32 functions as an example of acquisition means.

解析算出部３３は、主に、図２に示されているＣＰＵ３０１の処理によって実現され、入力された質問文を解析し、構造化された構造化情報を抽出する。例えば、解析算出部３３は、管理対象装置の部品名、ユニット名等の固有表現及び関係(箇所－動作)を、述語項解析器、深層学習等を用いて解析し、構造化情報を抽出することが可能である。また、解析算出部３３は、クエリを生成するクエリ生成部としての機能も担う。ここでクエリとは、例えば、所定のデータベース(ＤＢ)からデータを抽出、操作するなど、各種処理を行うための命令をいう。さらにクエリは、「問合せ」と訳されることもある。本実施形態において、解析算出部３３は、解析算出手段の一例として機能する。 The analysis calculation unit 33 is mainly realized by the processing of the CPU 301 shown in FIG. 2, analyzes the input question text, and extracts structured information. For example, the analysis calculation unit 33 analyzes specific expressions such as part names and unit names of the device to be managed and relationships (place-action) using a predicate term analyzer, deep learning, etc., and extracts structured information. It is possible. The analysis calculation unit 33 also functions as a query generation unit that generates queries. Here, a query is an instruction for performing various processes such as extracting and manipulating data from a predetermined database (DB). Query may also be translated as "query". In the present embodiment, the analysis calculation unit 33 functions as an example of analysis calculation means.

判断部３５は、主に、図２に示されているＣＰＵ３０１の処理によって実現され、情報処理装置３における各種判断を行う。また、判断部３５は、所定の質問に対する回答候補が複数存在する場合に、入力装置２に対してマルチモーダルの情報を要求する、すなわち回答候補の内容が曖昧であると判断する機能を担う。本実施形態において、判断部３５は、判断手段の一例として機能する。 The determination unit 35 is mainly implemented by the processing of the CPU 301 shown in FIG. 2 and performs various determinations in the information processing device 3 . Further, the judgment unit 35 has a function of requesting multimodal information from the input device 2 when there are multiple answer candidates for a predetermined question, that is, judging that the contents of the answer candidates are ambiguous. In this embodiment, the determination unit 35 functions as an example of determination means.

回答生成部３７は、主に、図２に示されているＣＰＵ３０１の処理によって実現され、入力された質問の内容に対応する回答情報、又は質問に対する回答を絞り込むために必要な追加要求情報を生成する。本実施形態において、回答生成部３７は、生成手段の一例として機能する。 The answer generator 37 is mainly realized by the processing of the CPU 301 shown in FIG. 2, and generates answer information corresponding to the content of the input question or additional request information necessary to narrow down the answers to the question. do. In this embodiment, the answer generator 37 functions as an example of generating means.

記憶読出部３９は、主に、図２に示されているＲＯＭ３０２及びＨＤ３０４のうち少なくとも一つに対するＣＰＵ３０１の処理によって実現され、記憶部３０００に各種データ(又は情報)を記憶したり、記憶部３０００から各種データ(又は情報)を読み出したりする。本実施形態において、記憶読出部３９は、記憶読出手段の一例として機能する。 The memory reading unit 39 is mainly realized by the processing of the CPU 301 on at least one of the ROM 302 and the HD 304 shown in FIG. Various data (or information) are read out from the . In the present embodiment, the memory reading unit 39 functions as an example of memory reading means.

○マルチモーダル情報○
本実施形態で用いられるマルチモーダル情報について説明する。本実施形態で用いられるマルチモーダルの一形態は、自然言語データ群より抽出された情報、並びに、画像情報及び音声情報が深層学習によって導かれ、互いに関連付けられる状態である。すなわち、自然言語データ群より抽出された情報をテキスト情報と考えると、自然言語データ群より抽出された情報以外の情報として、例えば、画像情報及び音声情報のうち少なくとも一方の情報が用いられることになる。 ○Multimodal information○
Multimodal information used in this embodiment will be described. One form of multimodal used in this embodiment is a state in which information extracted from a natural language data group, image information, and audio information are derived by deep learning and associated with each other. That is, when considering information extracted from a natural language data group as text information, information other than the information extracted from the natural language data group, for example, at least one of image information and voice information is used. Become.

また、本実施形態では、テキスト情報を含む自然言語データ群を自然言語情報の一例として扱い、部品情報、画像情報及び音声情報を含むデータ群を自然言語情報以外の非言語情報の一例として扱う。自然言語データ群は、例えば、質問応答システム１で保守、管理される管理対象装置の提供元であるメーカーが保有する自然言語に係るデータである。自然言語情報の一例としての自然言語データ群は、設計仕様書、マニュアル、保守レポート、及び部品情報を含む。一方、非言語情報の一例としてのデータ群は、言語情報以外のデータ群であり、自然言語データ群以外のデータ群が含まれる。自然言語データ群以外のデータ群のうち、例えば、CADデータ、3DCADデータ(製品の機構・構成情報)、部品表等が構造化データであり、CAD画像、ドキュメント内画像等が非構造化データである。 In addition, in this embodiment, a natural language data group including text information is treated as an example of natural language information, and a data group including part information, image information and audio information is treated as an example of non-linguistic information other than natural language information. The natural language data group is, for example, data relating to natural language owned by a manufacturer, which is a provider of a device to be managed maintained and managed by the question answering system 1 . A natural language data group as an example of natural language information includes design specifications, manuals, maintenance reports, and parts information. On the other hand, a data group as an example of non-linguistic information is a data group other than linguistic information, and includes a data group other than a natural language data group. Among data groups other than natural language data groups, for example, CAD data, 3D CAD data (product mechanism and configuration information), bill of materials, etc. are structured data, and CAD images, images in documents, etc. are unstructured data. be.

なお、本実施形態では、以降、入力装置２を利用する利用者又は管理対象装置の保守を行う担当者を、「利用者又は保守担当者」と記す。 In the present embodiment, a user who uses the input device 2 or a person in charge of maintenance of the managed device is hereinafter referred to as a "user or maintenance person".

〔実施形態の処理又は動作〕
次に、図５乃至図９を用いて、本実施形態に係る質問応答システムにおける動作及び各処理を説明する。 [Processing or operation of the embodiment]
Next, operations and processes in the question answering system according to this embodiment will be described with reference to FIGS. 5 to 9. FIG.

＜質問入力処理＞
図５は、質問応答処理の一例を示すシーケンス図である。まず、利用者又は保守担当者は、入力装置２を用いて、情報処理装置と通信するための通信アプリを起動する。その後、図５に示されているように、入力装置２の入出力部２３は、操作受付部２２と協働して、利用者又は保守担当者により入力された管理対象装置に対する所定の質問を受け付けて入力する（ステップＳ１１）。このとき、利用者又は保守担当者は、管理対象装置で発生した事象の対処法を探索するために、例えば、「△△△□□□という異音への対処方法を教えてほしい」といった発話によって、入力装置２に対して質問を入力する。これにより、入力装置２の入出力部２３は、利用者又は保守担当者による管理対象装置に対する具体的な質問を受け付ける。なお、変換生成部２７は、入出力部２３で入力された情報が自然言語等による音情報であれば、音情報をテキスト(文字)情報に変換し、質問文を生成する。さらに、入力装置２の操作受付部２２は、利用者又は保守担当者によるテキスト入力等によって、入力装置２に対して質問を入力してもよい。その場合、入力装置２の操作受付部２２は、利用者又は保守担当者のテキスト入力による管理対象装置に対する具体的な質問を直接受け付けることができる。 <Question input processing>
FIG. 5 is a sequence diagram showing an example of question answering processing. First, a user or a maintenance person uses the input device 2 to activate a communication application for communicating with the information processing device. After that, as shown in FIG. 5, the input/output unit 23 of the input device 2 cooperates with the operation reception unit 22 to ask a predetermined question for the managed device input by the user or maintenance personnel. Accept and input (step S11). At this time, the user or the maintenance staff utters, for example, "I want you to tell me how to deal with the abnormal noise △△△□□□" in order to find a way to deal with the event that occurred in the managed device. to input a question to the input device 2 . As a result, the input/output unit 23 of the input device 2 receives a specific question about the managed device from the user or the person in charge of maintenance. If the information input by the input/output unit 23 is sound information in a natural language or the like, the conversion generation unit 27 converts the sound information into text (character) information to generate a question sentence. Further, the operation reception unit 22 of the input device 2 may input a question to the input device 2 by text input by the user or maintenance staff. In this case, the operation reception unit 22 of the input device 2 can directly receive a specific question for the managed device by text input by the user or the person in charge of maintenance.

●画面表示例●
図６は、質問の入力を受け付ける質問受付画面の画面表示例である。図６に示されているように、入力装置２の表示制御部２４は、利用者又は保守担当者による質問の準備が整うと、ディスプレイ２０６に以下の内容を含む質問受付画面２１０１を表示させる。 ●Screen display example●
FIG. 6 is a screen display example of a question acceptance screen for accepting input of a question. As shown in FIG. 6, the display control unit 24 of the input device 2 causes the display 206 to display a question acceptance screen 2101 including the following contents when preparations for questions by the user or maintenance personnel are complete.

質問受付画面２１０１には、「質問受付画面」のタイトル、利用者宛に質問の入力を要求する所定のメッセージと質問内容入力欄２１１１、及び「送信」ボタン２１１２が表示される。質問内容入力欄２１１１は、例えば、テキスト情報を入力可能な入力欄であり、利用者又は保守担当者により入力されたテキスト情報が表示される。テキスト情報が入力された後、利用者又は保守担当者によって「送信」ボタンが操作(押下又はタップ等)されることにより、質問内容入力欄２１１１に入力されたテキスト情報が情報処理装置３に送信され、他の画面に遷移することができる。なお、質問受付画面２１０１では、上述したように、音声(音)による質問内容が受け付けられてもよい。その場合は、質問受付画面２１０１に、利用者又は保守担当者がマイクとして用いるマイクボタン、質問を発話した後に操作する停止ボタン等が配置され、それぞれの操作が受け付けられるＵＩ(User Interface)が提供されるようにしてもよい。 On the question reception screen 2101, a title of "question reception screen", a predetermined message requesting the user to input a question, a question content input column 2111, and a "send" button 2112 are displayed. The question content entry field 2111 is an entry field into which text information can be entered, for example, and displays the text information entered by the user or the person in charge of maintenance. After the text information is input, the text information input in the question content input field 2111 is transmitted to the information processing device 3 by operating (pressing, tapping, etc.) the "Send" button by the user or maintenance staff. and you can transition to another screen. As described above, the question acceptance screen 2101 may accept questions by voice (sound). In that case, the question reception screen 2101 is provided with a microphone button used as a microphone by the user or the maintenance staff, a stop button operated after the question is uttered, etc., and a UI (User Interface) is provided to receive each operation. may be made.

＜回答情報及び追加要求情報の生成処理＞
図５に戻り、質問入力を受け付けた入力装置２の送受信部２１は、情報処理装置３に対して、質問要求を送信する（ステップＳ１２）。これにより、情報処理装置３の送受信部３１は、入力装置２が送信した質問要求を受信する。このとき、質問要求には、ステップＳ１１で受け付けた質問内容を表すテキスト情報等が含まれる。 <Generation processing of response information and additional request information>
Returning to FIG. 5, the transmitter/receiver 21 of the input device 2 that has received the question input transmits a question request to the information processing device 3 (step S12). Thereby, the transmitting/receiving unit 31 of the information processing device 3 receives the question request transmitted by the input device 2 . At this time, the question request includes text information or the like representing the content of the question accepted in step S11.

質問要求を受信した情報処理装置３は、回答情報及び追加要求情報の生成処理を行う（ステップＳ１３）。具体的には、回答生成部３７は、後述する各種データベース（ＤＢ）を参照して、入力された質問の内容に対応する回答情報、又は質問に対する回答を絞り込むために必要な追加要求情報を生成する。 The information processing device 3 that has received the question request performs a process of generating answer information and additional request information (step S13). Specifically, the answer generation unit 37 refers to various databases (DB) described later, and generates answer information corresponding to the content of the input question or additional request information necessary to narrow down the answers to the question. do.

＜回答情報及び追加要求情報の生成の詳細＞
図７は、回答情報及び追加情報の生成処理の一例を示すフローチャートである。図７に示されているように、まず、情報処理装置３の取得部３２は、受信した質問要求に含まれる質問情報(質問内容を表すテキスト情報等)を取得する（ステップＳ１３－１）。質問情報は、具体的には「△△△□□□という異音への対処方法を教えてほしい」という内容が含まれるテキスト情報(例えば、テキストメール等の情報)である。 <Details of generating response information and additional request information>
FIG. 7 is a flow chart showing an example of processing for generating answer information and additional information. As shown in FIG. 7, first, the acquisition unit 32 of the information processing device 3 acquires question information (text information representing question content, etc.) included in the received question request (step S13-1). Specifically, the question information is text information (for example, information such as text mail) including the content of "I want you to tell me how to deal with the abnormal noise △△△□□□".

続いて、解析算出部３３は、取得した質問情報をベクトルに変換し、質問に対するクエリを生成する（ステップＳ１３－２）。 Subsequently, the analysis calculation unit 33 converts the acquired question information into a vector and generates a query for the question (step S13-2).

ベクトルへの変換、及び質問に対するクエリの生成後、回答生成部３７は、利用者又は保守担当者により入力された所定の質問に対する回答候補郡を生成する（ステップＳ１３－３）。回答候補郡の生成では、解析算出部３３は、取得した質問情報の内容に基づいて記憶読出部３９と協働して、発生した事象に含まれる「異音」を検索キーとして保守情報管理ＤＢ３００１（図４Ａ参照）を検索することにより、対応する“部品Ｃ”及び“対処方法”：“取付調整”、並びに、“部品Ｄ”及び“対処方法”：“交換”の二つの情報の組合せを読み出す。 After the conversion into vectors and the generation of the query for the question, the answer generator 37 generates a group of answer candidates for the predetermined question input by the user or maintenance staff (step S13-3). In generating the answer candidate groups, the analysis calculation unit 33 cooperates with the memory reading unit 39 based on the content of the acquired question information, and searches the maintenance information management DB 3001 using "abnormal noise" included in the occurred event as a search key. (see FIG. 4A), a combination of two pieces of information corresponding to "Part C" and "Corrective Action": "Mounting Adjustment" and "Part D" and "Corrective Action": "Replacement" can be obtained. read out.

続いて、判断部３５は、回答生成部３７によって生成された回答候補郡における回答が、所定の質問に対して一意に決定するか否かを判断する（ステップＳ１３－４）。判断部３５によって回答候補群で生成した回答が一意に決定すると判断された場合（ステップＳ１３－４；ＹＥＳ）に、回答生成部３７は、生成した回答候補群の内容に基づいて、質問に対する回答情報(回答文)を生成してこのフローを抜ける（ステップＳ１３－５）。つまり、ステップＳ１３－５では、回答生成部３７は、予め登録された所定の事象、所定の事象に対して対処を行った部品に係る部品情報、及び追加情報を関連付けて、回答情報(回答文)を生成する。 Subsequently, the determination unit 35 determines whether or not the answer in the answer candidate group generated by the answer generation unit 37 is uniquely determined for the predetermined question (step S13-4). If the determination unit 35 determines that the answer generated in the answer candidate group is uniquely determined (step S13-4; YES), the answer generation unit 37 generates an answer to the question based on the content of the generated answer candidate group. Information (answer sentence) is generated and this flow is exited (step S13-5). In other words, in step S13-5, the response generation unit 37 associates the pre-registered predetermined event, the component information related to the component for which the response to the predetermined event was performed, and the additional information, and generates the response information (response text). ).

他方、判断部３５によって回答候補群で生成した回答が一意に決定しないと判断された場合（ステップＳ１３－４；ＮＯ）、判断部３５は、マルチモーダル情報を取得済みか否かを判断する（ステップＳ１３－６）。マルチモーダル情報を取得済みであると判断した場合（ステップＳ１３－６；ＹＥＳ）、解析算出部３３は、取得済みのマルチモーダル情報をベクトルに変換後、類似度による絞込みを行い、上述したステップＳ１３－５の処理に遷移する（ステップＳ１３－７）。具体的には、解析算出部３３は、発生した事象及び追加情報との間の類似度を算出して、絞込みを行う。 On the other hand, if the determination unit 35 determines that the answer generated in the answer candidate group is not uniquely determined (step S13-4; NO), the determination unit 35 determines whether or not the multimodal information has been acquired ( step S13-6). If it is determined that the multimodal information has been acquired (step S13-6; YES), the analysis calculation unit 33 converts the acquired multimodal information into a vector, and then performs narrowing down based on the degree of similarity. -5 (step S13-7). Specifically, the analysis calculation unit 33 calculates the degree of similarity between the event that occurred and the additional information, and narrows down.

ここで、類似度による絞込みの一例として、テキストと画像のペアの類似度を算出する手法がある。例えば、ViLBERT、UINTERなどから獲得したベクトルのコサイン類似度により算出することができる。ViLBERT、UINTERともに、テキストと画像を同時に入力し、その両者の特徴を抽出し、ベクトルで表現する手法である。これらの手法は、ベクトル間のコサイン類似度を算出することで、テキスト及び画像の組合せの類似度の高さを計算することを想定している。なお、これらの手法はすでに論文等で公開されている公知の手法である。 Here, as an example of narrowing down by similarity, there is a method of calculating the similarity of a text-image pair. For example, it can be calculated by cosine similarity of vectors obtained from ViLBERT, UINTER, and the like. Both ViLBERT and UINTER are methods of inputting text and images at the same time, extracting the features of both, and expressing them as vectors. These methods are supposed to calculate the degree of similarity between text and image combinations by calculating the cosine similarity between vectors. Note that these methods are known methods that have already been published in papers and the like.

ステップＳ１３－６の処理において、マルチモーダル情報を取得済みでないと判断した場合（ステップＳ１３－６；ＮＯ）、回答生成部３７は、追加要求情報を生成してこのフローを抜ける（ステップＳ１３－８）。ステップＳ１３－８の例では、回答生成部３７は、二つの回答源となる情報を生成しており、且つ、マルチモーダル情報を取得していない。そのため、回答生成部３７は、上述したステップＳ１３－８の処理を実行してこのフローを抜ける。このときに生成される追加要求情報の内容は、例えば、「異音が発生している付近の詳細な画像や音声はありますか？」といった内容になり、利用者又は保守担当者に対して、所定の質問に対する回答の参考になるような画像情報及び音声情報のうち少なくとも一方の情報を提供することを促す内容となる。 In the processing of step S13-6, if it is determined that the multimodal information has not been obtained (step S13-6; NO), the reply generation unit 37 generates additional request information and exits this flow (step S13-8). ). In the example of step S13-8, the answer generation unit 37 has generated two pieces of information as answer sources and has not acquired multimodal information. Therefore, the answer generation unit 37 executes the process of step S13-8 described above and exits this flow. The content of the additional request information generated at this time is, for example, "Are there any detailed images or sounds in the vicinity where the abnormal noise is occurring?" The content is to encourage the provision of at least one of image information and audio information that will serve as a reference for answering a predetermined question.

＜回答情報及び追加要求情報の表示処理＞
図７に戻り、送受信部３１は、入力装置２に対して、所定の質問に対する回答を表す回答情報、又は、回答情報を生成するために所定の質問に関連付けられた追加要求を表す追加要求情報を送信する。具体的には、送受信部３１は、入力装置２に対して、回答生成部３７で生成した質問応答を送信する（ステップＳ１４）。これにより、入力装置２の送受信部２１は、情報処理装置３が送信した質問応答を受信する。このとき、質問応答には、質問に対する回答情報としての回答文、又は、質問に対する回答を絞り込むために必要な追加の内容の入力を要求するための追加要求情報が含まれる。つまり、情報処理装置３は、所定の質問に対する回答を生成できない場合に、入力装置２に対して追加要求情報を送信する。この追加要求情報が送信されたときの入力装置２における画面については、後述する画面表示例にて詳細に説明する。 <Display processing of response information and additional request information>
Returning to FIG. 7, the transmitting/receiving unit 31 supplies the input device 2 with answer information indicating an answer to a predetermined question, or additional request information indicating an additional request associated with a predetermined question in order to generate answer information. to send. Specifically, the transmitter/receiver 31 transmits the question and answer generated by the answer generator 37 to the input device 2 (step S14). Thereby, the transmitting/receiving section 21 of the input device 2 receives the question and answer transmitted by the information processing device 3 . At this time, the question answer includes an answer sentence as answer information to the question, or additional request information for requesting input of additional content necessary to narrow down the answer to the question. That is, the information processing device 3 transmits addition request information to the input device 2 when an answer to a predetermined question cannot be generated. A screen displayed on the input device 2 when the addition request information is transmitted will be described in detail in a screen display example described later.

質問応答を受信後、入力装置２の記憶読出部２９は、受信した質問応答を記憶部２０００の所定領域に記憶する（ステップＳ１５）。 After receiving the question and answer, the storage/reading unit 29 of the input device 2 stores the received question and answer in a predetermined area of the storage unit 2000 (step S15).

その後、表示制御部２４は、記憶読出部２９と協働して記憶した回答を示す回答情報又は追加要求情報を記憶部２０００から読み出し、ディスプレイ２０６に表示する（ステップＳ１６）。 After that, the display control unit 24 reads out the answer information indicating the answer stored in cooperation with the memory reading unit 29 or the additional request information from the storage unit 2000, and displays it on the display 206 (step S16).

ディスプレイ２０６に質問応答が表示された入力装置２は、追加要求情報を満たす情報の入力を行う（ステップＳ１７）。但し、ステップＳ１６で、利用者又は保守担当者が質問応答を確認し、満足のいく質問応答である場合は、ステップＳ１７の処理は行われなくてもよい。ステップＳ１７の処理については、後述する画面表示例にて詳細に説明する。 The input device 2 with the question and answer displayed on the display 206 inputs information satisfying the additional request information (step S17). However, in step S16, the user or maintenance personnel confirms the question and answer, and if the question and answer is satisfactory, the process of step S17 may not be performed. The process of step S17 will be described in detail with a screen display example to be described later.

ステップＳ１７で追加要求情報を満たす情報の入力が行われた後、判断部２５は、「送信」ボタンが操作されたか否かを判断する（ステップＳ１８）。「送信」ボタンが操作されたと判断された場合（ステップＳ１８；ＹＥＳ）、判断部２５は、ステップＳ１２の処理に移行し、送受信部２１は、新たな質問要求を情報処理装置３に対して送信し、回答が一意に決定するまで処理を繰り返す。つまり、情報処理装置３の送受信部３１は、新たな質問要求を入力装置２から受信する。このときに受信する新たな質問要求は、ステップＳ１７において追加要求情報に対して新たに入力された画像情報及び音声情報のうち少なくとも一方の情報で示される質問情報である。すなわち、２回目以降のステップＳ１２の処理では、送受信部３１は、入力装置２が送信した所定の質問に係る所定の事象に含まれる画像情報及び音声情報のうち少なくとも一方の情報を、追加要求情報に対する追加情報として受信する。 After the information satisfying the additional request information is input in step S17, the determination unit 25 determines whether or not the "send" button has been operated (step S18). If it is determined that the "send" button has been operated (step S18; YES), the determination unit 25 proceeds to the process of step S12, and the transmission/reception unit 21 transmits a new question request to the information processing device 3. and repeat the process until a unique answer is determined. That is, the transmission/reception unit 31 of the information processing device 3 receives a new question request from the input device 2 . The new question request received at this time is question information indicated by at least one of the image information and the voice information newly input for the additional request information in step S17. That is, in the process of step S12 from the second time onward, the transmitting/receiving unit 31 converts at least one of the image information and the audio information included in the predetermined event related to the predetermined question transmitted by the input device 2 to the additional request information. receive as additional information to

他方、「送信」ボタンが操作されないと判断された場合（ステップＳ１８；ＮＯ）、判断部２５は、このシーケンス処理を終了する。このとき、判断部２５は、所定期間若しくは所定回数ステップＳ１８の判断処理を繰り返し、利用者又は保守担当者による操作を待つようにしてから、このシーケンスを抜けるようにしてもよい。 On the other hand, if it is determined that the "send" button has not been operated (step S18; NO), the determining section 25 terminates this sequence process. At this time, the determination unit 25 may repeat the determination process of step S18 for a predetermined period of time or a predetermined number of times, wait for an operation by the user or the person in charge of maintenance, and then exit this sequence.

本実施形態に係る質問応答システムでは、例えば、上述したステップＳ１２及びＳ１４の処理が実行される場合、入力装置２と情報処理装置３との間に他の装置等が存在してもよい。つまり、入力装置２と情報処理装置３との間で送受信される各情報(データ)は、一度他の装置を介して送受信されるような構成であってもよい。上述した構成及び処理方法は、入力装置２と情報処理装置３との間の他の処理ステップにおいても適用可能である。 In the question answering system according to this embodiment, for example, when the processes of steps S12 and S14 described above are executed, another device or the like may exist between the input device 2 and the information processing device 3 . That is, each piece of information (data) transmitted/received between the input device 2 and the information processing device 3 may be transmitted/received once via another device. The configuration and processing method described above are also applicable to other processing steps between the input device 2 and the information processing device 3 .

なお、上述したシーケンス図及びフローチャートは、一実施形態を説明するための一例にすぎない。そこで、上述した実施形態に限定されるものではなく、他のシーケンス図及びフローチャートにより処理等、当業者が想到することができる範囲内で変更することが可能である。 It should be noted that the above-described sequence diagrams and flowcharts are merely examples for describing one embodiment. Therefore, the present invention is not limited to the above-described embodiment, and can be modified within the range that a person skilled in the art can conceive of, such as processing using other sequence diagrams and flowcharts.

●画面表示例●
図８は、応答通知の内容を表示する応答通知画面の画面表示例である。図８では、利用者又は保守担当者が送信した所定の質問に対して、所望の回答情報が一意に決定した場合に、発生した事象の要因となった画像情報に係る画面が、入力装置２の表示制御部２４によって表示される。図８に示されているように、表示制御部２４は、情報処理装置３が送信した質問応答を受信すると、ディスプレイ２０６に以下の内容を含む応答通知画面２１０２を表示させる。 ●Screen display example●
FIG. 8 is a screen display example of a response notification screen displaying the contents of the response notification. In FIG. 8, when desired answer information is uniquely determined in response to a predetermined question sent by a user or maintenance staff, a screen related to image information that caused an event that occurred is displayed on the input device 2. is displayed by the display control unit 24 of . As shown in FIG. 8, when receiving the question and answer transmitted by the information processing device 3, the display control unit 24 causes the display 206 to display a response notification screen 2102 including the following contents.

応答通知画面２１０２には、「応答通知画面」のタイトル、及び利用者又は保守担当者宛に質問に対する回答としての応答メッセージが表示される。このときに回答生成部３７によって生成される質問応答には、例えば、事象がどこの箇所で発生したかを説明する文、発生した事象の内容及び対処方法、並びに、対処した部品の画像情報２１１３が含まれる。入力装置２は、上述した質問応答に含まれる回答情報を情報処理装置３から受信し、ディスプレイ２０６に表示させる。これにより、情報処理装置３は、入力装置２を介し利用者又は保守担当者に対して、質問した内容に対して応答すべき回答情報の詳細を提示することが可能になる。つまり、情報処理装置３は、所定の質問が利用者又は保守担当者により入力された場合に、画像情報といった具体的な視覚情報を含めて応答することが可能になる。そのため、質問応答システム１は、利用者又は保守担当者に対して、修理作業等の効率を上げることも可能になる。 The response notification screen 2102 displays a title of "response notification screen" and a response message addressed to the user or maintenance personnel as an answer to the question. The question and answer generated by the answer generator 37 at this time includes, for example, a sentence explaining where the event occurred, details of the event that occurred, how to deal with it, and image information 2113 of the part dealt with. is included. The input device 2 receives the answer information included in the above question and answer from the information processing device 3 and displays it on the display 206 . As a result, the information processing device 3 can present the details of the answer information to be answered in response to the content of the question to the user or the person in charge of maintenance via the input device 2 . In other words, the information processing device 3 can respond including specific visual information such as image information when a predetermined question is input by the user or maintenance personnel. Therefore, the question answering system 1 can also improve the efficiency of repair work and the like for users or maintenance personnel.

また、応答通知画面２１０２には、「確認」ボタン２１１４が含まれる。これにより、利用者又は保守担当者は、応答通知画面２１０２に表示された内容を確認後、「確認」ボタン２１１４を操作（押下、タップ等）操作することで、他の画面に遷移することができる。 The response notification screen 2102 also includes a “confirm” button 2114 . As a result, the user or maintenance personnel can switch to another screen by operating (pressing, tapping, etc.) the "confirm" button 2114 after confirming the contents displayed on the response notification screen 2102. can.

なお、図８に示した応答通知画面２１０２は、情報処理装置３が送信したメールを受信した画面例であるが、入力装置２の表示制御部２４は、入力装置２が情報処理装置３に対して何らかの情報を取りに行って得られた画面を表示するようにしてもよい。つまり、メール受信画面ではなく、ブラウザ等を利用して得た画面情報をディスプレイ２０６に表示させるような形態でもよい。 Note that the response notification screen 2102 shown in FIG. It is also possible to display a screen obtained by retrieving some information. In other words, screen information obtained using a browser or the like may be displayed on the display 206 instead of the mail reception screen.

●画面表示例●
図９は、追加情報の入力を受け付ける追加情報受付画面の画面表示例である。図９では、利用者又は保守担当者が送信した所定の質問に対して、所望の回答情報が得られなかった場合の入力装置２に表示される画面例である。図９に示されているように、表示制御部２４は、情報処理装置３が送信した質問応答を受信すると、ディスプレイ２０６に以下の内容を含む追加情報受付画面２１０３を表示させる。 ●Screen display example●
FIG. 9 is a screen display example of an additional information acceptance screen for accepting input of additional information. FIG. 9 shows an example of a screen displayed on the input device 2 when desired answer information is not obtained in response to a predetermined question sent by the user or the person in charge of maintenance. As shown in FIG. 9, when receiving the question and answer transmitted by the information processing device 3, the display control unit 24 causes the display 206 to display an additional information reception screen 2103 including the following contents.

追加情報受付画面２１０３には、「追加情報受付画面」のタイトル、追加情報の提供を依頼するメッセージ、＜追加の質問＞としてのメッセージ、「画像」選択ボタン２１１５、「音声」選択ボタン２１１６、画像表示領域２１１７、マイクボタン２１１８、停止ボタン２１１９、「送信」ボタン２１２０及び「キャンセル」ボタン２１２１がそれぞれ表示される。つまり、表示制御部２４は、画像情報を与えるための画像情報受付手段及び音声情報を与えるための音声情報受付手段のうち少なくとも一方の手段をディスプレイ２０６に表示する。 On the additional information reception screen 2103, a title of "additional information reception screen", a message requesting provision of additional information, a message as <additional question>, an "image" selection button 2115, a "voice" selection button 2116, an image A display area 2117, a microphone button 2118, a stop button 2119, a "send" button 2120 and a "cancel" button 2121 are displayed respectively. That is, the display control unit 24 displays on the display 206 at least one of the image information receiving means for providing image information and the audio information receiving means for providing audio information.

「画像」選択ボタン２１１５は、追加情報のひとつである画像情報を入力するためのボタンである。「画像」選択ボタン２１１５が操作されることにより、入力装置２のＣＰＵ２０１は、カメラ２２０及び撮像素子Ｉ／Ｆ２２１を起動させてもよい。これにより、利用者又は保守担当者は、事象が発生している箇所の付近を撮影して、所望の画像情報として入力装置２の所定領域に画像データとして一時保存することができる。さらに、「画像」選択ボタン２１１５が操作されることにより、ＣＰＵ２０１は、ディスプレイ２０６を制御して、画像表示領域２１１７をポップアップ画面として追加情報受付画面２１０３上に表示させてもよい。これにより、利用者又は管理対象者は、所望の画像データを入力装置２の所定領域に一時保存することができる。このような方法により、利用者又は保守担当者は、情報処理装置３に対して所望の画像情報を送信(アップロード)させることが可能になる。 An "image" selection button 2115 is a button for inputting image information, which is one type of additional information. By operating the “image” selection button 2115 , the CPU 201 of the input device 2 may activate the camera 220 and the image sensor I/F 221 . As a result, the user or the person in charge of maintenance can photograph the vicinity of the location where the event has occurred and temporarily store it as image data in a predetermined area of the input device 2 as desired image information. Furthermore, by operating the “image” selection button 2115 , the CPU 201 may control the display 206 to display the image display area 2117 as a pop-up screen on the additional information reception screen 2103 . Thereby, the user or the person to be managed can temporarily store desired image data in a predetermined area of the input device 2 . With such a method, the user or the person in charge of maintenance can transmit (upload) desired image information to the information processing apparatus 3 .

「音声」選択ボタン２１１６は、追加情報の他のひとつである音声情報を入力するためのボタンである。「音声」選択ボタン２１１６が操作されることにより、入力装置２のＣＰＵ２０１は、マイク２１８を制御し、「マイク」ボタン２１１８をオンにする。そこで、利用者又は保守担当者は、この「マイク」ボタン２１１８をマイクとして使用することができるようになる。この段階で、利用者又は保守担当者は、所望の音声(音)情報を「マイク」ボタン２１１８に向かって発話し、発話した後に「停止」ボタン２１１９を操作することで、入力装置２の所定領域に、発話した内容が音声データとして一時保存される。これにより、利用者又は保守担当者は、情報処理装置３に対して所望の音声情報を送信(アップロード)させることが可能になる。 A "voice" selection button 2116 is a button for inputting voice information, which is another type of additional information. By operating the "voice" selection button 2116, the CPU 201 of the input device 2 controls the microphone 218 to turn on the "microphone" button 2118. FIG. Therefore, the user or maintenance personnel can use this "microphone" button 2118 as a microphone. At this stage, the user or the person in charge of maintenance speaks desired voice (sound) information into the "microphone" button 2118, and after speaking, operates the "stop" button 2119 to set the input device 2 to the predetermined voice (sound) information. The content of the utterance is temporarily stored as voice data in the area. As a result, the user or the person in charge of maintenance can transmit (upload) desired audio information to the information processing device 3 .

以上の段階を経て、利用者又は保守担当者は、「送信」ボタン２１２０を操作することで、画像情報及び音声情報のうち少なくとも一方の情報を情報処理装置３に対して送信させ、他の画面に遷移することができる。 After going through the above steps, the user or the person in charge of maintenance operates the "Send" button 2120 to send at least one of the image information and the sound information to the information processing device 3, and another screen is displayed. can transition to

上述した構成により、情報処理装置３は、所定の事象に係る質問内容に関連付けられた詳細な情報を利用者又は保守担当者から得ることが可能になる。つまり、情報処理装置３は、入力装置２を利用する利用者又は保守担当者が提供した質問の曖昧性を、より早い段階で解消させることが可能になる。換言すれば、情報処理装置３は、所定の質問が利用者又は保守担当者により入力された場合に、利用者又は保守担当者に対して曖昧な応答をする可能性をできるだけ排除し、最終的に利用者又は保守担当者が求める応答を迅速に提供することが可能になる。そのため、質問応答システム１は、利用者又は保守担当者に対して、深層学習を利用した際の正確性も担保することが可能になる。 With the configuration described above, the information processing device 3 can obtain detailed information associated with the content of a question related to a predetermined event from the user or the person in charge of maintenance. That is, the information processing device 3 can resolve the ambiguity of the question provided by the user who uses the input device 2 or the person in charge of maintenance at an early stage. In other words, when a predetermined question is input by the user or the maintenance staff, the information processing device 3 eliminates the possibility of giving an ambiguous response to the user or the maintenance staff as much as possible, It is possible to quickly provide a response requested by a user or a maintenance person. Therefore, the question answering system 1 can ensure the accuracy of deep learning for users or maintenance personnel.

さらに、本実施形態において、例えば、新人のカスタマーエンジニア(以下、ＣＥと記す)が客先を訪問するシーンを想定する。このような場において、ＣＥは、客先に設置されたＭＦＰ等の画像形成装置で発生している事象のひとつである故障の原因を特定中である。そこでＣＥは、質問応答システム１に対して故障の原因を質問すると、質問応答システム１は、質問の曖昧性を解消しながら回答していく。このような場合、故障に伴いどのような異音が発生しているのか、どこで異音が発生しているのか、といった情報は問題を特定する上で重要である一方、言語化が難しいケースも多い。本実施形態では、そのようなケースにおいてマルチモーダルな情報をそのまま利用することで、効率的に質問の曖昧性を解消することが可能になる。 Furthermore, in this embodiment, for example, a scene is assumed in which a new customer engineer (hereinafter referred to as CE) visits a customer. In such a situation, the CE is identifying the cause of failure, which is one of the phenomena occurring in the image forming apparatus such as the MFP installed at the customer's site. Therefore, when the CE asks the question-answering system 1 about the cause of the failure, the question-answering system 1 answers while resolving the ambiguity of the question. In such cases, information such as what kind of abnormal noise is occurring due to the failure and where the abnormal noise is occurring is important in identifying the problem, but in some cases it is difficult to verbalize it. many. In this embodiment, by using the multimodal information as it is in such a case, it is possible to efficiently resolve the ambiguity of the question.

なお、本実施形態において、利用者又は保守担当者をまとめて利用者と呼ぶ。さらに、利用者は、保守担当者以外に、管理対象装置が提供する各種サービスを管理するサービス担当者、管理対象装置を修理する修理担当者等を含む。 In addition, in this embodiment, a user or a person in charge of maintenance is collectively called a user. In addition to maintenance personnel, users include service personnel who manage various services provided by managed devices, and repair personnel who repair managed devices.

〔実施形態の主な効果〕
以上説明したように本実施形態によれば、情報処理装置３の送受信部３１は、入力装置２に対して、回答生成部３７で生成した質問応答を送信し（ステップＳ１４）、質問応答に含まれる追加要求情報に対して新たに入力された画像情報及び音声情報のうち少なくとも一方の情報を入力装置２から受信する（ステップＳ１２）。これにより、所定の質問に対する回答を応答できない場合の追加で入力される質問の曖昧性を解消することが可能になるという効果を奏する。 [Main effects of the embodiment]
As described above, according to the present embodiment, the transmitting/receiving unit 31 of the information processing device 3 transmits the question and answer generated by the answer generating unit 37 to the input device 2 (step S14), and At least one of image information and audio information newly input in response to the additional request information received is received from the input device 2 (step S12). This has the effect of resolving the ambiguity of the additionally input question when the answer to the predetermined question cannot be answered.

〔実施形態の補足〕
なお、上述した一実施形態の機能は、アセンブラ、Ｃ、Ｃ＋＋、Ｃ＃、Ｊａｖａ(登録商標)等のレガシープログラミング言語又はオブジェクト指向プログラミング言語等で記述されたコンピュータ実行可能なプログラムにより実現でき、各実施形態の機能を実行するためのプログラムは、電気通信回線を通じて頒布することができる。 [Supplement to the embodiment]
The functions of the above-described embodiment can be realized by a computer-executable program written in a legacy programming language such as assembler, C, C++, C#, Java (registered trademark), or an object-oriented programming language. Programs for performing the functions of the embodiments can be distributed over telecommunications lines.

また、一実施形態の機能を実行するためのプログラムは、ＲＯＭ、ＥＥＰＲＯＭ(Electrically Erasable Programmable Read-Only Memory)、ＥＰＲＯＭ(Erasable Programmable Read-Only Memory)、フラッシュメモリ、フレキシブルディスク、ＣＤ(Compact Disc)－ＲＯＭ、ＣＤ－ＲＷ(Re-Writable)、ＤＶＤ－ＲＯＭ、ＤＶＤ－ＲＡＭ、ＤＶＤ－ＲＷ、ブルーレイディスク、ＳＤカード、ＭＯ(Magneto-Optical disc)等の記録媒体に格納して頒布することもできる。 In addition, the program for executing the functions of one embodiment includes ROM, EEPROM (Electrically Erasable Programmable Read-Only Memory), EPROM (Erasable Programmable Read-Only Memory), flash memory, flexible disk, CD (Compact Disc)- It can also be stored and distributed in recording media such as ROM, CD-RW (Re-Writable), DVD-ROM, DVD-RAM, DVD-RW, Blu-ray disc, SD card, MO (Magneto-Optical disc) and the like.

さらに、一実施形態の機能の一部又は全部は、例えば、ＡＳＩＣ(Application Specific Integrated Circuit)、ＤＳＰ（digital signal processor）、ＦＰＧＡ(Field Programmable Gate Array)、ＳＯＣ(System on a chip)、ＧＰＵ(Graphics Processing Unit)等のプログラマブル・デバイス(ＰＤ)上に実装することができ、各実施形態の機能をＰＤ上に実現するためにＰＤにダウンロードする回路構成データ(ビットストリームデータ)、回路構成データを生成するためのＨＤＬ(Hardware Description Language)、ＶＨＤＬ(Very High Speed Integrated Circuits Hardware Description Language)、Ｖｅｒｉｌｏｇ－ＨＤＬ等により記述されたデータとして記録媒体により配布することができる。 Furthermore, some or all of the functions of one embodiment are, for example, ASIC (Application Specific Integrated Circuit), DSP (digital signal processor), FPGA (Field Programmable Gate Array), SOC (System on a chip), GPU (Graphics Processing Unit) can be implemented on a programmable device (PD) such as, and generates circuit configuration data (bitstream data) and circuit configuration data to be downloaded to the PD in order to realize the functions of each embodiment on the PD HDL (Hardware Description Language), VHDL (Very High Speed Integrated Circuits Hardware Description Language), Verilog-HDL, etc., for the purpose of distributing data on a recording medium.

さらに、上述した実施形態により得られるテキスト及び各種テーブルは、機械学習の学習効果によって生成されたものでもよく、関連付けられている各項目のデータを機械学習にて分類付けすることで、テーブルを使用しなくてもよい。ここで、機械学習とは、コンピュータに人のような学習能力を獲得させるための技術であり、コンピュータが、データ識別等の判断に必要なアルゴリズムを事前に取り込まれる学習データから自律的に生成、新たなデータについてこれを適用して予測を行う技術のことをいう。機械学習のための学習方法は、教師あり学習、教師なし学習、半教師学習、強化学習、深層学習のいずれかの方法でもよい、さらに、機械学習のための学習方法は、これらの学習方法を組み合わせた学習方法でもよく、機械学習のための学習方法は問わない。 Furthermore, the texts and various tables obtained by the above-described embodiments may be generated by the learning effect of machine learning. You don't have to. Here, machine learning is a technology for making a computer acquire human-like learning ability, and the computer autonomously generates algorithms necessary for judgment such as data identification from learning data taken in advance, It is a technology that applies this to new data and makes predictions. The learning method for machine learning may be supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, or deep learning. A combined learning method may be used, and any learning method for machine learning may be used.

さらに、一実施形態で例示した入力装置に情報処理装置の各機能及び各手段を有し、入力応答装置として機能するようにしてもよい。その場合、本実施形態で用いられた入力装置２には、情報処理装置３の各機能部が含まれることになる。 Furthermore, the input device exemplified in one embodiment may have each function and means of an information processing device and function as an input response device. In that case, the input device 2 used in this embodiment includes each functional unit of the information processing device 3 .

これまで本発明の一実施形態に係る質問応答システム、情報処理装置、情報処理方法及びプログラムについて説明してきたが、本発明は、上述した実施形態に限定されるものではなく、他の実施形態の追加、変更又は削除等、当業者が想到することができる範囲内で変更することができ、いずれの態様においても本発明の作用・効果を奏する限り、本発明の範囲に含まれるものである。 A question answering system, an information processing apparatus, an information processing method, and a program according to one embodiment of the present invention have been described so far. Additions, changes, deletions, etc., can be made within the range that a person skilled in the art can conceive, and as long as the action and effect of the present invention are exhibited in any aspect, it is included in the scope of the present invention.

１質問応答システム
２入力装置
３情報処理装置
２１送受信部（受信手段の一例、送信手段の一例）
２２操作受付部（操作受付手段の一例、入力手段の一例）
２１１４画像ボタン２１１４(画像情報受付手段の一例)
２１１５音声ボタン２１１５(音声情報受付手段の一例)
２３入出力部（入出力手段の一例）
３１送受信部（受信手段の一例、送信手段の一例）
３２取得部（取得手段の一例）
３３解析算出部（解析算出手段の一例）
３５判断部（判断手段の一例）
３７回答生成部（生成手段の一例） 1 question answering system 2 input device 3 information processing device 21 transmission/reception unit (an example of receiving means, an example of transmitting means)
22 Operation reception unit (an example of operation reception means, an example of input means)
2114 Image button 2114 (an example of image information receiving means)
2115 Voice button 2115 (an example of voice information receiving means)
23 input/output unit (an example of input/output means)
31 Transceiver (an example of receiving means, an example of transmitting means)
32 Acquisition unit (an example of acquisition means)
33 analysis calculation unit (an example of analysis calculation means)
35 Determination unit (an example of determination means)
37 Answer generation unit (an example of generation means)

特開２０１９－１１７５６８号公報JP 2019-117568 A

Claims

A question answering system comprising an input device for inputting a predetermined question and an information processing device for responding to an answer to the predetermined question transmitted by the input device,
The information processing device is
receiving means for receiving question information representing the predetermined question transmitted by the input device;
sending means for sending answer information representing an answer to the predetermined question or additional request information representing an additional request associated with the predetermined question to generate the answer information to the input device;
has
The input device is
display control means for displaying the addition request information transmitted by the information processing device on a display means;
transmitting means for transmitting at least one of image information and audio information input in response to the addition request information to the information processing apparatus;
having
A question answering system characterized by:

The transmission means of the information processing device includes:
sending the additional request information to the input device when an answer to the predetermined question cannot be generated;
The question answering system according to claim 1, characterized by:

The question answering system according to claim 1 or 2,
The receiving means
receiving at least one of image information and audio information included in a predetermined event related to the predetermined question transmitted by the input device as additional information to the additional request information;
A question answering system characterized by:

The information processing device further includes
generating means for generating the response information by associating the pre-registered predetermined event, part information related to the part that has dealt with the predetermined event, and the additional information;
4. The question answering system according to claim 3, characterized by:

The question answering system according to claim 4,
The information processing device further includes
A calculation means for calculating a similarity between the event and the additional information;
A question answering system characterized by:

The display control means is
displaying on the display means at least one of image information receiving means for providing the image information and audio information receiving means for providing the audio information;
6. The question answering system according to any one of claims 1 to 5, characterized in that:

The display control means is
displaying image information that caused the event on the display means when an answer to the predetermined question is uniquely determined;
6. The question answering system according to any one of claims 1 to 5, characterized in that:

An information processing device that responds to an answer to the predetermined question transmitted by an input device for inputting a predetermined question,
receiving means for receiving question information representing the predetermined question transmitted by the input device;
sending means for sending answer information representing an answer to the predetermined question or additional request information representing an additional request associated with the predetermined question to generate the answer information to the input device;
has
The receiving means
receiving at least one of image information and audio information input in response to the addition request information;
The transmission means is
transmitting to the input device answer information representing an answer to the predetermined question generated based on at least one of the image information and the audio information received by the receiving means;
An information processing device characterized by:

An information processing method executed by an information processing device that responds to an answer to a predetermined question transmitted by an input device for inputting a predetermined question,
a receiving step of receiving question information representing the predetermined question transmitted by the input device;
a sending step of sending answer information representing an answer to the predetermined question or additional request information representing an additional request associated with the predetermined question to generate the answer information to the input device;
has
The receiving step includes
including a process of receiving at least one of image information and audio information input in response to the addition request information;
The sending step includes:
A process of transmitting, to the input device, answer information representing an answer to the predetermined question generated based on at least one of the image information and the audio information received in the receiving step,
An information processing method characterized by:

A program for causing a computer to function as each means according to claim 8.