JP7152454B2

JP7152454B2 - Information processing device, information processing method, information processing program, and information processing system

Info

Publication number: JP7152454B2
Application number: JP2020163922A
Authority: JP
Inventors: 孝裕森田; 亮張; 英毅表
Original assignee: SoftBank Corp
Current assignee: SoftBank Corp
Priority date: 2020-09-29
Filing date: 2020-09-29
Publication date: 2022-10-12
Anticipated expiration: 2040-09-29
Also published as: JP2022056109A

Description

本発明は、情報処理装置、情報処理方法、情報処理プログラム及び情報処理システムに関する。 The present invention relates to an information processing device, an information processing method, an information processing program, and an information processing system.

近年、遠隔コミュニケーションに関する技術の開発がますます盛んになっている。遠隔コミュニケーションでは、遠隔地にいる相手の映像や音声などを相互に通信することでコミュニケーションを図る。 In recent years, the development of technology related to remote communication has become more and more popular. In remote communication, communication is achieved by mutual communication of images, voices, etc. of a partner at a remote location.

特開２０１９－６１５９４号公報JP 2019-61594 A

しかしながら、上記の従来技術では、対面によるコミュニケーションと比べると、伝え手から聞き手に対して複雑な操作内容を正確に伝達することが困難な場合がある。遠隔コミュニケーションにおける複雑な操作内容の伝達を支援可能にする技術が求められている。 However, with the above-described conventional technology, it may be difficult for the sender to accurately convey complicated operation details to the listener compared to face-to-face communication. There is a demand for a technology that can support the transmission of complicated operation details in remote communication.

実施形態に係る情報処理装置は、遠隔コミュニケーションの参加者である聞き手の情報に基づいて、前記遠隔コミュニケーションの参加者である伝え手から前記聞き手に対して伝達される操作内容であって、前記聞き手が操作する操作対象に対する操作内容を補う補助情報を生成する生成部と、前記生成部によって生成された前記補助情報を前記聞き手の端末装置に出力するよう制御する出力制御部と、を備える。 An information processing apparatus according to an embodiment provides an operation content transmitted from a transmitter who is a participant in remote communication to the listener based on information about the listener who is a participant in the remote communication. a generation unit that generates auxiliary information that supplements the operation content for the operation target operated by the , and an output control unit that controls to output the auxiliary information generated by the generation unit to the terminal device of the listener.

図１は、比較例に係る情報処理の一例を示す図である。FIG. 1 is a diagram illustrating an example of information processing according to a comparative example. 図２は、実施形態に係る情報処理の一例を示す図である。FIG. 2 is a diagram illustrating an example of information processing according to the embodiment; 図３は、実施形態に係る情報処理システムの構成例を示す図である。FIG. 3 is a diagram illustrating a configuration example of an information processing system according to the embodiment; 図４は、実施形態に係る情報処理装置の構成例を示す図である。FIG. 4 is a diagram illustrating a configuration example of an information processing apparatus according to the embodiment; 図５は、実施形態に係る参加者情報記憶部の一例を示す図である。FIG. 5 is a diagram illustrating an example of a participant information storage unit according to the embodiment; 図６は、第１の変形例に係る情報処理の一例を示す図である。FIG. 6 is a diagram illustrating an example of information processing according to the first modification. 図７は、第２の変形例に係る情報処理の一例を示す図である。FIG. 7 is a diagram illustrating an example of information processing according to the second modification. 図８は、第３の変形例に係る情報処理の一例を示す図である。FIG. 8 is a diagram illustrating an example of information processing according to the third modification. 図９は、第４の変形例に係る情報処理の一例を示す図である。FIG. 9 is a diagram illustrating an example of information processing according to the fourth modification. 図１０は、情報処理装置の機能を実現するコンピュータの一例を示すハードウェア構成図である。FIG. 10 is a hardware configuration diagram showing an example of a computer that implements the functions of the information processing apparatus.

以下に、本願に係る情報処理装置、情報処理方法、情報処理プログラム及び情報処理システムを実施するための形態（以下、「実施形態」と呼ぶ）について図面を参照しつつ詳細に説明する。なお、この実施形態により本願に係る情報処理装置、情報処理方法、情報処理プログラム及び情報処理システムが限定されるものではない。また、以下の各実施形態において同一の部位には同一の符号を付し、重複する説明は省略される。 Hereinafter, modes for implementing an information processing apparatus, an information processing method, an information processing program, and an information processing system (hereinafter referred to as "embodiments") according to the present application will be described in detail with reference to the drawings. Note that the information processing apparatus, information processing method, information processing program, and information processing system according to the present application are not limited to these embodiments. Also, in each of the following embodiments, the same parts are denoted by the same reference numerals, and overlapping descriptions are omitted.

（実施形態）
〔１．はじめに〕
近年、新型コロナウイルス感染症対策等により、テレワークを導入する企業が増加している。また、テレワーク導入に伴い、ビデオ会議やＷｅｂ会議といった遠隔コミュニケーションを利用する機会がますます増加している。例えば、遠隔コミュニケーションを利用して、伝え手から聞き手に対して、聞き手が操作する操作対象に対する操作内容を伝達する機会も増加している。 (embodiment)
[1. Introduction]
In recent years, an increasing number of companies have introduced telework as a countermeasure against the new coronavirus infection. In addition, with the introduction of telework, there are more and more opportunities to use remote communication such as video conferences and web conferences. For example, using remote communication, there are increasing opportunities for a sender to transmit to a listener the details of an operation performed on an operation target operated by the listener.

一般的に、遠隔コミュニケーションにおける操作内容の伝達は、対面での操作内容の伝達と比べると、複雑な操作内容を伝達することが難しいと言われている。例えば、遠隔コミュニケーションで操作内容を伝達する場合、伝え手は実際の操作対象が手元にない状態で聞き手に対して操作対象に対する操作内容を伝える必要があるため、複雑な操作内容を伝達することが難しい場合がある。また、聞き手は伝え手から伝達される操作内容を言語情報のみで理解する必要があるため、複雑な操作内容を理解することが難しい場合がある。 In general, it is said that it is more difficult to convey complicated operation details in remote communication than in face-to-face communication. For example, when transmitting operation details by remote communication, the transmitter needs to communicate the operation details of the operation target to the listener when the actual operation target is not at hand, so it is possible to transmit complicated operation details. It can be difficult. In addition, since the listener needs to understand the content of the operation transmitted from the sender only with verbal information, it may be difficult for the listener to understand the content of the complicated operation.

そこで近年、遠隔コミュニケーションでの複雑な操作内容の伝達を助けるための技術が開発されている。例えば、遠隔地にいる伝え手の３次元ホログラフィックを現場の聞き手に提供することで、ミス軽減や伝達速度の向上に繋げる技術が知られている（参考ＵＲＬ：https://dynamics.microsoft.com/en-us/mixed-reality/overview/）。しかしながら、複雑な操作内容を伝達するにあたり、視覚情報、聴覚情報、触覚情報が豊富な対面での伝達に匹敵または対抗できるオンライン環境を提供するには、改善の余地がある。具体的には、遠隔コミュニケーションにおいて、より使い勝手がよく、かつ、様々なケースを吸収できる仕組みが必要である。例えば、聞き手の属性に応じて伝達内容を最適化する仕組み、聞き手の意思に応じた伝達内容をリアルタイムで提供する仕組み、聞き手が所在する現場の状況に応じたタイミングで伝達内容を提供する仕組み、そして精密な操作内容を提供する仕組みなどが望まれている。 Therefore, in recent years, techniques have been developed for assisting transmission of complicated operation contents in remote communication. For example, there is a known technology that reduces mistakes and improves communication speed by providing a 3D holographic image of a remote speaker to a listener on site (Reference URL: https://dynamics.microsoft.com/). com/en-us/mixed-reality/overview/). However, there is room for improvement in providing an online environment that can compete with or compete with face-to-face communication, which is rich in visual, auditory, and tactile information, in communicating complex operational content. Specifically, in remote communication, there is a need for a mechanism that is more user-friendly and can absorb various cases. For example, a mechanism to optimize the content of communication according to the attributes of the listener, a mechanism to provide the content of communication according to the intention of the listener in real time, a mechanism to provide the content of communication at the timing according to the situation of the site where the listener is located, There is also a demand for a mechanism that provides precise operation details.

ここで、図１を用いて、比較例に係る情報処理の一例について説明する。図１は、比較例に係る情報処理の一例を示す図である。図１では、遠隔コミュニケーションを用いて、伝え手である参加者Ｕ１１（以下、「伝え手Ｕ１１」ともいう）が聞き手である参加者Ｕ２１およびＵ２２（以下、「聞き手Ｕ２１およびＵ２２」ともいう）に対して、聞き手が操作する装置Ａに対する操作内容を伝達している。 Here, an example of information processing according to a comparative example will be described with reference to FIG. FIG. 1 is a diagram illustrating an example of information processing according to a comparative example. In FIG. 1, using remote communication, a participant U11 who is a communicator (hereinafter also referred to as “communicator U11”) communicates with participants U21 and U22 who are listeners (hereinafter also referred to as “listeners U21 and U22”). On the other hand, the content of the operation for the device A operated by the listener is transmitted.

また、聞き手Ｕ２１およびＵ２２は、それぞれヘッドマウントディスプレイである端末装置１０－２１および端末装置１０－２２を装着しており、端末装置１０－２１および端末装置１０－２２それぞれを通して伝え手Ｕ１１の画像Ｇ１を見ることができる。また、聞き手Ｕ２１およびＵ２２は、端末装置１０－２１および端末装置１０－２２にそれぞれ搭載されたスピーカーを通して、伝え手Ｕ１１の音声を聞くことができる。 Listeners U21 and U22 wear terminal devices 10-21 and 10-22, which are head-mounted displays, respectively. can see Listeners U21 and U22 can hear the voice of the communicator U11 through speakers installed in the terminal devices 10-21 and 10-22, respectively.

また、２人の聞き手のうち、聞き手Ｕ２１は、装置Ａに対する操作の経験を有する経験者である。また、聞き手Ｕ２２は、装置Ａに対する操作の経験を有しない未経験者であり、かつ、音が聞こえにくい状態（難聴）である。図１に示すように、比較例では、聞き手の属性に関わらず、聞き手Ｕ２１およびＵ２２に対して同じ情報が伝達される。そのため、難聴である聞き手Ｕ２２にとっては、伝え手Ｕ１１の音声が聞き取りづらく、操作内容の伝達自体が困難な場合があった。また、未経験者である聞き手Ｕ２２にとっては、馴染みのない専門用語が頻繁に登場するため、伝達される操作内容を理解することが困難な場合があった。 Also, of the two listeners, the listener U21 is an experienced person who has experience operating the device A. FIG. Also, the listener U22 is an inexperienced person who has no experience of operating the device A, and is in a state where it is difficult to hear sounds (hard of hearing). As shown in FIG. 1, in the comparative example, the same information is transmitted to listeners U21 and U22 regardless of the attributes of the listeners. For this reason, the hearing-impaired listener U22 may find it difficult to hear the voice of the transmitter U11, making it difficult to transmit the operation content itself. In addition, for the inexperienced listener U22, unfamiliar technical terms frequently appear, so there are cases where it is difficult for the listener U22 to understand the transmitted operation details.

そこで、本願発明は、遠隔コミュニケーションの参加者である聞き手の属性情報に基づいて、遠隔コミュニケーションの参加者である伝え手から聞き手に対して伝達される操作内容であって、聞き手が操作する操作対象に対する操作内容を補う補助情報を聞き手の端末装置に出力するよう制御する。これにより、本願発明は、伝え手によって伝達される操作内容に加えて、聞き手の属性に応じた補助情報を提供することができるので、遠隔コミュニケーションでの複雑な操作内容の伝達を助けることができる。したがって、本願発明は、遠隔コミュニケーションにおける複雑な操作内容の伝達を支援可能にすることができる。 Therefore, according to the present invention, based on the attribute information of the listener who is a participant in the remote communication, the operation content transmitted from the transmitter who is a participant in the remote communication to the listener is an operation target operated by the listener. control so that auxiliary information that supplements the operation content for the listener is output to the terminal device of the listener. As a result, the present invention can provide supplementary information according to the attributes of the listener in addition to the operational details transmitted by the sender, so that it is possible to assist the transmission of complicated operational details in remote communication. . Therefore, the present invention can support transmission of complicated operation contents in remote communication.

〔２．情報処理の一例〕
次に、図２を用いて、実施形態に係る情報処理の一例について説明する。図２は、実施形態に係る情報処理の一例を示す図である。図２に示す情報処理は、実施形態に係る情報処理システム１によって実現される。情報処理システム１には、端末装置１０と、情報処理装置１００とが含まれる。以下の説明では、実施形態に係る情報処理として、端末装置１０および情報処理装置１００が協働して行う情報処理について説明する。本実施形態では、情報処理装置１００は、実施形態に係る情報処理プログラムを実行し、端末装置１０と協働することで、実施形態に係る情報処理を行う。 [2. Example of information processing]
Next, an example of information processing according to the embodiment will be described using FIG. FIG. 2 is a diagram illustrating an example of information processing according to the embodiment; The information processing shown in FIG. 2 is implemented by the information processing system 1 according to the embodiment. The information processing system 1 includes a terminal device 10 and an information processing device 100 . In the following description, information processing performed by the terminal device 10 and the information processing device 100 in cooperation will be described as information processing according to the embodiment. In the present embodiment, the information processing apparatus 100 executes the information processing program according to the embodiment and cooperates with the terminal device 10 to perform information processing according to the embodiment.

図２の説明に先立って、図３を用いて、実施形態に係る情報処理システム１の構成について説明する。図３は、実施形態に係る情報処理システム１の構成例を示す図である。図３に示すように、情報処理システム１には、端末装置１０と、情報処理装置１００とが含まれる。端末装置１０と、情報処理装置１００とは所定のネットワークＮを介して、有線または無線により通信可能に接続される。なお、図３に示す情報処理システム１には、任意の数の端末装置１０と任意の数の情報処理装置１００とが含まれてもよい。 Prior to the description of FIG. 2, the configuration of the information processing system 1 according to the embodiment will be described using FIG. FIG. 3 is a diagram showing a configuration example of the information processing system 1 according to the embodiment. As shown in FIG. 3 , the information processing system 1 includes a terminal device 10 and an information processing device 100 . The terminal device 10 and the information processing device 100 are connected via a predetermined network N so as to be communicable by wire or wirelessly. The information processing system 1 shown in FIG. 3 may include any number of terminal devices 10 and any number of information processing apparatuses 100 .

端末装置１０は、遠隔コミュニケーションの参加者（以下、「参加者」ともいう）によって利用される情報処理装置である。端末装置１０は、例えば、スマートフォンや、タブレット型端末や、ノート型ＰＣ（Personal Computer）や、デスクトップＰＣや、携帯電話機や、ＰＤＡ（Personal Digital Assistant）や、ヘッドマウントディスプレイ等である。本実施形態では、聞き手の端末装置１０は、ヘッドマウントディスプレイであるものとする。また、伝え手の端末装置１０は、ノート型ＰＣであるものとする。 The terminal device 10 is an information processing device used by a participant in remote communication (hereinafter also referred to as "participant"). The terminal device 10 is, for example, a smart phone, a tablet terminal, a notebook PC (Personal Computer), a desktop PC, a mobile phone, a PDA (Personal Digital Assistant), a head-mounted display, or the like. In this embodiment, the listener's terminal device 10 is assumed to be a head-mounted display. It is also assumed that the terminal device 10 of the sender is a notebook PC.

また、端末装置１０には、遠隔コミュニケーションシステムを利用するためのアプリケーション（以下、「遠隔コミュニケーションアプリ」ともいう）がインストールされている。端末装置１０は、遠隔コミュニケーションアプリに対する操作を行うための各種画像（例えば、ツールバーやアイコン等）を画面に表示する。また、端末装置１０には、カメラ、マイク、スピーカーなどの機能を有するデバイスが搭載または接続されている。端末装置１０は、それぞれのデバイスから入力された映像や音声を複数の拠点間で送受信する。 An application for using the remote communication system (hereinafter also referred to as “remote communication application”) is installed in the terminal device 10 . The terminal device 10 displays various images (for example, toolbars, icons, etc.) for operating the remote communication application on the screen. Devices having functions such as a camera, a microphone, and a speaker are installed in or connected to the terminal device 10 . The terminal device 10 transmits and receives video and audio input from each device between a plurality of bases.

また、端末装置１０には、参加者の物理的な状態を検知する各種のセンサが搭載または接続されている。例えば、端末装置１０には、上述したカメラやマイクといったセンサが接続されている。端末装置１０は、各種のセンサによって、参加者の物理的な状態を示すセンサ情報を検出する。例えば、端末装置１０は、センサ情報の一例として、カメラによって参加者の画像を検出する。端末装置１０は、センサ情報を検出すると、検出したセンサ情報を情報処理装置１００に送信する。 In addition, the terminal device 10 is equipped with or connected to various sensors for detecting the physical state of the participants. For example, the terminal device 10 is connected to sensors such as the above-described camera and microphone. The terminal device 10 detects sensor information indicating the physical state of the participant using various sensors. For example, the terminal device 10 detects an image of a participant using a camera as an example of sensor information. When detecting the sensor information, the terminal device 10 transmits the detected sensor information to the information processing device 100 .

また、端末装置１０は、情報処理装置１００から補助情報を受信する。端末装置１０は、補助情報を受信すると、補助情報を出力する。例えば、端末装置１０は、補助情報が画像である場合には、補助情報を画面に表示する。また、端末装置１０は、補助情報が音声である場合には、補完情報をスピーカーから出力する。 The terminal device 10 also receives auxiliary information from the information processing device 100 . Upon receiving the auxiliary information, the terminal device 10 outputs the auxiliary information. For example, when the auxiliary information is an image, the terminal device 10 displays the auxiliary information on the screen. Also, when the auxiliary information is voice, the terminal device 10 outputs complementary information from the speaker.

なお、図３に示すように、端末装置１０を利用する参加者に応じて、端末装置１０を端末装置１０－１～１０－Ｎ（Ｎは自然数）のように区別して説明する場合がある。例えば、端末装置１０－２１は、図２に示す参加者Ｕ２１によって利用される端末装置１０である。また、例えば、端末装置１０－２２は、図２に示す参加者Ｕ２２によって利用される端末装置１０である。また、以下では、端末装置１０－１～１０－Ｎ（Ｎは自然数）について、特に区別なく説明する場合には、端末装置１０と記載する。 Note that, as shown in FIG. 3, the terminal devices 10 may be classified as terminal devices 10-1 to 10-N (N is a natural number) according to the participants using the terminal devices 10. FIG. For example, the terminal device 10-21 is the terminal device 10 used by the participant U21 shown in FIG. Also, for example, the terminal device 10-22 is the terminal device 10 used by the participant U22 shown in FIG. Further, hereinafter, the terminal devices 10-1 to 10-N (N is a natural number) will be referred to as the terminal device 10 when they are not distinguished from each other.

情報処理装置１００は、端末装置１０から聞き手の情報を取得する。具体的には、情報処理装置１００は、端末装置１０から聞き手のセンサ情報を取得する。情報処理装置１００は、聞き手のセンサ情報を取得すると、取得した聞き手のセンサ情報に基づいて、聞き手が操作内容に関する操作に困難をきたしているか否かを判定する。続いて、情報処理装置１００は、聞き手が操作内容に関する操作に困難をきたしていると判定した場合、聞き手の属性情報に基づいて、伝え手から聞き手に対して伝達される操作内容であって、聞き手が操作する操作対象に対する操作内容を補う補助情報を生成する。また、情報処理装置１００は、生成した補助情報を聞き手の端末装置１０に出力するよう制御する。具体的には、情報処理装置１００は、生成した補助情報を聞き手の端末装置１０に送信する。 The information processing device 100 acquires listener information from the terminal device 10 . Specifically, the information processing device 100 acquires the listener's sensor information from the terminal device 10 . After acquiring the listener's sensor information, the information processing apparatus 100 determines whether or not the listener is having difficulty performing an operation related to the operation content based on the acquired listener's sensor information. Subsequently, when the information processing apparatus 100 determines that the listener is having difficulty in performing an operation related to the operation content, the information processing apparatus 100, based on the listener's attribute information, is the operation content transmitted from the sender to the listener, Auxiliary information that supplements the operation content for the operation target operated by the listener is generated. The information processing device 100 also controls the generated auxiliary information to be output to the terminal device 10 of the listener. Specifically, the information processing device 100 transmits the generated auxiliary information to the terminal device 10 of the listener.

ここから、図２の説明に戻る。図２では、図１と同様に、遠隔コミュニケーションを用いて、伝え手Ｕ１１が聞き手Ｕ２１およびＵ２２に対して、聞き手が操作する装置Ａに対する操作内容を伝達している。ここで、図２に示す例では、情報処理装置１００が、聞き手Ｕ２２の属性情報に基づいて、装置Ａに対する操作内容を補う補助情報を聞き手Ｕ２２の端末装置１０－２２に出力する点が図１と異なる。 From here, the description returns to FIG. In FIG. 2, as in FIG. 1, the communicator U11 uses remote communication to communicate to listeners U21 and U22 the details of the operation of the device A operated by the listeners. Here, in the example shown in FIG. 2, the information processing apparatus 100 outputs auxiliary information that supplements the operation content for the device A to the terminal device 10-22 of the listener U22 based on the attribute information of the listener U22. different from

図示は省略するが、まず、聞き手Ｕ２２が、「わからない単語が多いし、声が聞き取りづらいです。」と発言したとする。このとき、聞き手Ｕ２２が装着している端末装置１０－２２に搭載されたマイクは、聞き手Ｕ２２の発言の音声を検出する。続いて、端末装置１０－２２は、聞き手Ｕ２２の発言の音声を検出すると、検出した音声データを情報処理装置１００に送信する。 Although illustration is omitted, first, it is assumed that the listener U22 says, "There are many words I do not understand, and it is difficult to hear the voice." At this time, the microphone mounted on the terminal device 10-22 worn by the listener U22 detects the speech of the listener U22. Subsequently, when the terminal device 10-22 detects the speech of the listener U22, the terminal device 10-22 transmits the detected speech data to the information processing device 100. FIG.

情報処理装置１００は、端末装置１０－２２から聞き手Ｕ２２の音声データを取得する。続いて、情報処理装置１００は、聞き手Ｕ２２の音声データを取得すると、取得した聞き手Ｕ２２の音声データに基づいて、聞き手Ｕ２２が操作内容に関する操作に困難をきたしているか否かを判定する。図２では、情報処理装置１００は、聞き手Ｕ２２が操作内容に関する操作に困難をきたしていると判定する。 The information processing device 100 acquires the voice data of the listener U22 from the terminal device 10-22. Subsequently, when acquiring the voice data of the listener U22, the information processing apparatus 100 determines whether or not the listener U22 is having difficulty performing operations related to the operation content based on the acquired voice data of the listener U22. In FIG. 2, the information processing apparatus 100 determines that the listener U22 is having difficulty in performing an operation related to the operation content.

情報処理装置１００は、聞き手Ｕ２２が操作内容に関する操作に困難をきたしていると判定した場合、聞き手Ｕ２２が難聴であるという属性情報に基づいて、補助情報の一例として、伝え手Ｕ１１によって伝達される操作内容を文字にした字幕を含む画像Ｇ２１を生成する。続いて、情報処理装置１００は、生成した画像Ｇ２１を聞き手Ｕ２２の端末装置１０－２２に表示するよう制御する。 When the information processing apparatus 100 determines that the listener U22 is having difficulty in performing an operation related to the operation content, based on the attribute information that the listener U22 is hard of hearing, an example of the auxiliary information is transmitted by the transmitter U11. An image G21 including subtitles in which the contents of the operation are made into characters is generated. Subsequently, the information processing device 100 controls to display the generated image G21 on the terminal device 10-22 of the listener U22.

これにより、情報処理装置１００は、難聴である聞き手Ｕ２２が伝え手Ｕ１１の音声を聞き取りづらい場合であっても、補助情報である字幕を含む画像Ｇ２１により視覚的に操作内容を知覚することを可能にする。したがって、情報処理装置１００は、難聴である聞き手Ｕ２２に対する操作内容の伝達を助けることができる。 As a result, the information processing apparatus 100 can visually perceive the operation content from the image G21 including subtitles, which is the auxiliary information, even when the hearing-impaired listener U22 has difficulty in hearing the voice of the transmitter U11. to Therefore, the information processing device 100 can help convey the operation content to the hearing-impaired listener U22.

また、情報処理装置１００は、聞き手Ｕ２２が操作内容に関する操作に困難をきたしていると判定した場合、聞き手Ｕ２２が未経験者であるという属性情報に基づいて、補助情報の一例として、伝え手Ｕ１１によって伝達される操作内容に登場する専門用語の意味を文字にした字幕を含む画像Ｇ２２を生成する。続いて、情報処理装置１００は、生成した画像Ｇ２２を聞き手Ｕ２２の端末装置１０－２２に表示するよう制御する。 Further, when the information processing apparatus 100 determines that the listener U22 is having difficulty in performing an operation related to the operation content, based on the attribute information that the listener U22 is an inexperienced person, the information processing apparatus 100 outputs, as an example of auxiliary information, An image G22 is generated that includes subtitles in which technical terms appearing in the transmitted operation content are written in characters. Subsequently, the information processing device 100 controls to display the generated image G22 on the terminal device 10-22 of the listener U22.

これにより、情報処理装置１００は、未経験者である聞き手Ｕ２２にとって馴染みのない専門用語が頻繁に登場する場合であっても、補助情報である字幕を含む画像Ｇ２２により専門用語の意味をその都度確認可能にすることができる。したがって、情報処理装置１００は、未経験者である聞き手Ｕ２２が操作内容を理解するのを助けることができる。 As a result, even when technical terms unfamiliar to the inexperienced listener U22 appear frequently, the information processing apparatus 100 can confirm the meaning of the technical terms each time using the image G22 including subtitles, which is auxiliary information. can be made possible. Therefore, the information processing device 100 can help the inexperienced listener U22 to understand the operation content.

〔３．情報処理装置の構成〕
次に、図４を用いて、実施形態に係る情報処理装置１００の構成について説明する。図４は、実施形態に係る情報処理装置１００の構成例を示す図である。図４に示すように、情報処理装置１００は、通信部１１０と、記憶部１２０と、制御部１３０とを有する。なお、情報処理装置１００は、情報処理装置１００の管理者等から各種操作を受け付ける入力部（例えば、キーボードやマウス等）や、各種情報を表示させるための表示部（例えば、液晶ディスプレイ等）を有してもよい。 [3. Configuration of Information Processing Device]
Next, the configuration of the information processing apparatus 100 according to the embodiment will be described using FIG. FIG. 4 is a diagram illustrating a configuration example of the information processing apparatus 100 according to the embodiment. As shown in FIG. 4, the information processing apparatus 100 has a communication section 110, a storage section 120, and a control section . The information processing apparatus 100 includes an input unit (for example, a keyboard, a mouse, etc.) for receiving various operations from an administrator of the information processing apparatus 100, and a display unit (for example, a liquid crystal display) for displaying various information. may have.

（通信部１１０）
通信部１１０は、例えば、ＮＩＣ（Network Interface Card）等によって実現される。そして、通信部１１０は、ネットワークと有線または無線で接続され、例えば、端末装置１０との間で情報の送受信を行う。 (Communication unit 110)
The communication unit 110 is realized by, for example, a NIC (Network Interface Card) or the like. The communication unit 110 is connected to a network by wire or wirelessly, and transmits and receives information to and from the terminal device 10, for example.

（記憶部１２０）
記憶部１２０は、例えば、ＲＡＭ（Random Access Memory)、フラッシュメモリ（Flash Memory）等の半導体メモリ素子、または、ハードディスク、光ディスク等の記憶装置によって実現される。記憶部１２０は、図４に示すように、参加者情報記憶部１２１とセンサ情報記憶部１２２を有する。 (storage unit 120)
The storage unit 120 is realized by, for example, a semiconductor memory device such as a RAM (Random Access Memory) or flash memory, or a storage device such as a hard disk or an optical disk. The storage unit 120 has a participant information storage unit 121 and a sensor information storage unit 122, as shown in FIG.

（参加者情報記憶部１２１）
参加者情報記憶部１２１は、参加者の属性に関する各種の情報を記憶する。具体的には、参加者情報記憶部１２１は、参加者を識別可能な識別情報（参加者ＩＤともいう）と参加者の属性情報とを対応付けて記憶する。図５を用いて、参加者情報記憶部１２１の一例について説明する。図５は、実施形態に係る参加者情報記憶部の一例を示す図である。 (Participant information storage unit 121)
The participant information storage unit 121 stores various types of information regarding attributes of participants. Specifically, the participant information storage unit 121 associates and stores identification information (also referred to as a participant ID) capable of identifying a participant and attribute information of the participant. An example of the participant information storage unit 121 will be described with reference to FIG. FIG. 5 is a diagram illustrating an example of a participant information storage unit according to the embodiment;

図５に示す例では、参加者情報記憶部１２１は、装置Ａに対する操作の経験者であることを示す参加者Ｕ２１の属性情報である「経験者」と参加者Ｕ２１の参加者ＩＤ「Ｕ２１」とを対応付けて記憶する。また、参加者情報記憶部１２１は、装置Ａに対する操作の未経験者であることを示す参加者Ｕ２２の属性情報である「未経験者」および音が聞こえづらい状態であることを示す参加者Ｕ２２の属性情報である「難聴」と参加者Ｕ２２の参加者ＩＤ「Ｕ２２」とを対応付けて記憶する。 In the example shown in FIG. 5, the participant information storage unit 121 stores the attribute information "experienced" of the participant U21 indicating that the participant U21 has experience of operating the device A, and the participant ID "U21" of the participant U21. are associated with each other and stored. In addition, the participant information storage unit 121 stores the attribute information of the participant U22 indicating that the participant U22 is inexperienced in operating the device A, and the attribute information of the participant U22 indicating that the sound is difficult to hear. The information "hard of hearing" and the participant ID "U22" of the participant U22 are associated and stored.

（センサ情報記憶部１２２）
センサ情報記憶部１２２は、参加者の物理的な状態を示す各種のセンサ情報を記憶する。具体的には、センサ情報記憶部１２２は、参加者を識別可能な識別情報（参加者ＩＤともいう）と参加者のセンサ情報とを対応付けて記憶する。 (Sensor information storage unit 122)
The sensor information storage unit 122 stores various sensor information indicating physical states of participants. Specifically, the sensor information storage unit 122 associates and stores identification information (also referred to as a participant ID) that can identify a participant and the sensor information of the participant.

図２に示す例では、センサ情報記憶部１２２は、参加者Ｕ１１の画像データおよび音声データと参加者Ｕ１１の参加者ＩＤ「Ｕ１１」とを対応付けて記憶する。センサ情報記憶部１２２は、参加者Ｕ２１の画像データおよび音声データと参加者Ｕ２１の参加者ＩＤ「Ｕ２１」とを対応付けて記憶する。また、センサ情報記憶部１２２は、参加者Ｕ２２の画像データおよび音声データと参加者Ｕ２２の参加者ＩＤ「Ｕ２２」とを対応付けて記憶する。 In the example shown in FIG. 2, the sensor information storage unit 122 stores the image data and audio data of the participant U11 in association with the participant ID "U11" of the participant U11. The sensor information storage unit 122 stores the image data and audio data of the participant U21 in association with the participant ID "U21" of the participant U21. Further, the sensor information storage unit 122 stores the image data and audio data of the participant U22 in association with the participant ID "U22" of the participant U22.

（制御部１３０）
図４の説明に戻って、制御部１３０は、コントローラ（controller）であり、例えば、ＣＰＵ（Central Processing Unit）やＭＰＵ（Micro Processing Unit）等によって、情報処理装置１００内部の記憶装置に記憶されている各種プログラム（情報処理プログラムの一例に相当）がＲＡＭを作業領域として実行されることにより実現される。また、制御部１３０は、コントローラであり、例えば、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）等の集積回路により実現される。 (control unit 130)
Returning to the description of FIG. 4, the control unit 130 is a controller, and is stored in a storage device inside the information processing apparatus 100 by, for example, a CPU (Central Processing Unit) or MPU (Micro Processing Unit). Various programs (corresponding to an example of an information processing program) are executed by using the RAM as a work area. Also, the control unit 130 is a controller, and is implemented by an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array).

図４に示すように、制御部１３０は、取得部１３１と、判定部１３２と、生成部１３３と、出力制御部１３４とを有し、以下に説明する情報処理の作用を実現または実行する。なお、制御部１３０の内部構成は、図４に示した構成に限られず、後述する情報処理を行う構成であれば他の構成であってもよい。 As shown in FIG. 4, the control unit 130 has an acquisition unit 131, a determination unit 132, a generation unit 133, and an output control unit 134, and implements or executes the information processing operation described below. Note that the internal configuration of the control unit 130 is not limited to the configuration shown in FIG. 4, and may be another configuration as long as it performs information processing to be described later.

（取得部１３１）
取得部１３１は、端末装置１０から参加者の情報を取得する。具体的には、取得部１３１は、端末装置１０から聞き手の情報を取得する。また、取得部１３１は、端末装置１０から聞き手のセンサ情報を取得する。例えば、取得部１３１は、端末装置１０から聞き手の画像データおよび音声データを取得する。続いて、取得部１３１は、聞き手のセンサ情報を取得すると、取得したセンサ情報をセンサ情報記憶部１２２に格納する。 (Acquisition unit 131)
The acquisition unit 131 acquires information on participants from the terminal device 10 . Specifically, the acquisition unit 131 acquires listener information from the terminal device 10 . In addition, the acquisition unit 131 acquires the listener's sensor information from the terminal device 10 . For example, the acquisition unit 131 acquires image data and audio data of the listener from the terminal device 10 . Subsequently, when acquiring the sensor information of the listener, the acquisition unit 131 stores the acquired sensor information in the sensor information storage unit 122 .

また、取得部１３１は、端末装置１０から伝え手の情報を取得する。具体的には、取得部１３１は、端末装置１０から伝え手のセンサ情報を取得する。例えば、取得部１３１は、端末装置１０から伝え手の画像データおよび音声データを取得する。続いて、取得部１３１は、伝え手のセンサ情報を取得すると、取得したセンサ情報をセンサ情報記憶部１２２に格納する。 Also, the acquisition unit 131 acquires the information of the sender from the terminal device 10 . Specifically, the acquisition unit 131 acquires sensor information of the sender from the terminal device 10 . For example, the acquisition unit 131 acquires image data and voice data of the sender from the terminal device 10 . Subsequently, when acquiring the sensor information of the sender, the acquisition unit 131 stores the acquired sensor information in the sensor information storage unit 122 .

（判定部１３２）
判定部１３２は、聞き手の情報に基づいて、聞き手が操作内容に関する操作に困難をきたしているか否かを判定する。具体的には、判定部１３２は、取得部１３１によって取得された聞き手のセンサ情報に基づいて、聞き手が操作内容に関する操作に困難をきたしているか否かを判定する。 (Determination unit 132)
The determination unit 132 determines whether or not the listener is having difficulty performing an operation related to the operation content, based on the listener's information. Specifically, based on the listener's sensor information acquired by the acquisition unit 131, the determination unit 132 determines whether or not the listener is having difficulty performing an operation related to the operation content.

例えば、判定部１３２は、取得部１３１によって取得された聞き手の音声データに基づいて、聞き手が操作内容に関する操作に困難をきたしているか否かを判定する。例えば、判定部１３２は、公知の音声認識技術を用いて、取得部１３１によって取得された聞き手の音声データを文字列に変換する。図２に示す例では、判定部１３２は、聞き手Ｕ２２の音声データを文字列である「わからない単語が多いし、声が聞き取りづらいです。」に変換する。 For example, the determination unit 132 determines whether or not the listener is having difficulty performing an operation related to the operation content, based on the listener's voice data acquired by the acquisition unit 131 . For example, the determination unit 132 converts the listener's voice data acquired by the acquisition unit 131 into a character string using a known voice recognition technology. In the example shown in FIG. 2, the determination unit 132 converts the speech data of the listener U22 into a character string "There are many words I do not understand, and it is difficult to hear the voice."

続いて、判定部１３２は、変換した文字列を形態素解析して、単語に分解する。続いて、判定部１３２は、分解された単語の中に特定の単語が含まれるか否かを判定する。例えば、判定部１３２は、聞き手が操作内容に関する操作に困難をきたしている場合に発せられやすい単語リスト（以下、単語リストともいう）に含まれる単語が分解された単語の中に含まれるか否かを判定する。例えば、単語リストには、「わからない」、「困った」、「聞き取りづらい」などの単語が含まれているとする。図２に示す例では、判定部１３２は、分解された単語の中に単語リストに含まれる単語である「わからない」と「聞き取りづらい」が含まれると判定する。 Subsequently, the determination unit 132 morphologically analyzes the converted character string and breaks it down into words. Subsequently, the determination unit 132 determines whether or not a specific word is included in the decomposed words. For example, the determination unit 132 determines whether words included in a word list (hereinafter, also referred to as a word list) that are likely to be uttered when the listener is having difficulty performing an operation related to the operation content are included in the decomposed words. determine whether For example, suppose that the word list includes words such as "I don't understand", "I'm having trouble", and "Hard to hear". In the example shown in FIG. 2, the determination unit 132 determines that the decomposed words include the words "I don't understand" and "Difficult to hear" included in the word list.

続いて、判定部１３２は、分解された単語の中に特定の単語が含まれると判定した場合、聞き手が操作内容に関する操作に困難をきたしていると判定する。一方、判定部１３２は、分解された単語の中に特定の単語が含まれないと判定した場合、聞き手が操作内容に関する操作に困難をきたしていないと判定する。図２に示す例では、判定部１３２は、分解された単語の中に単語リストに含まれる単語である「わからない」と「聞き取りづらい」が含まれると判定したので、聞き手Ｕ２２が操作内容に関する操作に困難をきたしていると判定する。 Subsequently, when determining that a specific word is included in the decomposed words, the determination unit 132 determines that the listener is having difficulty performing the operation related to the operation content. On the other hand, when determining that the specific word is not included in the decomposed words, the determination unit 132 determines that the listener does not have difficulty in performing the operation related to the operation content. In the example shown in FIG. 2, the determination unit 132 determines that the decomposed words include the words "I don't understand" and "Difficult to hear" that are included in the word list. It is determined that there is difficulty in

（生成部１３３）
生成部１３３は、遠隔コミュニケーションの参加者である聞き手の情報に基づいて、遠隔コミュニケーションの参加者である伝え手から聞き手に対して伝達される操作内容であって、聞き手が操作する操作対象に対する操作内容を補う補助情報を生成する。具体的には、生成部１３３は、判定部１３２によって聞き手が操作に困難をきたしていると判定された場合に、補助情報を生成する。例えば、生成部１３３は、聞き手の情報の一例として、聞き手の属性に関する属性情報に基づいて、補助情報を生成する。 (Generating unit 133)
Based on the information of the listener who is a participant in the remote communication, the generating unit 133 generates operation contents transmitted from the transmitter who is a participant in the remote communication to the listener, which is an operation for an operation target operated by the listener. Generate auxiliary information to supplement the content. Specifically, the generation unit 133 generates the auxiliary information when the determination unit 132 determines that the listener is having difficulty with the operation. For example, the generation unit 133 generates auxiliary information based on attribute information about listener attributes as an example of listener information.

図２に示す例では、生成部１３３は、判定部１３２によって聞き手Ｕ２２が操作内容に関する操作に困難をきたしていると判定された場合、参加者情報記憶部１２１を参照して、聞き手Ｕ２２の属性情報を取得する。例えば、生成部１３３は、聞き手Ｕ２２が難聴であるという属性情報を取得する。続いて、生成部１３３は、聞き手Ｕ２２が難聴であるという属性情報を取得すると、聞き手Ｕ２２が難聴であるという属性情報に基づいて、補助情報の一例として、伝え手Ｕ１１によって伝達される操作内容を文字にした字幕を含む画像Ｇ２１を生成する。例えば、生成部１３３は、センサ情報記憶部１２２を参照して、伝え手Ｕ１１の音声データを取得する。続いて、生成部１３３は、公知の音声認識技術を用いて、伝え手Ｕ１１の音声データを文字列に変換する。続いて、生成部１３３は、変換した文字列に基づいて、伝え手Ｕ１１によって伝達される操作内容を文字にした字幕を含む画像Ｇ２１を生成する。 In the example shown in FIG. 2 , when the determination unit 132 determines that the listener U22 is having difficulty in performing an operation related to the operation content, the generation unit 133 refers to the participant information storage unit 121 to determine the attributes of the listener U22. Get information. For example, the generator 133 acquires attribute information indicating that the listener U22 is hard of hearing. Subsequently, when the attribute information indicating that the listener U22 is hard of hearing is acquired, the generation unit 133 generates operation details transmitted by the transmitter U11 as an example of auxiliary information based on the attribute information that the listener U22 is hard of hearing. An image G21 including subtitles in text is generated. For example, the generation unit 133 refers to the sensor information storage unit 122 and acquires voice data of the transmitter U11. Next, the generation unit 133 converts the speech data of the sender U11 into a character string using a known speech recognition technology. Next, based on the converted character string, the generation unit 133 generates an image G21 including captions in which the content of the operation transmitted by the messenger U11 is written.

また、生成部１３３は、判定部１３２によって聞き手Ｕ２２が操作内容に関する操作に困難をきたしていると判定された場合、参加者情報記憶部１２１を参照して、聞き手Ｕ２２が未経験者であるという属性情報を取得する。続いて、生成部１３３は、聞き手Ｕ２２が未経験者であるという属性情報を取得すると、聞き手Ｕ２２が未経験者であるという属性情報に基づいて、補助情報の一例として、伝え手Ｕ１１によって伝達される操作内容に登場する専門用語の意味を文字にした字幕を含む画像Ｇ２２を生成する。例えば、生成部１３３は、センサ情報記憶部１２２を参照して、伝え手Ｕ１１の音声データを取得する。続いて、生成部１３３は、公知の音声認識技術を用いて、伝え手Ｕ１１の音声データを文字列に変換する。続いて、生成部１３３は、変換した文字列の中から伝え手Ｕ１１によって伝達される操作内容に登場する専門用語を抽出する。続いて、生成部１３３は、専門用語の意味が掲載された辞書等を参照して、抽出した専門用語の意味を文字にした字幕を含む画像Ｇ２２を生成する。 In addition, when the determination unit 132 determines that the listener U22 is having difficulty in performing the operation related to the operation content, the generation unit 133 refers to the participant information storage unit 121 to determine whether the listener U22 has the attribute that the listener U22 is an inexperienced person. Get information. Next, when acquiring the attribute information that the listener U22 is an inexperienced person, the generating unit 133 generates an operation transmitted by the communicator U11 as an example of auxiliary information based on the attribute information that the listener U22 is an inexperienced person. An image G22 is generated that includes subtitles in which the meanings of technical terms appearing in the content are written in characters. For example, the generation unit 133 refers to the sensor information storage unit 122 and acquires voice data of the transmitter U11. Next, the generation unit 133 converts the speech data of the sender U11 into a character string using a known speech recognition technology. Subsequently, the generating unit 133 extracts technical terms appearing in the operation content transmitted by the transmitter U11 from the converted character string. Subsequently, the generation unit 133 refers to a dictionary or the like that lists the meanings of technical terms, and generates an image G22 including subtitles in which the meanings of the extracted technical terms are written.

（出力制御部１３４）
出力制御部１３４は、生成部１３３によって生成された補助情報を聞き手の端末装置１０に出力するよう制御する。具体的には、出力制御部１３４は、生成部１３３によって補助情報が生成されると、生成部１３３によって生成された補助情報を聞き手の端末装置１０に送信する。図２に示す例では、出力制御部１３４は、生成部１３３によって画像Ｇ２１および画像Ｇ２２が生成されると、生成された画像Ｇ２１および画像Ｇ２２を聞き手Ｕ２２の端末装置１０－２２に送信する。 (Output control unit 134)
The output control unit 134 controls to output the auxiliary information generated by the generation unit 133 to the terminal device 10 of the listener. Specifically, when the auxiliary information is generated by the generating unit 133, the output control unit 134 transmits the auxiliary information generated by the generating unit 133 to the terminal device 10 of the listener. In the example shown in FIG. 2, when the image G21 and the image G22 are generated by the generation unit 133, the output control unit 134 transmits the generated image G21 and the image G22 to the terminal device 10-22 of the listener U22.

〔４．変形例〕
上述した実施形態に係る情報処理システム１は、上記実施形態以外にも種々の異なる形態にて実施されてよい。そこで、以下では、情報処理システム１の他の実施形態について説明する。なお、実施形態と同一部分には、同一符号を付して説明を省略する。 [4. Modification]
The information processing system 1 according to the above-described embodiments may be implemented in various different forms other than the above-described embodiments. Therefore, other embodiments of the information processing system 1 will be described below. In addition, the same code|symbol is attached|subjected to the same part as embodiment, and description is abbreviate|omitted.

〔４－１．第１の変形例〕
まず、図６を用いて、第１の変形例に係る情報処理について説明する。図６は、第１の変形例に係る情報処理の一例を示す図である。図６では、図２と同様に、遠隔コミュニケーションを用いて、伝え手Ｕ１１が聞き手Ｕ２１およびＵ２２に対して、聞き手が操作する装置Ａに対する操作内容を伝達している。ここで、図６に示す例では、情報処理装置１００が、聞き手Ｕ２２が所在する現場の音声情報に基づいて、装置Ａに対する操作内容を補う補助情報を聞き手Ｕ２２の端末装置１０－２２に出力する点が図２と異なる。 [4-1. First modification]
First, with reference to FIG. 6, information processing according to the first modified example will be described. FIG. 6 is a diagram illustrating an example of information processing according to the first modification. In FIG. 6, as in FIG. 2, using remote communication, the transmitter U11 communicates to listeners U21 and U22 the details of the operation of device A operated by the listeners. Here, in the example shown in FIG. 6, the information processing device 100 outputs auxiliary information supplementing the operation content of the device A to the terminal device 10-22 of the listener U22 based on the voice information of the site where the listener U22 is located. 2 differs from FIG.

図示は省略するが、まず、聞き手Ｕ２２が、「読めない漢字があるので、ひらがなにしてほしい。あと、文字をもっと大きくしてほしい。」と発言したとする。このとき、聞き手Ｕ２２が装着している端末装置１０－２２に搭載されたマイクは、聞き手Ｕ２２の発言の音声を検出する。続いて、端末装置１０－２２は、聞き手Ｕ２２の発言の音声を検出すると、検出した音声データを情報処理装置１００に送信する。 Although illustration is omitted, first, it is assumed that the listener U22 says, "Since there are some kanji characters that I cannot read, I want you to write them in hiragana. Also, I want you to make the characters larger." At this time, the microphone mounted on the terminal device 10-22 worn by the listener U22 detects the speech of the listener U22. Subsequently, when the terminal device 10-22 detects the speech of the listener U22, the terminal device 10-22 transmits the detected speech data to the information processing device 100. FIG.

取得部１３１は、端末装置１０－２２から聞き手Ｕ２２の音声データを取得する。判定部１３２は、公知の音声認識技術を用いて、取得部１３１によって取得された聞き手Ｕ２２の音声データを文字列である「読めない漢字があるので、ひらがなにしてほしい。あと、文字をもっと大きくしてほしい。」に変換する。 Acquisition unit 131 acquires voice data of listener U22 from terminal device 10-22. Using a known speech recognition technique, the determination unit 132 converts the voice data of the listener U22 acquired by the acquisition unit 131 into a character string "There are kanji characters that cannot be read, so please use hiragana. I want you to do it."

続いて、判定部１３２は、変換した文字列を形態素解析して、単語に分解する。続いて、判定部１３２は、上述した単語リストに含まれる単語が分解された単語の中に含まれるか否かを判定する。図６に示す例では、判定部１３２は、分解された単語の中に単語リストに含まれる単語である「読めない」と「してほしい」が含まれると判定する。 Subsequently, the determination unit 132 morphologically analyzes the converted character string and breaks it down into words. Subsequently, the determination unit 132 determines whether or not the words included in the word list described above are included in the decomposed words. In the example shown in FIG. 6, the determining unit 132 determines that the decomposed words include the words "I can't read" and "I want you to do" that are included in the word list.

続いて、判定部１３２は、分解された単語の中に単語リストに含まれる単語である「読めない」と「してほしい」が含まれると判定したので、聞き手Ｕ２２が操作内容に関する操作に困難をきたしていると判定する。 Subsequently, the determination unit 132 determines that the decomposed words include the words "can't read" and "want to do", which are included in the word list. is determined to have occurred.

図６に示す例では、生成部１３３は、判定部１３２によって聞き手Ｕ２２が操作内容に関する操作に困難をきたしていると判定された場合、聞き手Ｕ２２の音声データを変換した文字列に含まれる「ひらがなにしてほしい」に基づいて、補助情報の一例として、伝え手Ｕ１１によって伝達される操作内容をひらがなで表記した字幕を含む画像Ｇ３１を生成する。 In the example shown in FIG. 6 , when the determination unit 132 determines that the listener U22 is having difficulty in performing an operation related to the operation content, the generation unit 133 converts the voice data of the listener U22 into the character string “Hiragana Based on "I want you to do this", an image G31 is generated as an example of the auxiliary information, which includes captions written in hiragana that describe the details of the operation transmitted by the messenger U11.

具体的には、生成部１３３は、聞き手Ｕ２２の音声データを変換した文字列に含まれる「ひらがなにしてほしい」の意味を「ひらがなに変換してほしい」であると解析する。続いて、生成部１３３は、聞き手Ｕ２２の音声データを変換した文字列を意味解析すると、記憶部１２０に記憶された処理リストを参照して、意味解析した結果に対応する処理を実行する。より具体的には、記憶部１２０は、「〇〇してほしい」という文字列と「〇〇」に対応する処理とが対応付けられた処理リストを記憶する。例えば、記憶部１２０は、「ひらがなに変換してほしい」という文字列と、ひらがなに変換する処理とが対応付けられた処理リストを記憶する。生成部１３３は、意味解析した結果である「ひらがなに変換してほしい」という文字列に対応する処理として、伝え手Ｕ１１の音声データをひらがなに変換する処理を実行する。例えば、生成部１３３は、公知の音声認識技術を用いて、伝え手Ｕ１１の音声データを文字列に変換する。続いて、生成部１３３は、変換した文字列をひらがなに変換する。続いて、生成部１３３は、伝え手Ｕ１１によって伝達される操作内容をひらがなで表記した字幕を含む画像Ｇ３１を生成する。 Specifically, the generating unit 133 analyzes the meaning of "I want you to use hiragana" included in the character string obtained by converting the voice data of the listener U22 as "I want you to use hiragana". Next, after semantically analyzing the character string obtained by converting the speech data of the listener U22, the generating unit 133 refers to the processing list stored in the storage unit 120 and executes processing corresponding to the result of the semantic analysis. More specifically, the storage unit 120 stores a process list in which a character string "I want you to do XX" and a process corresponding to "XX" are associated with each other. For example, the storage unit 120 stores a process list in which a character string "Please convert to hiragana" and a process for converting to hiragana are associated with each other. The generation unit 133 executes a process of converting the voice data of the sender U11 into hiragana as a process corresponding to the character string "Please convert to hiragana" which is the result of the semantic analysis. For example, the generator 133 converts voice data of the transmitter U11 into a character string using a known voice recognition technology. Next, the generation unit 133 converts the converted character string into hiragana. Subsequently, the generation unit 133 generates an image G31 including subtitles in which the content of the operation transmitted by the messenger U11 is written in hiragana.

また、生成部１３３は、判定部１３２によって聞き手Ｕ２２が操作内容に関する操作に困難をきたしていると判定された場合、聞き手Ｕ２２の音声データを変換した文字列に含まれる「文字をもっと大きくしてほしい」に基づいて、補助情報の一例として、伝え手Ｕ１１によって伝達される操作内容を示す文字の大きさが拡大された字幕を含む画像Ｇ３１を生成する。 In addition, when the determination unit 132 determines that the listener U22 is having difficulty in performing an operation related to the operation content, the generation unit 133 selects "Make the characters larger" included in the character string obtained by converting the voice data of the listener U22. Based on "I want you", an image G31 is generated as an example of the auxiliary information, which includes subtitles in which the size of the characters indicating the details of the operation transmitted by the messenger U11 is enlarged.

具体的には、生成部１３３は、聞き手Ｕ２２の音声データを変換した文字列に含まれる「文字をもっと大きくしてほしい」の意味を「文字のサイズを拡大してほしい」であると解析する。生成部１３３は、聞き手Ｕ２２の音声データを変換した文字列を意味解析すると、記憶部１２０に記憶された処理リストを参照して、意味解析した結果に基づく処理を実行する。例えば、記憶部１２０は、「文字のサイズを拡大してほしい」という文字列と、文字のサイズを拡大する処理とが対応付けられた処理リストを記憶する。生成部１３３は、意味解析した結果である「文字のサイズを拡大してほしい」という文字列に対応する処理として、伝え手Ｕ１１によって伝達される操作内容を示す文字のサイズを拡大する処理を実行する。例えば、生成部１３３は、伝え手Ｕ１１の音声データを変換した文字列に基づいて、伝え手Ｕ１１によって伝達される操作内容を示す文字のサイズを拡大した字幕を含む画像Ｇ３１を生成する。 Specifically, the generation unit 133 analyzes the meaning of "I want you to make the characters larger" included in the character string obtained by converting the speech data of the listener U22 as "I want you to enlarge the character size". . After semantically analyzing the character string obtained by converting the voice data of the listener U22, the generation unit 133 refers to the processing list stored in the storage unit 120 and executes processing based on the result of the semantic analysis. For example, the storage unit 120 stores a processing list in which a character string “I want you to enlarge the character size” and a process for increasing the character size are associated with each other. The generation unit 133 executes a process of enlarging the size of the characters indicating the details of the operation transmitted by the transmitter U11 as a process corresponding to the character string "Please enlarge the character size" that is the result of the semantic analysis. do. For example, the generating unit 133 generates an image G31 including subtitles in which the size of the characters indicating the details of the operation transmitted by the transmitter U11 is enlarged based on the character string obtained by converting the voice data of the transmitter U11.

出力制御部１３４は、生成部１３３によって画像Ｇ３１が生成されると、生成された画像Ｇ３１を聞き手Ｕ２２の端末装置１０－２２に送信する。 When the image G31 is generated by the generation unit 133, the output control unit 134 transmits the generated image G31 to the terminal device 10-22 of the listener U22.

図６では、生成部１３３が、補助情報の一例として、伝え手Ｕ１１によって伝達される操作内容をひらがなで表記した文字情報を生成する例について説明したが、ひらがな以外で表記した文字情報を生成してもよい。例えば、生成部１３３は、補助情報の一例として、カタカナ、漢字またはローマ字で表記された文字情報を生成してもよい。 FIG. 6 illustrates an example in which the generation unit 133 generates character information written in hiragana for the content of the operation transmitted by the messenger U11 as an example of the auxiliary information. may For example, the generation unit 133 may generate character information written in katakana, kanji, or romaji as an example of the auxiliary information.

なお、生成部１３３は、補助情報の一例として、日本語以外の他の言語に翻訳された文字情報を生成してもよい。例えば、生成部１３３は、聞き手の母国語が英語であるという属性情報に基づいて、伝え手によって伝達される操作内容を英語に翻訳した字幕を生成してもよい。 Note that the generation unit 133 may generate character information translated into a language other than Japanese as an example of the auxiliary information. For example, the generation unit 133 may generate subtitles by translating the details of the operation transmitted by the sender into English based on the attribute information indicating that the native language of the listener is English.

また、図６では、生成部１３３が、補助情報の一例として、伝え手Ｕ１１によって伝達される操作内容を示す文字の大きさが拡大された文字情報を生成する例について説明したが、これに限られない。例えば、生成部１３３は、補助情報の一例として、文字の大きさが縮小された文字情報を生成してもよい。 In addition, in FIG. 6, an example in which the generation unit 133 generates, as an example of the auxiliary information, character information in which the size of the characters indicating the details of the operation transmitted by the messenger U11 is enlarged has been described. can't For example, the generation unit 133 may generate character information in which the character size is reduced as an example of the auxiliary information.

〔４－２．第２の変形例〕
次に、図７を用いて、第２の変形例に係る情報処理について説明する。図７は、第２の変形例に係る情報処理の一例を示す図である。図７では、遠隔コミュニケーションを用いて、伝え手Ｕ１１が、同じ現場に所在する聞き手Ｕ２３およびＵ２４に対して、聞き手が操作する装置Ｏ１に対する操作内容を伝達している。 [4-2. Second modification]
Next, information processing according to the second modification will be described with reference to FIG. FIG. 7 is a diagram illustrating an example of information processing according to the second modification. In FIG. 7, using remote communication, the communicator U11 communicates to listeners U23 and U24 located at the same site the details of the operation of the device O1 operated by the listeners.

取得部１３１は、聞き手Ｕ２１およびＵ２２が所在する現場の画像情報を取得する。例えば、取得部１３１は、聞き手Ｕ２１およびＵ２２が所在する現場に設置されたカメラＣ１から聞き手Ｕ２１およびＵ２２が所在する現場の画像情報を取得する。 Acquisition unit 131 acquires image information of the site where listeners U21 and U22 are located. For example, the acquiring unit 131 acquires image information of the site where the listeners U21 and U22 are located from the camera C1 installed at the site where the listeners U21 and U22 are located.

判定部１３２は、聞き手Ｕ２１およびＵ２２が所在する現場の画像情報に基づいて、伝え手から聞き手に対して操作内容を伝達するタイミングを判定する。具体的には、判定部１３２は、取得部１３１によって取得された現場の画像情報に基づいて、聞き手Ｕ２１およびＵ２２が操作内容を確認している最中であるか否かを判定する。 The determination unit 132 determines the timing of transmitting the operation content from the transmitter to the listeners based on the image information of the site where the listeners U21 and U22 are located. Specifically, the determination unit 132 determines whether or not the listeners U21 and U22 are checking the operation content based on the image information of the site acquired by the acquisition unit 131 .

例えば、判定部１３２は、公知の物体認識技術を用いて、現場の画像情報に含まれる装置Ｏ１の領域を特定する。また、判定部１３２は、公知の姿勢推定技術を用いて、現場の画像情報に含まれる聞き手Ｕ２１およびＵ２２の人物領域を特定する。続いて、判定部１３２は、特定された聞き手Ｕ２１およびＵ２２のうち少なくともいずれか一方が装置Ｏ１に接触しているか否かを判定する。判定部１３２は、聞き手Ｕ２１およびＵ２２のうち少なくともいずれか一方が装置Ｏ１に接触していると判定した場合、聞き手Ｕ２１およびＵ２２が操作内容を確認している最中であると判定する。続いて、判定部１３２は、聞き手Ｕ２１およびＵ２２が操作内容を確認している最中であると判定した場合、伝え手から聞き手に対して操作内容を伝達するタイミングではないと判定する。 For example, the determination unit 132 identifies the area of the device O1 included in the image information of the site using a known object recognition technique. The determination unit 132 also uses a known posture estimation technique to identify the person areas of the listeners U21 and U22 included in the image information of the scene. Next, determination unit 132 determines whether or not at least one of identified listeners U21 and U22 is in contact with device O1. When determining that at least one of the listeners U21 and U22 is in contact with the device O1, the determination unit 132 determines that the listeners U21 and U22 are confirming the operation details. Subsequently, when the determination unit 132 determines that the listeners U21 and U22 are in the process of confirming the operation details, it determines that it is not the time for the transmitter to transmit the operation details to the listeners.

生成部１３３は、判定部１３２によって伝え手から聞き手に対して操作内容を伝達するタイミングではないと判定された場合、次の操作内容を伝達するのを少し待つよう促す画像を生成する。出力制御部１３４は、生成部１３３によって次の操作内容を伝達するのを少し待つよう促す画像が生成されると、生成された画像を伝え手の端末装置１０に表示するよう制御する。 When the determination unit 132 determines that it is not the timing for transmitting the operation content from the transmitter to the listener, the generation unit 133 generates an image prompting the listener to wait for a while before transmitting the next operation content. When the generation unit 133 generates an image prompting the user to wait for a while before transmitting the next operation content, the output control unit 134 controls the terminal device 10 of the transmitter to display the generated image.

一方、判定部１３２は、聞き手Ｕ２１およびＵ２２のいずれも装置Ｏ１に接触していないと判定した場合、聞き手Ｕ２１およびＵ２２が操作内容を確認している最中ではないと判定する。続いて、判定部１３２は、聞き手Ｕ２１およびＵ２２が操作内容を確認している最中ではないと判定した場合、伝え手から聞き手に対して操作内容を伝達するタイミングであると判定する。 On the other hand, when determining that neither of the listeners U21 and U22 is touching the device O1, the determination unit 132 determines that the listeners U21 and U22 are not currently confirming the operation content. Subsequently, when the determination unit 132 determines that the listeners U21 and U22 are not in the process of confirming the operation details, it determines that it is time for the transmitter to transmit the operation details to the listeners.

生成部１３３は、判定部１３２によって伝え手から聞き手に対して操作内容を伝達するタイミングであると判定された場合、次の操作内容を伝達するのを促す画像を生成してもよい。出力制御部１３４は、生成部１３３によって次の操作内容を伝達するのを促す画像が生成されると、生成された画像を伝え手の端末装置１０に表示するよう制御してもよい。このようにして、出力制御部１３４は、判定部１３２によって判定されたタイミングに関する情報を伝え手の端末装置１０に出力するよう制御する。 When the determination unit 132 determines that it is time to transmit the operation content from the transmitter to the listener, the generation unit 133 may generate an image prompting transmission of the next operation content. When the generation unit 133 generates an image prompting the transmission of the next operation content, the output control unit 134 may control to display the generated image on the terminal device 10 of the transmitter. In this way, the output control unit 134 controls to output the information about the timing determined by the determination unit 132 to the terminal device 10 of the sender.

図７では、判定部１３２が、聞き手Ｕ２１およびＵ２２が所在する現場の画像情報に基づいて、伝え手から聞き手に対して操作内容を伝達するタイミングを判定する例について説明したが、聞き手の情報は画像情報でなくてもよい。具体的には、判定部１３２は、聞き手が所在する現場の音声情報に基づいて、伝え手から聞き手に対して操作内容を伝達するタイミングを判定してもよい。判定部１３２は、現場の音声情報に基づいて、聞き手Ｕ２１およびＵ２２による発話がない状態が所定の時間以上であるか否かを判定する。判定部１３２は、例えば、聞き手Ｕ２１およびＵ２２による発話がない状態が所定の時間以上であると判定した場合、伝え手から聞き手に対して操作内容を伝達するタイミングであると判定してもよい。あるいは、判定部１３２は、現場の音声情報に基づいて、聞き手Ｕ２１およびＵ２２による発話に「ＯＫです」といった文言が含まれるか否かを判定する。判定部１３２は、例えば、聞き手Ｕ２１およびＵ２２による発話に「ＯＫです」といった文言が含まれると判定した場合、伝え手から聞き手に対して操作内容を伝達するタイミングであると判定してもよい。 FIG. 7 describes an example in which the determination unit 132 determines the timing of transmitting the operation content from the transmitter to the listener based on the image information of the site where the listeners U21 and U22 are located. It does not have to be image information. Specifically, the determination unit 132 may determine the timing of transmitting the operation content from the transmitter to the listener based on the voice information of the site where the listener is located. The determination unit 132 determines whether or not the listeners U21 and U22 have not spoken for a predetermined period of time or longer, based on the on-site audio information. For example, if the determination unit 132 determines that the listeners U21 and U22 have not spoken for a predetermined period of time or longer, the determination unit 132 may determine that it is time for the transmitter to transmit the operation details to the listeners. Alternatively, the determination unit 132 determines whether or not the utterances by the listeners U21 and U22 include words such as "OK" based on the voice information at the site. For example, if it is determined that the utterances by the listeners U21 and U22 include the phrase "OK", the determination unit 132 may determine that it is time for the transmitter to transmit the operation details to the listener.

また、判定部１３２は、聞き手が操作する装置Ｏ１から発生するシグナル情報に基づいて、伝え手から聞き手に対して操作内容を伝達するタイミングを判定してもよい。例えば、判定部１３２は、聞き手が装置Ｏ１を操作している時のみ発生するシグナル情報を装置Ｏ１から取得することができる。続いて、判定部１３２は、取得したシグナル情報に基づいて、聞き手による操作が行われていない状態が所定の時間以上であるか否かを判定する。判定部１３２は、例えば、聞き手による操作が行われていない状態が所定の時間以上であると判定した場合、伝え手から聞き手に対して操作内容を伝達するタイミングであると判定してもよい。あるいは、判定部１３２は、取得したシグナル情報に基づいて、聞き手による操作が行われている状態が所定の時間以上であるか否かを判定する。判定部１３２は、例えば、聞き手による操作が行われている状態が所定の時間以上であると判定した場合、伝え手から聞き手に対して操作内容を伝達するタイミングではないと判定してもよい。 Further, the determination unit 132 may determine the timing of transmitting the operation content from the transmitter to the listener based on the signal information generated from the device O1 operated by the listener. For example, the determination unit 132 can acquire signal information from the device O1 that is generated only when the listener is operating the device O1. Next, based on the acquired signal information, the determination unit 132 determines whether or not the listener has not performed any operation for a predetermined time or longer. For example, if the determination unit 132 determines that the listener has not performed an operation for a predetermined period of time or longer, the determination unit 132 may determine that it is time for the transmitter to transmit the operation content to the listener. Alternatively, the determination unit 132 determines whether or not the state in which the listener is performing an operation is longer than or equal to a predetermined time, based on the acquired signal information. For example, if it is determined that the listener is performing an operation for a predetermined period of time or more, the determination unit 132 may determine that it is not the time to transmit the operation content from the transmitter to the listener.

〔４－３．第３の変形例〕
次に、図８を用いて、第３の変形例に係る情報処理について説明する。図８は、第３の変形例に係る情報処理の一例を示す図である。図８では、遠隔コミュニケーションを用いて、伝え手Ｕ１１が、聞き手Ｕ２５に対して、聞き手Ｕ２５が操作するドローンのコントローラＯ２５に対する操作内容を伝達している。 [4-3. Third modification]
Next, information processing according to the third modification will be described with reference to FIG. FIG. 8 is a diagram illustrating an example of information processing according to the third modification. In FIG. 8, using remote communication, the communicator U11 communicates to the listener U25 the details of the operation of the controller O25 of the drone operated by the listener U25.

図８では、取得部１３１が、聞き手Ｕ２５によるコントローラＯ２５に対する操作を撮像した操作動画を取得する。例えば、取得部１３１は、眼鏡型のヘッドマウントディスプレイである聞き手Ｕ２５の端末装置１０－２５に搭載されたカメラＣ２５から、聞き手Ｕ２５によるコントローラＯ２５に対する操作を撮像した操作動画を取得する。出力制御部１３４は、取得部１３１によって取得された操作動画を伝え手Ｕ１１の端末装置１０の画面に表示するよう制御する。これにより、伝え手側は、取得部１３１によって取得された操作動画を繰り返し再生することができるため、聞き手側の評価をよりしやすくなる。 In FIG. 8, the acquisition unit 131 acquires an operation video imaged of the operation of the controller O25 by the listener U25. For example, the acquisition unit 131 acquires an operation video imaged of the operation of the controller O25 by the listener U25 from the camera C25 mounted on the terminal device 10-25 of the listener U25, which is a glasses-type head-mounted display. The output control unit 134 controls to display the operation video acquired by the acquisition unit 131 on the screen of the terminal device 10 of the transmitter U11. This allows the sender to repeatedly reproduce the operation video acquired by the acquisition unit 131, making it easier for the listener to evaluate.

また、取得部１３１は、聞き手Ｕ２５の操作に対する伝え手Ｕ１１の評価を示す評価情報を伝え手Ｕ１１の端末装置１０から取得する。出力制御部１３４は、取得部１３１によって取得された評価情報を聞き手Ｕ２５の端末装置１０－２５に出力するよう制御する。 In addition, the acquisition unit 131 acquires evaluation information indicating the evaluation of the communicator U11 with respect to the operation of the listener U25 from the terminal device 10 of the communicator U11. The output control unit 134 controls to output the evaluation information acquired by the acquisition unit 131 to the terminal device 10-25 of the listener U25.

〔４－４．第４の変形例〕
次に、図９を用いて、第４の変形例に係る情報処理について説明する。図９は、第４の変形例に係る情報処理の一例を示す図である。図９では、図８と同様に、遠隔コミュニケーションを用いて、伝え手Ｕ１１が、聞き手Ｕ２６に対して、聞き手Ｕ２６が操作するドローンのコントローラＯ２６に対する操作内容を伝達している。 [4-4. Fourth modification]
Next, information processing according to the fourth modification will be described with reference to FIG. FIG. 9 is a diagram illustrating an example of information processing according to the fourth modification. In FIG. 9, as in FIG. 8, the communicator U11 uses remote communication to communicate to the listener U26 the details of the operation of the controller O26 of the drone operated by the listener U26.

図９では、生成部１３３が、聞き手の情報の一例として、聞き手Ｕ２６がドローンを操作するのに用いられるコントローラＯ２６の信号情報に基づいて、補助情報である見本動画Ｇ４２を生成する。具体的には、取得部１３１は、聞き手Ｕ２６のコントローラＯ２６からコントローラＯ２６の信号情報を取得する。続いて、判定部１３２は、取得部１３１によって取得された信号情報に基づいて、聞き手Ｕ２６が操作内容に関する操作に困難をきたしているか否かを判定する。例えば、判定部１３２は、取得部１３１によって取得された信号情報に基づいて、聞き手Ｕ２６が現在行っている操作を判定する。続いて、判定部１３２は、あらかじめ用意された操作手順に基づいて、聞き手Ｕ２６が現在行っている操作が操作手順と比べてどの程度遅れているかを判定する。例えば、判定部１３２は、聞き手Ｕ２６が現在行っている操作が操作手順と比べて所定の閾値を超えて遅れていると判定した場合には、聞き手Ｕ２６が操作内容に関する操作に困難をきたしていると判定する。生成部１３３は、判定部１３２によって聞き手Ｕ２６が操作内容に関する操作に困難をきたしていると判定された場合、ドローンに対する見本操作を撮像した見本動画Ｇ４２を生成する。出力制御部１３４は、生成部１３３によって生成された見本動画Ｇ４２を聞き手Ｕ２６の端末装置１０－２６に表示するよう制御する。例えば、出力制御部１３４は、聞き手Ｕ２６による実際の操作を撮像した動画Ｇ４１の横に同じ大きさの見本動画Ｇ４２が表示されるように制御する。 In FIG. 9, the generation unit 133 generates a sample moving image G42, which is auxiliary information, based on the signal information of the controller O26 used by the listener U26 to operate the drone, as an example of listener information. Specifically, the acquisition unit 131 acquires the signal information of the controller O26 from the controller O26 of the listener U26. Subsequently, based on the signal information acquired by the acquisition unit 131, the determination unit 132 determines whether or not the listener U26 is having difficulty performing the operation related to the operation content. For example, the determination unit 132 determines the operation currently being performed by the listener U26 based on the signal information acquired by the acquisition unit 131 . Subsequently, the determination unit 132 determines how much the operation currently performed by the listener U26 is behind the operation procedure based on the operation procedure prepared in advance. For example, if the determination unit 132 determines that the operation currently being performed by the listener U26 is delayed by more than a predetermined threshold compared to the operation procedure, the listener U26 is having difficulty performing an operation related to the operation content. I judge. When the determination unit 132 determines that the listener U26 is having difficulty in performing an operation related to the operation content, the generation unit 133 generates a sample moving image G42 in which a sample operation for the drone is captured. The output control unit 134 controls to display the sample moving image G42 generated by the generation unit 133 on the terminal device 10-26 of the listener U26. For example, the output control unit 134 performs control so that a sample moving image G42 of the same size is displayed next to the moving image G41 in which the actual operation by the listener U26 is captured.

〔４－５．その他の変形例〕
生成部１３３は、聞き手の情報の一例として、聞き手が所在する現場の画像情報に基づいて、補助情報を生成してもよい。例えば、生成部１３３は、聞き手が所在する現場の画像情報に基づいて、補助情報の一例として、操作対象の操作部位を示す部位画像、または操作対象に対する操作方向を示す方向画像を生成する。 [4-5. Other Modifications]
The generation unit 133 may generate auxiliary information based on image information of the site where the listener is located, as an example of the listener's information. For example, the generation unit 133 generates, as an example of the auxiliary information, a region image indicating the operation region of the operation target or a direction image indicating the operation direction with respect to the operation target, based on the image information of the site where the listener is located.

また、生成部１３３は、聞き手が所在する現場の画像情報に基づいて、補助情報の一例として、操作対象の所定の領域を拡大した拡大画像を生成してもよい。例えば、生成部１３３は、操作対象の精密領域を拡大した拡大画像を生成する。 Further, the generation unit 133 may generate an enlarged image obtained by enlarging a predetermined area of the operation target as an example of the auxiliary information based on the image information of the site where the listener is located. For example, the generation unit 133 generates an enlarged image obtained by enlarging the precision region of the operation target.

〔５．効果〕
上述してきたように、実施形態に係る情報処理装置１００は、生成部１３３と出力制御部１３４を備える。生成部１３３は、遠隔コミュニケーションの参加者である聞き手の情報に基づいて、遠隔コミュニケーションの参加者である伝え手から聞き手に対して伝達される操作内容であって、聞き手が操作する操作対象に対する操作内容を補う補助情報を生成する。出力制御部１３４は、生成部１３３によって生成された補助情報を聞き手の端末装置１０に出力するよう制御する。また、生成部１３３は、聞き手の情報として、聞き手の属性に関する属性情報、聞き手が所在する現場の音声情報、聞き手が所在する現場の画像情報、または聞き手が操作対象を操作するのに用いられる制御装置の信号情報に基づいて、補助情報を生成する。 [5. effect〕
As described above, the information processing apparatus 100 according to the embodiment includes the generator 133 and the output controller 134 . Based on the information of the listener who is a participant in the remote communication, the generating unit 133 generates operation contents transmitted from the transmitter who is a participant in the remote communication to the listener, which is an operation for an operation target operated by the listener. Generate auxiliary information to supplement the content. The output control unit 134 controls to output the auxiliary information generated by the generation unit 133 to the terminal device 10 of the listener. In addition, the generating unit 133 generates, as listener information, attribute information related to the listener's attributes, audio information of the site where the listener is located, image information of the site where the listener is located, or control information used by the listener to operate the operation target. Auxiliary information is generated based on the device signal information.

これにより、情報処理装置１００は、伝え手によって伝達される操作内容に加えて、聞き手に応じた補助情報を提供することができるので、遠隔コミュニケーションでの複雑な操作内容の伝達を助けることができる。したがって、本願発明は、遠隔コミュニケーションにおける複雑な操作内容の伝達を支援可能にすることができる。 As a result, the information processing apparatus 100 can provide supplementary information corresponding to the listener in addition to the operation details transmitted by the sender, so that it is possible to assist the transmission of complicated operation details in remote communication. . Therefore, the present invention can support transmission of complicated operation contents in remote communication.

また、生成部１３３は、補助情報として、操作対象の操作部位を示す部位画像、または操作対象に対する操作方向を示す方向画像を生成する。出力制御部１３４は、部位画像または方向画像のうち少なくともいずれか１つを聞き手の端末装置１０の画面に表示する。 In addition, the generation unit 133 generates, as auxiliary information, a region image indicating the operation region of the operation target, or a direction image indicating the operation direction with respect to the operation target. The output control unit 134 displays at least one of the part image and the direction image on the screen of the terminal device 10 of the listener.

これにより、情報処理装置１００は、伝え手によって伝達される操作内容に加えて、操作対象の操作部位を示す部位画像、または操作対象に対する操作方向を示す方向画像を提供することができるので、遠隔コミュニケーションでの複雑な操作内容の伝達を助けることができる。 As a result, the information processing apparatus 100 can provide, in addition to the operation content transmitted by the transmitter, a region image indicating the operation region of the operation target or a direction image indicating the operation direction with respect to the operation target. It can help to convey complicated operation contents in communication.

また、生成部１３３は、補助情報として、操作対象の所定の領域を拡大した拡大画像を生成する。出力制御部１３４は、拡大画像を聞き手の端末装置１０の画面に表示する。 In addition, the generation unit 133 generates an enlarged image obtained by enlarging a predetermined area of the operation target as auxiliary information. The output control unit 134 displays the enlarged image on the screen of the terminal device 10 of the listener.

これにより、情報処理装置１００は、例えば、精密領域を拡大して表示することができるので、遠隔コミュニケーションでの精密領域に関する操作内容の伝達を助けることができる。 As a result, the information processing apparatus 100 can, for example, enlarge and display the precision area, thereby helping to transmit the operation content regarding the precision area in remote communication.

また、生成部１３３は、補助情報として、操作内容を示す文字情報または操作内容に関する用語の意味を説明する文字情報を生成する。出力制御部１３４は、文字情報を聞き手の端末装置１０の画面に表示する。 In addition, the generation unit 133 generates, as auxiliary information, character information indicating the content of the operation or character information explaining the meaning of terms related to the content of the operation. The output control unit 134 displays the character information on the screen of the terminal device 10 of the listener.

これにより、情報処理装置１００は、例えば、難聴である聞き手が伝え手の音声を聞き取りづらい場合であっても、補助情報である文字情報により視覚的に操作内容を知覚することを可能にする。したがって、情報処理装置１００は、例えば、難聴である聞き手に対する操作内容の伝達を助けることができる。また、情報処理装置１００は、例えば、未経験者である聞き手にとって馴染みのない専門用語が頻繁に登場する場合であっても、補助情報である文字情報により専門用語の意味をその都度確認可能にすることができる。したがって、情報処理装置１００は、例えば、未経験者である聞き手が操作内容を理解するのを助けることができる。 As a result, the information processing apparatus 100 enables, for example, a hearing-impaired listener to visually perceive the operation content based on the character information, which is auxiliary information, even when it is difficult for the listener to hear the speaker's voice. Therefore, the information processing apparatus 100 can help convey operation details to, for example, a hearing-impaired listener. In addition, the information processing apparatus 100 makes it possible to check the meaning of technical terms each time, for example, using character information, which is auxiliary information, even when technical terms that are unfamiliar to an inexperienced listener appear frequently. be able to. Therefore, the information processing apparatus 100 can help, for example, an inexperienced listener to understand the operation content.

また、生成部１３３は、補助情報として、文字の大きさが拡大または縮小された文字情報を生成する。出力制御部１３４は、文字の大きさが拡大または縮小された文字情報を聞き手の端末装置１０の画面に表示する。 In addition, the generation unit 133 generates character information in which the character size is enlarged or reduced as auxiliary information. The output control unit 134 displays the character information whose character size is enlarged or reduced on the screen of the terminal device 10 of the listener.

これにより、情報処理装置１００は、例えば、文字を大きくすることで、視力の低い聞き手が伝え手の操作内容を視覚的に知覚することを助けることができる。また、情報処理装置１００は、例えば、文字を小さくすることで、聞き手の手元や操作対象の邪魔にならないように画面中に伝え手の操作内容を表示することができる。したがって、情報処理装置１００は、聞き手が操作内容を視覚的に理解するのを助けることができる。 As a result, the information processing apparatus 100 can help a listener with poor eyesight to visually perceive the operator's operation details, for example, by enlarging the characters. Further, the information processing apparatus 100 can display the operator's operation details on the screen so as not to interfere with the listener's hand or the operation target, for example, by making the characters smaller. Therefore, the information processing apparatus 100 can help the listener to visually understand the operation content.

また、生成部１３３は、補助情報として、ひらがな、カタカナ、漢字またはローマ字で表記された文字情報を生成する。出力制御部１３４は、ひらがな、カタカナ、漢字またはローマ字で表記された文字情報を聞き手の端末装置１０の画面に表示する。 The generation unit 133 also generates character information written in hiragana, katakana, kanji, or romaji as auxiliary information. The output control unit 134 displays character information written in hiragana, katakana, kanji, or romaji on the screen of the terminal device 10 of the listener.

これにより、情報処理装置１００は、聞き手に応じた日本語の表示態様によって、聞き手が伝え手の操作内容を理解するのを助けることができる。 As a result, the information processing apparatus 100 can help the listener to understand the details of the operator's operation by displaying Japanese according to the listener.

また、生成部１３３は、補助情報として、日本語以外の他の言語に翻訳された文字情報を生成する。出力制御部１３４は、他の言語に翻訳された文字情報を聞き手の端末装置１０の画面に表示する。 The generation unit 133 also generates character information translated into a language other than Japanese as auxiliary information. The output control unit 134 displays the character information translated into another language on the screen of the terminal device 10 of the listener.

これにより、情報処理装置１００は、聞き手に応じた言語によって、聞き手が伝え手の操作内容を理解するのを助けることができる。 As a result, the information processing apparatus 100 can help the listener to understand the details of the operator's operation in a language suitable for the listener.

また、情報処理装置１００は、判定部１３２をさらに備える。判定部１３２は、聞き手の情報に基づいて、聞き手が操作内容に関する操作に困難をきたしているか否かを判定する。生成部１３３は、判定部１３２によって聞き手が操作に困難をきたしていると判定された場合に、補助情報を生成する。 Information processing apparatus 100 further includes determination unit 132 . The determination unit 132 determines whether or not the listener is having difficulty performing an operation related to the operation content, based on the listener's information. The generation unit 133 generates auxiliary information when the determination unit 132 determines that the listener is having difficulty in operating.

これにより、情報処理装置１００は、聞き手が困っているときに補助情報を生成することができるので、聞き手が伝え手の操作内容を理解するのを助けることができる。 As a result, the information processing apparatus 100 can generate auxiliary information when the listener is in trouble, so that the listener can help the listener to understand the operator's operation contents.

また、判定部１３２は、聞き手の情報に基づいて、伝え手から聞き手に対して操作内容を伝達するタイミングを判定する。出力制御部１３４は、判定部１３２によって判定されたタイミングに関する情報を伝え手の端末装置１０に出力するよう制御する。 Also, the determination unit 132 determines the timing of transmitting the operation content from the transmitter to the listener based on the listener's information. The output control unit 134 controls to output the information about the timing determined by the determination unit 132 to the terminal device 10 of the sender.

これにより、情報処理装置１００は、聞き手にとって受け入れやすいタイミングで、伝え手から聞き手に対して操作内容を伝達することができるので、聞き手が伝え手の操作内容を理解するのを助けることができる。 As a result, the information processing apparatus 100 can transmit the operation content from the transmitter to the listener at a timing that is easy for the listener to accept, thereby helping the listener to understand the transmitter's operation content.

また、生成部１３３は、補助情報として、操作対象に対する見本操作を撮像した見本動画を生成する。出力制御部１３４は、見本動画を聞き手の端末装置１０の画面に表示するよう制御する。 In addition, the generation unit 133 generates, as auxiliary information, a sample moving image in which a sample operation on the operation target is captured. The output control unit 134 controls to display the sample moving image on the screen of the terminal device 10 of the listener.

これにより、情報処理装置１００は、伝え手によって伝達される操作内容に加えて、見本動画を見ることができるので、聞き手が伝え手の操作内容を理解するのを助けることができる。 As a result, the information processing apparatus 100 can view the sample moving image in addition to the details of the operation transmitted by the narrator, thereby helping the listener to understand the details of the narrator's operation.

また、情報処理装置１００は、取得部１３１をさらに備える。取得部１３１は、聞き手による操作対象に対する操作を撮像した操作動画を取得する。出力制御部１３４は、操作動画を伝え手の端末装置１０の画面に表示するよう制御する。 Information processing apparatus 100 further includes acquisition unit 131 . Acquisition unit 131 acquires an operation video in which an operation on an operation target by a listener is imaged. The output control unit 134 controls to display the operation video on the screen of the terminal device 10 of the sender.

これにより、情報処理装置１００は、伝え手に対して、自身が伝達した操作内容が聞き手にどの程度伝わっているか、フィードバックすることができる。 As a result, the information processing apparatus 100 can give feedback to the transmitter as to how well the details of the operation transmitted by the information processing apparatus 100 have been conveyed to the listener.

また、取得部１３１は、聞き手の操作に対する伝え手の評価を示す評価情報を取得する。出力制御部１３４は、評価情報を聞き手の端末装置１０に出力するよう制御する。 In addition, the acquisition unit 131 acquires evaluation information indicating the transmitter's evaluation of the listener's operation. The output control unit 134 controls to output the evaluation information to the terminal device 10 of the listener.

これにより、情報処理装置１００は、聞き手に対して、どの程度操作内容を正しく理解しているか、伝え手の評価をフィードバックすることができる。 As a result, the information processing apparatus 100 can feed back the listener's evaluation of how well the operation contents are correctly understood.

〔６．ハードウェア構成〕
また、上述してきた実施形態に係る情報処理装置１００は、例えば図１０に示すような構成のコンピュータ１０００によって実現される。図１０は、情報処理装置１００の機能を実現するコンピュータの一例を示すハードウェア構成図である。コンピュータ１０００は、ＣＰＵ１１００、ＲＡＭ１２００、ＲＯＭ１３００、ＨＤＤ１４００、通信インターフェイス（Ｉ／Ｆ）１５００、入出力インターフェイス（Ｉ／Ｆ）１６００、及びメディアインターフェイス（Ｉ／Ｆ）１７００を備える。 [6. Hardware configuration]
Also, the information processing apparatus 100 according to the above-described embodiments is implemented by a computer 1000 configured as shown in FIG. 10, for example. FIG. 10 is a hardware configuration diagram showing an example of a computer that implements the functions of the information processing apparatus 100. As shown in FIG. Computer 1000 includes CPU 1100 , RAM 1200 , ROM 1300 , HDD 1400 , communication interface (I/F) 1500 , input/output interface (I/F) 1600 and media interface (I/F) 1700 .

ＣＰＵ１１００は、ＲＯＭ１３００またはＨＤＤ１４００に格納されたプログラムに基づいて動作し、各部の制御を行う。ＲＯＭ１３００は、コンピュータ１０００の起動時にＣＰＵ１１００によって実行されるブートプログラムや、コンピュータ１０００のハードウェアに依存するプログラム等を格納する。 The CPU 1100 operates based on programs stored in the ROM 1300 or HDD 1400 and controls each section. The ROM 1300 stores a boot program executed by the CPU 1100 when the computer 1000 is started up, a program depending on the hardware of the computer 1000, and the like.

ＨＤＤ１４００は、ＣＰＵ１１００によって実行されるプログラム、及び、かかるプログラムによって使用されるデータ等を格納する。通信インターフェイス１５００は、所定の通信網を介して他の機器からデータを受信してＣＰＵ１１００へ送り、ＣＰＵ１１００が生成したデータを所定の通信網を介して他の機器へ送信する。 The HDD 1400 stores programs executed by the CPU 1100, data used by the programs, and the like. Communication interface 1500 receives data from another device via a predetermined communication network, sends the data to CPU 1100, and transmits data generated by CPU 1100 to another device via a predetermined communication network.

ＣＰＵ１１００は、入出力インターフェイス１６００を介して、ディスプレイやプリンタ等の出力装置、及び、キーボードやマウス等の入力装置を制御する。ＣＰＵ１１００は、入出力インターフェイス１６００を介して、入力装置からデータを取得する。また、ＣＰＵ１１００は、生成したデータを入出力インターフェイス１６００を介して出力装置へ出力する。 The CPU 1100 controls output devices such as displays and printers, and input devices such as keyboards and mice, through an input/output interface 1600 . CPU 1100 acquires data from an input device via input/output interface 1600 . CPU 1100 also outputs the generated data to an output device via input/output interface 1600 .

メディアインターフェイス１７００は、記録媒体１８００に格納されたプログラムまたはデータを読み取り、ＲＡＭ１２００を介してＣＰＵ１１００に提供する。ＣＰＵ１１００は、かかるプログラムを、メディアインターフェイス１７００を介して記録媒体１８００からＲＡＭ１２００上にロードし、ロードしたプログラムを実行する。記録媒体１８００は、例えばＤＶＤ（Digital Versatile Disc）、ＰＤ（Phase change rewritable Disk）等の光学記録媒体、ＭＯ（Magneto-Optical disk）等の光磁気記録媒体、テープ媒体、磁気記録媒体、または半導体メモリ等である。 Media interface 1700 reads programs or data stored in recording medium 1800 and provides them to CPU 1100 via RAM 1200 . CPU 1100 loads such a program from recording medium 1800 onto RAM 1200 via media interface 1700, and executes the loaded program. The recording medium 1800 is, for example, an optical recording medium such as a DVD (Digital Versatile Disc) or a PD (Phase change rewritable disc), a magneto-optical recording medium such as an MO (Magneto-Optical disk), a tape medium, a magnetic recording medium, or a semiconductor memory. etc.

例えば、コンピュータ１０００が実施形態に係る情報処理装置１００として機能する場合、コンピュータ１０００のＣＰＵ１１００は、ＲＡＭ１２００上にロードされたプログラムを実行することにより、制御部１３０の機能を実現する。コンピュータ１０００のＣＰＵ１１００は、これらのプログラムを記録媒体１８００から読み取って実行するが、他の例として、他の装置から所定の通信網を介してこれらのプログラムを取得してもよい。 For example, when the computer 1000 functions as the information processing apparatus 100 according to the embodiment, the CPU 1100 of the computer 1000 implements the functions of the control unit 130 by executing programs loaded on the RAM 1200 . CPU 1100 of computer 1000 reads these programs from recording medium 1800 and executes them, but as another example, these programs may be obtained from another device via a predetermined communication network.

以上、本願の実施形態のいくつかを図面に基づいて詳細に説明したが、これらは例示であり、発明の開示の欄に記載の態様を始めとして、当業者の知識に基づいて種々の変形、改良を施した他の形態で本発明を実施することが可能である。 As described above, some of the embodiments of the present application have been described in detail based on the drawings. It is possible to carry out the invention in other forms with modifications.

〔７．その他〕
また、上記実施形態及び変形例において説明した各処理のうち、自動的に行われるものとして説明した処理の全部または一部を手動的に行うこともでき、あるいは、手動的に行われるものとして説明した処理の全部または一部を公知の方法で自動的に行うこともできる。この他、上記文書中や図面中で示した処理手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。例えば、各図に示した各種情報は、図示した情報に限られない。 [7. others〕
Further, among the processes described in the above embodiments and modifications, all or part of the processes described as being performed automatically can be performed manually, or described as being performed manually. All or part of the processing can also be performed automatically by known methods. In addition, information including processing procedures, specific names, various data and parameters shown in the above documents and drawings can be arbitrarily changed unless otherwise specified. For example, the various information shown in each drawing is not limited to the illustrated information.

また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部または一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。 Also, each component of each device illustrated is functionally conceptual, and does not necessarily need to be physically configured as illustrated. In other words, the specific form of distribution and integration of each device is not limited to the one shown in the figure, and all or part of them can be functionally or physically distributed and integrated in arbitrary units according to various loads and usage conditions. Can be integrated and configured.

また、上述してきた実施形態及び変形例は、処理内容を矛盾させない範囲で適宜組み合わせることが可能である。 Also, the above-described embodiments and modifications can be appropriately combined within a range that does not contradict the processing contents.

また、上述してきた「部（section、module、unit）」は、「手段」や「回路」などに読み替えることができる。例えば、生成部は、生成手段や生成回路に読み替えることができる。 Also, the above-mentioned "section, module, unit" can be read as "means" or "circuit". For example, the generating unit can be read as generating means or a generating circuit.

１情報処理システム
１０端末装置
１００情報処理装置
１１０通信部
１２０記憶部
１３０制御部
１３１取得部
１３２判定部
１３３生成部
１３４出力制御部 1 information processing system 10 terminal device 100 information processing device 110 communication unit 120 storage unit 130 control unit 131 acquisition unit 132 determination unit 133 generation unit 134 output control unit

Claims

Based on the information of the listener who is a participant in the remote communication, the operation content transmitted from the transmitter who is a participant in the remote communication to the listener, and the operation content for the operation target operated by the listener. a generation unit that generates auxiliary information to be supplemented when it is determined that the listener is having difficulty in performing an operation related to the operation content ;
an output control unit that controls to output the auxiliary information generated by the generation unit to the terminal device of the listener;
Information processing device.

The generating unit
Attribute information relating to attributes of the listener, audio information of the site where the listener is located, image information of the site where the listener is located, or a control device used by the listener to operate the operation target as the information of the listener. generating the auxiliary information based on the signal information of
The information processing device according to claim 1 .

The generating unit
generating, as the auxiliary information, a part image indicating an operation part of the operation target or a direction image indicating an operation direction with respect to the operation target;
The output control unit is
displaying at least one of the region image and the direction image on the screen of the terminal device of the listener;
The information processing apparatus according to claim 1 or 2.

The generating unit
generating an enlarged image obtained by enlarging a predetermined area of the operation target as the auxiliary information;
The output control unit is
displaying the enlarged image on the screen of the terminal device of the listener;
The information processing apparatus according to any one of claims 1 to 3.

The generating unit
generating, as the auxiliary information, character information indicating the content of the operation or character information explaining the meaning of terms related to the content of the operation;
The output control unit is
displaying the character information on the screen of the terminal device of the listener;
The information processing apparatus according to any one of claims 1 to 4.

The generating unit
generating the character information in which the character size is enlarged or reduced as the auxiliary information;
The output control unit is
displaying the character information with the character size enlarged or reduced on the screen of the terminal device of the listener;
The information processing device according to claim 5 .

The generating unit
generating the character information written in hiragana, katakana, kanji, or romaji as the auxiliary information;
The output control unit is
Displaying the character information written in the hiragana, the katakana, the kanji, or the romaji on the screen of the terminal device of the listener;
The information processing apparatus according to claim 5 or 6.

The generating unit
generating the character information translated into a language other than Japanese as the auxiliary information;
The output control unit is
displaying the character information translated into the other language on the screen of the terminal device of the listener;
The information processing apparatus according to any one of claims 5 to 7.

Based on the information of the listener, further comprising a determination unit that determines whether or not the listener is having difficulty with the operation related to the operation content,
The generating unit
generating the auxiliary information when the determination unit determines that the listener is having difficulty with the operation;
The information processing apparatus according to any one of claims 1 to 8.

further comprising a determination unit that determines a timing for transmitting the operation content from the transmitter to the listener based on the listener's information,
The output control unit is
Control to output information about the timing determined by the determination unit to the terminal device of the transmitter;
The information processing apparatus according to any one of claims 1 to 9.

The generating unit
generating, as the auxiliary information, a sample video image of a sample operation on the operation target;
The output control unit is
Control to display the sample video on the screen of the terminal device of the listener,
The information processing apparatus according to any one of claims 1 to 10.

Further comprising an acquisition unit that acquires an operation video in which the listener's operation on the operation target is captured,
The output control unit is
controlling to display the operation video on the screen of the terminal device of the sender;
The information processing apparatus according to any one of claims 1 to 11.

The acquisition unit
Acquiring evaluation information indicating the reporter's evaluation of the listener's operation,
The output control unit is
controlling to output the evaluation information to the terminal device of the listener;
The information processing apparatus according to claim 12.

A computer-executed information processing method comprising:
Based on the information of the listener who is a participant in the remote communication, the operation content transmitted from the transmitter who is a participant in the remote communication to the listener, and the operation content for the operation target operated by the listener. a generation step of generating auxiliary information to be supplemented when it is determined that the listener is having difficulty in performing an operation related to the operation content ;
an output control step of controlling output of the auxiliary information generated by the generating step to the terminal device of the listener;
Information processing method including.

Based on the information of the listener who is a participant in the remote communication, the operation content transmitted from the transmitter who is a participant in the remote communication to the listener, and the operation content for the operation target operated by the listener. a generation procedure for generating auxiliary information to be supplemented when it is determined that the listener is having difficulty performing an operation related to the operation content ;
an output control procedure for controlling output of the auxiliary information generated by the generation procedure to the terminal device of the listener;
An information processing program characterized by causing a computer to execute

An information processing system including an information processing device and a terminal device of a remote communication participant,
The information processing device is
Based on the information of the listener who is a participant in the remote communication, the operation content transmitted from the transmitter who is a participant in the remote communication to the listener, and the operation content for the operation target operated by the listener. a generation unit that generates auxiliary information to be supplemented when it is determined that the listener is having difficulty in performing an operation related to the operation content ;
an output control unit that controls to output the auxiliary information generated by the generation unit to the terminal device of the listener;
with
The terminal device of the listener who is the participant,
outputting the auxiliary information generated by the generator;
Information processing system.