WO2022137547A1

WO2022137547A1 - Communication assistance system

Info

Publication number: WO2022137547A1
Application number: PCT/JP2020/048863
Authority: WO
Inventors: 契宇都木
Original assignee: 株式会社日立製作所
Priority date: 2020-12-25
Filing date: 2020-12-25
Publication date: 2022-06-30

Abstract

This communication assistance system is for providing assistance in remote communication to be performed by a plurality of participants. The communication assistance system has: an input reception unit that receives input from the participants who participate in the communication assistance system; an intimacy level estimation unit that estimates an intimacy level between a first participant and another participant among the plurality of participants on the basis of information relating to the first participant and the other participant; an output priority calculation unit that, on the basis of the intimacy level estimated by the intimacy level estimation unit, calculates an output priority for determining the priority of an output content to be outputted from the other participant to the first participant; and an output unit that outputs the output content to the first participant on the basis of the output priority calculated by the output priority calculation unit.

Description

Communication support system

The present invention relates to a communication support system.

As a countermeasure against the new coronavirus infection (COVID-19), in order to solve the problem of restricted face-to-face communication, remote conferences and remote collaborative creation activities using digital technology have expanded between individuals and companies. ing.

Against this background, remote conferences and remote collaborative creation activities have the problem that communication does not proceed smoothly because it is difficult to understand the real reaction of the other party compared to activities in the actual environment. Therefore, in remote communication involving a large number of people, it is important to provide appropriate feedback on the reactions of participants according to the situation and to support remote communication.

The following Patent Document 1 is known as a background technique of the present invention. In Patent Document 1, speech information is extracted based on voice data collected from participants in a plurality of meetings, activity data in the meeting of the participants is generated, and the dialogue status of the participants is described by the size of a circle or a line. By visualizing and displaying using the thickness etc., and by acquiring the voices of multiple participants during the meeting and displaying the ever-changing conversation situation in real time, more active discussions can be held while observing the situation. Techniques that can be induced are disclosed.

Japanese Unexamined Patent Publication No. 2008-262406

In the structure of Patent Document 1, the problems in terms of voice in remote communication involving a large number of people such as remote conferences and remote collaborative creation activities can be solved, but nonverbal communication by visual information such as facial expressions is lacking. The problem of the point to be done has not been solved. Therefore, in remote communication, it is necessary to more appropriately grasp the reaction of each participant.

Regarding the above points, as a method for promoting remote communication, there is a method in the prior art for visualizing the communication situation and giving feedback using an interface that outputs comments and videos of each participant. However, in remote communication where a large number of participants participate, the comments and videos of many participants are displayed as they are without considering the relationship with each participant, so that important information for each participant can be appropriately visualized. It creates a situation where no feedback is given. Such a situation leads to lack of presence of each participant and hinders smooth communication.

Furthermore, in addition to this, in remote communication in which a large number of participants participate, their own comments and videos are directly displayed regardless of whether the relationships between the participants are close or far, ensuring the anonymity of each participant. There is a need.

In view of the above, it is an object of the present invention to provide a communication support system capable of realizing remote communication that achieves both improvement of presence and ensuring anonymity.

The communication support system of the present invention is a communication support system that supports remote communication performed between a plurality of participants, and includes an input reception unit that receives input from the participants participating in the communication support system, and the above-mentioned. An intimacy estimation unit that estimates the intimacy between the first participant and the other participants based on the information about the first participant and the other participants among the plurality of participants. Output priority calculation that determines the priority of the output contents output from the other participants to the first participant based on the intimacy estimated by the intimacy estimation unit. It has a unit and an output unit that outputs the output content to the first participant based on the output priority calculated by the output priority calculation unit.

According to the present invention, it is possible to provide a communication support system capable of realizing remote communication that achieves both an improvement in presence and anonymity.

Configuration of users / devices / networks in a remote communication system according to the first embodiment of the present invention. Configuration of general information equipment used in FIG. User information stored on the server. An example of a screen to check the relationship between users. An example of a screen for selecting information for the user to be sent. An example of a screen that sets the amount of feedback for each user group. An example of a screen that sets the amount of feedback regarding emotional expression. An example of a model screen of a communication support system according to the first embodiment. A modified example of FIG. Analysis example of user's face image. An example of data showing the judgment value of the analysis of FIG. The whole processing flowchart which concerns on 1st Embodiment. Feedback data sent to the server on a regular basis. Calculation of inter-individual distance according to the first embodiment. An example of a model screen of a communication support system according to the second embodiment. Feedback data periodically transmitted to the server according to the second embodiment. Calculation of inter-individual distance according to the second embodiment. A configuration example of a model of a communication support system according to a third embodiment. A model example of the user viewing screen according to the third embodiment. An example of a model screen of an avatar robot according to a third embodiment. Feedback data periodically transmitted to the server according to the third embodiment. Calculation of inter-individual distance according to the third embodiment.

Hereinafter, embodiments of the present invention will be described with reference to the drawings. The following description and drawings are examples for explaining the present invention, and are appropriately omitted and simplified for the sake of clarification of the description. The present invention can also be implemented in various other forms. Unless otherwise specified, each component may be singular or plural.

The position, size, shape, range, etc. of each component shown in the drawings may not represent the actual position, size, shape, range, etc. in order to facilitate understanding of the invention. Therefore, the present invention is not necessarily limited to the position, size, shape, range and the like disclosed in the drawings.

(Structure of the first embodiment and the communication support system)
FIG. 1 is a configuration of a user, a device, and a network in a remote communication system according to the first embodiment of the present invention.

The server 1 is connected to the presenting user 3 and the listening users 6a to 6f via the net line of the general communication network, and analyzes the contents of the remote communication between these users. The presentation user 3 makes a presentation to the listening users 6a to 6f through the information terminal (PC) 2.

Each of the listening users 6a to 6f is watching the presentation of the presentation user 3 via the information terminal (PC) 2 and the information terminal (pad) 7. The listening users 6a to 6f are divided into a first group 4 and a second group 5, and this grouping is, for example, a group for each company, a group for each department in the organization, a group for each age, and the like. , It is freely set according to the closeness of the relationship of the group of users who participate in remote communication.

Within each group, there is a conversation that is shared only within the group, and the specific content of the conversation, the impression of the presentation content of the presentation user 3, and the content of emotional expression for the presentation are shared by text messages and the like. On the contrary, the concrete conversation content is not transmitted to the outside of each group in principle, and instead, each audit for the presentation of the presentation user 3 is performed between the first group 4 and the second group 5. Only the output contents of emotional expressions such as emoticon icons (non-verbal reactions) representing the reactions of users 6a to 6f are shared.

By such grouping, it is possible to create confidentiality and familiarity for each group, and to share the atmosphere of each group other than the own group or the listening users 6a to 6f.

FIG. 2 is a configuration of a general information device used in FIG.

The server 1 is composed of a CPU, a main memory for storing program data, and an external storage (device) such as a memory card. The CPU has an input reception unit that accepts input from participants participating in the communication support system, an intimacy estimation unit that estimates the intimacy between users, and an output priority that calculates the output priority based on the intimacy between users. It is equipped with a degree calculation unit, an index calculation unit that calculates an index related to communication between users, and an output unit that determines and outputs output contents based on an output priority and an index related to communication. Details will be described later.

The presenting user 3 or the listening users 6a to 6f (FIG. 1) are output devices (keyboard, mouse, touch panel) such as a display, a camera, a microphone, and an input device (keyboard, mouse, touch panel) connected to a general communication network by an external bus such as USB. The information terminal 2 (PC) or the information terminal 7 (pad) is used to view or transmit audio and video.

FIG. 3 is user information stored in the server. FIG. 3A is a data table of user information, and FIG. 3B is a data table related to the history of FIG. 3A.

Inside the server 1 (FIG. 1), as shown in FIG. 3 (a), profile information about the user who uses the remote communication system is managed. This user information shown in FIG. 3 is an example of an inter-individual distance table from the viewpoint of user A. The items in the data table of FIG. 3A include the group to which the user belongs, the relationship between a certain user A and each user (d is an identification number and has nothing to do with numerical calculation), and the history of text comments by comment number (for details, see FIG. 3 (b)) shows the intimacy coefficient between the user A and another user. This intimacy coefficient is a value indicating the distance between individuals between users, and the intimacy changes depending on the settings, input contents, facial expressions, etc. of each user, which will be described later.

The intimacy coefficient shown in FIG. 3A indicates that the smaller the value, the higher the intimacy with the target partner. Further, as shown in FIG. 1, this user information is stored in the server 1 as being common to all users regardless of the presenter or the audience.

The intimacy between users can be judged from the contents of the history of text comments in FIG. 3 (b). For example, the number of histories between certain users (number of conversations) and how to convey a message to the other party in a text comment (determined by the frequency of use of a specific positive word) can be used as judgment materials. In addition to that, the intimacy between users can be determined by grasping the relationship between users and the emotions of the other user based on the information set by the screens of FIGS. 4 to 7 described below. The intimacy coefficient set according to the intimacy between the users determined in this way is used in the server 1 to determine the output content for each user, and affects the content output to the user.

FIG. 4 is an example of a screen for managing relationships between users.

Terminal 2 (7) outputs a screen for managing information about each user. The user of the terminal 2 (7) can select, set, and edit the relationship with each user on the output screen according to the classification (classification of colleagues, customers, friends, acquaintances, etc.) and the degree of familiarity.

The familiarity value that can be set for each user is an element for calculating the intimacy coefficient (FIG. 3 (a)). The value of familiarity shown in FIG. 4 indicates that the larger the value, the higher the intimacy. Even if the person is not close, if the other person is an important person (VIP, etc.), setting the familiarity value to a high value can affect the intimacy coefficient (importance coefficient). .. By setting the relationship with the target user in this way, the distance between individuals (= intimacy) is estimated by the intimacy estimation unit, and the content output to the other user for the input information of one user is It will be decided.

FIG. 5 is an example of a screen for selecting information for the transmission target user.

Terminal 2 (7) outputs a screen for setting transmission information. For example, in FIG. 5, for "colleagues", voice, text, anonymous keywords (specific keywords used without displaying their name to the target user are shared), facial expression information, emotion classification data ( It is set to be able to send (icon related to one's facial expression) and anonymous emotions (only emotions are shared without displaying one's name to the target user), and is not shared for facial expressions. For extremely private information, such as facial footage, you can choose to share or display only with close relationships, as set only in the Friends section. The relationship between users is determined by the setting of this transmission information, and a value according to the relationship between users is added to the intimacy in the calculation in the intimacy estimation unit.

If there is no check in any of the targets, each information is not sent to the server 1, so that the complete confidentiality of the user information is ensured.

FIG. 6 is an example of a screen for setting the feedback amount for each user group.

The terminal 2 (7) outputs a screen for setting feedback weighting in order to incorporate highly important opinions and reactions to the group to which the user belongs and to reduce unimportant opinions and reactions as much as possible. There is.

The value of this display weighting is an input value that determines how much the reaction of which group is displayed. Therefore, a group in which this value is set to a large value will have a large display on its own information terminal. For example, on the setting screen of FIG. 6, since the display weight of a customer who is in a financial group is 0.9 (maximum value is 1), comments and reactions are greatly displayed to the user of this terminal. The presence on the screen of the terminal becomes large. However, since the display weighting of the group of subordinates of the company is set to 0.3, the comments and reactions on this terminal are displayed small, and the presence on the screen becomes small. In this way, the relationship between users is determined by the display weighting, and the intimacy estimation unit adds a value corresponding to this relationship to the intimacy.

FIG. 7 is an example of a screen for setting the amount of feedback regarding emotional expression.

Terminal 2 (7) outputs a screen for setting the aggregation of the emotions of the other user. This is a screen for inputting and setting what kind of reaction is important. For example, as shown in FIG. 7, when you want to aggregate positive emotions, you can select a positive item in the emotion group to relate to positive emotions. As detailed items, display weighting values can be set for each of facial expressions, voices, languages, and inputs. The larger the display weighting value, the greater the influence on the calculation result of the intimacy.

For example, in the case of data [input] and classification [button: like], a value of 1.1 is set. Since this item is larger than other items, the "Like button" greatly affects the aggregated results of "Positive". On the contrary, in the case of data [voice] and classification [voice quality: good mood], since the value is set to 0.2, the influence on the total result of "positive" is small. In this way, the display weighting value can be customized for each group or user, and each user's emotions are parameterized by the facial expressions judged from each user's facial image and the input from the UI on each user's terminal. be able to. The emotional parameterization is performed by the index calculation unit calculating an index related to communication.

The familiarity value set on the screen of FIG. 4 indicating the relationship for each user, the type of transmission information set on the screen of FIG. 5, and the display weighting set on the screens of FIGS. 6 and 7, respectively. The value of is used in the calculation of the intimacy of each user performed by the intimacy estimation unit in the server 1 (FIG. 2). That is, in the server 1, the intimacy estimation unit estimates the intimacy according to the relationship between the users by using these setting information.

FIG. 8 is an example of a model screen of the communication support system according to the first embodiment.

The presentation screen is output and displayed on the terminal 2 (7). Further, on the lower side of the presentation screen, the feedback time transition graph 12 and the video 13 of another user participating are projected.

The feedback time transition graph 12 shows the reaction (emotion) of each user to the presentation as parameters, and the graph flows to the left with the passage of time. In this graph, the larger the number (there are many good reactions to the presentation), the higher the graph. The quantified feedback from the feedback time transition graph 12 allows the presenter to visually understand the current excitement of the audience, which allows the presenter to proceed with the conference while feeling a realistic reaction. More specifically, the index calculation unit calculates an index according to the reaction of each user based on the value of the emotion display weighting described with reference to FIG. 7, and the output unit feeds back using the value of this index as a parameter. By determining the display content of the time transition graph 12, the reaction of each user is parameterized and reflected in the feedback time transition graph 12.

The facial image 13 of another user is a real image or image of a face taken by a camera, a facial expression reproduction model (CG or virtual model that reproduces a facial expression based on the movement of a real person's facial expression), and a facial expression reproduction icon (existing). Based on the movement of the person's facial expression, an icon with a similar facial expression is displayed). Which display is to be displayed is determined by the server 1 for each user based on the output priority calculated by the output priority calculation unit based on the intimacy between the users. For example, users with higher intimacy and higher output priority display images that are closer to real images and images, and by changing the display so that the lower the output priority, the higher the level of abstraction, intimacy and anonymity. It is possible to display both sexes.

On the right side of the presentation screen, Word cloud 18 that aggregates and describes trend words in conversation and a text field 19 that expresses chat shared during the presentation are displayed. Word cloud 18 can display the trending words in a large size by aggregating the comments of the users who are participating in the conference, and what kind of keywords are used by the users who are watching the presentation as a whole. You can visualize what you are doing. Although the comments of the users who are participants are displayed in the text field 19, some users may not be displayed due to the transmission information setting (FIG. 5) described above.

Below the Word cloud 18, a video 16 displaying one's facial expression and a transmission data 17 indicating transmission information are displayed. Further, a voice ON / OFF button 14 and a video ON / OFF button 15 for setting ON / OFF of one's own voice and facial expression are provided.

FIG. 9 is a modified example of FIG.

In FIG. 9, the Word cloud 18 and the text field 19 in FIG. 8 are replaced with the feedback 2-axis graph 20. In the feedback 2-axis graph 20, a plot 21 representing the group importance is displayed based on the definition set in the 2-axis graph.

For example, as shown in FIG. 9, "concentration" (how many listening users are listening without showing away, whether the number of nods to the speaker is high) and "positiveness" (speaker's). There is a two-axis graph (such as inputting a specific keyword that has a favorable impression on the content of the presentation, the time of a smile that can be read from the facial expression of the listening user, etc.). This graph shows the result of totaling the indexes related to communication between users calculated by the index calculation unit for each user, and is based on the output priority calculated by the output priority calculation unit based on the intimacy between users. The display content is determined on the server 1. As a result, it is possible to inform the presenting user of the listening status or the psychological status of the user who watches the presentation in a timely manner. These two-axis items can be freely set. For example, even with settings such as "negative degree" and "number of chats (per hour)", it is possible to convey the presence of the conference to the presenting user in the same manner.

It is also possible to create a 3-axis graph by setting a color for the plot 21 that represents the group importance. For example, if you add "number of chats" in addition to "degree of positiveness" and "degree of concentration", you can visually express three elements by changing the color to a brighter color according to the number of "number of chats". .. In this way, the visualized feedback makes it easier to understand the overall distribution and important distribution.

Furthermore, by plotting the data icons of acquaintances and important people on the plot of this feedback graph, it can be made easier to understand. You can also set it so that you can hear cheers when any of the items get bigger.

Although the model screen examples in FIGS. 8 and 9 are drawn assuming a PC terminal, a pad terminal may also be used.

FIG. 10 is an analysis example of the user's face image. Further, FIG. 11 is an example of data showing the determination value of the analysis of FIG. Note that FIG. 11A is an example of feature point data to be transmitted as “facial expression information”, and FIG. 11B is an example of data to be transmitted as “emotion classification data”.

By analyzing the facial expression of the user in the index calculation unit of the server 1 using the camera captured image 22, it is possible to determine the emotion of the user corresponding to the facial expression and calculate the index related to communication. Based on the index calculated in this way, the output unit generates a facial expression reproduction model or a facial expression reproduction icon. As shown in the feature point extraction image 23, the analysis method creates a data table by digitizing and extracting a set number of points (feature points) that are characteristic of facial expressions in x-coordinates and y-coordinates. As a result, as shown in FIG. 11A, the coordinate values for each feature point in the feature point extraction image 23 are extracted. Then, the facial expression reproduction model 24 of FIG. 10 is completed by generating a model image in which the coordinate values of the feature points are associated with the specific parts of the face. By discriminating the user's facial expression from the facial expression reproduction model 24 completed in this way, the user's emotion can be determined. At this time, the facial expression reproduction model 24 may be generated by changing the facial expression reproduction model 24 according to the user's emotions to further understand the user's emotions. In FIG. 11A, only four feature points are shown, but in reality, it is determined by 30 or more feature points.

Further, the index calculation unit can calculate an index related to communication from the feature point extraction image 23 using a machine learning discriminator, and can determine the facial expression icon 25 based on the calculation result. As shown in FIG. 11B, the facial expression icon 25 is determined by calculating a determination value as an index related to communication for each facial expression item such as a smile or a nod. The machine learning discriminator is, for example, SVM (Support vector machine), NN (Neural Network), or the like.

The determination value (index) for each facial expression item calculated when the facial expression icon 25 is determined may be used in the calculation of the intimacy of each user performed by the intimacy estimation unit. For example, among the weighted values of the display for each detailed item set on the screen of FIG. 7, the weighted value related to the facial expression can be determined whether or not to be adopted in the calculation of intimacy based on this determination value.

The facial expression reproduction model 24 and the facial expression icon 25 created as described above are images that reproduce the facial expressions of each user based on the index related to communication between users. These are transmitted from the server 1 to the terminal of each user, and are displayed as a video 13 on the model screen shown in FIGS. 8 and 9, for example, to output to each user. Note that FIG. 10 shows the output based on facial expression analysis. Similarly, based on the input from the voice, the sound is fed back by applying the sound to a machine learning classifier or the like based on the recognition of the laughing voice and the volume of the voice. It also has a mechanism to do it.

In addition, cultural differences can be absorbed by setting conversion rules for each nationality of participating users in the recognition mechanism. For example, it is possible to eliminate the difference by setting different gesture and smile thresholds for each country and reflecting emotions in the parameters.

FIG. 12 is an overall processing flowchart according to the first embodiment.

First, the flow of personal client operation on the terminal will be explained. The personal client operation is the content operated by the terminal of the PC or the pad. Set the terminal information in step S1. Log in to the terminal in step S2. The information logged in in step S2 is also shared with the server (described later in step S104).

Steps S3 to S12 are flows (loop processing) that are repeated at regular time intervals, and the loop processing is started in step S3. In step S4, the voice information obtained by the microphone recording operation at the time of the user using the terminal is acquired. In step S5, the voice information of the user using the terminal is processed by converting the recorded information of the microphone into text and estimating the emotional information from the recorded information and the text. In step S6, the user using the terminal acquires the video information obtained by recording the camera at the time. In step S7, the facial features obtained from the video information of the user using the terminal are recognized from the image information of the camera acquired in step S6, and the information related to emotions is estimated according to the degree of the features. .. In step S8, the input information to be transmitted to the server is acquired based on the information acquired / processed in steps S4 to S7.

Send the input information to the server in step S9. The information to be transmitted is input by the input reception unit of the server. This transmission is an asynchronous communication in which the data transmitted to the buffer is read at the timing of reading the audio / video information and is processed. Receive the information delivered from the server in step S10. The transmission of step S9 or the reception of step S10 is periodically transmitted / received each time the loop processing of steps S3 to S12 is performed. In step S11, the terminal outputs an image (face image, facial expression reproduction model, facial expression icon) based on the information received from the server. By logging out in step S13, the flow of individual client operation ends.

Next, the flow of server operation will be explained. Server 1 is started in step S101. In step S102, the personal information database (DB) is read.

Steps S103 to S112 are flows (loop processing) that are repeated at regular time intervals, and the loop processing is started in step S103. In step S104, in the operation of the individual client, the login information of step S2 is received and a new login is accepted. In step S105, the loop process is started for a certain user A among the logged-in users.

In step S106, the information periodically transmitted from the terminal in step S9 is received. Specifically, the input receiving unit accepts the input of the terminal. In step S107, the inter-individual distance between the user A and a plurality of other users is confirmed, the output content is determined, and the loop process of transmitting to the terminal of the user A is started. In step S108, the distance between the individual user A and another user is calculated and confirmed. Specifically, in the flow of step S107 and step S108, the index calculation unit calculates the index, the intimacy estimation unit calculates the intimacy, and the output priority calculation unit outputs based on the calculated intimacy. The priority is calculated, and the output unit determines the output content at the terminal based on the calculated output priority. The output content determined here includes the facial expression reproduction model and display reproduction icon of each user displayed as the video 13 of FIGS. 8 and 9, the display content in Word cloud 18 and the text field 19. In step S109, the information of the output content determined in step S108 is transmitted to the terminal of the user A. This transmission is asynchronous communication similar to step S9.

In step S110, the distance between the individual user A and another user is confirmed, the output content is determined, and the loop process of transmitting to the terminal of the user A is terminated. By executing the loop processing of steps S107 to S110 for each user excluding the user A, each user follows the output priority according to the personal distance (intimacy) between the user A and each other user. The output content to be output to the user A is determined, and the information of the output content is transmitted from the server 1 to the terminal of the user A.

In step S111, the loop process targeting user A is terminated. By executing the loop processing of steps S105 to S111 for each user, the output content in the terminal of each user is determined, and the information of the output content is transmitted from the server 1 to the terminal of each user. In step S112, the loop processing of steps S103 to S112 is terminated.

FIG. 13 is an example of feedback data periodically transmitted to the server according to the first embodiment.

The feedback data shown in FIG. 13 is the information transmitted in step S9 of FIG. Each item will be explained. In the voice information disclosure flag column, a flag as to whether or not to disclose the voice information of the user A is shown to 12 users other than the user A who are participating in the conference, such as [10010 ...]. ing. In this way, it is determined to whom information about oneself can be disclosed. It is not disclosed to the group whose corresponding flag is 0.

The file type is recorded in the voice information data column. Similar to the voice information disclosure flag field, the text information disclosure flag column shows a flag as to whether or not the user A discloses voice information to 12 users other than the user A. In the text information data field, the text content input by the user A is recorded. The face image disclosure flag column is the same as the voice information disclosure flag column and the text information disclosure flag column. Face image data is recorded in the face image data column. The facial expression information disclosure flag column is the same as the facial image disclosure flag column, the voice information disclosure flag column, and the text information disclosure flag column.

A coordinate value list is recorded in the facial expression information data column. This is a value used for constructing the above-mentioned facial expression reproduction model (see FIG. 10). The emotional information disclosure flag column is the same as the facial expression information disclosure flag column, the face image disclosure flag column, the voice information disclosure flag column, and the text information disclosure flag column. In the emotion information data column, the determination value for the facial expression icon (see FIG. 10) analyzed from the facial image is recorded.

The above feedback data is sent to the server, and the intimacy estimation unit determines the intimacy coefficient, which is information regarding the distance between each user and the user A. In addition, the above information of each user is totaled by the display weighted value set on the emotion aggregation setting screen (FIG. 7) in the index calculation unit, and the emotion aggregation value for each user is calculated. This purpose-specific emotion aggregate value is transmitted from the server 1 to each user's terminal, and is used for graph display on the screen as shown in the feedback 2-axis graph 20 of FIG. 9 in each terminal. , Expressed as a plot group as the degree of concentration. Further, the correlation with the reception display side user is calculated and used for the calculation of the sympathy value in the inter-individual distance (described later in FIG. 14).

FIG. 14 is a flowchart showing the process of calculating the inter-individual distance according to the first embodiment. The flowchart will be described below.

The inter-individual distance table created in FIG. 14 is used in the calculation / confirmation of the inter-individual distance in step S108 of FIG. 12, and is created independently of the flow of FIG. First, in step S20, user A (screen viewer) is selected. In step S21, a loop process for determining the distance between individuals with each viewer (user X) participating in the conference is started. In step S22, it is confirmed whether or not there is target person information. If this target person information does not exist in the information of user A, it is determined that there is no relationship. In step S23, the data of the classification group of the target person is acquired.

In step S24, the importance information is input for each group acquired in step S23. The importance information referred to here is the display weighting (FIG. 6) in the feedback weighting setting. In step S25, a numerical value of "friendliness" preset for each user is input from the management screen of the terminal (FIG. 4). In step S26, the intimacy in the conversation is calculated from the number of times the user A has talked with each user. The numerical value and the calculated value input in the flow from step S24 to step S26 are used in the calculation in which the intimacy estimation unit determines the inter-individual distance (intimacy).

In step S27, the correlation with the emotion data of the user A regarding the user X is calculated, and the sympathy value between the user A and the user X is calculated. The feedback data of FIG. 13 is also used to calculate this sympathy value. If the positive correlation is higher than a certain threshold, the empathy value is added to the calculation of the inter-individual distance.

In step S28, the output priority calculation unit determines the inter-individual distance between the user A and the user X based on the flow of steps S22 to S27. In step S29, the loop processing of steps S21 to S29 is terminated. By executing the loop processing of steps S21 to S29 for each user other than the user A, the data table regarding the inter-individual distance of the user A is completed in the step S30. This inter-personal distance data table is created for each user from a certain user A, and the output is determined based on the aggregated data table.

By creating a data table related to the distance between individuals as described above, the intimacy between users is estimated from the relationship between users and the state of communication, and based on the estimation result, from the user to other users. Output (comments, utterances) can be prioritized. For example, by clearly indicating or outputting a large comment to a participant who has a high degree of intimacy, the comment that is of high interest to the participant can be output preferentially, so that remote communication with a sense of presence can be realized. Can be done. It also calculates indicators related to communication from the user to other users (for example, agree, disagree, understand), and based on this and output priority, outputs a graph expressing the value of the indicator, or an image based on the indicator. Images (for example, user's face image, facial expression reproduction model, facial expression reproduction icon, etc.) can be output.

(Second embodiment)
FIG. 15 is an example of a model screen of the communication support system according to the second embodiment.

The point of the communication support system of the second embodiment is for exhibitions on VR (Virtual Reality), and is remote communication performed in VR space. The points common to the first embodiment will be omitted, and the differences will be mainly described.

In the VR space image 27 output by the terminal 2 (7), the first exhibit 28, the facial expression reproduction icon 29 of the user who is looking at the first exhibit, and the second exhibit 30 are projected. .. As shown in the image 27 of the VR space, each user (avatar) in the VR space can grasp the direction, position, and distance of the line of sight in the virtual space three-dimensionally on the screen.

Further, the group importance 21 of the above-mentioned quantified feedback (2-axis graph) 20 and the first exhibit by the reaction feedback information 31 to the first exhibit for determining the user's line-of-sight target on the output screen. The screen of the facial expression information 32 of the user who is watching can also be used to visualize the atmosphere of the user who is watching the same object. This makes it possible to understand the facial expressions and atmospheres of the users participating in the VR space.

For example, if an acquaintance user makes a voice comment on an exhibit, other users who are looking at the same exhibit can hear the acquaintance user's voice in real time. This also applies when an acquaintance user displayed in the comment field makes a comment in a sentence, and the user is displayed as a sentence as it is. However, the opinions of general users (not related) who are viewing the same exhibition are displayed in Word cloud 18 at the word level.

FIG. 16 is feedback data periodically transmitted to the server according to the second embodiment.

A column for avatar position operation information and a column for avatar angle operation information have been added to the data table described with reference to FIG. In the column of avatar position operation information, the position in the VR space by the XYZ coordinates is shown. In the column of the avatar angle operation information, the numerical value regarding the line-of-sight direction in which the avatar is looking at the object in the VR space is shown. That is, the operation information of the avatar is sent together with the feedback data.

FIG. 17 is a calculation of the inter-individual distance according to the second embodiment.

The difference from the calculation of the inter-individual distance of the first embodiment shown in FIG. 14 is step S28A and step S29A. In step S28A, if the user X sees the same object as the user A, the value of the intimacy according to the relationship between the users is added in the calculation of the interpersonal distance between the user A and the user X. .. In step S29A, if the user X is within a certain distance from the user A in the VR, the intimacy according to the relationship between the users is calculated in the calculation of the interpersonal distance between the user A and the user X. Add the values. This completes the inter-individual distance table in the VR space (step S32A).

(Third embodiment)
FIG. 18 is a configuration example of a model of the communication support system according to the third embodiment.

The point in the third embodiment is remote communication via the avatar robot. Similar to the description of the second embodiment, the points common to the first embodiment will be omitted, and the differences will be mainly described.

The avatar robot 37 is equipped with an omnidirectional camera 34 and is operated at the presentation site 36 set up in the real space. The omnidirectional image 33 taken by the avatar robot 37 using the omnidirectional camera 34 is a projection of the on-site presenter 35, the on-site presenter 39, and the like existing around the avatar robot 37.

The avatar robot 37 transmits and projects the captured omnidirectional image 33 to each user 6aB to 6dB. Each user 6aB to 6dB can view the omnidirectional image 33 by an output terminal such as a PC, a pad, a smartphone, or a VR headset.

Each user 6aB to 6dB is watching the omnidirectional video 33, but the user 6bB, the user 6cB, and the user 6dB are watching the same object (object video 33b) in the omnidirectional video 33, and the user 6aB is all. You are looking at another object (object image 33a) in the surrounding image 33. Therefore, in FIG. 18, the user 6aB does not participate in the feedback from the user 6bB to the user 6dB to the site 36.

The server creates a facial expression reproduction model, a facial expression icon, etc. from the appearance and reaction of the user 6bB to the user 6dB watching the omnidirectional image 33, and feeds back to the site 36. The output of the feedback content is transmitted as feedback to the screen 38 in front of the on-site presenter 39 and the screen 41 in front of the on-site presenter 35 on the 360-degree four-sided display 40 installed in the avatar robot 37, respectively. The feedback content is, for example, screen 42. Specifically, the content is, for example, the impression of the user 6 dB on the content of the presentation by the on-site presenter 35.

FIG. 19 is a user viewing content, which is an example of a user viewing screen model according to the third embodiment. Similar to the description of FIG. 18, the points common to the first embodiment and the second embodiment will be omitted, and the differences will be mainly described.

The difference between the first embodiment (see FIGS. 8 and 9) and the second embodiment (see FIG. 15) is that the user's viewpoint image 46 in the avatar robot is used for the screen of the terminal 2 (7). That is. The user's viewpoint image 46 displays a presenter A39, a user B's avatar 44, a facial expression reproduction icon 45 representing the user B's feedback content, a user C's avatar 48, and a face image icon 47 representing the user C's feedback content. Has been done. This is because when the user is looking at the presenter A, the other user is also looking at the presenter A, and the user's viewpoint image 46 is accompanied by another user's avatar (Avatar 44 in FIG. 19). 48) is coming to appear. On the contrary, when the viewing angle of another user moves to a viewing angle deviating from a certain viewing angle of the user, the icon of the other user disappears from the user's screen.

FIG. 20 is an example of a model screen of the avatar robot according to the third embodiment.

FIG. 20 shows the state of the avatar robot as seen from the presenter A. The emotion icon 45, which is the feedback content of the user B, the face image icon 47, which is the feedback content of the user C, and the facial expression information 49, which is the feedback content of the user A, are projected on the 360-degree four-sided display 40 of the avatar robot. There is. Assuming that there is only an avatar robot in the field, the presenter A can visually understand the reaction to the content of his presentation by seeing this.

FIG. 21 is feedback data transmitted periodically according to the third embodiment.

FIG. 21 shows that the difference between the first embodiment and the second embodiment is that the viewpoint robot selection information and the line-of-sight angle information are included in the items. In the viewpoint robot selection information column, the identification number of the robot is recorded by selecting one robot from the plurality of robots in the field by the user. Further, as the line-of-sight angle information, information about the line-of-sight angle currently viewed by the user from the robot having the identification number selected in the viewpoint robot selection information field is recorded.

FIG. 22 is a calculation of the inter-individual distance according to the third embodiment.

FIG. 22 shows the difference between the first embodiment (see FIG. 14) and the second embodiment (see FIG. 17) in step S28B and step S29B. In step S28B, when the user X is looking at the camera of the same robot among the plurality of avatar robots having the user A, that is, the avatar robot that outputs the image to be viewed by the user X and the avatar robot visually recognized by the user A. Are the same, in the calculation of the distance between the individuals of the user A and the user X, the value of the intimacy according to the relationship between these users is added.

In step 29B, based on step 28B, if the line-of-sight direction of user X is close to the line-of-sight direction of user A, that is, when the line-of-sight direction of user X is within a predetermined error range with the line-of-sight direction of user A, the user. In the calculation of the inter-individual distance between A and the user X, the value of the intimacy according to the relationship between these users is added. The inter-individual distance table is completed by the flow of steps S20B to 32B including this.

According to the first to third embodiments of the present invention described above, the following effects are exhibited.

(1) In a communication support system that supports remote communication performed among a plurality of participants, the server 1 is an input reception unit that receives input from participants participating in the communication support system, and among a plurality of participants. , The intimacy estimation unit that estimates the intimacy between the first participant and the other participants based on the information about the first participant and the other participants, and the parent estimated by the intimacy estimation unit. An output priority calculation unit that calculates an output priority that determines the priority of output contents output from other participants to the first participant based on the density, and an output calculated by the output priority calculation unit. It has an output unit that outputs the output contents to the first participant based on the priority. By doing so, it is possible to provide a communication support system that can realize remote communication that achieves both an improvement in presence and anonymity.

(2) In the communication support system, the server 1 has an index calculation unit that calculates an index related to communication between the first participant and other participants. The output unit determines the output content based on the output priority calculated by the output priority calculation unit and the index calculated by the index calculation unit. By doing so, it is possible to give feedback according to the communication situation between the participants to the presenter who is making a presentation to other participants.

(3) When the output priority calculated by the output priority calculation unit is higher than the predetermined value for the first participant, the output unit is taken as an input received by the input reception unit, for example, with a camera. The image of the face of another participant is output as it is, and if it is lower than the predetermined value, for example, a facial expression reproduction model or a facial expression reproduction icon is output as an image based on an index related to communication. By doing so, it is possible to provide feedback that guarantees anonymity without missing important feedback.

(4) The intimacy estimation unit sets the relationship from the first participant to the other participants by the screen of FIG. 4, and from the first participant to the other participants by the screen of FIG. The intimacy can be estimated based on the type of transmission information set for the subject and the amount of feedback set from the first participant to the group to which the other participants belong from the screen of FIG. .. By doing so, it is possible to estimate the intimacy in consideration of the relationship between users.

(5) The image based on the index related to communication includes a facial expression reproduction model 24 which is a model image that reproduces the facial expressions of the participants. The output unit can create a facial expression reproduction model 24 from a camera-photographed image 22 of each participant's face. By doing so, it is possible to create a sense of intimacy while ensuring anonymity for a person who has a certain sense of intimacy.

(6) The image based on the index related to communication includes the facial expression reproduction icon 25, which is an icon that reproduces the facial expression of the participant. The output unit can create the facial expression reproduction icon 25 from the camera-captured image 22 of each participant's face. By doing so, it is possible to give feedback by expressing emotions that ensure anonymity to a person who is not acquainted.

(7) In the scene of remote communication using the VR space, the intimacy estimation unit is used when the avatars of other participants are looking at the same target as the avatars of the first participant, or other participants. When at least one of the avatars of the first participant is within a certain distance from the avatar of the first participant, in the calculation of the intimacy between the first participant and the other participant, the first. Add the values according to the relationship between one participant and other participants. By doing so, the intimacy can be estimated from the user's avatars in the VR space.

(8) In the scene of remote communication using the avatar robot, the intimacy estimation unit visually recognizes the avatar robot that outputs the video to be viewed by other participants and the first participant among the plurality of avatar robots. The first is when at least one of the following is satisfied when the avatar robot is the same as the robot, or when the viewing direction of another participant is within a predetermined error from the viewing direction of the first participant. In the calculation of the intimacy between a participant and another participant, a value corresponding to the relationship between the first participant and the other participant is added. By doing so, even when the avatar robot is used, it is possible to calculate the inter-individual distance between the users and estimate the intimacy between the users.

The present invention is not limited to the above embodiment, and various modifications and other configurations can be combined within a range that does not deviate from the gist thereof. Further, the present invention is not limited to the one including all the configurations described in the above-described embodiment, and includes the one in which a part of the configurations is deleted.

1 ... Server 2 ... Information terminal (PC)
3 ... Announcement user 4 ... First group 5 ... Second group 6a-6c ... Auditing user 7 ... Information terminal (pad)
12 ... Feedback time transition graph 13 ... Video of another user 14 ... Voice ON / OFF button 15 ... Video ON / OFF button 16 ... Video displaying one's facial expression 17 ... Transmission data 18 ... Word cloud
19 ... Text field 20 ... Feedback 2-axis graph 21 ... Plot showing group importance 22 ... Camera shot image 23 ... Feature point extraction image 24 ... Facial expression reproduction model 25 ... Facial expression reproduction icon 26 ... Abata looking at the first exhibit 27 ... VR space image 28 ... 1st exhibit 29 ... Avatar (user) looking at the 1st exhibit Facial expression reproduction icon 30 ... 2nd exhibit 31 ... Feedback information 32 ... Looking at the 1st exhibit User's facial expression information 33 ... All-around image of the

avatar robot

33a, 33b ... Object 34 ... All-around camera 35 ... On-site presenter B reflected in the all-around image of the avatar robot
36 ... Site 37 ... Abata (operating) robot 38 ... Screen in front of site presenter A 39 ... Site presenter A
40 ... 360-degree four-sided display 41 ... Screen in front of presenter B on site 42 ... Screen on which the facial expression feedback of the viewing user is reflected 44 ... Avatar 45 of user B ... Facial expression reproduction icon 46 of user B ... User's viewpoint image (Abata robot) )
47 ... Face image icon of user C 48 ... Avata of user C 49 ... Facial expression reproduction model of user A

Claims

A communication support system that supports remote communication between multiple participants.
An input reception unit that accepts input from the participants who participate in the communication support system, and
An intimacy estimation unit that estimates the intimacy between the first participant and the other participants based on the information about the first participant and the other participants among the plurality of participants. ,
Output priority calculation that determines the priority of the output content output from the other participant to the first participant based on the intimacy estimated by the intimacy estimation unit. Department and
A communication support system including an output unit that outputs the output contents to the first participant based on the output priority calculated by the output priority calculation unit.
The communication support system according to claim 1.
It has an index calculation unit that calculates an index related to communication between the first participant and the other participants.
The output unit is a communication support system that determines the output content based on the output priority calculated by the output priority calculation unit and the index calculated by the index calculation unit.
The communication support system according to claim 2.
When the output priority calculated by the output priority calculation unit is higher than the predetermined value, the output unit outputs the input received by the input reception unit to the first participant, and determines the output. If it is lower than, a communication support system that outputs an image based on the above index.
The communication support system according to claim 1.
The intimacy estimation unit is of the relationship set from the first participant to the other participant and the transmission information set from the first participant to the other participant. A communication support system that estimates the intimacy based on the type and the amount of feedback set from the first participant to the group to which the other participant belongs.
The communication support system according to claim 3.
The image based on the index includes a facial expression reproduction model which is a model image that reproduces the facial expression of the participant.
The output unit is a communication support system that creates and outputs the facial expression reproduction model from images of the faces of the participants.
The communication support system according to claim 2.
The image based on the index includes a facial expression reproduction icon which is an icon that reproduces the facial expression of the participant.
The output unit is a communication support system that creates and outputs the facial expression reproduction icon from an image of each participant's face.
The communication support system according to claim 1.
In the scene where the remote communication is performed using the VR space,
In the intimacy estimation unit, when the avatar of the other participant is looking at the same target as the avatar of the first participant, or the avatar of the other participant is the avatar of the first participant. The first participant and the other in the calculation of the intimacy between the first participant and the other participant when at least one of them is satisfied when they are within a certain distance from. A communication support system that adds values according to the relationship with the participants.
The communication support system according to claim 1.
In the scene where the remote communication is performed using the avatar robot
The intimacy estimation unit is used when, among the plurality of the avatar robots, the avatar robot that outputs the video viewed by the other participant and the avatar robot visually recognized by the first participant are the same, or When at least one of the line-of-sight direction of the other participant is within a predetermined error range from the line-of-sight direction of the first participant, the first participant and the other participant A communication support system that adds values according to the relationship between the first participant and the other participants in the calculation of the intimacy.