WO2023095531A1 - 情報処理装置、情報処理方法および情報処理プログラム - Google Patents
情報処理装置、情報処理方法および情報処理プログラム Download PDFInfo
- Publication number
- WO2023095531A1 WO2023095531A1 PCT/JP2022/040089 JP2022040089W WO2023095531A1 WO 2023095531 A1 WO2023095531 A1 WO 2023095531A1 JP 2022040089 W JP2022040089 W JP 2022040089W WO 2023095531 A1 WO2023095531 A1 WO 2023095531A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user
- information
- sensing
- information processing
- situation
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims abstract description 143
- 238000003672 processing method Methods 0.000 title claims abstract description 7
- 238000004891 communication Methods 0.000 claims description 81
- 238000000034 method Methods 0.000 claims description 22
- 238000012545 processing Methods 0.000 claims description 16
- 230000002996 emotional effect Effects 0.000 claims description 14
- 230000007613 environmental effect Effects 0.000 claims description 13
- 230000003542 behavioural effect Effects 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 12
- 230000036541 health Effects 0.000 claims description 9
- 230000001755 vocal effect Effects 0.000 description 65
- 238000010586 diagram Methods 0.000 description 15
- 230000008921 facial expression Effects 0.000 description 12
- 230000006870 function Effects 0.000 description 12
- 238000001514 detection method Methods 0.000 description 9
- 238000004458 analytical method Methods 0.000 description 7
- 230000008859 change Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000015654 memory Effects 0.000 description 4
- 206010011224 Cough Diseases 0.000 description 3
- 206010041349 Somnolence Diseases 0.000 description 3
- 230000009471 action Effects 0.000 description 3
- 230000036760 body temperature Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 239000004065 semiconductor Substances 0.000 description 3
- 238000010191 image analysis Methods 0.000 description 2
- 206010048909 Boredom Diseases 0.000 description 1
- 206010011469 Crying Diseases 0.000 description 1
- 208000031361 Hiccup Diseases 0.000 description 1
- 208000032140 Sleepiness Diseases 0.000 description 1
- 230000036772 blood pressure Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000035622 drinking Effects 0.000 description 1
- 238000010195 expression analysis Methods 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000037321 sleepiness Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 230000002618 waking effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M11/00—Telephonic communication systems specially adapted for combination with other electrical systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
Definitions
- the present disclosure relates to an information processing device, an information processing method, and an information processing program.
- a telepresence system is a communication tool that allows users in remote locations to feel as if they are face-to-face.
- Japanese Patent Laid-Open No. 2002-200001 presents text information related to the content of a user's utterance with an appropriate effect according to the voice, facial expression, and actions of the user who is the speaker during telepresence, thereby enabling conversation.
- a supporting technique is proposed.
- Patent Document 2 proposes a technique of rewriting the video or audio as necessary and presenting it to each user when it is determined that it is not desirable to present the video or audio of the speaker as is to the other user. ing.
- JP 2021-71632 A Japanese Patent Application Laid-Open No. 2021-21025
- the present disclosure proposes an information processing device, an information processing method, and an information processing program capable of controlling the disclosure level of information regarding video, audio, and operations from the standpoint of a listener and a speaker.
- an information processing apparatus uses a sensing device to obtain an acquisition unit that acquires first sensing information of the first user obtained by using a sensing device and second sensing information of the second user that is acquired using a sensing device; the first sensing information and the second sensing specifying the disclosure level of the information transmitted and received between the first user and the second user based on the first user's situation and the second user's situation corresponding to the information and an information processing unit.
- FIG. 1 is a diagram showing the configuration of a telepresence system according to this embodiment; FIG. It is a figure for demonstrating non-verbal information.
- FIG. 10 is a diagram showing a setting example of a disclosure level of non-verbal information; It is a figure which shows the structural example of the telepresence apparatus which concerns on this embodiment.
- FIG. 4 is a diagram (1) for explaining a method of specifying a disclosure level of non-verbal information;
- FIG. 2 is a diagram (2) for explaining a method of specifying a disclosure level of non-verbal information;
- 4 is a flow chart showing a processing procedure of the telepresence device 50 according to the embodiment;
- FIG. 4 is a diagram for explaining an example of disclosure level control;
- FIG. 3 is a diagram illustrating a configuration example of a server 100;
- FIG. 1 is a hardware configuration diagram showing an example of a computer that realizes functions of a telepresence device and a server;
- Embodiment 1-1 Configuration of telepresence system according to embodiment 1-2.
- Definition of non-verbal information 1-3 Definition of disclosure level of non-verbal information 1-4.
- Identification of disclosure level 1-5-1 Processing for detecting speaker's situation 1-5-1-1.
- Content of communication 1-5-1-2 Changes in external environment 1-5-1-3.
- Human condition 1-5-1-4 Relationship between speaker and listener 1-5-2. Processing for detecting listener's situation 1-5-2-1.
- External environmental factors 1-5-2-2 Processing for detecting listener's situation 1-5-2-1.
- External environmental factors 1-5-2-2 Human internal state 1-5-2-3. Human behavioral state 1-5-2-4.
- Emotional expression 1-6 Processing procedure of telepresence device according to embodiment 1-7.
- FIG. 1 is a diagram showing the configuration of a telepresence system according to an embodiment. As shown in FIG. 1, this presence system has telepresence devices 50 a and 50 b and a server 100 . Telepresence devices 50 a and 50 b and server 100 are interconnected via network 5 .
- the telepresence device 50a is operated by the user 1A at the point A.
- the telepresence device 50b is operated by the user 1B at the point B.
- FIG. In the following description, the telepresence devices 50a and 50b will be referred to as the telepresence device 50 when they are not distinguished from each other.
- the users 1A and 1B use the telepresence device 50 to hold an online conference or the like.
- the telepresence system In the telepresence system according to the present embodiment, information such as video, audio, and operations are transmitted and received in real time between the telepresence devices 50a and 50b (performing two-way communication). In this way, the telepresence system provides an interactive environment by exchanging information in real time and making the users 1A and 1B feel as if they are face-to-face.
- two-way communication is performed between two points, point A and point B, but two-way communication can also be performed at three or more points, including multiple other points.
- the server 100 is a device that records log information regarding information transmitted and received between the telepresence devices 50 while the user 1A and the user 1B are having an online conference or the like. The server 100 notifies the log information to the telepresence device 50 .
- the server 100 also manages user characteristic information.
- the characteristic information includes information such as the user's name, sex, age, and human relationships.
- the server 100 notifies the telepresence device 50 of the property information.
- non-verbal information When communicating with a partner using the telepresence system shown in FIG. 1, information is exchanged by presenting/sharing contents or using voice calls. At this time, information obtained as clues other than content and language is defined as "non-linguistic information" in this embodiment.
- FIG. 2 is a diagram for explaining non-verbal information.
- non-verbal information is classified into three types: vision, sound, and operation.
- Images are visual information.
- the video includes information on the user's facial expression, line of sight/blink, nodding/head shaking, posture, gesture, hairstyle, and clothing.
- Speech is information that can be heard with the ear.
- the voice includes information on the user's voice loudness/pitch, voice speed/speech volume, voice brightness, backtracking, sighing, hiccups, and coughing.
- Operations are information about user actions in the application software. Operations include cursor movement, key input, and display area information.
- the telepresence device 50 can adjust the type and amount of nonverbal information to be exchanged with the other party by controlling the camera function and the microphone function.
- the disclosure level of non-verbal information is set in multiple stages.
- FIG. 3 is a diagram showing a setting example of the disclosure level of non-verbal information.
- the degree of importance of each item of video, audio, and operation is set as video>audio>operation based on the amount of information, and corresponds to the disclosure level.
- the disclosure level has a level value of 1 to 4, and the higher the level value, the greater the variety and amount of non-verbal information disclosed to the other party.
- video (all), audio, and operations are disclosed to the other party.
- video (partial), audio, and operations are disclosed to the other party.
- voice and operations are disclosed to the other party.
- only the operation is disclosed to the other party.
- the telepresence device 50 specifies the disclosure level of nonverbal information and transmits the nonverbal information corresponding to the disclosure level to the telepresence device 50 of the other party.
- the telepresence device 50 may be configured to specify the disclosure level.
- FIG. 4 is a diagram showing a configuration example of the telepresence device 50 according to this embodiment. Here, description will be made using the telepresence device 50a.
- the telepresence device 50a has a communication section 51, an input section 52, an output section 53, a storage section 54, and a control section 55.
- FIG. 5 is a diagram showing a configuration example of the telepresence device 50 according to this embodiment.
- the telepresence device 50a has a communication section 51, an input section 52, an output section 53, a storage section 54, and a control section 55.
- the communication unit 51 is realized by, for example, a NIC (Network Interface Card) or the like.
- the communication unit 51 is connected to the network 5 by wire or wirelessly, and transmits and receives information to and from the server 100 and the telepresence device 50b via the network 5 .
- the input unit 52 has a camera 52a, a microphone 52b, and various sensors 52c.
- the camera 52a is a device for capturing images.
- the microphone 52b is a device that collects sound. Although illustration is omitted, the input unit 52 may include input devices such as a keyboard and a mouse.
- the various sensors 52c include a biosensor that measures biometric information, an external environment sensor that measures external environment information, and the like.
- the biometric information corresponds to information such as the user's body temperature, perspiration amount, blood pressure, and heartbeat.
- the external environment information corresponds to information such as the surrounding environment (air temperature, temperature, humidity) in which the telepresence device 50a is located.
- the camera 52a, the microphone 52b, and the various sensors 52c may be present in plural rather than one at a time. Also, the camera 52a, the microphone 52b, and the various sensors 52c need not be integrated with the telepresence device 50a, and may be portable wearable devices.
- the output unit 53 has a display 53a, a speaker 53b, and an actuator 53c.
- the display 53a is a device that displays an image.
- the speaker 53b is a device that outputs sound.
- the actuator 53c is a device that generates vibration, heat, smell, wind, or the like.
- the display 53a, the speaker 53b, and the actuator 53c may be present in plural rather than one at a time.
- the storage unit 54 is realized, for example, by a semiconductor memory device such as RAM (Random Access Memory) or flash memory, or a storage device such as a hard disk or optical disk.
- the storage unit 54 has first sensing information 54a, second sensing information 54b, log information 54c, characteristic information 54d, and disclosure level information 54e.
- the first sensing information 54a includes image information captured by the camera 52a of the telepresence device 50a, audio information collected by the microphone 52b of the telepresence device 50a, and various sensors 52c of the telepresence device 50a. It corresponds to biometric information and external environment information.
- the second sensing information 54b is image information captured by the camera on the telepresence device 50b side, audio information collected by the microphone on the telepresence device 50b side, and various sensors on the telepresence device 50b side. It corresponds to biometric information and external environment information.
- the log information 54c includes information that has been transmitted and received between the telepresence devices 50a and 50b (or other telepresence devices) in the past.
- the information that has been sent and received in the past may be the non-verbal information described with reference to FIG. 2, or may be other information such as usage status, meeting minutes, and chat history.
- the log information 54c is notified from the server 100. FIG.
- the characteristic information 54d includes information such as the user's name, gender, age, and human relationships.
- the characteristic information 54 d is notified from the server 100 .
- the disclosure level information 54e is disclosure level information set by the information processing section 55b of the control section 55, which will be described later.
- the control unit 55 executes a program stored inside the telepresence device 50 using, for example, a CPU (Central Processing Unit) or MPU (Micro Processing Unit) as a working area such as a RAM (Random Access Memory). Realized. Also, the control unit 55 may be implemented by an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array).
- the control unit 55 has an acquisition unit 55a, an information processing unit 55b, and a communication control unit 55c.
- the acquisition unit 55a acquires image information from the camera 52a during non-face-to-face communication and registers it in the first sensing information 54a.
- the acquisition unit 55a acquires voice information from the microphone 52b and registers it in the first sensing information 54a.
- the acquisition unit 55a acquires biological information and external environment information from the various sensors 52c and registers them in the first sensing information 54a.
- the acquisition unit 55a acquires the second sensing information 54b from the telepresence device 50b and registers it in the storage unit 54 during non-face-to-face communication.
- the acquisition unit 55a acquires the log information 54c and the characteristic information 54d from the server 100.
- the acquisition unit 55a registers the log information 54c and the characteristic information 54d in the storage unit 54.
- the information processing unit 55b specifies the disclosure level when notifying non-verbal information from the "speaker" to the "listener". do.
- the information processing unit 55b registers the identified disclosure level information in the storage unit 140 as the disclosure level information 54e.
- the information processing section 55b repeatedly executes the process of specifying the disclosure level at predetermined intervals, and updates the disclosure level information 54e each time the disclosure level is specified.
- the user 1A at the point A is the "speaker" and the user 1B at the point B is the "listener”.
- the information processing section 55b sets the initial value of the disclosure level in the disclosure level information 54e.
- the communication control unit 55c controls the type and amount of non-verbal information transmitted from the telepresence device 50a to the telepresence device 50b according to the disclosure level set in the disclosure level information 54e.
- the communication control unit 55c transmits non-verbal information such as video (all), audio, and operations input from the input unit 52 to the telepresence device 50b.
- the communication control unit 55c transmits non-verbal information such as video (partial), audio, and operations input from the input unit 52 to the telepresence device 50b.
- non-verbal information such as video (partial), audio, and operations input from the input unit 52 to the telepresence device 50b.
- the video (part) only the video of the face area of the video of the user 1A captured by the camera 52a is transmitted to the telepresence device 50b. What kind of video (part) is to be transmitted to the telepresence device 50b is set in advance.
- the communication control unit 55c transmits the voice input from the input unit 52 and the non-verbal information of the operation to the telepresence device 50b.
- the communication control unit 55c transmits the non-verbal information of the operation input from the input unit 52 to the telepresence device 50b.
- the communication control unit 55c causes the output unit 53 to output the received non-verbal information when receiving the non-verbal information from the telepresence device 50b.
- the nonverbal information transmitted from the telepresence device 50b is nonverbal information of a type and amount controlled based on the disclosure level specified by the telepresence device 50b.
- the information processing unit 55b of the telepresence device 50a described with reference to FIG. 4 identifies the situation of the speaker based on the first sensing information 54a, and detects the situation of the listener based on the second sensing information 54b.
- the speaker's situation is "whether the speaker wants to know the listener's (partner's) non-verbal information or not.”
- the listener's situation is defined as ⁇ whether the listener wants to convey his/her own nonverbal information to the speaker (partner) or not''.
- the information processing section 55b identifies the disclosure level based on the detection result.
- FIG. 5 is diagrams for explaining a method of specifying the disclosure level of non-verbal information.
- FIG. 5 will be described. As shown in FIG. 5, communication situations are classified into four patterns according to the situation of the speaker and the listener.
- the "first pattern” would be used.
- the "second pattern” is applied.
- the pattern shown in FIG. 6 is obtained. That is, when the communication situation is the first pattern, the disclosure level is "4". When the communication situation is the second pattern or the third pattern, the disclosure level is "2-3". When the communication situation is the fourth pattern, the disclosure level is "1".
- the disclosure level is set to "2-3 (may be 2 or 3)".
- the disclosure level corresponding to the three patterns is assumed to be "2".
- the disclosure levels set for the second and third patterns may be changed as appropriate, but the disclosure levels set for the second and third patterns are higher than the disclosure level of the fourth pattern and higher than the disclosure level of the first pattern. is also small.
- the information processing section 55b executes the following processes based on the first sensing information 54a, the second sensing information 54b, the log information 54c, and the characteristic information 54d. Various detection methods are shown below, but the information processing section 55b may use any one detection method, or may use a plurality of detection methods in combination.
- the information processing section 55b executes a plurality of detection methods, and determines the situation of the speaker by a weighted majority rule or the like when each determination result is different.
- the information processing section 55b performs voice analysis on the voice information of the first sensing information 54a, and determines whether or not the speaker is asking a question or making a proposal.
- the time stamp function of the server 100 is used to register information such as the start time and end time of the speech of the speaker and the listener.
- the information processing unit 55b determines that the speaker's situation is "I want to know the listener's non-verbal state" when there is no utterance from the listener for a predetermined time or longer from the time when the speaker made a question or proposal. . On the other hand, the information processing unit 55b determines that when the listener utters an utterance within a predetermined time from the time when the speaker makes a question or proposal, the speaker's situation is "I do not want to know the listener's nonverbal state.” Determine that there is.
- the information processing unit 55b analyzes the image information of the first sensing information 54a, and if the facial expression of the speaker is a predetermined facial expression (embarrassed facial expression), or if the speaker is making a predetermined gesture (waving his hand), etc. Alternatively, it may be determined that the speaker's situation is "I want to know the listener's nonverbal state.”
- the information processing unit 55b performs voice analysis on the voice information of the second sensing information 54b, and determines whether or not a predetermined external environment sound is included.
- the information processing unit 55b determines that the speaker's situation is "I want to know the nonverbal state of the listener.”
- the information processing unit 55b determines that the speaker's situation is "I do not want to know the listener's nonverbal state.” I judge.
- the information processing unit 55b may analyze the image information of the first sensing information 54a, further determine whether the speaker's facial expression is a predetermined facial expression (an embarrassed facial expression), and determine the speaker's situation.
- the information processing unit 55b performs voice analysis on the voice information of the second sensing information 54b, and determines whether or not a predetermined unpleasant sound or voice (coughing, tongue clicking, sighing) is included.
- the information processing unit 55b determines that the speaker's situation is "want to know the listener's nonverbal state" when the voice information of the second sensing information 54b includes a predetermined sound that makes the listener feel uncomfortable. do.
- the information processing unit 55b determines that the speaker's situation is "I do not want to know the listener's nonverbal state.” Determine that there is.
- the information processing unit 55b analyzes the image information of the second sensing information 54b, further determines whether the expression of the listener is a predetermined expression (sleepy face, bored face), and determines the situation of the speaker. You may Further, the information processing unit 55b determines whether or not the characteristics of changes in the pulse and body temperature are characteristics indicating sleepiness and boredom of the listener, based on the biological information of the first sensing information 54a. A speaker's situation may be determined.
- the information processing unit 55b Based on the log information 54c, the information processing unit 55b counts the number of times of past communication between the speaker and the listener, and when the number of times of communication is less than a predetermined number, the speaker's situation is "listener's nonverbal state.” I want to know.” On the other hand, when the number of times of communication is less than the predetermined number, the information processing section 55b determines that the speaker's situation is "I do not want to know the listener's nonverbal state.”
- the information processing unit 55b refers to the human relationship between the speaker and the listener. It may be determined that "I do not want to know the non-verbal state of the listener".
- the information processing section 55b executes the following processes based on the first sensing information 54a, the second sensing information 54b, the log information 54c, and the characteristic information 54d. Various detection methods are shown below, but the information processing section 55b may use any one detection method, or may use a plurality of detection methods in combination.
- the information processing section 55b executes a plurality of detection methods, and when each determination result is different, determines the listener's situation by a weighted majority vote or the like.
- the information processing unit 55b of the telepresence device 50a on the speaker side detects the situation of the listener. , the situation of the listener may be detected and notified to the information processing section 55b.
- the information processing section 55b analyzes the image information and audio information of the second sensing information 54b and determines whether or not a predetermined event has occurred.
- the information processing unit 55b determines that a predetermined event has occurred when there is no listener from the image information, or when the voice information includes the sound of the intercom, the ringing sound of the telephone, or the sound of rain.
- the information processing section 55b determines that the listener's situation is "the listener does not want to convey his/her own non-verbal information to the speaker". On the other hand, when the predetermined event has not occurred, the information processing section 55b determines that the listener's situation is "the listener wants to convey his/her own non-verbal information to the speaker".
- the information processing unit 55b performs facial expression analysis on the image information of the second sensing information 54b, and determines that the internal state of the listener is in a predetermined state (waking up, sleeping badly, poor physical condition, unshaven, No make-up) is determined.
- the information processing section 55b may determine whether or not the internal state of the listener is in a predetermined state using the biological information (information on body temperature and heartbeat) of the second sensing information 54b.
- the information processing unit 55b determines that the listener's situation is "the listener does not want to convey his/her own non-verbal information to the speaker". On the other hand, when the listener's internal state is not in a predetermined state, the information processing unit 55b determines that the listener's situation is "the listener wants to convey his/her own non-verbal information to the speaker".
- the information processing unit 55b analyzes the state of the second sensing information 54b, and if the listener is performing a predetermined action (surfing the Internet, playing a smartphone game, or taking care of a child), the listener's situation is determined as "the listener is I don't want to convey language information to the speaker.” On the other hand, the information processing unit 55b analyzes the state of the second sensing information 54b, and if the listener does not perform the predetermined action, the listener's situation is "the listener wants to convey his/her own nonverbal information to the speaker". I judge.
- the information processing unit 55b analyzes the image information of the second sensing information 54b, and determines whether the listener is properly listening to the story based on the direction of the listener's face and the degree of retention of the line of sight. The information processing section 55b determines that the listener is properly listening when the direction of the face or the position of the line of sight is in a certain direction for a predetermined time.
- the information processing unit 55b performs language analysis using a speech analysis technology on the speech information of the second sensing information 54b. determines that it wants to ask the speaker a question.
- the information processing unit 55b performs language analysis using a speech analysis technology on the speech information of the second sensing information 54b. Determine that the listener wants to express gratitude.
- the information processing unit 55b performs image analysis on the image information of the second sensing information 54b, counts the number of nods, and when the number of nods reaches a predetermined number or more, the listener wants to express gratitude. can be determined.
- the information processing unit 55b performs image analysis on the image information of the second sensing information 54b, and determines the overall emotional state (whether or not you are listening, whether you have a question, etc.) based on your facial expression (a facial expression that you want to talk), gestures, and the like. whether or not there is, whether or not to express gratitude, etc.).
- the information processing unit 55b determines that the listener's situation is "the listener wants to convey his/her own nonverbal information to the speaker.” ”. On the other hand, the information processing unit 55b determines that the listener's situation is "the listener is listening to the talk, has a question, or wants to express his gratitude.” I don't want to tell the speaker.”
- FIG. 7 is a flow chart showing the processing procedure of the telepresence device 50 according to this embodiment.
- the acquisition unit 55a of the telepresence device 50a starts acquiring first sensing information 54a from the input unit 52, and starts acquiring second sensing information 54b from the telepresence device 50b (step S101). ).
- the acquisition unit 55a of the telepresence device 50 acquires the log information 54c and the characteristic information 54d from the server 100 (step S102).
- the information processing unit 55b of the telepresence device 50 executes user recognition (step S103). For example, in step S103, the information processing unit 55b acquires whether or not the target user is present at the point where the non-face-to-face communication is performed using the telepresence system.
- the telepresence device 50a starts non-face-to-face communication with the telepresence device 50b (step S104).
- the information processing section 55b sets the disclosure level to the initial value (step S105). For example, although the initial value of the disclosure level is set to disclosure level "1", it may be changed as appropriate.
- the communication control unit 55c of the telepresence device 50a starts transmitting and receiving non-verbal information according to the disclosure level with the telepresence device 50b (step S106).
- the information processing section 55b identifies the situation of the speaker based on the first sensing information 54a (step S107).
- the information processing section 55b identifies the listener's situation based on the second sensing information 54b (step S108).
- the information processing section 55b updates the disclosure level based on the situation of the speaker and the situation of the listener (step S109).
- step S110 If the telepresence device 50a continues the process (step S110, Yes), it proceeds to step S107. On the other hand, if the telepresence device 50a does not continue the processing, it ends the non-face-to-face communication (step S111).
- FIG. 8 is a diagram for explaining an example of disclosure level control.
- FIG. 8 shows changes in disclosure levels of non-verbal information with each other when persons 1A and 1B at different locations perform non-face-to-face communication using the telepresence device 50 .
- the horizontal axis in FIG. 8 is the axis corresponding to the time from the start to the end of communication.
- Line segment 6a indicates which one was the "speaker”.
- a line segment 6b indicates who was the "listener”.
- the person 1A is the speaker and the person 1B is the listener, and they change several times during the communication.
- level 1 is the initial value of the first disclosure level and the second disclosure level. It is assumed that the situation of the speaker and the situation of the listener during communication change as shown in FIG. 8 from time t 1 to t 6 .
- Time t1 will be explained.
- person 1A is the speaker and person 1B is the listener.
- the listener's situation becomes "the listener wants to convey his/her own non-verbal information to the speaker".
- the communication situation shown in FIG. 5 changes from the fourth pattern to the second pattern, and the second disclosure level changes from one to two.
- the telepresence device 50b transmits non-verbal information based on disclosure level 2 to the telepresence device 50a.
- Time t2 will be explained.
- person 1B is the speaker and person 1A is the listener.
- the speaker's situation becomes "the speaker wants to know the listener's non-verbal information".
- the communication situation shown in FIG. 5 changes from the second pattern to the first pattern, and the first disclosure level changes from 1 to 4.
- the telepresence device 50a transmits non-verbal information based on the first disclosure level 4 to the telepresence device 50b.
- Time t3 will be explained.
- person 1B is the speaker and person 1A is the listener.
- the listener's situation is "the listener communicates his/her own nonverbal information to the speaker (partner)." I don't want to tell you.”
- the communication situation shown in FIG. 5 is converted to the third pattern, and the first disclosure level changes from 4 to 2.
- the telepresence device 50a transmits non-verbal information based on the first disclosure level 2 to the telepresence device 50b.
- Time t4 will be explained.
- person 1B is the speaker and person 1A is the listener.
- the person 1A wants to take a break or eat and drink while the person 1B is talking.
- the listener's situation becomes "the listener does not want to convey his/her own non-verbal information to the speaker (partner)".
- the person 1A does not want to show the person 1B stretching and eating and drinking, voice conversation is still possible.
- the communication situation shown in FIG. 5 is converted to the third pattern, and the first disclosure level changes from 4 to 2. It is assumed that the first disclosure level is updated to 4 between time t3 and time t4 .
- Time t5 will be explained.
- person 1A and person 1B wanted to take a break from each other, or stopped talking and decided to make work coordinates.
- the situations of the persons 1A and 1B are "the speaker does not want to know the non-verbal state of the listener" and "the listener does not want to convey their own non-verbal information to the speaker”.
- the communication situation shown in FIG. 5 is converted to the fourth pattern, and the first disclosure level and the second disclosure level are changed to 1.
- the telepresence device 50a transmits non-verbal information based on the first disclosure level 1 to the telepresence device 50b.
- the telepresence device 50b transmits non-verbal information based on the second disclosure level 1 to the telepresence device 50a.
- Time t6 will be explained.
- the person 1A and the person 1B are able to understand each other, but with the remaining time approaching, it is time for the final confirmation or summary.
- the situations of the persons 1A and 1B are "the speaker wants to know the listener's nonverbal state" and "the listener wants to convey his own nonverbal information to the speaker”.
- the communication situation shown in FIG. 5 is converted to the first pattern, and the first disclosure level and the second disclosure level change from 1 to 4.
- the telepresence device 50a transmits non-verbal information based on the first disclosure level 4 to the telepresence device 50b.
- the telepresence device 50b transmits non-verbal information based on the second disclosure level 4 to the telepresence device 50a.
- the telepresence device 50 provides the first usage data acquired using the sensing device while the first user and the second user are having an online call. First sensing information of the user and second sensing information of the second user acquired using the sensing device are acquired. The telepresence device 50 communicates between the first user and the second user based on the situation of the first user and the situation of the second user corresponding to the first sensing information and the second sensing information. Identify the disclosure level of information sent and received in This makes it possible to specify the disclosure level of non-verbal information relating to images, sounds, and operations from the standpoints of listeners and speakers.
- the telepresence device 50 controls the type and amount of information transmitted and received between the first user and the second user based on the disclosure level.
- non-verbal information can be transmitted and received according to the type and amount of information according to the position of the listener and the speaker.
- comfortable communication can be realized, neither the speaker nor the listener will feel bad about each other, there is no need or worry to interrupt communication in the middle, stress can be reduced during non-face-to-face communication, time during non-face-to-face communication It is possible to contribute to the effective use of communication, and to improve the degree of understanding of non-face-to-face communication content.
- the telepresence device 50 acquires first sensing information and second sensing information acquired using at least one sensing device of a microphone, a camera, and a sensor. This makes it possible to acquire information for specifying the disclosure level.
- the telepresence device 50 determines whether the first user wants to know the information of the second user as the situation of the first user, Based on the first sensing information and the second sensing information, it is determined whether or not the second user wants to convey the information of the first user as the situation of the second user. This makes it possible to determine the situation of the speaker and the situation of the listener for specifying the disclosure level.
- the telepresence device 50 Based on the first sensing information and the second sensing information, the telepresence device 50 identifies the communication content between the first user and the second user, and based on the communication content, identifies the communication content of the first user. determine the situation. This makes it possible to specify the situation of the first user (speaker) according to the content of the communication.
- the telepresence device 50 identifies the external environment of the second user based on the second sensing information, and determines the situation of the first user based on the external environment. This makes it possible to specify the situation of the first user (speaker) in response to changes in the external environment of the second user (listener).
- the telepresence device 50 identifies whether or not the second sensing information includes information that allows the first user to confirm an unpleasant situation, and based on the identification result, determines whether the first user's determine the situation. This makes it possible to specify the situation of the first user (speaker) according to the situation of the second user (listener).
- the telepresence device 50 identifies the second user's external environmental factors based on the second sensing information, and determines the second user's situation based on the external environmental factors. As a result, it is possible to specify the situation of the second user (listener) according to external environmental factors of the second user (listener) (parcel service calls, rain, phone calls, etc.).
- the telepresence device 50 Based on the second sensing information, the telepresence device 50 identifies the second user's appearance or health condition, and based on the identified second user's appearance or health condition, determines the second user's situation. judge. This makes it possible to specify the situation of the second user (listener) according to the internal state of the second user (listener).
- the telepresence device 50 identifies the behavioral state of the second user based on the second sensing information, and determines the user's situation based on the identified behavioral state. This makes it possible to specify the situation of the second user (listener) according to the behavioral state of the second user (listener).
- the telepresence device 50 identifies the second user's emotional expression based on the second sensing information, and determines the user's situation based on the identified emotional expression. This makes it possible to specify the situation of the second user (listener) according to the emotional expression of the second user (listener).
- the information processing unit 55b of the telepresence device 50 specifies the disclosure level of nonverbal information based on the first sensing information 54a and the second sensing information 54b. , but not limited to.
- server 100 may acquire first sensing information 54a and second sensing information 54b from telepresence devices 50a and 50b and identify the disclosure level. In this case, the server 100 notifies the specified disclosure level to the telepresence devices 50a and 50b, and executes data communication of non-verbal information according to the disclosure level.
- FIG. 9 is a diagram showing a configuration example of the server 100. As shown in FIG. As shown in FIG. 9, this server 100 has a communication section 110, an input section 120, an output section 130, a storage section 140, and a control section 150. FIG.
- the communication unit 110 is implemented by, for example, a NIC.
- the communication unit 110 is connected to the network 5 by wire or wirelessly, and transmits and receives information to and from the server 100 and the telepresence device 50 via the network 5 .
- the input unit 120 corresponds to input devices such as a keyboard and a mouse.
- the output unit 130 corresponds to a display device such as a display.
- the storage unit 140 is implemented, for example, by a semiconductor memory device such as a RAM or flash memory, or a storage device such as a hard disk or optical disk.
- the storage unit 140 has first sensing information 54a, second sensing information 54b, log information 54c, characteristic information 54d, and disclosure level information 54e.
- first sensing information 54a Descriptions of the first sensing information 54a, the second sensing information 54b, the log information 54c, the characteristic information 54d, and the disclosure level information 54e are the same as those described above.
- the control unit 150 is implemented, for example, by executing a program stored inside the telepresence device 50 using a RAM or the like as a work area by a CPU, MPU, or the like. Also, the control unit 150 may be implemented by an integrated circuit such as an ASIC or FPGA.
- the control unit 150 has an acquisition unit 150a, an information processing unit 150b, and a notification unit 150c.
- the acquisition unit 150a acquires the first sensing information 54a from the telepresence device 50a and registers it in the storage unit 140.
- the acquisition unit 150 a acquires the second sensing information 54 b from the telepresence device 50 b and registers it in the storage unit 140 .
- the acquisition unit 150 a acquires information transmitted and received during communication between the telepresence devices 50 as log information 54 c and registers it in the storage unit 140 .
- the acquisition unit 150a acquires information input from the input unit 120 or the like, such as the user's name, sex, age, and human relationships, as characteristic information 54d, and registers the information in the storage unit 140.
- FIG. 1 the acquisition unit 150 a acquires information transmitted and received during communication between the telepresence devices 50 as log information 54 c and registers it in the storage unit 140 .
- the acquisition unit 150a acquires information input from the input unit 120 or the like, such as the user's name, sex, age, and human relationships, as characteristic information 54d, and registers the information in the storage unit 140.
- the information processing unit 150b Based on the first sensing information 54a, the second sensing information 54b, the log information 54c, and the characteristic information 54d, the information processing unit 150b specifies the disclosure level when the "speaker" notifies the “listener” of the non-verbal information. do.
- the information processing unit 150b registers the identified disclosure level information in the storage unit 140 as the disclosure level information 54e.
- Other processing related to the information processing section 150b is the same as that of the information processing section 55b described above.
- the notification unit 150c notifies the telepresence device 50 of the disclosure level information 54e registered by the information processing unit 150b.
- the server 100 acquires the first sensing information 54a and the second sensing information 54b from the telepresence devices 50a and 50b, specifies the disclosure level, and transmits the specified disclosure level information to the telepresence devices 50a and 50b. 50b.
- data communication of non-verbal information corresponding to the disclosure level can be executed between the telepresence devices 50 .
- FIG. 10 is a hardware configuration diagram showing an example of a computer that realizes the functions of the telepresence device 50 and the server 100.
- the computer 1000 has a CPU 1100 , a RAM 1200 , a ROM (Read Only Memory) 1300 , a HDD (Hard Disk Drive) 1400 , a communication interface 1500 and an input/output interface 1600 .
- Each part of computer 1000 is connected by bus 1050 .
- the CPU 1100 operates based on programs stored in the ROM 1300 or HDD 1400 and controls each section. For example, the CPU 1100 loads programs stored in the ROM 1300 or HDD 1400 into the RAM 1200 and executes processes corresponding to various programs.
- the ROM 1300 stores a boot program such as BIOS (Basic Input Output System) executed by the CPU 1100 when the computer 1000 is started, and programs dependent on the hardware of the computer 1000.
- BIOS Basic Input Output System
- the HDD 1400 is a computer-readable recording medium that non-temporarily records programs executed by the CPU 1100 and data used by such programs.
- HDD 1400 is a recording medium that records an information processing program according to the present disclosure, which is an example of program data 1450 .
- a communication interface 1500 is an interface for connecting the computer 1000 to an external network 1550 (for example, the Internet).
- the CPU 1100 receives data from another device via the communication interface 1500, and transmits data generated by the CPU 1100 to another device.
- the input/output interface 1600 is an interface for connecting the input/output device 1650 and the computer 1000 .
- the CPU 1100 receives data from input devices such as a keyboard and mouse via the input/output interface 1600 .
- the CPU 1100 also transmits data to an output device such as a display, speaker, or printer via the input/output interface 1600 .
- the input/output interface 1600 may function as a media interface for reading a program or the like recorded on a predetermined recording medium.
- Media include, for example, optical recording media such as DVD (Digital Versatile Disc) and PD (Phase change rewritable disk), magneto-optical recording media such as MO (Magneto-Optical disk), tape media, magnetic recording media, semiconductor memories, etc. is.
- the CPU 1100 of the computer 1000 implements the functions of the control unit 55 and the like by executing programs loaded on the RAM 1200 .
- the HDD 1400 stores programs according to the present disclosure and data in the storage unit 54 .
- CPU 1100 reads and executes program data 1450 from HDD 1400 , as another example, these programs may be obtained from another device via external network 1550 .
- the functions of the server 100 can also be realized by the computer 1000 described with reference to FIG.
- the CPU 1100 of the computer corresponding to the server 100 implements the functions of the control unit 150 and the like by executing programs loaded on the RAM 1200 .
- the HDD 1400 also stores an information processing program according to the present disclosure and data in the storage unit 140 .
- CPU 1100 reads and executes program data 1450 from HDD 1400 , as another example, these programs may be obtained from another device via external network 1550 .
- the information processing device obtains first sensing information of the first user using a sensing device while the first user and the second user are talking online, and the sensing device.
- an acquisition unit that acquires the second sensing information of the second user acquired using the first user's situation and the second sensing information that correspond to the first sensing information and the second sensing information;
- an information processing unit that specifies a disclosure level of information transmitted and received between the first user and the second user based on the user's situation. This makes it possible to specify the disclosure level of non-verbal information relating to images, sounds, and operations from the standpoints of listeners and speakers.
- the information processing device further comprises a communication control unit that controls the type and amount of information transmitted and received between the first user and the second user based on the disclosure level.
- a communication control unit that controls the type and amount of information transmitted and received between the first user and the second user based on the disclosure level.
- the information processing device acquires the first sensing information and the second sensing information acquired using at least one sensing device of a microphone, a camera, and a sensor. This makes it possible to acquire information for specifying the disclosure level.
- the first user is a speaker user who speaks to the second user
- the second user is a listener user who listens to the first user
- information Based on the first sensing information and the second sensing information, the processing device determines whether or not the first user wants to know the information of the second user as the situation of the first user. and, based on the first sensing information and the second sensing information, a process of determining whether or not the second user wants to convey the information of the first user as the situation of the second user. Execute more. This makes it possible to determine the situation of the speaker and the situation of the listener for specifying the disclosure level.
- the information processing device specifies communication content between the first user and the second user based on the first sensing information and the second sensing information, and based on the communication content, determines the Determine the status of the first user. This makes it possible to specify the situation of the first user (speaker) according to the content of the communication.
- the information processing unit identifies the external environment of the second user based on the second sensing information, and determines the situation of the first user based on the external environment. do. This makes it possible to specify the situation of the first user (speaker) in response to changes in the external environment of the second user (listener).
- the information processing unit identifies whether or not the second sensing information includes information with which the first user can confirm an unpleasant situation, and based on the identification result to determine the situation of the first user. This makes it possible to specify the situation of the first user (speaker) according to the situation of the second user (listener).
- the information processing device identifies external environmental factors of the second user based on the second sensing information, and determines the situation of the second user based on the external environmental factors. As a result, it is possible to specify the situation of the second user (listener) according to external environmental factors of the second user (listener) (parcel service calls, rain, phone calls, etc.).
- the information processing device identifies the appearance or health condition of the second user based on the second sensing information, and determines the second usage based on the identified appearance or health condition of the second user. determine a person's status. This makes it possible to specify the situation of the second user (listener) according to the internal state of the second user (listener).
- the information processing device identifies the behavioral state of the second user based on the second sensing information, and determines the situation of the second user based on the identified behavioral state. This makes it possible to specify the situation of the second user (listener) according to the behavioral state of the second user (listener).
- the information processing device identifies the emotional expression of the second user based on the second sensing information, and determines the situation of the second user based on the identified emotional expression. This makes it possible to specify the situation of the second user (listener) according to the emotional expression of the second user (listener).
- the present technology can also take the following configuration.
- the first sensing information of the first user acquired using the sensing device and the first sensing information acquired using the sensing device an acquisition unit that acquires the second sensing information of the second user, Based on the situation of the first user and the situation of the second user corresponding to the first sensing information and the second sensing information, between the first user and the second user
- An information processing device comprising: an information processing unit that specifies a disclosure level of information to be transmitted and received.
- the information processing apparatus according to (1) further comprising a communication control unit that controls the type and amount of information transmitted and received between the first user and the second user based on the disclosure level.
- the first user is a speaker user who speaks to the second user
- the second user is a listener user who listens to the first user
- the information processing unit determines whether the first user wants to know the information of the second user as the situation of the first user.
- the information processing apparatus specifies communication content between the first user and the second user based on the first sensing information and the second sensing information, and based on the communication content, The information processing apparatus according to any one of (1) to (4), which determines the situation of the first user.
- the information processing unit specifies the external environment of the second user based on the second sensing information, and determines the situation of the first user based on the external environment.
- the information processing device according to any one of (5).
- the information processing unit identifies whether or not the second sensing information includes information with which it is possible to confirm an unpleasant situation for the first user, and based on the identification result, determines whether the first user
- the information processing apparatus according to any one of (1) to (6), which determines a user's situation.
- the information processing unit identifies external environmental factors of the second user based on the second sensing information, and determines the situation of the second user based on the external environmental factors.
- the information processing apparatus according to any one of (7).
- the information processing unit identifies the appearance or health condition of the second user based on the second sensing information, and identifies the second user based on the identified appearance or health condition of the second user.
- the information processing apparatus according to any one of (1) to (8), which determines a user's situation. (10) The information processing unit identifies the behavioral state of the second user based on the second sensing information, and determines the situation of the second user based on the identified behavioral state. 1) The information processing apparatus according to any one of (9). (11) The information processing unit identifies the emotional expression of the second user based on the second sensing information, and determines the situation of the second user based on the identified emotional expression. 1) The information processing apparatus according to any one of (10).
- the first sensing information of the first user acquired using the sensing device and the first sensing information acquired using the sensing device obtaining the second sensing information of the second user, Based on the situation of the first user and the situation of the second user corresponding to the first sensing information and the second sensing information, between the first user and the second user An information processing method that specifies the disclosure level of information sent and received.
- the computer While the first user and the second user are talking online, the first sensing information of the first user acquired using the sensing device and the first sensing information acquired using the sensing device an acquisition unit that acquires the second sensing information of the second user, Based on the situation of the first user and the situation of the second user corresponding to the first sensing information and the second sensing information, between the first user and the second user An information processing program for functioning as an information processing unit that specifies the disclosure level of information to be transmitted and received.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Telephonic Communication Services (AREA)
Abstract
Description
1.実施形態
1-1.実施形態に係るテレプレゼンスシステムの構成
1-2.非言語情報の定義
1-3.非言語情報の開示レベルの定義
1-4.実施形態に係るテレプレゼンス装置の構成
1-5.開示レベルの特定
1-5-1.話し手の状況を検出する処理
1-5-1-1.コミュニケーションの内容
1-5-1-2.外部環境の変化
1-5-1-3.人の状態
1-5-1-4.話し手と聞き手との関係性
1-5-2.聞き手の状況を検出する処理
1-5-2-1.外部環境要因
1-5-2-2.人の内部状態
1-5-2-3.人の行動状態
1-5-2-4.感情表現
1-6.実施形態に係るテレプレゼンス装置の処理手順
1-7.非言語情報の開示レベルの制御の一例
1-8.実施形態に係る効果
2.その他の実施形態
3.ハードウェア構成
4.むすび
[1-1.実施形態に係るテレプレゼンスシステムの構成]
まず、図1を用いて、本実施形態に係るテレプレゼンスシステムの一例を説明する。図1は、実施形態に係るテレプレゼンスシステムの構成を示す図である。図1に示すように、このプレゼンスシステムは、テレプレゼンス装置50a,50bと、サーバ100とを有する。テレプレゼンス装置50a,50bと、サーバ100とは、ネットワーク5を介して相互に接続される。
図1に示したテレプレゼンスシステムを用いて相手とコミュニケーションをする際、コンテンツを提示・共有したり、音声通話を用いたりして、情報をやり取りする。この際、コンテンツや言語以外の手がかりとして得られる情報を、本実施形態では「非言語情報」と定義する。
図3は、非言語情報の開示レベルの設定例を示す図である。原則として、映像、音声、操作の各項目を開示すればするほど、非言語情報の開示レベルは高くなる。本実施形態では、映像、音声、操作の各項目の重要度を、情報量の大きさから、映像>音声>操作とし、開示レベルに対応させている。また、映像の中でも、一部の項目だけ開示することで、レベルを分類することもできる。
次に、図1に示したテレプレゼンス装置50の構成について説明する。図4は、本実施形態に係るテレプレゼンス装置50の構成例を示す図である。ここでは、テレプレゼンス装置50aを用いて説明を行う。図4に示すように、このテレプレゼンス装置50aは、通信部51、入力部52、出力部53、記憶部54、制御部55を有する。
図4で説明したテレプレゼンス装置50aの情報処理部55bは、第1センシング情報54aを基にして、話し手の状況を特定し、第2センシング情報54bを基にして、聞き手の状況を検出する。
情報処理部55bが実行する、話し手の状況(話し手が聞き手(相手)の非言語情報を知りたいのか、知りたくないのか)を検出する処理の一例について説明する。情報処理部55bは、第1センシング情報54a、第2センシング情報54b、ログ情報54c、特性情報54dを基にして、以下の処理を実行する。以下では、様々な検出方法を示すが、情報処理部55bは、何れか一つの検出方法を用いてもよいし、複数の検出方法を複合的に用いてもよい。情報処理部55bは、複数の検出方法を実行し、各判定結果が分かれる場合には、重み付きの多数決等によって、話し手の状況を判定する。
話し手は、非対面コミュニケーション中に、一定時間以上話しているが、聞き手側から何のリアクションも返ってこなかったり、質問や提案をしても返答がなかった場合には、聞き手が自分の話を聞いているか不安になり、聞き手側の非言語情報を知りたくなる。
話し手は、非対面コミュニケーション中に、聞き手側から、たとえば、テレビの音、赤ん坊の泣き声、工事音、雨音といった所定の外部環境の音が聞こえてきた場合に、聞き手側の周辺環境をはじめとする非言語情報を知りたくなる。
話し手は、非対面コミュニケーション中に、聞き手側から、咳、舌打ち、ため息といった、一般的に聞こえると不快に感じる音、声が聞こえてきたり、相手の表情が眠そうであったり、退屈そうであったりすると、聞き手の心理状態をはじめとする非言語情報を知りたくなる。
話し手と聞き手とは、初対面の場合、お互い相手がどういった人物なのかを知るために、非言語情報が欲しくなる。一方で、お互い過去に何度もコミュニケーションを実施していて、対手のことが十分に把握できている場合や、あるいは、相手に対して嫌悪感といった悪い感情を持っている場合には、相手の非言語情報が欲しくない。
情報処理部55bが実行する、聞き手の状況(聞き手が自身の非言語情報を話し手(相手)に伝えたいのか、伝えたくないのか)を検出する処理の一例について説明する。情報処理部55bは、第1センシング情報54a、第2センシング情報54b、ログ情報54c、特性情報54dを基にして、以下の処理を実行する。以下では、様々な検出方法を示すが、情報処理部55bは、何れか一つの検出方法を用いてもよいし、複数の検出方法を複合的に用いてもよい。情報処理部55bは、複数の検出方法を実行し、各判定結果が分かれる場合には、重み付きの多数決等によって、聞き手の状況を判定する。
話し手が話しているにも関わらず、聞き手が一時的にその場(テレプレゼンス装置50または地点)を離れる必要が発生することがコミュニケーション中において考えられる。たとえば、聞き手が自宅の自席でテレプレゼンスシステムを用いてコミュニケーションを行っている間に、宅配便がきた、電話がかかってきた、雨が降ってきたなどといったイベントが発生する場合がある。この場合には、聞き手はそのイベントに対応しなければならない。しかし、このとき、聞き手が話し手に非言語情報(映像)を開示してしまうと、相手の気分を損ねたり、話を途中で遮ってしまったりする可能性がある。よって、聞き手は、所定のイベントが発生した場合に、自身の非言語情報を伝えたくない。
聞き手の外見や健康状態に起因する、寝起き、寝ぐせがひどい、体調不良、ひげを剃っていない、ノーメイクなどといったときは、相手に対する失礼の意識から、自分の非言語情報を相手(話し手)に伝えたくない。または、話し手の話にそもそも興味がなかった場合、その様子を相手に悟られたくないので、自身の非言語情報を相手に伝えたくない。
非対面コミュニケーション中に、聞き手が何か別のことをしながら話し手とコミュニケーションを行う場合がある。聞き手は、コミュニケーションをしながら、同時に何か別のこと(ネットサーフィン、スマホゲーム、子供の世話など)をしている場合、その様子を相手に見られたくなく、自分の非言語情報を相手に伝えたくない。
非対面コミュニケーション中、聞き手は、話し手の話をちゃんと聞いていることをアピールしたい時、あるいは質問があるとき、または感謝の意といった感情表現を相手に伝えたくなる場合があり、自分の非言語情報を相手に伝えたくなる。
続いて、図4に示したテレプレゼンス装置50の処理手順の一例について説明する。図7は、本実施形態に係るテレプレゼンス装置50の処理手順を示すフローチャートである。図7に示すように、テレプレゼンス装置50aの取得部55aは、入力部52から第1センシング情報54aの取得を開始し、テレプレゼンス装置50bから第2センシング情報54bの取得を開始する(ステップS101)。
本実施形態に係るテレプレゼンスシステムを用いて非対面コミュニケーションを行った場合の開示レベルの制御の一例について説明する。図8は、開示レベルの制御の一例を説明するための図である。図8では、異なる地点にいる人物1Aと人物1Bとが、テレプレゼンス装置50を用いて非対面コミュニケーションを行った際の、お互いの非言語情報の開示レベルの変化を示している。
上記のように、本実施形態に係るテレプレゼンス装置50が、第1利用者と第2利用者とがオンラインを介した通話を行っている間に、センシング装置を用いて取得される第1利用者の第1センシング情報と、センシング装置を用いて取得される第2利用者の第2センシング情報とを取得する。テレプレゼンス装置50は、第1センシング情報と第2センシング情報とに対応する第1利用者の状況と第2利用者の状況とを基にして、第1利用者と第2利用者との間で送受信される情報の開示レベルを特定する。これによって、聞き手と話し手との立場から、映像、音声、操作に関する非言語情報の開示レベルを特定することができる。
上記の実施形態で説明したテレプレゼンスシステムに含まれるテレプレゼンス装置50a,50b、サーバ100の処理は一例であり、その他の処理を実行してもよい。
上述してきた各実施形態に係るテレプレゼンス装置50、サーバ100等の情報機器は、たとえば、図10に示すような構成のコンピュータ1000によって実現される。図10は、テレプレゼンス装置50、サーバ100の機能を実現するコンピュータの一例を示すハードウェア構成図である。以下、実施形態に係るテレプレゼンス装置50を例に挙げて説明する。コンピュータ1000は、CPU1100、RAM1200、ROM(Read Only Memory)1300、HDD(Hard Disk Drive)1400、通信インタフェース1500、及び入出力インタフェース1600を有する。コンピュータ1000の各部は、バス1050によって接続される。
情報処理装置は、第1利用者と第2利用者とがオンラインを介した通話を行っている間に、センシング装置を用いて取得される前記第1利用者の第1センシング情報と、センシング装置を用いて取得される前記第2利用者の第2センシング情報とを取得する取得部と、前記第1センシング情報と前記第2センシング情報とに対応する前記第1利用者の状況と前記第2利用者の状況とを基にして、前記第1利用者と第2利用者との間で送受信される情報の開示レベルを特定する情報処理部とを備える。これによって、聞き手と話し手との立場から、映像、音声、操作に関する非言語情報の開示レベルを特定することができる。
(1)
第1利用者と第2利用者とがオンラインを介した通話を行っている間に、センシング装置を用いて取得される前記第1利用者の第1センシング情報と、センシング装置を用いて取得される前記第2利用者の第2センシング情報とを取得する取得部と、
前記第1センシング情報と前記第2センシング情報とに対応する前記第1利用者の状況と前記第2利用者の状況とを基にして、前記第1利用者と第2利用者との間で送受信される情報の開示レベルを特定する情報処理部と
を備える情報処理装置。
(2)
前記開示レベルを基にして、前記第1利用者と第2利用者との間で送受信される情報の種類および量を制御する通信制御部を更に備える前記(1)に記載の情報処理装置。
(3)
前記取得部は、マイク、カメラ、センサのうち少なくとも一つのセンシング装置を用いて取得された前記第1センシング情報と前記第2センシング情報とを取得する前記(1)または(2)に記載の情報処理装置。
(4)
前記第1利用者は、前記第2利用者に対して話をする話し手の利用者であり、前記第2利用者は、前記第1利用者からの話を聞く聞き手の利用者であり、
前記情報処理部は、前記第1センシング情報と前記第2センシング情報とを基に、前記第1利用者の状況として、前記第1利用者が前記第2利用者の情報を知りたいか否かを判定し、前記第1センシング情報と前記第2センシング情報とを基に、前記第2利用者の状況として、前記第2利用者が前記第1利用者の情報を伝えたいか否かを判定する処理を更に実行する前記(1)~(3)のいずれか一つに記載の情報処理装置。
(5)
前記情報処理部は、前記第1センシング情報と前記第2センシング情報とを基にして、前記第1利用者と前記第2利用者とのコミュニケーション内容を特定し、前記コミュニケーション内容を基にして、前記第1利用者の状況を判定する前記(1)~(4)のいずれか一つに記載の情報処理装置。
(6)
前記情報処理部は、前記第2センシング情報を基にして、前記第2利用者の外部環境を特定し、前記外部環境を基にして、前記第1利用者の状況を判定する前記(1)~(5)のいずれか一つに記載の情報処理装置。
(7)
前記情報処理部は、前記第2センシング情報に、前記第1利用者に対して不快な状況を確認可能な情報が含まれているか否かを特定し、特定結果を基にして、前記第1利用者の状況を判定する前記(1)~(6)のいずれか一つに記載の情報処理装置。
(8)
前記情報処理部は、前記第2センシング情報を基にして、前記第2利用者の外部環境要因を特定し、前記外部環境要因を基にして、前記第2利用者の状況を判定する前記(1)~(7)のいずれか一つに記載の情報処理装置。
(9)
前記情報処理部は、前記第2センシング情報を基にして、前記第2利用者の外見または健康状態を特定し、特定した前記第2利用者の外見または健康状態を基にして、前記第2利用者の状況を判定する前記(1)~(8)のいずれか一つに記載の情報処理装置。
(10)
前記情報処理部は、前記第2センシング情報を基にして、前記第2利用者の行動状態を特定し、特定した前記行動状態を基にして、前記第2利用者の状況を判定する前記(1)~(9)のいずれか一つに記載の情報処理装置。
(11)
前記情報処理部は、前記第2センシング情報を基にして、前記第2利用者の感情表現を特定し、特定した前記感情表現を基にして、前記第2利用者の状況を判定する前記(1)~(10)のいずれか一つに記載の情報処理装置。
(12)
第1利用者と第2利用者とがオンラインを介した通話を行っている間に、センシング装置を用いて取得される前記第1利用者の第1センシング情報と、センシング装置を用いて取得される前記第2利用者の第2センシング情報とを取得し、
前記第1センシング情報と前記第2センシング情報とに対応する前記第1利用者の状況と前記第2利用者の状況とを基にして、前記第1利用者と第2利用者との間で送受信される情報の開示レベルを特定する
情報処理方法。
(13)
コンピュータを、
第1利用者と第2利用者とがオンラインを介した通話を行っている間に、センシング装置を用いて取得される前記第1利用者の第1センシング情報と、センシング装置を用いて取得される前記第2利用者の第2センシング情報とを取得する取得部と、
前記第1センシング情報と前記第2センシング情報とに対応する前記第1利用者の状況と前記第2利用者の状況とを基にして、前記第1利用者と第2利用者との間で送受信される情報の開示レベルを特定する情報処理部と
として機能させるための情報処理プログラム。
50a,50b テレプレゼンス装置
51,110 通信部
52,120 入力部
52a カメラ
52b マイク
52c 各種センサ
53,130 出力部
53a ディスプレイ
53b スピーカー
53c アクチュエーター
54,140 記憶部
54a 第1センシング情報
54b 第2センシング情報
54c ログ情報
54d 特性情報
54e 開示レベル情報
55,150 制御部
55a,150a 取得部
55b,150b 情報処理部
55c 通信制御部
100 サーバ
150c 通知部
Claims (13)
- 第1利用者と第2利用者とがオンラインを介した通話を行っている間に、センシング装置を用いて取得される前記第1利用者の第1センシング情報と、センシング装置を用いて取得される前記第2利用者の第2センシング情報とを取得する取得部と、
前記第1センシング情報と前記第2センシング情報とに対応する前記第1利用者の状況と前記第2利用者の状況とを基にして、前記第1利用者と第2利用者との間で送受信される情報の開示レベルを特定する情報処理部と
を備える情報処理装置。 - 前記開示レベルを基にして、前記第1利用者と第2利用者との間で送受信される情報の種類および量を制御する通信制御部を更に備える請求項1に記載の情報処理装置。
- 前記取得部は、マイク、カメラ、センサのうち少なくとも一つのセンシング装置を用いて取得された前記第1センシング情報と前記第2センシング情報とを取得する請求項1に記載の情報処理装置。
- 前記第1利用者は、前記第2利用者に対して話をする話し手の利用者であり、前記第2利用者は、前記第1利用者からの話を聞く聞き手の利用者であり、
前記情報処理部は、前記第1センシング情報と前記第2センシング情報とを基に、前記第1利用者の状況として、前記第1利用者が前記第2利用者の情報を知りたいか否かを判定し、前記第1センシング情報と前記第2センシング情報とを基に、前記第2利用者の状況として、前記第2利用者が前記第1利用者の情報を伝えたいか否かを判定する処理を更に実行する請求項1に記載の情報処理装置。 - 前記情報処理部は、前記第1センシング情報と前記第2センシング情報とを基にして、前記第1利用者と前記第2利用者とのコミュニケーション内容を特定し、前記コミュニケーション内容を基にして、前記第1利用者の状況を判定する請求項4に記載の情報処理装置。
- 前記情報処理部は、前記第2センシング情報を基にして、前記第2利用者の外部環境を特定し、前記外部環境を基にして、前記第1利用者の状況を判定する請求項4に記載の情報処理装置。
- 前記情報処理部は、前記第2センシング情報に、前記第1利用者に対して不快な状況を確認可能な情報が含まれているか否かを特定し、特定結果を基にして、前記第1利用者の状況を判定する請求項4に記載の情報処理装置。
- 前記情報処理部は、前記第2センシング情報を基にして、前記第2利用者の外部環境要因を特定し、前記外部環境要因を基にして、前記第2利用者の状況を判定する請求項4に記載の情報処理装置。
- 前記情報処理部は、前記第2センシング情報を基にして、前記第2利用者の外見または健康状態を特定し、特定した前記第2利用者の外見または健康状態を基にして、前記第2利用者の状況を判定する請求項4に記載の情報処理装置。
- 前記情報処理部は、前記第2センシング情報を基にして、前記第2利用者の行動状態を特定し、特定した前記行動状態を基にして、前記第2利用者の状況を判定する請求項3に記載の情報処理装置。
- 前記情報処理部は、前記第2センシング情報を基にして、前記第2利用者の感情表現を特定し、特定した前記感情表現を基にして、前記第2利用者の状況を判定する請求項3に記載の情報処理装置。
- 第1利用者と第2利用者とがオンラインを介した通話を行っている間に、センシング装置を用いて取得される前記第1利用者の第1センシング情報と、センシング装置を用いて取得される前記第2利用者の第2センシング情報とを取得し、
前記第1センシング情報と前記第2センシング情報とに対応する前記第1利用者の状況と前記第2利用者の状況とを基にして、前記第1利用者と第2利用者との間で送受信される情報の開示レベルを特定する
情報処理方法。 - コンピュータを、
第1利用者と第2利用者とがオンラインを介した通話を行っている間に、センシング装置を用いて取得される前記第1利用者の第1センシング情報と、センシング装置を用いて取得される前記第2利用者の第2センシング情報とを取得する取得部と、
前記第1センシング情報と前記第2センシング情報とに対応する前記第1利用者の状況と前記第2利用者の状況とを基にして、前記第1利用者と第2利用者との間で送受信される情報の開示レベルを特定する情報処理部と
として機能させるための情報処理プログラム。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2023563572A JPWO2023095531A1 (ja) | 2021-11-25 | 2022-10-27 | |
CN202280075969.8A CN118251883A (zh) | 2021-11-25 | 2022-10-27 | 信息处理装置、信息处理方法和信息处理程序 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2021191014 | 2021-11-25 | ||
JP2021-191014 | 2021-11-25 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023095531A1 true WO2023095531A1 (ja) | 2023-06-01 |
Family
ID=86539364
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2022/040089 WO2023095531A1 (ja) | 2021-11-25 | 2022-10-27 | 情報処理装置、情報処理方法および情報処理プログラム |
Country Status (3)
Country | Link |
---|---|
JP (1) | JPWO2023095531A1 (ja) |
CN (1) | CN118251883A (ja) |
WO (1) | WO2023095531A1 (ja) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018193687A1 (ja) * | 2017-04-18 | 2018-10-25 | ソニー株式会社 | 情報処理装置、情報処理方法、および記録媒体 |
JP2019208167A (ja) * | 2018-05-30 | 2019-12-05 | 公立大学法人首都大学東京 | テレプレゼンスシステム |
JP2020202567A (ja) * | 2017-12-27 | 2020-12-17 | ハイパーコネクト インコーポレイテッド | 映像通話サービスを提供する端末及びサーバ |
JP2021021025A (ja) | 2019-07-29 | 2021-02-18 | 国立大学法人東京農工大学 | 導電性多孔質体の製造方法および熱電変換部材の製造方法 |
JP2021071632A (ja) | 2019-10-31 | 2021-05-06 | ソニー株式会社 | 情報処理装置、情報処理方法、及び、プログラム |
JP2021099538A (ja) * | 2018-03-30 | 2021-07-01 | ソニーグループ株式会社 | 情報処理装置、情報処理方法およびプログラム |
-
2022
- 2022-10-27 JP JP2023563572A patent/JPWO2023095531A1/ja active Pending
- 2022-10-27 CN CN202280075969.8A patent/CN118251883A/zh active Pending
- 2022-10-27 WO PCT/JP2022/040089 patent/WO2023095531A1/ja active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018193687A1 (ja) * | 2017-04-18 | 2018-10-25 | ソニー株式会社 | 情報処理装置、情報処理方法、および記録媒体 |
JP2020202567A (ja) * | 2017-12-27 | 2020-12-17 | ハイパーコネクト インコーポレイテッド | 映像通話サービスを提供する端末及びサーバ |
JP2021099538A (ja) * | 2018-03-30 | 2021-07-01 | ソニーグループ株式会社 | 情報処理装置、情報処理方法およびプログラム |
JP2019208167A (ja) * | 2018-05-30 | 2019-12-05 | 公立大学法人首都大学東京 | テレプレゼンスシステム |
JP2021021025A (ja) | 2019-07-29 | 2021-02-18 | 国立大学法人東京農工大学 | 導電性多孔質体の製造方法および熱電変換部材の製造方法 |
JP2021071632A (ja) | 2019-10-31 | 2021-05-06 | ソニー株式会社 | 情報処理装置、情報処理方法、及び、プログラム |
Also Published As
Publication number | Publication date |
---|---|
CN118251883A (zh) | 2024-06-25 |
JPWO2023095531A1 (ja) | 2023-06-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220012470A1 (en) | Multi-user intelligent assistance | |
JP6992839B2 (ja) | 情報処理装置、情報処理方法及びプログラム | |
US9724824B1 (en) | Sensor use and analysis for dynamic update of interaction in a social robot | |
US9848166B2 (en) | Communication unit | |
KR102338888B1 (ko) | 검출된 이벤트들에 관한 정보를 제공하기 위한 방법들, 시스템들 및 매체들 | |
WO2020148920A1 (ja) | 情報処理装置、情報処理方法、及び情報処理プログラム | |
JP2008509455A (ja) | ユーザとシステムとの間の通信方法及びシステム | |
JP7323098B2 (ja) | 対話支援装置、対話支援システム、及び対話支援プログラム | |
US20200357504A1 (en) | Information processing apparatus, information processing method, and recording medium | |
WO2023095531A1 (ja) | 情報処理装置、情報処理方法および情報処理プログラム | |
JP7296626B2 (ja) | 情報処理装置及びプログラム | |
JP7131542B2 (ja) | 情報処理装置、情報処理方法およびプログラム | |
JP7405357B2 (ja) | 高齢者等見守システム | |
KR20220065746A (ko) | 치매 환자의 기록 데이터 가공 서비스를 지원하는 시스템 | |
JP2021114004A (ja) | 情報処理装置及び情報処理方法 | |
JP7254345B2 (ja) | 情報処理装置及びプログラム | |
KR20190030549A (ko) | 영상 채팅에 기반하여 광고 콘텐츠의 흐름을 제어하는 방법, 시스템 및 비일시성의 컴퓨터 판독 가능 기록 매체 | |
JP7307576B2 (ja) | プログラム及び情報処理装置 | |
JP2021162670A (ja) | コミュニケーション提供装置、コミュニケーション提供システム及びプログラム | |
JP7501524B2 (ja) | 情報処理装置、情報処理方法、およびプログラム | |
JP7477909B2 (ja) | ビデオミーティング評価端末、ビデオミーティング評価システム及びビデオミーティング評価プログラム | |
JP6824146B2 (ja) | 評価装置、方法およびプログラム | |
JP7290977B2 (ja) | プログラム及び情報処理装置 | |
JP7150114B1 (ja) | コミュニケーション支援システム、コミュニケーション支援方法、及びコミュニケーション支援プログラム | |
US20230105048A1 (en) | Robot control method and information providing method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22898315 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2023563572 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202280075969.8 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2022898315 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2022898315 Country of ref document: EP Effective date: 20240625 |