WO2024185334A1 - 情報処理装置、および情報処理方法、並びにプログラム - Google Patents

情報処理装置、および情報処理方法、並びにプログラム Download PDF

Info

Publication number
WO2024185334A1
WO2024185334A1 PCT/JP2024/002446 JP2024002446W WO2024185334A1 WO 2024185334 A1 WO2024185334 A1 WO 2024185334A1 JP 2024002446 W JP2024002446 W JP 2024002446W WO 2024185334 A1 WO2024185334 A1 WO 2024185334A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
output
intimacy
background
information processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2024/002446
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
裕美 深谷
秀平 宮崎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Group Corp
Original Assignee
Sony Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Group Corp filed Critical Sony Group Corp
Priority to CN202480015970.0A priority Critical patent/CN120883628A/zh
Priority to JP2025505116A priority patent/JPWO2024185334A1/ja
Priority to EP24766709.0A priority patent/EP4679862A1/en
Publication of WO2024185334A1 publication Critical patent/WO2024185334A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/165Management of the audio stream, e.g. setting of volume, audio stream path
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/56Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
    • H04M3/568Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/11Positioning of individual sound objects, e.g. moving airplane, within a sound field

Definitions

  • the present disclosure relates to an information processing device, an information processing method, and a program. More specifically, the present disclosure relates to an information processing device, an information processing method, and a program that, when multiple user terminals are connected to a network to have a conversation or meeting, output background sounds and background images of a real environment, such as a cafe, to each user terminal, allowing each user to feel as if they are having a conversation in a real environment, such as a cafe.
  • each user participating in the conversation connects their own user device, such as a PC or smartphone, to a communications network such as the Internet, and images and audio are sent and received between each device via the communications network.
  • a user device such as a PC or smartphone
  • a communications network such as the Internet
  • Patent Document 1 International Publication No. WO2019/155735
  • This patent document discloses a configuration in which a virtual image of a conversation partner is displayed on a user terminal, and the orientation and facial expression of the displayed virtual image are changed in the same manner as those of the actual conversation partner.
  • this disclosed configuration only controls the display of the users having a conversation, and does not control background sounds or background images, so it does not give the user the feeling that they are in the same space as the users having a conversation.
  • the present disclosure has been made in consideration of the above-mentioned problems, and aims to provide an information processing device, information processing method, and program that enable users to feel as if they are having a conversation in a real environment, such as a cafe, by outputting background sounds and background images of a real environment, such as a cafe, to the user terminal of the user who is having a conversation via a communication network.
  • a first aspect of the present disclosure is a method for manufacturing a semiconductor device comprising: a communication unit for receiving a user utterance of a conversation partner via a network; an output voice control unit that controls the output of the user's utterance,
  • the output audio control unit includes: The information processing device executes voice direction control so that the user's utterance is heard as an utterance coming from the user's position of the conversation partner relative to a predefined user position.
  • a second aspect of the present disclosure is An information processing method executed in an information processing device,
  • the information processing device includes: a communication unit for receiving a user utterance of a conversation partner via a network; an output voice control unit that controls the output of the user's utterance, The output voice control unit,
  • the information processing method includes controlling a voice direction so that the user's speech is heard as coming from the user's position of the conversation partner relative to a predefined user position.
  • a third aspect of the present disclosure is A program for causing an information processing device to execute information processing,
  • the information processing device includes: a communication unit for receiving a user utterance of a conversation partner via a network; an output voice control unit that controls the output of the user's utterance,
  • the program causes the output audio control unit to
  • the program executes voice direction control so that the user's speech is heard as coming from the user's position of the conversation partner relative to a predefined user position.
  • the program disclosed herein is, for example, a program that can be provided by a storage medium or a communication medium in a computer-readable format to an information processing device or computer system capable of executing various program codes.
  • a program that can be provided by a storage medium or a communication medium in a computer-readable format to an information processing device or computer system capable of executing various program codes.
  • a system refers to a logical collective configuration of multiple devices, and is not limited to devices that are located within the same housing.
  • a configuration is realized in which voice direction control is performed so that a user's utterance of a conversation partner via a network is heard as coming from the conversation partner's user position relative to a predefined user position.
  • the device has a communication unit that receives a user utterance from a conversation partner via a network, and an output voice control unit that executes output control of the user utterance.
  • the output voice control unit executes voice direction control and volume control so that the user utterance is heard as an utterance coming from a user position of the conversation partner relative to a predefined user position.
  • the user position of the conversation partner relative to the user position is set to a predetermined fixed position or according to the intimacy with the conversation partner, and the higher the intimacy, the closer the user position is set to the position.
  • This configuration realizes a configuration for executing voice direction control so that user utterance of a conversation partner via a network is heard as utterance coming from the conversation partner's user position relative to a predefined user position. It should be noted that the effects described in this specification are merely examples and are not limiting, and additional effects may also be provided.
  • FIG. 1 is a diagram illustrating an overview of a configuration of an information processing system according to the present disclosure and processing executed by the system.
  • 11A and 11B are diagrams illustrating specific examples of background audio data and background image data provided by a background data providing server to a user terminal.
  • 11A and 11B are diagrams illustrating an example of a process for storing background data (background audio data, background image data) in a user terminal.
  • FIG. 2 is a diagram illustrating a configuration example of a user terminal used in the first embodiment.
  • 11 is a diagram illustrating a specific example of a voice control process executed by an output voice control unit of a user terminal.
  • FIG. 11A to 11C are diagrams illustrating an example of a processing sequence for determining background settings and the positions of each user.
  • FIG. 13 is a diagram illustrating a setting that enables each user to hear the speech of other users as if it were coming from the user's set position.
  • 11 is a diagram illustrating an example of a calculation process of an audio output control parameter applied to an output audio control process executed by an output audio control unit.
  • FIG. 11 is a diagram illustrating an example of a calculation process of an audio output control parameter applied to an output audio control process executed by an output audio control unit.
  • FIG. 11 is a diagram illustrating an example of a process for controlling the output of speech of other users b to d, which is executed by an output speech control unit of user terminal a of user a.
  • 13 is a diagram illustrating an example of a process executed by an output voice control unit of user terminal b of user b.
  • FIG. 13 is a diagram illustrating an example of a process executed by an output voice control unit of user terminal b of user b.
  • FIG. FIG. 13 is a diagram illustrating an example in which four users a to d are having a conversation using background data of a cafe in which various background sounds are present.
  • 11A and 11B are diagrams illustrating a specific example of an output audio control process executed by an output audio control unit of a user terminal in a setting with a cafe as the background.
  • FIG. 13 is a diagram illustrating an example in which four users a to d are having a conversation using background data of a cafe with many people around.
  • 11A and 11B are diagrams illustrating a specific example of an output audio control process executed by an output audio control unit of a user terminal in a setting of a cafe with many people around in the background.
  • FIG. 11 is a diagram for explaining a specific processing example of the second embodiment.
  • 11A and 11B are diagrams illustrating an example of a display control process for users (avatars or real images corresponding to the users) according to the degree of intimacy between the users.
  • 11A and 11B are diagrams illustrating specific examples of images that an image output unit displays on a display unit in accordance with the display positions of each user that an output image control unit has determined in accordance with intimacy information calculated by an intimacy calculation unit.
  • FIG. 11 is a diagram illustrating a graph showing an example of a control process of a user speech output volume according to the degree of intimacy executed by an output voice control unit of the user terminal a.
  • FIG. 11 is a diagram illustrating an example of an output voice control process corresponding to a specific degree of intimacy between a user a and users b to d.
  • 11 is a diagram illustrating an example of an output voice control process corresponding to a specific degree of intimacy between a user a and users b to d.
  • FIG. 11 is a diagram illustrating a configuration example of a user terminal used in the second embodiment.
  • 11A and 11B are diagrams illustrating a detailed configuration example of an intimacy calculation unit and a specific example of an intimacy calculation process.
  • FIG. 13 is a diagram showing an example of a calculation process of a "user preference-based intimacy" calculated by an intimacy calculation unit;
  • FIG. 13 is a diagram showing an example of a calculation process of a "conversation density-based intimacy” calculated by an intimacy calculation unit.
  • 11A and 11B are diagrams illustrating an example of a process for changing a display mode executed by an output image control unit while a conversation is being held between a plurality of users a to d.
  • 11A and 11B are diagrams illustrating an example of a process of changing a user image (avatar image or real image) in response to a change in intimacy level, which is executed by an output image control unit.
  • 11 is a diagram showing an example in which the user terminals of users who are engaged in a conversation via a network output different background data to each user terminal.
  • 11A and 11B are diagrams illustrating an example of a process for switching background data to be output to a terminal according to a conversation between users;
  • 11A and 11B are diagrams illustrating an example of a process for switching background data to be output to a terminal according to a conversation between users;
  • 11A and 11B are diagrams illustrating an example of a process for switching background data to be output to a terminal according to a conversation between users;
  • 11A and 11B are diagrams illustrating an example of a process for switching background data to be output to a terminal according to a conversation between users;
  • 11A and 11B are diagrams illustrating an example of a process for switching background data to be output to a terminal according to a conversation between users;
  • 11A and 11B are diagrams illustrating an example of a process for switching background data to be output to a terminal according to a
  • FIG. 1 is a diagram illustrating an example of a process in which a user a uses a user terminal a to converse with another user via a network.
  • FIG. 1 is a diagram illustrating an example of a process in which a user a uses a user terminal a to converse with another user via a network.
  • FIG. 1 is a diagram illustrating an example of a process in which a user a uses a user terminal a to converse with another user via a network.
  • FIG. 1 is a diagram illustrating an example of a process in which a user a uses a user terminal a to converse with another user via a network.
  • FIG. 1 is a diagram illustrating an example of a process in which a user a uses a user terminal a to converse with another user via a network.
  • FIG. 1 is a diagram illustrating an example of a process in which a user a uses a user terminal a to converse with another user via a network.
  • FIG. 1 is a diagram illustrating an example of a process in which a user a uses a user terminal a to converse with another user via a network.
  • FIG. 2 is a diagram illustrating an example of the hardware configuration of a user terminal and a server.
  • Example 1 Example of executing voice control based on the user's position 2-1.
  • Example 2 Example of executing voice output control and display control according to the intimacy between users 3-1. Specific process example of the voice output control and display control process according to the intimacy between users 3-2.
  • Example 3 Example of using different background data in each user terminal 4-1.
  • Process example 1 Example of a process in which background data set in one's own terminal is continuously output when talking to another user and having a conversation with the other user 4-2.
  • Process example 2 Example of a process in which only background voice data, among the background data set in one's own terminal, is switched to background voice set in the user terminal of the conversation partner when talking to another user and having a conversation with the other user 4-3. 4-3.
  • Process Example 3 A processing example in which, when talking to another user and having a conversation with that other user, not only the background audio data but also the background image data among the background data set on one's own terminal is switched to the background data set on the user terminal of the conversation partner 4-4.
  • (Processing Example 4) A processing example in which, when talking to another user and having a conversation with that other user, the background data set on one's own terminal is set to be continuously output 4-5.
  • (Processing Example 5) A processing example in which, when talking to another new user during a conversation between multiple users, the background audio data of the user terminal of the new user is set to be transmitted to and output from the user terminals of the multiple users in the conversation 4-6.
  • (Processing Example 6) A processing example in which, when talking to another new user during a conversation between multiple users, the background audio data and background image data of the user terminal of the new user are set to be transmitted to and output from the user terminals of the multiple users in the conversation 4-7.
  • FIG. 7 A processing example in which, when talking to another new user during a conversation between multiple users, the background audio data of the user terminal of the new user is set to be transmitted to and output from the user terminals of the multiple users in the conversation 5. 5. An example of a specific processing sequence for outputting background data to a user terminal and having a conversation between users 6. An example of a hardware configuration of a user terminal and a server 7. Summary of the configuration of the present disclosure
  • Figure 1 shows an example of a system for conducting remote conferences, remote meetings, online games, etc., and shows an example of the configuration of an information processing system that allows users to converse with each other via a communication network.
  • Figure 1 shows users a, 11a to d, 11d who are participating in the conversation via the communication network, user terminals a, 21a to d, 21d used by each user, a communication management server 50 which provides the communication execution environment, and a background data providing server 70 which provides various background audio data and background image data.
  • the communication management server 50 is, for example, a remote conference management server that provides a remote conference execution environment, or a game server that provides a game execution environment.
  • the background data providing server 70 is a server that provides each user terminal with background voice data and background image data of various locations such as cafes and conference rooms where users a, 11a to d, 11d hold conversations such as conferences.
  • communication management server 50 and the background data providing server 70 are shown as separate servers in the figure, they may be treated as a single server.
  • user terminal a, 21a to user terminal d, 21d are connected to the communication management server 50 via the communication network 30, and the voices and images output from user terminal a, 21a to user terminal d, 21d are transmitted and received between user terminal a, 21a to user terminal d, 21d via the communication management server 50.
  • various background voice data and background image data provided by the background data providing server 70 can be stored in the user terminals a, 21a to d, 21d before carrying out a conversation via the communication network.
  • user terminal a, 21a to user terminal d, 21d can also carry out conversations over a communication network while acquiring various background audio data and background image data provided by the background data providing server 70.
  • the connection between the background data providing server 70 and user terminal a, 21a to user terminal d, 21d is maintained when carrying out conversations over the communication network.
  • the user terminals 21a to 21d are each composed of a communication-enabled information processing device such as a PC, a smartphone, or a tablet terminal. Each of these user terminals 21a to 21d has a microphone and a camera, and voice data such as user utterances and image data such as images of the user's face acquired at the user terminal 21 are transmitted to other user terminals 21 via the communication management server 50.
  • Each of the user terminals 21a to 21d that execute communication processing via the network performs audio output control of each user's conversation sound, as well as audio output control and image output control for outputting background sound and background images of a certain environment, for example.
  • FIG. 2 shows a specific example of background audio data and background image data stored in the storage unit of the background data providing server 70.
  • the storage unit of the background data providing server 70 stores background data (background audio data, background image data) corresponding to the following various backgrounds, for example.
  • background audio data background audio data, background image data
  • FIG. 2 shows a specific example of background audio data and background image data stored in the storage unit of the background data providing server 70.
  • the storage unit of the background data providing server 70 stores background data (background audio data, background image data) corresponding to the following various backgrounds, for example.
  • background data background audio data, background image data
  • Background audio data is various audio data generated within the space that constitutes the background. If the background space contains walls, ceilings, floors, etc., the background audio data is generated taking into account the reverberations from these walls, etc.
  • the audio data will include the sound of a coffee siphon in the cafe space, the voices of people talking in the cafe, etc., and will also include reverberations from the walls of the cafe.
  • the background data providing server 70 generates audio data files for various environments and stores them in a storage unit. For example, it performs an analysis of the impulse response of a real space to generate audio data files that contain audio data corresponding to various real spaces.
  • the image data stored in the image data file is three-dimensional image data or two-dimensional image data, and is image data that can display images from various viewpoints. Each user can display images from various directions by operating the display unit of the user device (sliding a finger, etc.).
  • various background voice data and background image data held by the background data providing server 70 can be stored in user terminals a, 21a to d, 21d before a conversation is carried out via a communication network.
  • Figure 3 is a diagram explaining an example of processing when background data (background audio data, background image data) is stored in the user terminal 21.
  • FIG. 3 shows a background data providing server 70 and user terminals a and 21a.
  • User terminals a and 21a can access the background data providing server 70, acquire (download) various background data (background audio data, background image data) from the background data providing server 70, and store it in the memory of user terminals a and 21a.
  • background data background audio data, background image data
  • various background audio data acquired from the background data providing server 70 is stored in the audio data storage section, which is the memory section of the user terminal a, 21a, and various background image data acquired from the background data providing server 70 is stored in the image data storage section.
  • FIG. 3 shows an example of background data acquisition processing by user terminals a and 21a
  • the other user terminals b to d can also execute similar processing to acquire (download) various background data (background audio data, background image data) from the background data providing server 70 and store it in the memory of each user terminal.
  • background data background audio data, background image data
  • Example 1 Example of executing voice control based on user's location
  • Example 2 Example of executing voice control based on the user's position
  • FIG. 4 shows an example of the configuration of a user terminal a, 21a used by a user a, 11a, who is a participant in the conversation via the communication network described with reference to FIG.
  • the user terminals a, 21a to d, 21d used by the users a, 11a to d, 11d all have substantially the same configuration as the example shown in FIG.
  • the user terminals 21a to 21d are composed of information processing devices capable of communication, such as PCs, smartphones, and tablet terminals.
  • the user terminal a, 21a has a communication unit 101, a user position determination unit (UI) 102, a user position information storage unit 103, a background data acquisition unit 104, an audio data storage unit 105, an image data storage unit 106, an audio data receiving unit 107, an output audio control unit 108, an audio output unit 109, an output image control unit 110, an image output unit 111, a display unit 112, an audio input unit 113, a camera 114, an image input unit 115, and a data transmission unit 116.
  • UI user position determination unit
  • the communication unit 101 executes communication processing via a communication network. It executes data transmission and reception processing with other user terminals, the communication management server 50, and the background data providing server 70.
  • a user position determination unit (UI) 102 executes a process of determining the position of each user when a plurality of users are engaged in a conversation via a network.
  • the user position can be determined by utilizing a UI (user interface).
  • UI user interface
  • FIG. 1 when four users a, 11a to d, 11d are to converse over a network, a process for determining the positions of these four users is executed.
  • This user position determination process is executed to control the direction from which each user's voice can be heard. Specific examples of this user position determination process and voice control process according to the determined user position will be explained later.
  • the user position information storage unit 103 is a memory unit for storing the user position information determined by the user position determination unit (UI) 102. For example, as described with reference to FIG. 1, when four users a, 11a to d, 11d are having a conversation via a network, the position information of these four users is stored.
  • the user position information stored in the user position information storage unit 103 is output to an output sound control unit 108 and an output image control unit 110 .
  • the output voice control unit 108 and the output image control unit 110 perform output control of each user's speech (such as control of the direction from which each user's speech can be heard) and output control of user images (avatar images or actual images) according to the user position information stored in the user position information storage unit 103.
  • the background data acquisition unit 104 acquires (downloads) various background data (background audio data, background image data) from the background data providing server 70 via the communication unit 101 .
  • the background voice data constituting the background data acquired by the background data acquisition unit 104 from the background data providing server 70 is stored in the voice data storage unit 105.
  • the background image data is stored in the image data storage unit .
  • the background audio data stored in the audio data storage unit 105 is selected and acquired under the control of an output audio control unit 108 , and is output via an audio output unit 109 .
  • the background image data stored in the image data storage unit 106 is selected and acquired under the control of an output image control unit 110 , and is output to a display unit 112 via an image output unit 111 .
  • the voice data receiving unit 107 receives voice data, such as the voices of other users participating in the conversation, via the communication unit 101 and outputs it to the output voice control unit 108.
  • the output audio control unit 108 outputs audio data, such as the voices of other users participating in the conversation, input from the audio data receiving unit 107, together with background audio data selected and acquired from the audio data storage unit 105, via the audio output unit 1109 to speakers, such as headphones, worn by the user.
  • the output voice control unit 108 also controls the output of each user's voice, specifically the direction from which each user's voice can be heard and the volume of the voice, according to the position of each user stored in the user position information storage unit 103.
  • the output image control unit 110 executes control to output the background image data selected and acquired from the image data storage unit 106 to the display unit 112 via the image output unit 111 .
  • the output image control unit 110 also controls the display of an avatar image representing each user, a virtual image (character image) representing each user, or an actual image of each user superimposed on a background image, depending on the position of each user stored in the user position information storage unit 103.
  • a virtual image (character image) representing each user is stored in advance in the image data storage unit 106.
  • an image received from each user terminal via the communication unit 101 may be used.
  • the image received from each user terminal via the communication unit 101 is used.
  • the voice input unit 113 inputs voice data such as the user's speech via a microphone, and transmits the data via the data transmission unit 116 and the communication unit 101 to each device such as each user terminal connected to the network.
  • the image input unit 115 inputs image data such as a facial image of the user captured by the camera 114, and transmits the image data to each device such as each user terminal connected to the network via the data transmission unit 116 and the communication unit 101.
  • the output voice control unit 108 of the user terminal controls the output of each user's voice, specifically, controls the direction from which each user's voice can be heard and the volume of the voice, depending on the position of each user stored in the user position information storage unit 103.
  • FIG. 1 An example of a process in which four users a, 11a to d, 11d shown in FIG. 1 hold an online conference will be described.
  • Four users a, 11a to d, 11d hold an online conference in a single conference room as shown in FIG.
  • image data of the conference room is used as the background image data
  • audio data of the conference room is used as the background audio data.
  • the audio data of the conference room is, for example, almost silent audio data or actual audio data of air conditioning noise.
  • the position of each user is determined. Specifically, the positions of four users a, 11a to d, 11d are determined as shown in Figure 5.
  • users a, 11a to d, 11d connect their respective user terminals a, 21a to d, 21d via a communications network, and then consult with each other to determine background settings and the positions of each user.
  • An example of a processing sequence for determining background settings and the positions of each user will be described with reference to FIG.
  • the processing sequence for setting the background and determining the position of each user can be executed, for example, according to steps (S01) to (S04) in FIG. 6.
  • step S01 the user a, 11a, proposes to set up an online conference to the other users b to d, "Let's set it up so we're meeting in a conference room.”
  • each user operates their own user terminal to perform an operation to select background data of the conference room as background data (background image data, background audio data). For example, a UI for selecting background data is displayed on each user terminal, and the user uses this UI to select background data of the conference room.
  • the output voice control unit 108 of each user terminal is set to input the voice data of the conference room from the voice data storage unit 105 and output the voice data of the conference room via the voice output unit 109 .
  • the output image control unit 110 of each user terminal is set to input image data of the conference room from the image data storage unit 106 and output a background image of the conference room to the display unit 112 via the image output unit 111 .
  • step S03 the users a and 11a propose to the other users b to d the following setting of user positions in the online conference. "Is it okay for the user positions to be C in front of me (A), B next to me (A), and D in front of B?" Such a user position is proposed.
  • This process can be performed by utilizing the user position determination unit (UI) 102 of the user terminal described with reference to FIG.
  • UI user position determination unit
  • user location data such as that shown on the right side of (S03) in FIG. 6 is displayed on each user terminal, and each user can confirm the user positions proposed by users a and 11a.
  • step S04 when all users a to d agree with the positions of users a to d in the conference room proposed by user a, 11a, the online conference is then started.
  • the position information of users a to d set by user a, 11a using the user position determination unit (UI) 102 of user terminal a, 21a is also transmitted to each of user terminals b to d, and this user position information is stored in the user position information storage unit 103 in each user terminal.
  • UI user position determination unit
  • the output voice control unit 108 of each user terminal performs voice direction control and volume control of each user's speech voice according to the user position stored in the user position information storage unit 103.
  • the output image control unit 110 of each user terminal displays an avatar image or a real image of each user superimposed on the background image data of the conference room according to the user position stored in the user position information storage unit 103 . That is, image data such as that shown in FIG. 5 is displayed on the display unit of the user terminal of each of users a to d.
  • each of users a to d can hear the speech of the other users as voice coming from the position of each user set according to the sequence described with reference to FIG.
  • the output voice control unit 108 of the user terminal a, 21a to user terminal d, 21d of each user a to d controls the voice direction and volume of the speech of each user according to the user position stored in the user position information storage unit 103.
  • This output voice control process allows each user to hear the speech of other users as if it were coming from the position set by each user.
  • FIG. 7 shows a specific example in which the user a, 11a recognizes the direction from which the speech of the other users b to d is coming under the control of the user terminal a, 21a used by the user a, 11a.
  • the recognition directions of the speech of the other users b to d by the user a, 11a are as follows:
  • the voice of user b, 11b is heard from the right side of user a, 11a.
  • the voice of user c, 11c is heard from the front of user a, 11a.
  • the voice of user b, 11b can be heard from diagonally in front and to the right of user a, 11a.
  • the output voice control unit 108 of the user terminal a, 21a controls the voice direction and volume of the speech of each user according to the user position stored in the user position information storage unit 103.
  • This output voice control process allows the user a, 11a to hear the speech of the other users b to d as if it were coming from the user position shown in FIG.
  • the output voice control process executed by the output voice control unit 108 of the user terminal a, 21a is a voice direction control process of the user's voice according to the user position stored in the user position information storage unit 103, and a volume control process.
  • the output audio control unit 108 pre-calculates and stores audio output control parameters corresponding to various sound source positions around the central position, which is the position of the user listening to the audio, and uses these control parameters to control the output of audio from each position.
  • FIG. 8 shows an example in which the position of the user listening to the sound (i.e., the listening position) is set at the center position (x0, y0, z0) in an xyz three-dimensional space, and various virtual sound source positions (x1, y1, z1) to (xn, yn, zn) are set around it.
  • the audio output channels from the virtual sound source positions (x1, y1, z1) to (xn, yn, zn) are ch1 to chn.
  • the channel-specific audio control parameters include not only control parameters for the direction from which audio can be heard, but also volume control parameters for adjusting the volume of the audio. In other words, it also includes volume control parameters according to the distance between the central position (x0, y0, z0) and each channel position, and the volume is controlled to be louder at channel positions close to the central position (x0, y0, z0) and quieter at positions farther away.
  • the output sound control unit 108 pre-calculates and stores sound output control parameters corresponding to various sound source positions around the position of the user listening to the sound, with the position being set as the central position.
  • the output voice control unit 108 uses the control parameters calculated in advance and the parameters corresponding to the set position of the speaking user to execute voice output control for each user's speech.
  • the recognition directions of the speech of the other users b to d by the user a, 11a are as follows:
  • the voice of user b, 11b is heard from the right side of user a, 11a.
  • the voice of user c, 11c is heard from the front of user a, 11a.
  • the voice of user b, 11b can be heard from diagonally in front and to the right of user a, 11a.
  • multiple channels are set around a three-dimensional space centered on the position of the user who is the listener.
  • the position of the user listening to the sound (i.e., the listening position) is set as the center position (x0, y0) in an xy two-dimensional space, and various virtual sound source positions (x1, y1) to (xn, yn) are set around it.
  • the audio output channels from the virtual sound source positions (x1, y1) to (xn, yn) are defined as ch1 to chn.
  • FIG. 10 illustrates an example of a process for controlling the output of voices of other users b to d, which is executed by the output voice control unit 108 of the user terminal a, 21a of the user a, 11a.
  • the position of user a, 11a is set at the center position (x0, y0) of the xy two-dimensional plane shown in FIG.
  • the other users b, 11b to d, 11d are positioned according to the sequence shown in FIG. 6 described above, that is, are positioned as shown in FIG. 5 described above. That is, To the right of user a, 11a, is user b, 11b. Before user a, 11a, user c, 11c, User d, 11d, is in front of user a, 11a, to the right. This is the arrangement.
  • This user arrangement is associated with the virtual sound source positions on the xy two-dimensional plane for which the channel-corresponding parameters have been calculated, as shown in FIG.
  • User b, 11b is set at the location chq(xq, yq).
  • User c, 11c is set at the location chp(xp, yp).
  • User d, 11d is set at the location chn(xn,yn).
  • the output voice control unit 108 of the user terminal a, 21a of the user a, 11a controls the output of the speech of each user b to d according to the channel positions corresponding to the users b to d as shown in Figure 10, and controls the output of each user's speech by using voice output control parameters corresponding to the channel positions corresponding to each user.
  • a control voice utilizing a control parameter corresponding to chq is output to the voice output unit (headphone) of user a, 11a.
  • a control voice using a control parameter corresponding to the chp is output to the voice output unit (headphone) of the user a, 11a.
  • a control voice using a control parameter corresponding to the chn is output to the voice output unit (headphone) of the user a, 11a.
  • the control parameters of the multiple channel positions around the speaking user's setting position are synthesized as described above to calculate the parameters corresponding to the speaking user's setting position, and the calculated parameters are used to perform voice output control for the user's speech.
  • the recognition direction of the speech of the other users b to d by the user a, 11a is set as follows.
  • the voice of user b, 11b is heard from the right side of user a, 11a.
  • the voice of user c, 11c is heard from the front of user a, 11a.
  • the voice of user b, 11b can be heard from diagonally in front and to the right of user a, 11a.
  • the output voice control unit 108 of the user terminal a, 21a controls the voice direction and volume of the speech of each user according to the user position stored in the user position information storage unit 103.
  • This output voice control process allows the user a, 11a to hear the speech of the other users b to d as if it were coming from the respective user positions, as shown in FIG. 7.
  • FIG. 7 shows an example of processing by the output voice control unit 108 of user terminal a, 21a, but the other users b to d also execute similar processing in their respective user terminals b, 21b to d, 21d.
  • User terminal b, 21b to user terminal d, 21d of each of users b to d sets the user itself at the center position, analyzes the positions of the other users, and controls the voice of each user so that the speech of the other users comes from the position of the user.
  • FIG. 11 shows a specific example in which user b, 11b recognizes the direction from which the speech of other users a, c, and d is coming by controlling user terminal b, 21b used by user b, 11b.
  • the recognition directions of the speech of other users a, c, and d by users b and 11b are as follows:
  • the voice of user a, 11a is heard from the left side of user b, 11b.
  • the voices of users c and 11c can be heard from a position diagonally in front and to the left of users b and 11b.
  • the voices of users b and 11b are heard from the front of users b and 11b.
  • the output voice control unit 108 of the user terminals b and 21b controls the voice direction and volume of the speech of each user according to the user position stored in the user position information storage unit 103.
  • This output voice control process allows the users b and 11b to hear the speech of the other users a, c, and d as if it were coming from the user positions shown in FIG.
  • FIG. 12 is a diagram showing a specific example of the output voice control process of the output voice control unit 108 of the user terminal b, 21b.
  • the output voice control unit 108 of the user terminal 21b sets the position of the user b 11b at the center position (x0, y0) of the xy two-dimensional plane shown in FIG.
  • the other users a, c, and d are positioned according to the sequence shown in FIG. 6 described above, that is, are positioned as shown in FIG. That is, To the left of user b, 11b, user a, 11a, User c, 11c, is in front of user b, 11b to the left. Before user b, 11b, user d, 11d, This is the arrangement.
  • This user arrangement is associated with the virtual sound source positions on the xy two-dimensional plane on which the channel-corresponding parameters have been calculated, as shown in FIG.
  • User a, 11a is set at location chr(xr,yr).
  • User c, 11c is set at the location chs(xs, ys).
  • User d, 11d is set at the location cht(xt, yt).
  • the output voice control unit 108 of the user terminal b, 21b of the user b, 11b controls the output of the speech of each user a, c, d according to the channel positions corresponding to the users a, c, d as shown in Figure 12, and controls the output of each user's speech by using voice output control parameters corresponding to the channel positions corresponding to each user.
  • the output voice control unit 108 controls not only the direction of speech according to the user's position, but also the volume according to the user's position, so that the volume of a distant user is made smaller than the volume of a closer user.
  • the speech of each user is controlled so that it can be heard from the set position of each user according to the position of each user stored in the user position information storage unit 103, allowing each user to experience the same sensation as if they were actually present in the same space.
  • the processing example described with reference to Figures 5 to 12 is an example of a setting in which four users a to d hold a conference in a conference room.
  • the output audio control unit 108 of each user terminal a, 21a to d, and 21d acquires the background sound of the conference room from the audio data storage unit 105 in which the background sound is stored, and outputs it from the audio output unit 109.
  • each user hears the speech of the other user together with the background sound of the conference room.
  • the background sound in the conference room is, as described above, almost silent audio data or actual audio data of air conditioning sounds.
  • FIG. 13 shows an example in which four users a to d are having a conversation in an environment in which various background sounds are present.
  • FIG. 13 shows an example in which four users a to d are having a conversation using background data of a cafe where various background sounds are present.
  • the user layout is the same as that in the conference room described above with reference to FIG.
  • the output audio control unit 108 of each of the user terminals a, 21 a to d, and 21 d acquires the cafe background sound from the audio data storage unit 105 in which the background sound is stored, and outputs it from the audio output unit 109 .
  • the background sounds include, for example, the sound of a coffee siphon, the sound of coffee cups, background music, and the like.
  • FIG. 14 shows an example of an output voice control process executed by the output voice control unit 108 of the user terminal a, 21a used by the user a, 11a.
  • the output voice control unit 108 of the user terminal a, 21a acquires background sounds of a cafe, such as the sound of a coffee siphon, the sound of a coffee cup, background music, etc., from the voice data storage unit 105, and outputs them via the voice output unit 109. Furthermore, the control unit also controls the direction of the speech of other users b to d as follows.
  • the voice of user b, 11b is controlled so as to be heard from the right side of user a, 11a.
  • the voices of users c and 11c are controlled so as to be heard from the front of users a and 11a.
  • the voices of users b and 11b are controlled so that they can be heard from a position diagonally in front and to the right of users a and 11a.
  • users a and 11a can get the feeling that four users a to d are gathered in a cafe and having a conversation.
  • FIG. 15 shows an example in which four users a to d are having a conversation using background data of a cafe with many people around.
  • the user layout is the same as that in the conference room described above with reference to FIG.
  • the output audio control unit 108 of each of the user terminals a, 21 a to d, and 21 d acquires the background sound of a cafe with many people around from the audio data storage unit 105 in which the background sound is stored, and outputs it from the audio output unit 109 .
  • the background sounds include the voices of many people talking, the sound of coffee cups, background music, and the like.
  • FIG. 16 shows an example of an output voice control process executed by the output voice control unit 108 of the user terminal a, 21a used by the user a, 11a.
  • the output voice control unit 108 of the user terminal a, 21a acquires background sounds of a cafe with many people around, such as background sounds consisting of the voices of many people, the sound of coffee cups, background music, etc., from the voice data storage unit 105, and outputs the acquired sounds via the voice output unit 109. Furthermore, the control unit 108 also controls the direction of the speech of other users b to d as follows.
  • the voice of user b, 11b is controlled so as to be heard from the right side of user a, 11a.
  • the voices of users c and 11c are controlled so as to be heard from the front of users a and 11a.
  • the voices of users b and 11b are controlled so that they can be heard from a position diagonally in front and to the right of users a and 11a.
  • users a and 11a can get the feeling that the four users a to d are gathered in a cafe with many people and are having a conversation.
  • the position of each of multiple users participating in a conversation is determined in advance, and voice output control of user speech and display control of user images (avatars or real images) are performed according to the determined position of each user.
  • the embodiment described below is an embodiment in which, for example, the positions of multiple users participating in a conversation are not determined in advance, and voice output control of user speech and display control of user images (avatars or real images) are performed according to the degree of intimacy between the users.
  • the user terminal a, 21a of user a, 11a executes audio output control to increase the volume of the speech of users who are close to (have a high level of intimacy with) user a, 11a, and also executes display control to change the display position to a position close to the display position of user a or in front of the user a.
  • audio output control is performed to reduce the volume of speech of users who are not on good terms with user a, 11a (low intimacy level), and display control is also performed to change the display position to a position farther away or behind the display position of user a.
  • 17 is a process similar to the pre-processing described in the first embodiment, that is, a diagram for explaining the sequence of the background determination process and the user placement determination process executed before the start of a conversation between users.
  • the processing sequence for setting the background and determining the position of each user can be executed, for example, according to the procedure of steps (S11) to (S14) in FIG.
  • step S11 the users a and 11a propose a conversation setting to the other users b to d as follows: "Let's set it up as a meeting in the park.” We propose the following:
  • each user operates their own user terminal to perform an operation to select background data of a park as background data (background image data, background audio data). For example, a UI for selecting background data is displayed on each user terminal, and the user uses this UI to select background data of the park.
  • the output audio control unit 108 of each user terminal is set to input park audio data from the audio data storage unit 105 and output the park audio data via the audio output unit 109.
  • it is set to output park audio data including the sounds of birds chirping and a babbling brook.
  • the output image control unit 110 of each user terminal is configured to input image data of the park from the image data storage unit 106 and output a background image of the park to the display unit 112 via the image output unit 111.
  • step S13 the users a and 11a make the following proposals regarding the setting of user positions to the other users b to d. "Is it okay to have a setting where friends can talk freely with each other?" We make such proposals.
  • This process can be performed by utilizing the user position determination unit (UI) 102 of the user terminal described with reference to FIG.
  • UI user position determination unit
  • user location data such as that shown on the right side of (S13) in FIG. 17 is displayed on each user terminal, and each user can confirm the user positions proposed by users a and 11a.
  • step S14 the setting proposed by user a, 11a, i.e. "Setting as two friends having a free conversation in a park.” If all users a to d agree to this setting, then the conversation begins.
  • the user terminal 21 is provided with an intimacy calculation unit that calculates the intimacy between users.
  • the intimacy calculation unit of the user terminal 21 analyzes the conversation situation between the users, and further analyzes the preference information of other users input by the user, to successively calculate and update the intimacy between the users. A specific example of the intimacy degree calculation process will be described later.
  • FIG. 18 shows an example of intimacy calculation performed by the intimacy calculation unit 121 of the user terminal a, 21a of the user a, 11a, and an example of control of the display position of each user according to the calculated intimacy performed by the output image control unit 110.
  • the intimacy calculation unit 121 analyzes the preference information of each user that the user a, 11a inputs to the user terminal a, 21a, as well as the amount of conversation between users in the past and present, and calculates the intimacy between the user a, 11a and the other users b to d.
  • the graph shown in FIG. 18A is a graph showing an example of the intimacy calculated by the intimacy calculation unit 121.
  • the degree of intimacy between users is expressed as a value between 0 and 10.
  • a degree of intimacy between users of 0 is the lowest level of intimacy, indicating that the users are on the worst terms with each other.
  • a degree of intimacy between users of 10 is the highest level of intimacy, indicating that the users are on the best terms with each other.
  • the graph shown in FIG. 18A indicates the following state of intimacy between users.
  • the intimacy level between user a and user b is 10, which indicates that the relationship between user a and user b is the best.
  • the intimacy level between user a and user c is 2, which indicates that the relationship between user a and user c is not very good.
  • the intimacy level between user a and user d is 5, which indicates that the relationship between user a and user d is neither good nor bad and is normal.
  • the output image control unit 110 of the user terminal a, 21a inputs the intimacy shown in Figure 18 (a), i.e., the intimacy information calculated by the intimacy calculation unit 121, and controls the display position of each user according to the calculated intimacy information.
  • the user display position according to the intimacy information determined by the output image control unit 110 is set as shown in FIG.
  • the output image control unit 110 determines the display position of each user according to the intimacy information as follows.
  • the display position of user b who has an intimacy level of 10 and is the closest to user a, is located very close (distance L1) to user a and almost directly in front of user a.
  • the display position of user c who is not on good terms with user a and has an intimacy level of 2, is located far away from user a (distance L3) and is located almost behind user a.
  • the display position of user d whose intimacy level is 5 and who is neither on good nor bad terms with user a, is neither close nor far from user a (distance L2), and is diagonally in front of user a.
  • the distances L1, L2, and L3 have a magnitude relationship of L1 ⁇ L2 ⁇ L3.
  • the output image control unit 110 performs control such that the higher the degree of intimacy of a user, the shorter the distance from the user display position of the user is set, and the user is displayed in a position closer to the front of the user display position of the user.
  • the lower the degree of intimacy with a user the longer the distance from the user display position of the user is set, and the user is displayed at a position farther away from the front of the user display position of the user.
  • FIG. 19 shows a specific example of an image that the image output unit 111 displays on the display unit 112 according to the display position of each user determined by the output image control unit 110 according to the intimacy information calculated by the intimacy calculation unit 121.
  • FIG. 19(b) shows the display positions of each user determined by the output image control unit 110, described above with reference to FIG. 18(b), in accordance with the calculated intimacy information of the intimacy calculation unit 121.
  • the image output unit 111 of the user terminal a, 21a inputs the display position information of each user determined by the output image control unit 110, and displays the images (avatars or actual images) of each user, i.e., user a, 11a to user d, 11d, on a background image (background image of a park), as shown in Figure 19(c).
  • the users a and 11a will proceed with conversation with each other while looking at this image displayed on the user terminals a and 21a.
  • User b who is closest to user a, 11a, is displayed in a position close to and in front of user a, and as a result, user a, 11a is more likely to actively talk to user b, 11b.
  • user c who is on the worst terms with users a and 11a, is displayed at the back of user a, away from user a. As a result, users a and 11a do not talk much with users c and 11c.
  • the display position of each user is determined according to the intimacy calculated by the intimacy calculation unit 121 of the user terminal used by each user. Therefore, the display position of each user may be set differently on each of the user terminals a to d used by users a to d.
  • the user terminal a, 21a of the user a, 11a executes voice output control to increase the volume of speech of users who are on good terms (have a high degree of intimacy) with the user a, 11a.
  • voice output control is executed to make the volume of the speech of a user who is not on good terms with user a, 11a (having a low level of intimacy) low.
  • FIG. 20 shows a graph illustrating an example of the control process of the user speech output volume according to the degree of intimacy executed by the output voice control unit 108 of the user terminal a, 21a used by the user a, 11a.
  • the graph shown in FIG. 20 shows the degree of intimacy between users on the horizontal axis and the user speech output volume on the vertical axis. As can be seen from the graph, the higher the degree of intimacy between users, the louder the output volume of user utterances.
  • the output voice control unit 108 of the user terminal a, 21a used by the user a, 11a executes an output volume control process to increase the output volume of speech from a user who has a high degree of intimacy with the user a, 11a.
  • output volume control processing is executed to reduce the output volume.
  • the intimacy between users is calculated by the intimacy calculation unit 121 .
  • the degree of intimacy between the user a, 11a and the users b to d calculated by the intimacy calculation unit 121 of the user terminal a, 21a used by the user a, 11a is assumed to be the same as that described above with reference to FIG.
  • the intimacy calculation unit 121 of the user terminal a, 21a used by the user a, 11a calculates the following inter-user intimacy degree.
  • the intimacy level between user a and user b is 10, i.e., high intimacy level.
  • the intimacy level between user a and user c is 2, ie, low intimacy level.
  • the intimacy level between user a and user d is 5, that is, medium intimacy level.
  • the output voice control unit 108 of the user terminal a, 21a used by the user a, 11a controls the output volume of each speech of users b to d as follows, as shown in FIG. 21.
  • the volume of the speech of user b, 11b, whose intimacy level is 10, i.e., whose level of intimacy is high, is controlled to be set to a high volume (Vol. 3) and output to a speaker such as a headphone of the user via the audio output unit 109.
  • the volume of the speech of user d, 11d, whose intimacy level is 5, i.e., whose level of intimacy is medium, is controlled to be medium volume (Vol. 2) and output to a speaker such as a headphone of the user via the audio output unit 109.
  • the output voice control unit 108 of the user terminal 21 executes an output volume control process that increases the output volume for speech from users who have a high level of intimacy with the user using the user terminal 21, and decreases the output volume for speech from users who have a low level of intimacy.
  • the output voice control unit 108 of the user terminal 21 also controls the output volume of background sound in addition to controlling the output volume of the user's speech. As shown in FIG. 22, the output volume control process for the background sound reduces the output volume of the background sound when a user with a high level of intimacy with the user using the user terminal 21 speaks, and increases the output volume of the background sound when a user with a low level of intimacy speaks.
  • the output voice control unit 108 of the user terminal a, 21a used by the user a, 11a executes the following control as shown in FIG. While user b, 11b, whose intimacy level is 10, i.e., whose level of intimacy is high, is speaking, volume control is executed to set the output volume of the background sound to a low volume (Vol. b1) and output the sound to a speaker such as a headphone of the user via the audio output unit 109.
  • a low volume Vol. b1
  • volume control is performed to set the output volume of the background sound to a medium volume (Vol. b2), and the sound is output to a speaker such as a headphone of the user via the audio output unit 109.
  • volume control is executed to set the output volume of the background sound to a high volume (Vol. b3), and the sound is output to a speaker such as a headphone of the user via the audio output unit 109.
  • the output voice control unit 108 of the user terminal 21 also controls the output volume of the background sound in addition to controlling the output volume of the user's speech.
  • the output voice control unit 108 of the user terminal 21 in this embodiment 2 executes an output volume control process for a user with a high level of intimacy to increase the volume of the user's speech and decrease the volume of the background sound, while for a user with a low level of intimacy, it executes an output volume control process to decrease the volume of the user's speech and increase the volume of the background sound.
  • FIG. 23 shows an example of the configuration of a user terminal a, 21a used by a user a, 11a, who is a participant in the conversation via the communication network described with reference to FIG.
  • the user terminals a, 21a to d, 21d used by the users a, 11a to d, 11d all have substantially the same configuration as the example shown in FIG.
  • the user terminals 21a to 21d are composed of information processing devices capable of communication, such as PCs, smartphones, tablet terminals, etc.
  • the user terminal a, 21a has a communication unit 101, a user position determination unit (UI) 102, a user position information storage unit 103, a background data acquisition unit 104, an audio data storage unit 105, an image data storage unit 106, an audio data receiving unit 107, an output audio control unit 108, an audio output unit 109, an output image control unit 110, an image output unit 111, a display unit 112, an audio input unit 113, a camera 114, an image input unit 115, a data transmission unit 116, and an intimacy calculation unit 121.
  • UI user position determination unit
  • a user position information storage unit 103 As shown in FIG. 23, the user terminal a, 21a has a communication unit 101, a user position determination unit (UI) 102, a user position information storage unit 103, a background data acquisition unit 104, an audio data storage unit 105, an image data storage unit 106, an audio data receiving unit 107, an output audio control unit 108, an audio output unit 109, an output image
  • the configuration of the user terminal a, 21a shown in FIG. 23 is a configuration in which an intimacy calculation unit 121 is added to the configuration of the user terminal a, 21a described with reference to FIG. 4 in the previous embodiment 1.
  • the configuration other than the intimacy calculation unit 121 is similar to the configuration described with reference to FIG. 4 in the first embodiment, and therefore description thereof will be omitted.
  • the user terminal a, 21a of the second embodiment controls the voice output of user speech and the display of user images (avatars or real images) according to the intimacy of each of a plurality of users participating in a conversation, for example.
  • the intimacy calculation unit 121 calculates the intimacy that serves as the basis for the audio output control process and the image display control process.
  • the intimacy calculated by the intimacy calculation unit 121 is output to the output sound control unit 108 and the output image control unit 110 as shown in FIG.
  • the output audio control unit 108 and the output image control unit 110 execute audio output control and image output control in accordance with the intimacy calculated by the intimacy calculation unit 121 . That is, for example, the control described above with reference to FIGS.
  • the intimacy calculation unit 121 analyzes the preference information of each user that the user a, 11a inputs to the user terminal a, 21a, as well as the amount of conversation between users in the past and present, and calculates the intimacy between the user a, 11a and the other users b to d.
  • FIG. 24 is a diagram showing an example of a detailed configuration of the intimacy calculation unit 121.
  • the intimacy calculation unit 121 has a user preference input unit (UI) 141 , a user preference analysis unit 142 , a user preference information storage unit 143 , a conversation density analysis unit 144 , and an intimacy calculation unit 145 .
  • UI user preference input unit
  • the user preference input unit (UI) 141 is an input unit (UI) that enables a user to directly input the level of preference for other users. For example, user a, 11a, inputs the preferred level (eg, Level 0 to Level 5) for each of other users b to d.
  • UI user preference input unit
  • the preference level (Lev. 0 to Lev. 5) for each other user input by the user via the user preference input unit (UI) 141 is input to the user preference analysis unit 142.
  • the user preference analysis unit 142 calculates the final preference level (e.g., Lev. 0 to Lev. 5) of other users as viewed from the user using the user terminal based on the preference levels (Lev. 0 to Lev. 5) of other users input to the user preference input unit (UI) 141 and the analysis results of conversations between users input via the communication unit.
  • the final preference level e.g., Lev. 0 to Lev. 5
  • the user preference analysis unit 142 executes, for example, the following analysis process as an analysis process of the conversation between users input via the communication unit.
  • the user preference analysis unit 142 performs, for example, conversation analysis processing to estimate the degree of intimacy between users.
  • the estimation of the preference level based on the conversation analysis results may be configured to utilize the results of a learning process of various conversation data, for example.
  • the final preference level (Lev. 0 to Lev. 5) data for each other user calculated by the user preference analysis unit 142 is stored in the user preference information storage unit 143.
  • the conversation density analysis unit 144 executes an analysis process of the conversation between users input via the communication unit, and calculates the conversation density between each user.
  • the conversation density analysis unit 144 executes an analysis process of conversations between users, and calculates the conversation density level (Lev. 0 to Lev. 5) between each user.
  • the conversation density analysis unit 144 performs an analysis process of conversations between users, taking into account, for example, the amount of direct conversation between users, the amount of voice chat, and even the number of times a user's name is called, and calculates the conversation density level (Lev. 0 to Lev. 5) between each user.
  • the conversation density level (Lev. 0 to Lev. 5) calculated by the conversation density analysis unit 144 is input to the intimacy calculation unit 145.
  • the intimacy calculation unit 145 calculates the intimacy level between users using the user preference level (Lev. 0 to Lev. 5) stored in the user preference information storage unit 143 and the conversation density level (Lev. 0 to Lev. 5) calculated by the conversation density analysis unit 144.
  • the intimacy calculation unit 145 of the user terminal a, 21a calculates the intimacy between the user a, 11, who is the user of the user terminal a, 21a, and each of the other users b to d.
  • the intimacy calculation unit 145 first calculates a "user preference base intimacy," which is an intimacy level according to the user preference level (Lev. 0 to Lev. 5) stored in the user preference information storage unit 143.
  • the graph shown in FIG. 25 shows an example of the calculation process of the "user preference-based intimacy" calculated by the intimacy calculation unit 145.
  • the horizontal axis of the graph shows the user preference level (Lev. 0 to Lev. 5) stored in the user preference information storage unit 143, and the vertical axis shows the "user preference-based intimacy.”
  • the user with the highest preference for user a, 11a is user b, 11b, and the "user preference base intimacy" for user b, 11b is calculated to be the highest value (approximately 9.0).
  • the next user with the highest preference for user a, 11a is user d, 11d, and the "user preference base intimacy" for user d, 11d is calculated as the next highest value (approximately 5.8).
  • the user who is least liked by user a, 11a is user c, 11c, and the "user preference base intimacy" for user c, 11c is calculated to be the lowest value (approximately 2.1).
  • the intimacy calculation unit 145 then calculates a "conversation density-based intimacy," which is an intimacy level according to the conversation density level (Lev. 0 to Lev. 5) calculated by the conversation density analysis unit 144.
  • FIG. 26 shows an example of the calculation process of the "conversation density-based intimacy" calculated by the intimacy calculation unit 145.
  • FIG. 25 shows a graph in which the horizontal axis indicates the conversation density level (Lev. 0 to Lev. 5) calculated by the conversation density analysis unit 144, and the vertical axis indicates the "conversation density-based intimacy.”
  • the user with the highest conversation density with user a, 11a is user b, 11b, and the "conversation density-based intimacy" with user b, 11b is calculated to be the highest value (approximately 7.2).
  • the next user with the highest conversation density with user a, 11a is user d, 11d, and the "conversation density-based intimacy" for user d, 11d is calculated as the next highest value (approximately 4.5).
  • the user who is least liked by the user a, 11a is the user c, 11c, and the "conversation density-based intimacy" for this user c, 11c is calculated to be the lowest value (approximately 1.8).
  • the intimacy calculation unit 145 performs calculation processing using the "user preference-based intimacy” calculated according to the graph shown in FIG. 25 and the “conversation density-based intimacy” calculated according to the graph shown in FIG. 26 to calculate the final “intimacy” for the user.
  • Final intimacy r ⁇ p + ⁇ q where p is a user preference-based intimacy degree, and q is a conversation density-based intimacy degree.
  • the final intimacy r (0 to 10) values for these three users b to d calculated by the intimacy calculation unit 145 are input to the output audio control unit 107 and the output image control unit 110, as shown in Figures 23 and 24.
  • the output audio control unit 107 and the output image control unit 110 execute audio output control and image output control according to the final intimacy r (0 to 10) for the three users b to d input from the intimacy calculation unit 121. That is, for example, the audio output control and image output control as described above with reference to FIGS.
  • the user preference analysis unit 142 and the conversation density analysis unit 144 continue the analysis process by inputting the conversation information via the communication unit 101 even while a conversation is taking place between users, successively updating the "user preference-based intimacy” and the “conversation density-based intimacy", and inputting the updated data to the intimacy calculation unit 145.
  • the intimacy calculation unit 145 uses the latest "user preference-based intimacy” and “conversation density-based intimacy” input from the user preference analysis unit 142 and conversation density analysis unit 144 to perform processing to successively update the final intimacy value, and continuously inputs the updated values to the output audio control unit 107 and the output image control unit 110.
  • the output audio control unit 107 and the output image control unit 110 perform processing to change the control mode successively according to the latest intimacy value that is updated during the conversation.
  • FIG. 27 is a diagram showing an example of a display control process executed by the output image control unit 110 while a conversation is being carried out between a plurality of users a to d, and shows examples of display data at the following two times.
  • (a) Time t
  • (b) Time t1 It should be noted that time t1 is a certain time after time t0.
  • the output image control unit 110 executes a display control process as shown in the upper part (a) of FIG.
  • the display position of user b who has an intimacy level of 10 and is the closest to user a, is located very close (distance L1) to user a and almost directly in front of user a.
  • the display position of user c who is not on good terms with user a and has an intimacy level of 2, is located far away from user a (distance L3) and is located almost behind user a.
  • the display position of user d whose intimacy level is 5 and who is neither on good nor bad terms with user a, is neither close nor far from user a (distance L2), and is diagonally in front of user a.
  • the distances L1, L2, and L3 have a magnitude relationship of L1 ⁇ L2 ⁇ L3.
  • FIG. 27B shows an example of a display control process at time t1, a certain time after time t0.
  • a conversation between users a to d continues between times t0 and t1, and the user preference analysis unit 142 and the conversation density analysis unit 144 input conversation information via the communication unit 101 and continue their analysis processing.
  • the user preference analysis unit 142 and the conversation density analysis unit 144 successively update the “user preference-based intimacy” and the “conversation density-based intimacy” as a result of this analysis process, and input the updated data to the intimacy calculation unit 145 .
  • the intimacy calculation unit 145 updates the final intimacy value by using the latest "user preference-based intimacy” and “conversation density-based intimacy” input from the user preference analysis unit 142 and conversation density analysis unit 144. This updated value is input to the output sound control unit 107 and the output image control unit 110.
  • the output image control unit 110 executes a display control process as shown in the lower part (b) of FIG.
  • the display position of user b who has a closeness level of 8 and is in the best relationship with user a, is located very close to user a (distance L1') and almost directly in front of user a.
  • the output voice control unit 107 also changes the mode of the output voice control process for each user utterance in accordance with changes in the degree of intimacy for each user and changes in the display position.
  • voice direction control is performed so that the speech of each user can be heard from each user's new display position
  • an example of changing the display position of a user is described as the display control process performed by the output image control unit 110 in response to a change in intimacy level, but the output image control unit 110 may also perform a process of changing the image (avatar image or actual image) of the user to be displayed.
  • the output image control unit 110 may execute display control of a user image (avatar image or real image) according to such intimacy.
  • the output image control unit 110 may execute display control of a user image (avatar image or real image) according to such intimacy.
  • Example 3 Example in which different background data is used on each user terminal.
  • the user terminals of the users who are basically engaged in a conversation via a network are described as examples in which the conversation is conducted while displaying common background data.
  • background data can be set individually by each user terminal, and it is also possible for the user terminals of each user having a conversation over the network to output different individual background data to their respective user terminals while the conversation takes place over the network.
  • the third embodiment described below is an embodiment that performs such processing.
  • FIG. 29 shows an example in which each user terminal of a user who is engaged in a conversation over a network outputs different background data to each user terminal.
  • the user a, 11a sets the user terminal a, 21a to output background data of a cafe, and the user terminal a, 21a outputs a background image and background sound of the cafe.
  • user b, 11b sets user terminal b, 21b to output background data of a park, and user terminal b, 21b outputs a background image and background sound of the park.
  • user c, 11c sets user terminal c, 21c to output background data of a live house, and user terminal c, 21c outputs background images and background sounds of the live house. Furthermore, user d, 11d, has set user terminal d, 21d to output park background data in the same manner as user terminal b, 21b, and user terminal d, 21d outputs a park background image and park background sound.
  • users a, 11a through d, 11d can converse with each other while outputting different background data to their respective user terminals a, 21a through d, 21d.
  • each user terminal may also be determined to be different for each user terminal. For example, it is possible to set a fixed user position as described in the first embodiment, or a user position determined according to the intimacy level as described in the second embodiment.
  • each user terminal can set whether to continuously output background data set on the user's own terminal regardless of whether or not the user is conversing with another user, or to output background data set on the other user's user terminal when conversing with another user to the user's own terminal.
  • These settings can be made individually on each user terminal, and can be set, for example, using a UI or the like.
  • Processing Example 1 A processing example in which, when talking to another user and having a conversation with the other user, the background data set on the user's own device is set to be continuously output.
  • Process Example 2 A processing example in which, when talking to another user and having a conversation with the other user, the background audio data alone among the background data set on the user's own device is switched to the background audio set on the conversation partner's user device.
  • FIG. 30 shows a user a, 11a, and a user b, 11b, who are engaged in a conversation via communication.
  • a user a, 11a is using a user terminal a, 21a, which is set to output background data of a cafe.
  • a user b, 11b is using a user terminal b, 21b, which is set to output background data of a park.
  • user a, 11a talks to user b, 11b.
  • the user b, 11b responds to the speech from the user a, 11a by making an utterance.
  • the user b, 11b recognizes the message from the user a, 11a as being from the user a, 11a who is in the park, which is the background data set in the user terminal b, 21b.
  • a response utterance by the user b, 11b is transmitted from the user terminal b, 21b to the user terminal a, 21a via the communication network.
  • the data transmitted from user terminal b, 21b to user terminal a, 21a is only the speech voice data of user b, 11b, and the background data set in user terminal b, 21b, i.e., background image data and background voice data of the park, is not transmitted.
  • User terminal a, 21a receives only the speech voice data of user b, 11b from user terminal b, 21b.
  • the output voice control unit 108 of user terminal a, 21a executes output control on the received speech voice data of user b, 11b, and the controlled voice is output. For example, it is output from headphones worn by user a, 11a connected to user terminal a, 21a.
  • the control process executed in response to the response utterance of user b, 11b is, for example, the control process according to the embodiment 1 described above. That is, the voice output is controlled so that the utterance of user b, 11b can be heard from the relative position of user b, 11b to user a, 11a displayed on user terminal a, 21a shown in FIG. 30.
  • the background sounds of the cafe set on the user terminal a, 11a are also continuously output from the headphones worn by the user a, 11a, and the response utterance of the user b, 11b is output from the headphones along with the background sounds of the cafe.
  • this output voice control process allows users a and 11a to have a conversation while recognizing that users b and 11b are also in the same cafe as users a and 11a.
  • Processing Example 2 that is, when talking to another user and having a conversation with the other user, a setting is made such that, among the background data set on one's own terminal, only the background audio data is switched to the background audio set on the user terminal of the conversation partner.
  • FIG. 31 also shows users a, 11a and b, 11b, who are engaged in a conversation via communication, as in FIG.
  • a user a, 11a is using a user terminal a, 21a, which is set to output background data of a cafe.
  • a user b, 11b is using a user terminal b, 21b, which is set to output background data of a park.
  • user a, 11a talks to user b, 11b.
  • the user b, 11b responds to the speech from the user a, 11a by making a speech.
  • the user b, 11b recognizes the message from the user a, 11a as being from the user a, 11a who is in the park, which is the background data set in the user terminal b, 21b.
  • the response utterance by the user b, 11b is transmitted from the user terminal b, 21b to the user terminal a, 21a via the communication network.
  • this Processing Example 2
  • not only the speech voice data of the user b, 11b but also the background voice data of the park set in the user terminal b, 21b is transmitted from the user terminal b, 21b to the user terminal a, 21a.
  • the background image data of the park is not transmitted.
  • User terminal a, 21a receives the speech voice data of user b, 11b and the background voice data of the park from user terminal b, 21b.
  • the output voice control unit 108 of user terminal a, 21a executes output control on the received background voice data of the park and the speech voice data of user b, 11b, and outputs the controlled voice.
  • the voice is output from headphones worn by user a, 11a connected to user terminal a, 21a.
  • the output voice control unit 108 of the user terminal a, 21a performs the same control process as that described with reference to FIG. 30 (Processing Example 1) for the response utterance of the user b, 11b. That is, this is the control process according to the previously described Example 1, and the voice output is controlled so that the utterance of the user b, 11b can be heard from the relative position of the user b, 11b to the user a, 11a displayed on the user terminal a, 21a shown in FIG. 31.
  • the headphones worn by user a (11a) output background sounds of the park, such as the sounds of birds chirping, received from user terminal b (11b) only when user b (11b) outputs their speech.
  • the headphones worn by user a, 11a output background audio data of a cafe, which is the background sound set in user terminal a, 21a, except when user b, 11b speaks, but only when user b, 11b speaks, the data is switched to background sounds of a park received from user terminal b, 11b, such as the sounds of birds chirping.
  • this output voice control process allows users a, 11a to recognize that users b, 11b are having a conversation in a setting where they are in a park.
  • FIG. 32 also shows users a, 11a and b, 11b, who are engaged in a conversation via communication.
  • the user a, 11a is using the user terminal a, 21a, which has been set to output background data of a cafe.
  • Fig. 32 shows a state after the background image of the user terminal a, 21a has been switched to background data of a park.
  • a user b, 11b is using a user terminal b, 21b, which is set to output background data of a park.
  • user a, 11a talks to user b, 11b.
  • the user b, 11b responds to the speech from the user a, 11a by making an utterance.
  • the user b, 11b recognizes the message from the user a, 11a as being from the user a, 11a who is in the park, which is the background data set in the user terminal b, 21b.
  • the response utterance by the user b, 11b is transmitted from the user terminal b, 21b to the user terminal a, 21a via the communication network.
  • this processing example 3
  • not only the speech voice data of user b, 11b but also background voice data of the park and background image data of the park set in user terminal b, 21b are transmitted from user terminal b, 21b to user terminal a, 21a.
  • User terminal a, 21a receives speech data of user b, 11b, background audio data of the park, and background image data of the park from user terminal b, 21b.
  • the output audio control unit 108 of the user terminal a, 21a executes output control on the received background audio data of the park and the speech audio data of the user b, 11b, and outputs the controlled audio.
  • the audio is output from headphones worn by the user a, 11a connected to the user terminal a, 21a.
  • the output voice control unit 108 of the user terminal a, 21a performs the same control process as that described with reference to FIG. 30 (Processing Example 1) for the response utterance of the user b, 11b. That is, this is the control process according to the previously described Example 1, and the voice output is controlled so that the utterance of the user b, 11b can be heard from the relative position of the user b, 11b to the user a, 11a displayed on the user terminal a, 21a shown in FIG. 31.
  • the output image control unit 108 of the user terminal a, 21a outputs the background image data of the park received from the user terminal b, 21b to the display unit of the user terminal a, 21a in accordance with the output timing of the response utterance of the user b, 11b. This is the state of the display unit of the user terminal a, 21a shown in FIG.
  • the user image (avatar image or real image) of each user placed on the background image of the park is placed according to the user position stored in the user position information storage unit 103 of the user terminal a, 21a. In other words, it is displayed at the same position as the user position placed on the image of the cafe, which is the background data set in the user terminal a, 21a.
  • these output image control processes and output audio control processes enable users a and 11a to recognize that they are having a conversation with users b and 11b in a park at the time when the speech of users b and 11b is output.
  • the background data of user terminal a, 21a is switched back to the original cafe background data.
  • the background image of the cafe is displayed on the display unit, and the background sound of the cafe is output from the headphones.
  • user a (11a) feels as if he or she has momentarily moved from the cafe to the park to talk to user b (11b) only when user b (11b) outputs their speech, and feels as if he or she has returned to the original cafe when user b (11b) finishes speaking.
  • Processing Example 4 a processing example in which the background data set on one's own terminal is set to be continuously output when another user speaks to the user and the user converses with the other user.
  • FIG. 33 like FIGS. 30 to 32, shows users a, 11a and b, 11b, who are engaged in a conversation via communication.
  • a user a, 11a is using a user terminal a, 21a, which is set to output background data of a cafe.
  • a user b, 11b is using a user terminal b, 21b, which is set to output background data of a park.
  • user a, 11a is approached by user b, 11b.
  • the user b, 11b recognizes that he is talking to the user a, 11a who is in a park, which is background data set in the user terminal b, 21b.
  • the speech by the user b, 11b is transmitted from the user terminal b, 21b to the user terminal a, 21a via the communication network.
  • the data transmitted from user terminal b, 21b to user terminal a, 21a is only the speech voice data of user b, 11b, and the background data set in user terminal b, 21b, i.e., background image data and background voice data of the park, is not transmitted.
  • User terminal a, 21a receives only the speech voice data of user b, 11b from user terminal b, 21b.
  • the output voice control unit 108 of user terminal a, 21a executes output control on the received speech voice data of user b, 11b, and the controlled voice is output. For example, it is output from headphones worn by user a, 11a connected to user terminal a, 21a.
  • the control process executed in response to the response utterance of user b, 11b is, for example, the control process according to the embodiment 1 described above. That is, the voice output is controlled so that the utterance of user b, 11b can be heard from the relative position of user b, 11b to user a, 11a displayed on user terminal a, 21a shown in FIG. 30.
  • the background sounds of the cafe set on the user terminal a, 11a are also continuously output from the headphones worn by the user a, 11a, and the response utterance of the user b, 11b is output from the headphones along with the background sounds of the cafe.
  • this output voice control process allows users a and 11a to have a conversation while recognizing that users b and 11b are also in the same cafe as users a and 11a.
  • user b, 11b can converse with user a, 11a while recognizing that they are in a park, which is the background data set in user terminal b, 21b.
  • Process 5 A processing example in which, when a new user is spoken to during a conversation between multiple users, background audio data of the new user's user terminal is set to be transmitted to and output from the user terminals of the multiple users in the conversation.
  • Process 6 A processing example in which, when a new user is spoken to during a conversation between multiple users, background audio data and background image data of the new user's user terminal are set to be transmitted to and output from the user terminals of the multiple users in the conversation.
  • Processcessing Example 7 A processing example in which, when a new user is spoken to during a conversation between multiple users, background audio data of the new user's user terminal is set to be transmitted to and output from the user terminals of the multiple users in the conversation.
  • FIG. 34 shows a user a, 11a, and a user b, 11b, who are engaged in a conversation, as well as other new users c, 11c.
  • a user a, 11a is using a user terminal a, 21a, which is set to output background data of a cafe.
  • a user c, 11c is using a user terminal c, 21c, which is set to output background data of a live music venue.
  • user a, 11a, or user b, 11b speaks to user c, 11c.
  • the user c, 11c for example, executes a response utterance in response to the speech from the user a, 11a.
  • the user c, 11c recognizes the message from the user a, 11a as being from the user a, 11a who is in the live music venue, which is the background data set in the user terminal c, 21c.
  • a response utterance by the user c, 11c is transmitted from the user terminal c, 21c to the user terminal a, 21a and the user terminal b, 21b via the communication network.
  • this Processing Example 5
  • not only the speech voice data of the user c, 11c is transmitted from the user terminal c, 21c to the user terminal a, 21a and the user terminal b, 21b, but also background voice data of the live music venue set in the user terminal c, 21c.
  • background image data of the live music venue is not transmitted.
  • User terminal a, 21a and user terminal b, 21b receive the speech data of user c, 11c and the background audio data of the live music venue from user terminal c, 21c.
  • the output audio control unit 108 of user terminal a, 21a and user terminal b, 21b executes output control on the received background audio data of the live music venue and the speech data of user c, 11c, and outputs the controlled audio.
  • the audio is output from headphones connected to user terminal a, 21a and user terminal b, 21b.
  • the output voice control unit 108 of the user terminal a, 21a performs the same control process as that described with reference to FIG. 30 (Processing Example 1) in response to the response utterance of the user c, 11c. That is, this is the control process according to the previously described Example 1, and voice output control is performed so that the utterance of the user c, 11c can be heard from the relative positions of the users c, 11c to the users a, 11a displayed on the user terminal a, 21a shown in FIG. 34, and the voice output is output.
  • the background sounds of the live music venue received from user terminal c, 11c are output from the headphones worn by user a, 11a only when user c, 11c outputs their speech.
  • the headphones worn by user a, 11a output the background audio data of a cafe, which is the background sound set in user terminal a, 21a, except when user c, 11c speaks, but only when user c, 11c speaks, the background sound is switched to the background sound of a live music venue received from user terminal c, 11c.
  • This output voice control process allows users a and 11a to recognize that users c and 11c are having a conversation in a setting where they are in a live music venue.
  • the same process is also executed in the user terminal b, 21b.
  • the speech of the user c, 11c, and the background sound of the live music venue are also transmitted to the user terminal b, 21b, and these voice data are output via the user terminal b, 21b.
  • the background data set in the user terminal b, 21b is background data of a park
  • the background sound of the park is switched to the background sound of a live music venue and output only at the timing when the speech of the user c, 11c is output.
  • This output audio control process allows users b and 11b to recognize that users c and 11c are having a conversation in a setting where they are in a live music venue.
  • FIG. 35 like FIG. 34, shows users a, 11a and b, 11b, who are engaged in a conversation, as well as other new users c, 11c.
  • a user a, 11a is using a user terminal a, 21a, which is set to output background data of a cafe.
  • Fig. 35 shows a state after the background image of the user terminal a, 21a has been switched to background data of a live music venue.
  • a user c, 11c is using a user terminal c, 21c, which is set to output background data of a live music venue.
  • user a, 11a, or user b, 11b speaks to user c, 11c.
  • the user c, 11c for example, executes a response utterance in response to the speech from the user a, 11a.
  • the user c, 11c recognizes the message from the user a, 11a as being from the user a, 11a who is in the live music venue, which is the background data set in the user terminal c, 21c.
  • a response utterance by the user c, 11c is transmitted from the user terminal c, 21c to the user terminal a, 21a and the user terminal b, 21b via the communication network.
  • this processing example 6
  • not only the speech voice data of user c, 11c is transmitted from user terminal c, 21c to user terminal a, 21a and user terminal b, 21b, but also background voice data and background image data of the live house set in user terminal c, 21c.
  • User terminal a, 21a and user terminal b, 21b receive speech data of user c, 11c, and background audio data and background image data of the live music venue from user terminal c, 21c.
  • the output audio control unit 108 of the user terminal a, 21a and the user terminal b, 21b executes output control on the received background audio data of the live music venue and the speech data of the user c, 11c, and outputs the controlled audio.
  • the audio is output from headphones connected to the user terminal a, 21a and the user terminal b, 21b.
  • the output voice control unit 108 of the user terminal a, 21a performs the same control process as that described with reference to FIG. 30 (Processing Example 1) in response to the response utterance of the user c, 11c. That is, this is the control process according to the previously described Example 1, and voice output control is performed so that the utterance of the user c, 11c can be heard from the relative positions of the users c, 11c to the users a, 11a displayed on the user terminal a, 21a shown in FIG. 34, and the voice output is output.
  • the background sounds of the live music venue received from user terminal c, 11c are output from the headphones worn by user a, 11a only when user c, 11c outputs their speech.
  • the headphones worn by user a, 11a output the background audio data of a cafe, which is the background sound set in user terminal a, 21a, except when user c, 11c speaks, but only when user c, 11c speaks, the background sound is switched to the background sound of a live music venue received from user terminal c, 11c.
  • the output image control unit 108 of the user terminal a, 21a outputs the background image data of the live music venue received from the user terminal c, 21c to the display unit of the user terminal a, 21a in accordance with the output timing of the response utterance of the user c, 11c. This is the state of the display unit of the user terminal a, 21a shown in FIG.
  • the user image (avatar image or real image) of each user placed on the background image of the live music venue displayed on the display unit of the user terminal a, 21a is placed according to the user position stored in the user position information storage unit 103 of the user terminal a, 21a. In other words, it is displayed at the same position as the user position placed on the image of the cafe, which is the background data set in the user terminal a, 21a.
  • the same process is also executed in the user terminals b and 21b.
  • the speech of the user c, 11c, and background image data and background audio data of the live music venue are also transmitted to the user terminals b and 21b, and the image data and audio data are output via the user terminals b and 21b.
  • the background data set in the user terminal b, 21b is background data of a park
  • the background image of the park is switched to the background image of a live music venue and displayed only at the timing when the speech of the user c, 11c is output
  • the background sound is also switched from the background sound of the park to the background sound of the live music venue and output.
  • users b and 11b can recognize that they are having a conversation with users c and 11c in a setting where they are both at a live music venue.
  • these output image control processes and output audio control processes enable users a, 11a and users b, 11b to recognize that they are having a conversation with users c and 11c at a live music venue when the speech of users c and 11c is output.
  • the background data of user terminal a, 21a is switched back to the original background data of the cafe, and the background data of user terminal b, 21b is also switched back to the original background data, for example, the background data of a park.
  • the display unit of user terminal a, 21a displays a background image of a cafe, and the background sound of the cafe is output from the headphones. Also, the display unit of user terminal b, 21b displays a background image of a park, and the background sound of the park is output from the headphones.
  • user a, 11a and user b, 11b can instantly feel as if they have moved from the cafe or park to the live music venue and are having a conversation only at the timing when user c, 11c outputs their speech, and when user c, 11c finishes speaking, they can feel as if they have returned to the original cafe or park.
  • FIG. 36 like FIGS. 34 and 35, shows users a, 11a and b, 11b, who are engaged in a conversation, as well as other new users c, 11c.
  • a user a, 11a is using a user terminal a, 21a, which is set to output background data of a cafe.
  • a user c, 11c is using a user terminal c, 21c, which is set to output background data of a live music venue.
  • user a, 11a, or user b, 11b is approached by user c, 11c.
  • the user c, 11c recognizes the speech directed to the user a, 11a as a speech directed to the user a, 11a who is in a live music venue, which is background data set in the user terminal c, 21c.
  • the speech of the user c, 11c is transmitted from the user terminal c, 21c to the user terminal a, 21a and the user terminal b, 21b via the communication network.
  • this Processing Example 7
  • not only the speech voice data of the user c, 11c is transmitted from the user terminal c, 21c to the user terminal a, 21a and the user terminal b, 21b, but also background voice data of the live music venue set in the user terminal c, 21c.
  • background image data of the live music venue is not transmitted.
  • User terminal a, 21a and user terminal b, 21b receive the speech data of user c, 11c and the background audio data of the live music venue from user terminal c, 21c.
  • the output audio control unit 108 of user terminal a, 21a and user terminal b, 21b executes output control on the received background audio data of the live music venue and the speech data of user c, 11c, and outputs the controlled audio.
  • the audio is output from headphones connected to user terminal a, 21a and user terminal b, 21b.
  • the output voice control unit 108 of the user terminal a, 21a performs the same control process as that described with reference to FIG. 30 (Processing Example 1) for the speech of the user c, 11c. That is, this is the control process according to the previously described Example 1, and the voice output control is performed so that the speech of the user c, 11c can be heard from the relative positions of the users c, 11c to the user a, 11a displayed on the user terminal a, 21a shown in FIG. 34, and is output.
  • the background sounds of the live music venue received from user terminal c, 11c are output from the headphones worn by user a, 11a only when user c, 11c outputs their speech.
  • the headphones worn by user a, 11a output the background audio data of a cafe, which is the background sound set in user terminal a, 21a, except when user c, 11c speaks, but only when user c, 11c speaks, the background sound is switched to the background sound of a live music venue received from user terminal c, 11c.
  • This output voice control process allows users a and 11a to recognize that users c and 11c are having a conversation in a setting where they are in a live music venue.
  • the same process is also executed in the user terminal b, 21b.
  • the speech of the user c, 11c, and the background sound of the live music venue are also transmitted to the user terminal b, 21b, and these voice data are output via the user terminal b, 21b.
  • the background data set in the user terminal b, 21b is background data of a park
  • the background sound of the park is switched to the background sound of a live music venue and output only at the timing when the speech of the user c, 11c is output.
  • This output audio control process allows users b and 11b to recognize that users c and 11c are having a conversation in a setting where they are in a live music venue.
  • FIGS. 1-10 A number of processing examples in which different background data is used in each user terminal have been described above with reference to FIGS.
  • the users a, 11a to d, 11d can talk to each other while outputting different background data to the user terminals a, 21a to d, 21d, respectively, and the user positions displayed on each user terminal can also be different for each user terminal. That is, the user positions can be freely set to the fixed user positions described in the first embodiment or the positions according to the intimacy level described in the second embodiment.
  • Step S21 First, the user a, 11a performs a selection process of background data for starting a conversation via a network on the display unit of the user terminal a, 21a.
  • the example shown in FIG. 37 is an example of a background data selection UI for selecting a location for a conversation via a network.
  • two selectable background data candidates "conference room” and "cafe,” are displayed as candidates for background data that can be used as the setting location for a talk room.
  • the user a, 11a performs a user operation of selecting "cafe” as background data and touching the "enter" button.
  • Step S22 When the background data of "cafe" is selected as background data by the user operation in step S21, a background image of the cafe is displayed on the display unit of the user terminal a, 21a in step S22. Furthermore, a user image (avatar image) of the user a, 11a is displayed on the background image of the cafe.
  • the display position of the user image (avatar image) of the user a, 11a can be set to any position within the background image according to the preference of the user a, 11a. Alternatively, for example, as shown in the example of FIG. 37 (S22), the user image (avatar image) may be displayed at a predetermined position such as the lower left of the background image.
  • the background sounds of the cafe are output from the speaker of the user terminal a, 21a or from headphones connected to the user terminal a, 21a.
  • the display positions of users b and 11b in the display data of user terminals a and 21a shown in FIG. 37 are positions determined according to the process in accordance with the second embodiment described above, that is, the degree of intimacy. In this state, user a, 11a, and user b, 11b, start a conversation.
  • Step S24 In the next step S24 shown in FIG. 38, the user c, 11c, enters the cafe. Users c and 11c can also enter the same cafe as users a and 11a and users b and 11b by performing the same operations as steps S21 and S22 on their own user terminal 21c.
  • the display positions of users c and 11c in the display data of user terminals a and 21a shown in FIG. 38 (S24) are also positions determined according to the process in accordance with the second embodiment described above, that is, the degree of intimacy. In this state, user a, 11a, user b, 11b, and user c, 11c can converse with each other.
  • the output voice control unit 108 of the user terminal a, 21a executes control to adjust the voice direction and volume of each user's speech according to the display position of each user.
  • Step S25 The next step S25 shows an example in which users a, 11a, b, 11b, and c, 11c have conversations with each other, resulting in an increase in the intimacy between users a, 11a and users c, 11c, and a process of updating the display positions of the users is performed in response to this increase in intimacy.
  • Step S26 In the next step S26 shown in FIG. 39, users c and 11c leave the cafe, and users d and 11d enter the cafe. Users d and 11d can also enter the same cafe as users a and 11a and users b and 11b by performing the same operations as steps S21 and S22 on their own user terminal 21d.
  • the output voice control unit 108 of the user terminal a, 21a executes control to adjust the voice direction and volume of each user's speech according to the display position of each user.
  • Step S27 shows an example in which users a, 11a, b, 11b, and d, 11d have conversations with each other, resulting in a change in the intimacy between users a, 11a and users b, 11b and users c, 11c, and a display position update process for each user is performed in response to this change in intimacy.
  • the output image control unit 110 of the user terminal a, 21a executes a user display position update process that changes the display positions of users c, 11c and users d, 11d.
  • FIG. 40 An example of background data (image, sound) switching processing will be described with reference to FIGS. 40 and 41.
  • FIG. First, with reference to FIG. 40, an example of processing for switching to an image of the same cafe background seen from a different direction will be described.
  • Step S31 is a state in which user a, 11a, user b, 11b, and user d, 11d are having a conversation while a background image of a cafe, which is background data set by user a, 11a, and background sound are being output to user terminal a, 21a.
  • Step S32 is an example in which, when user d, 11d speaks, background data set in user terminal d, 21d, i.e., background data (image data, audio data) including image data of the cafe seen from different directions, is input from user terminal d, 21d, which is the terminal used by user d, 11d, in conjunction with the user utterance, and output to user terminal a, 21a.
  • background data image data, audio data
  • Step S42 shows a processing example in which the user a, 11a responds to a call from the user c, 11c who is not in the cafe.
  • user terminal a, 11a When user terminal a, 11a receives an utterance from user c, 11c who does not enter the cafe, it inputs background data set in user terminal c, 21c, i.e., background data (image data, audio data) including image data of the live music venue, from user terminal c, 21c, which is the terminal used by user c, 11c, in conjunction with the reception of the user utterance.
  • background data image data, audio data
  • the user terminal a, 21a When the user terminal a, 21a outputs the call-out utterance from the user c, 11c through a speaker or headphones, the user terminal a, 21a displays the background data received from the user terminal c, 21c, i.e., image data of the live music venue, on the display unit, and outputs the audio data of the live music venue through the speaker or headphones.
  • the background data received from the user terminal c, 21c i.e., image data of the live music venue
  • This process allows user a, 11a, to recognize that user c, 11c is calling from the live music venue, and if interested, can move to the live music venue and talk to user c, 11c.
  • Step S51 is a state in which user a, 11a, user b, 11b, and user d, 11d are having a conversation while a background image of a cafe, which is background data set by user a, 11a, and background sound are being output to user terminal a, 21a.
  • the image of user a, 11a displayed on the display unit of user terminal a, 21a here is a virtual character image, i.e., an avatar image representing user a.
  • Step S52 shows an example in which user a, 11a activates the camera of user terminal a, 21a, captures a facial image of user a, 11a, and switches the avatar image of user a, 11a displayed on the display unit of user terminal a, 21a to an actual image of user a, 11a, i.e., an image captured by the camera.
  • the facial image of user a, 11a captured by the camera of user terminal a, 21a is also sent to user terminal b, 21b of user b, 11b, who is engaged in a conversation over the network, and user terminal d, 21d of user d, 11d, who is engaged in a conversation over the network, and the actual image (camera-captured image) of user a, 11a is also displayed on the display unit of these user terminals.
  • FIG. 43 is a diagram showing an example of the hardware configuration of the user terminal 21 and the server of the present disclosure. The hardware configuration shown in FIG. 43 will now be described.
  • CPU (Central Processing Unit) 301 functions as a control unit or data processing unit that executes various processes according to programs stored in ROM (Read Only Memory) 302 or storage unit 308. For example, it executes processes according to the sequences described in the above-mentioned embodiments.
  • RAM (Random Access Memory) 303 stores programs and data executed by CPU 301.
  • CPU 301, ROM 302, and RAM 303 are interconnected by bus 304.
  • the CPU 301 is connected to an input/output interface 305 via a bus 304, and the input/output interface 305 is connected to an input unit 306 consisting of various switches, a keyboard, a mouse, a microphone, a sensor, etc., and an output unit 307 consisting of a display, a speaker, etc.
  • the CPU 301 executes various processes in response to commands input from the input unit 306, and outputs the process results to, for example, the output unit 307.
  • the storage unit 308 connected to the input/output interface 305 is, for example, a hard disk, and stores the programs executed by the CPU 301 and various data.
  • the communication unit 309 functions as a transmitter/receiver for Wi-Fi communication, Bluetooth (registered trademark) (BT) communication, and other data communication via networks such as the Internet or a local area network, and communicates with external devices.
  • the drive 310 connected to the input/output interface 305 drives removable media 311, such as a magnetic disk, optical disk, magneto-optical disk, or semiconductor memory such as a memory card, and records or reads data.
  • removable media 311 such as a magnetic disk, optical disk, magneto-optical disk, or semiconductor memory such as a memory card
  • the technology disclosed in this specification can have the following configurations.
  • the output audio control unit includes: An information processing device that performs voice direction control so that the user's utterance is heard as an utterance coming from the user's position of the conversation partner relative to a predefined user position.
  • the user position of the conversation partner relative to the self position is The higher the intimacy level with the conversation partner, the closer the conversation partner is to the self-location.
  • An intimacy calculation unit that calculates an intimacy degree with a conversation partner user,
  • the user position of the conversation partner relative to the self position is The information processing device according to any one of (1) to (5), wherein the position is determined according to the degree of intimacy calculated by the intimacy calculation unit.
  • the intimacy calculation unit The information processing device according to (6), further comprising: calculating a degree of intimacy according to a preference of a user of the information processing device for the conversation partner user.
  • the intimacy calculation unit The information processing device according to (7), further comprising: analyzing a preference of a user of the information processing device with respect to the conversation partner user based on a past history.
  • the intimacy calculation unit The information processing device according to any one of (6) to (8), further comprising: a communication device that communicates with the user of the information processing device and the conversation partner user, and calculates a degree of intimacy between the user and the conversation partner user.
  • the information processing device an output image control unit that controls image output to a display unit;
  • the output image control unit An information processing device according to any one of (1) to (9), which executes a process of displaying a self user image representing the user and a user image of the conversation partner on a display unit.
  • the output image control unit displaying a background image determined by a user of the information processing device on a display unit;
  • the information processing device displays a user image of a user having a conversation on a background image.
  • the output image control unit The information processing device according to (10) or (11), which executes a process of displaying a self user image showing the user and a user image of a conversation partner at a predetermined fixed position.
  • the information processing device an output image control unit that controls image output to a display unit;
  • the output image control unit The information processing device according to any one of (6) to (12), wherein a display position of the user image of the conversation partner relative to a user image showing the user is determined according to the intimacy calculated by the intimacy calculation unit.
  • the output image control unit At the timing of outputting the user utterance of the conversation partner, An information processing device according to any one of (10) to (13), which executes a process of switching a background image to be displayed on a display unit from a background image set on the user's own terminal to a background image set on the user terminal of the conversation partner.
  • the output audio control unit The information processing device according to any one of (1) to (14), which executes a process of outputting background sound determined by a user who uses the information processing device via a sound output unit.
  • the output audio control unit includes: At the timing of outputting the user utterance of the conversation partner, An information processing device described in any one of (1) to (15) that executes a process of switching the background audio to be output to an audio output unit from the background audio set on the own terminal to the background audio set on the user terminal of the conversation partner.
  • An information processing method executed in an information processing device includes: a communication unit for receiving a user utterance of a conversation partner via a network; an output voice control unit that controls the output of the user's utterance, The output voice control unit, The information processing method includes controlling a voice direction so that the user's speech is heard as coming from the user's position of the conversation partner relative to a predefined user position.
  • a program for causing an information processing device to execute information processing includes: a communication unit for receiving a user utterance of a conversation partner via a network; an output voice control unit that controls the output of the user's utterance, The program causes the output audio control unit to A program for executing voice direction control so that the user's speech is heard as coming from the user's position of the conversation partner relative to a predefined user position.
  • a program recording the processing sequence can be installed and executed in memory within a computer built into dedicated hardware, or the program can be installed and executed in a general-purpose computer capable of executing various processes.
  • the program can be pre-recorded on a recording medium.
  • the program can be received via a network such as a LAN (Local Area Network) or the Internet, and installed on a recording medium such as an internal hard disk.
  • a system refers to a logical collective configuration of multiple devices, and is not limited to devices in the same housing.
  • a configuration is realized in which voice direction control is performed so that the user's speech of a conversation partner via a network is heard as speech coming from the user's position of the conversation partner relative to a predefined self-position.
  • the device has a communication unit that receives a user utterance from a conversation partner via a network, and an output voice control unit that executes output control of the user utterance.
  • the output voice control unit executes voice direction control and volume control so that the user utterance is heard as an utterance from a user position of the conversation partner relative to a predefined user position.
  • the user position of the conversation partner relative to the user position is set to a predetermined fixed position or according to the intimacy with the conversation partner, and the higher the intimacy, the closer the user position is set.
  • This configuration realizes a configuration for executing voice direction control so that user utterance of a conversation partner via a network is heard as utterance coming from the conversation partner's user position relative to a predefined user position.
  • Communication management server 70 Background data providing server 101 Communication unit 102 User position determination unit (UI) REFERENCE SIGNS LIST 103 User position information storage unit 104 Background data acquisition unit 105 Audio data storage unit 106 Image data storage unit 107 Audio data reception unit 108 Output audio control unit 109 Audio output unit 110 Output image control unit 111 Image output unit 112 Display unit 113 Audio input unit 114 Camera 115 Image input unit 116 Data transmission unit 121 Intimacy calculation unit 141 User preference input unit (UI) 142 User preference analysis unit 143 User preference information storage unit 144 Conversation density analysis unit 145 Intimacy calculation unit 301 CPU 302 ROM 303 RAM 304 bus 305 input/output interface 306 input unit 307 output unit 308 storage unit 309 communication unit 310 drive 311 removable media

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • User Interface Of Digital Computer (AREA)
PCT/JP2024/002446 2023-03-08 2024-01-26 情報処理装置、および情報処理方法、並びにプログラム Ceased WO2024185334A1 (ja)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202480015970.0A CN120883628A (zh) 2023-03-08 2024-01-26 信息处理装置、信息处理方法和程序
JP2025505116A JPWO2024185334A1 (https=) 2023-03-08 2024-01-26
EP24766709.0A EP4679862A1 (en) 2023-03-08 2024-01-26 Information processing device, information processing method, and program

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2023-035310 2023-03-08
JP2023035310 2023-03-08

Publications (1)

Publication Number Publication Date
WO2024185334A1 true WO2024185334A1 (ja) 2024-09-12

Family

ID=92674808

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2024/002446 Ceased WO2024185334A1 (ja) 2023-03-08 2024-01-26 情報処理装置、および情報処理方法、並びにプログラム

Country Status (4)

Country Link
EP (1) EP4679862A1 (https=)
JP (1) JPWO2024185334A1 (https=)
CN (1) CN120883628A (https=)
WO (1) WO2024185334A1 (https=)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06165173A (ja) * 1992-11-17 1994-06-10 Nippon Telegr & Teleph Corp <Ntt> 仮想社交界実現システム
JP2001036881A (ja) * 1999-07-16 2001-02-09 Canon Inc 音声伝送システム及び音声再生装置
JP2006140595A (ja) * 2004-11-10 2006-06-01 Sony Corp 情報変換装置及び情報変換方法、並びに通信装置及び通信方法
JP2013017027A (ja) * 2011-07-04 2013-01-24 Nippon Telegr & Teleph Corp <Ntt> 音像定位制御システム、コミュニケーション用サーバ、多地点接続装置、及び音像定位制御方法
JP2014011509A (ja) * 2012-06-27 2014-01-20 Sharp Corp 音声出力制御装置、音声出力制御方法、プログラム及び記録媒体
WO2019155735A1 (ja) 2018-02-07 2019-08-15 ソニー株式会社 情報処理装置、情報処理方法及びプログラム
WO2021125081A1 (ja) * 2019-12-19 2021-06-24 日本電気株式会社 情報処理装置、制御方法及び非一時的なコンピュータ可読媒体
JP2022116906A (ja) * 2021-01-29 2022-08-10 学校法人早稲田大学 プログラム、端末、サーバ装置、端末の処理方法、サーバ装置の処理方法及び会議システム

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06165173A (ja) * 1992-11-17 1994-06-10 Nippon Telegr & Teleph Corp <Ntt> 仮想社交界実現システム
JP2001036881A (ja) * 1999-07-16 2001-02-09 Canon Inc 音声伝送システム及び音声再生装置
JP2006140595A (ja) * 2004-11-10 2006-06-01 Sony Corp 情報変換装置及び情報変換方法、並びに通信装置及び通信方法
JP2013017027A (ja) * 2011-07-04 2013-01-24 Nippon Telegr & Teleph Corp <Ntt> 音像定位制御システム、コミュニケーション用サーバ、多地点接続装置、及び音像定位制御方法
JP2014011509A (ja) * 2012-06-27 2014-01-20 Sharp Corp 音声出力制御装置、音声出力制御方法、プログラム及び記録媒体
WO2019155735A1 (ja) 2018-02-07 2019-08-15 ソニー株式会社 情報処理装置、情報処理方法及びプログラム
WO2021125081A1 (ja) * 2019-12-19 2021-06-24 日本電気株式会社 情報処理装置、制御方法及び非一時的なコンピュータ可読媒体
JP2022116906A (ja) * 2021-01-29 2022-08-10 学校法人早稲田大学 プログラム、端末、サーバ装置、端末の処理方法、サーバ装置の処理方法及び会議システム

Also Published As

Publication number Publication date
CN120883628A (zh) 2025-10-31
EP4679862A1 (en) 2026-01-14
JPWO2024185334A1 (https=) 2024-09-12

Similar Documents

Publication Publication Date Title
US10491643B2 (en) Intelligent augmented audio conference calling using headphones
EP3039677B1 (en) Multidimensional virtual learning system and method
CN110035250A (zh) 音频处理方法、处理设备、终端及计算机可读存储介质
US8340267B2 (en) Audio transforms in connection with multiparty communication
KR20100097739A (ko) 통신 시스템의 적어도 두 사용자들 사이의 통신들을 제어하는 방법
JP7062126B1 (ja) 端末、情報処理方法、プログラム、および記録媒体
CN118413804A (zh) 音频装置、音频分配系统和操作其的方法
US12370448B2 (en) 3D spatialisation of voice chat
US20240340605A1 (en) Information processing device and method, and program
JP7143874B2 (ja) 情報処理装置、情報処理方法およびプログラム
CN113066504A (zh) 音频传输方法、装置及计算机存储介质
JP7160263B2 (ja) 情報処理システム、情報処理装置およびプログラム
CN116114241A (zh) 信息处理装置、信息处理终端、信息处理方法和程序
CN114339542A (zh) 音量调节方法、装置、电子设备和介质
KR20230015302A (ko) 정보 처리 장치, 정보 처리 방법 및 프로그램
KR20230038165A (ko) 발화 영상 제공 방법 및 이를 수행하기 위한 컴퓨팅 장치
JP7150114B1 (ja) コミュニケーション支援システム、コミュニケーション支援方法、及びコミュニケーション支援プログラム
WO2024185334A1 (ja) 情報処理装置、および情報処理方法、並びにプログラム
US12328566B2 (en) Information processing device, information processing terminal, information processing method, and program
WO2022008075A1 (en) Methods, system and communication device for handling digitally represented speech from users involved in a teleconference
JP7191146B2 (ja) 配信サーバ、配信方法、及びプログラム
CN117641191A (zh) 声音处理方法、拾音系统及电子设备
CN115705839A (zh) 语音播放方法、装置、计算机设备和存储介质
JP7687339B2 (ja) 情報処理装置、情報処理端末、情報処理方法、およびプログラム
CN115550600B (zh) 识别音频数据声音来源的方法、存储介质和电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24766709

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2025505116

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2025505116

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 202480015970.0

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 2024766709

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

WWP Wipo information: published in national office

Ref document number: 202480015970.0

Country of ref document: CN

ENP Entry into the national phase

Ref document number: 2024766709

Country of ref document: EP

Effective date: 20251008

ENP Entry into the national phase

Ref document number: 2024766709

Country of ref document: EP

Effective date: 20251008

ENP Entry into the national phase

Ref document number: 2024766709

Country of ref document: EP

Effective date: 20251008

ENP Entry into the national phase

Ref document number: 2024766709

Country of ref document: EP

Effective date: 20251008

WWP Wipo information: published in national office

Ref document number: 2024766709

Country of ref document: EP