WO2023243059A1 - 情報提示装置、情報提示方法及び情報提示プログラム - Google Patents
情報提示装置、情報提示方法及び情報提示プログラム Download PDFInfo
- Publication number
- WO2023243059A1 WO2023243059A1 PCT/JP2022/024206 JP2022024206W WO2023243059A1 WO 2023243059 A1 WO2023243059 A1 WO 2023243059A1 JP 2022024206 W JP2022024206 W JP 2022024206W WO 2023243059 A1 WO2023243059 A1 WO 2023243059A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- participant
- information
- sound source
- source position
- terminals
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
Definitions
- One aspect of the present invention relates to an information presentation device, an information presentation method, and an information presentation program.
- Non-Patent Document 1 proposes a technique for expressing it with an avatar. Furthermore, Non-Patent Document 2 proposes a technique using a robot.
- these proposed techniques there is a limit to the number of people that can be visually displayed, and it is difficult to apply them to conversations with a large number of people.
- Non-Patent Documents 1 and 2 cannot be applied to such uses.
- This invention has been made in view of the above-mentioned circumstances, and aims to provide an information presentation technique that allows the user to perceive an appropriate sense of distance from the speaker without interfering with the visual information to be projected.
- an information presentation device transmits audio information acquired from a first participant terminal among a plurality of participant terminals via a network to a first participant terminal via a network.
- This information presentation device presents information to one or more second participant terminals different from the participant terminal, and includes a sound source position specifying section, an audio presentation section, and a visual effect presentation section.
- the sound source position specifying unit determines, for each of the second participant terminals, the position of the second participant terminal, which is a participant of the second participant terminal, according to the role assigned to each participant who uses a plurality of participant terminals.
- a sound source position is defined as the position of the first participant who is the participant of the first participant terminal, with reference to the position of the second participant.
- the audio presentation unit generates, for each of the one or more second participant terminals, sound field information in which the audio information from the first participant terminal is localized as a sound image based on the sound source position of the first participant. and transmits it to each of the second participant terminals.
- the visual effect presentation unit generates a visual effect based on the sound source position of the first participant for each of the one or more second participant terminals, and transmits the generated visual effect to each of the second participant terminals.
- FIG. 1 is a diagram showing an example of the configuration of an information presentation system according to an embodiment of the present invention.
- FIG. 2 is a block diagram showing an example of the hardware configuration of a communication server as an embodiment of the information presentation device of the present invention.
- FIG. 3 is a block diagram showing an example of the software configuration of the communication server.
- FIG. 4 is a flowchart illustrating an example of the procedure and contents of the preparation process executed by the control unit of the communication server.
- FIG. 5 is a diagram showing an example of a preset data set stored in the conversation type database of the communication server.
- FIG. 6 is a schematic diagram showing the positional relationship of participants indicated by the example preset data set of FIG. FIG.
- FIG. 7 is a diagram showing an example of the contents of the participant assignment table stored in the participant information database of the communication server by the process of registering participant information and roles in the preparation process shown in FIG. 4.
- FIG. 8 is a diagram illustrating an example of a participant prescribed data set stored in the prescribed information database of the communication server through the process of defining the sound source position of each participant in the preparation process shown in FIG. 4.
- FIG. 9 is a schematic diagram showing the positional relationship of each participant based on the participant regulation data set of the example of FIG.
- FIG. 10 is a schematic diagram showing the positional relationship with visitor A in the example of FIG. 9 as a reference.
- FIG. 11 is a diagram showing an example of the contents of the sound source position regulation data table for visitor A stored in the regulation information database of the communication server through the process of defining the sound source position of each participant in the preparation process shown in FIG. It is.
- FIG. 12 is a schematic diagram showing a virtual position of a slide in a positional relationship with visitor A as a reference.
- FIG. 13 is a schematic diagram showing an example of how the slide of FIG. 12 appears on the display screen of visitor A at the virtual position.
- FIG. 14 is a schematic diagram illustrating an example of what visitor A sees on the display screen when the reference position for depth representation is fixed at the virtual position of the slide.
- FIG. 12 is a schematic diagram showing a virtual position of a slide in a positional relationship with visitor A as a reference.
- FIG. 13 is a schematic diagram showing an example of how the slide of FIG. 12 appears on the display screen of visitor A at the virtual position.
- FIG. 14 is a schematic diagram illustrating an example of what visitor A sees on
- FIG. 15 is a schematic diagram illustrating an example of what the visitor A sees on the display screen when the reference position for depth expression is dynamically changed depending on the speaker.
- FIG. 16 is a flowchart illustrating an example of the processing procedure and processing contents of conversation processing executed by the control unit of the communication server.
- FIG. 17 is a diagram showing an example of a preset data set stored in the conversation type database in the first modification.
- FIG. 18 is a schematic diagram showing omission of the Y-axis setting in the second modification.
- FIG. 1 is a diagram showing an example of the configuration of an information presentation system in an embodiment of the present invention.
- the information presentation system of this embodiment includes a communication server CS, which is an embodiment of the information presentation device of this invention, as its main component.
- the information presentation system includes this communication server CS, an organizer terminal OT used by an organizer who holds online communication with a large number of people, and a plurality of participant terminals PT1 to PTn (n) used by participants participating in online communication. is an arbitrary integer) through the network NW.
- the network NW is the Internet.
- the network NW may be any network, such as a LAN (Local Area Network), as long as it is capable of transmitting the above information data.
- LAN Local Area Network
- Online communication among many people is a conversation that frequently takes place online, and the types of conversations include, for example, meetings, business negotiations, academic conferences, exhibitions, university lectures and discussions. Furthermore, online communication among a large number of people may be unidirectional as long as the conversation involves multiple roles. For example, such conversation types include panel discussions among experts, live sports commentary with player commentary, product sales, plays, and the like.
- organizer terminal OT Although only one organizer terminal OT is shown in FIG. 1, it goes without saying that the information presentation system of this embodiment may include a plurality of organizer terminals OT. In FIG. 1, only one of them is shown as a representative.
- Organizer terminal OT and participant terminals PT1 to PTn are devices that can output audio and video from PCs (Personal Computers), smartphones, glass-type devices, etc., and can converse with others via networks NW such as the Internet. No restrictions if there are any.
- Communication server CS 2 and 3 are block diagrams showing an example of the hardware configuration and software configuration of the communication server CS.
- the communication server CS consists of a server computer installed on the web or cloud, for example. Note that the communication server CS may be a PC that is one of the organizer terminal OT or participant terminals PT1 to PTn.
- the communication server CS includes a control section 1, and a storage unit having a program storage section 2 and a data storage section 3, and a communication interface section 4 are connected to the control section 1 via a bus 5. ing. Note that in FIGS. 2 and 3, the interface is indicated as I/F.
- the control unit 1 is a hardware processor such as a CPU (Central Processing Unit). For example, by using a multi-core and multi-threaded CPU, it is possible to simultaneously execute a plurality of information processes.
- the control unit 1 may include multiple hardware processors.
- the communication interface section 4 under the control of the control section 1, sends and receives information data between the organizer terminal OT and the participant terminals PT1 to PTn, respectively.
- the program storage unit 2 includes, for example, non-volatile memories such as HDD (Hard Disk Drive) and SSD (Solid State Drive) that can be written and read at any time, and non-volatile memories such as ROM (Read Only Memory). It is constructed by combining.
- the program storage unit 2 stores, in addition to middleware such as an OS (Operating System), application programs necessary for inputting each of the above-mentioned information necessary for information presentation in one embodiment and transmitting a registration request for the information.
- middleware such as an OS (Operating System)
- OS Operating System
- the data storage unit 3 is, for example, a combination of a nonvolatile memory such as an HDD or an SSD that can be written to and read from at any time, and a volatile memory such as a RAM (Random Access Memory) as a storage medium.
- the data storage unit 3 stores, in its storage area, a conversation type database 31, a participant information database 32, a regulation information database 33, and a generated information database as main storage units necessary for carrying out an embodiment of the present invention.
- a database 34 is provided. Note that in FIGS. 2 and 3, the database is indicated as DB.
- the conversation type database 31 stores preset data sets corresponding to each type of online communication with a large number of people, such as meetings, business negotiations, academic conferences, exhibitions, university lectures and discussions.
- the preset data includes the role of the participant and the position and direction of the sound source for each role.
- the conversation type database 31 can also store datasets edited from preset datasets by the organizer.
- the participant information database 32 stores user information such as user ID, login information such as password, and name for all users who use this information presentation system. Furthermore, the participant information database 32 stores information on the role assigned by the organizer for each user who becomes a participant (the organizer can also be a participant) by selection from the user who becomes the organizer.
- the regulation information database 33 is defined for each participant and stores information regarding the sound source positions of other participants.
- the generated information database 34 stores sound field information and visual effect information generated for each participant.
- the control unit 1 includes a conversation type setting unit 11, a conversation type editing unit 12, a participant information registration unit 13, a sound source position specifying unit 14, an audio acquisition unit 15, as processing function units necessary for implementing one embodiment. It includes a sound field generation section 16, an audio reproduction section 17, a visual effect generation section 18, and a visual effect expression section 19. All of these processing function units are realized by causing the hardware processor of the control unit 1 to execute an application program stored in the program storage unit 2.
- processing function of the processing function unit is realized by an ASIC (Application Specific Integrated Circuit) or a DSP (Digital Signal It may be realized by an integrated circuit such as a FPGA (field-programmable gate array), a GPU (graphics processing unit), or the like.
- ASIC Application Specific Integrated Circuit
- DSP Digital Signal It may be realized by an integrated circuit such as a FPGA (field-programmable gate array), a GPU (graphics processing unit), or the like.
- the conversation type setting unit 11 communicates with the organizer terminal OT via the network NW by the communication interface unit 4, and presents the list of conversation types stored as a preset data set in the conversation type database 31 to the organizer terminal OT. Then, the selection of the conversation type is accepted from the organizer terminal OT.
- the conversation type setting unit 11 stores a preset data set of the conversation type selected by the organizer in the conversation type database 31 as a selected data set of the online communication to be held.
- the conversation type editing unit 12 communicates with the organizer terminal OT via the network NW by the communication interface unit 4, presents the contents of the selected data set stored in the conversation type database 31 to the organizer terminal OT, and Editing of the selected data set is accepted from the terminal OT.
- the conversation type editing unit 12 reflects the editing results in the selected data set stored in the conversation type database 31.
- the participant information registration unit 13 communicates with the organizer terminal OT via the network NW using the communication interface unit 4, receives role assignments for each participant from the organizer terminal OT, and receives information on the assigned roles.
- the information is stored in the participant information database 32.
- the sound source position defining unit 14 determines the sound source position of each participant according to the role of each participant stored in the participant information database 32.
- the sound source position specifying unit 14 stores information on the sound source position for each confirmed participant in the specified information database 33.
- the sound source position defining unit 14 further determines the visual representation to be presented to each of the participant terminals PT1 to PTn based on the positional relationship of the sound source positions of each participant. This visual representation will be explained in detail in the explanation of the operation.
- the sound source position specifying unit 14 stores the determined visual expression in the specifying information database 33.
- the audio acquisition unit 15 communicates with the participant terminals PT1 to PTn of the participants participating in the online communication stored in the participant information database 32 via the network NW by the communication interface unit 4, and acquires the participant terminals. Audio information is acquired from each of PT1 to PTn.
- the sound field generation unit 16 selects the participant whose terminal is the source of the audio information acquired by the audio acquisition unit 15 based on the information on the sound source position prescribed for each participant stored in the regulation information database 33. Determine the location of the sending participant relative to the location of each participant participating in the online communication in which the participant participates. Then, the sound field generation unit 16 generates sound field information to be provided to each participant other than the transmission source based on the determination result.
- the sound field information is information for outputting audio information as a spatial sound image using stereophonic sound technology.
- the sound field generation unit 16 causes the generated information database 34 to store the generated sound field information for each participant.
- the audio reproduction unit 17 reproduces the audio information from each participant terminal other than the participant terminal that is the source of the audio information stored in the generated information database 34.
- the acquired audio information is applied to the sound field information for the participant terminals. That is, the audio reproduction unit 17 generates sound field information in which the acquired audio information is localized as a sound image. Then, the audio reproduction unit 17 transmits the sound field information obtained by localizing the audio information to the participant terminals of each participant except for the participant terminal that is the source of the audio information via the network NW by the communication interface unit 4. Send to.
- the visual effect generation unit 18 identifies the participant whose terminal is the source of the audio information acquired by the audio acquisition unit 15 based on the information on the sound source position prescribed for each participant stored in the regulation information database 33. Determine the location of the sending participant relative to the location of each participant participating in the online communication in which the participant participates. Then, based on the determination result, the visual effect generation unit 18 generates visual effect information to be provided to each participant according to the visual expression of each participant stored in the regulation information database 33.
- Visual effect information is information regarding visual expressions presented when outputting audio information at participant terminals.
- the visual effect generation unit 18 causes the generated information database 34 to store the generated visual effect information for each participant.
- the visual effect generation unit 18 when the audio acquisition unit 15 acquires audio information from any of the participant terminals PT1 to PTn, the visual effect generation unit 18 generates a voice information from the participant terminal that is the source of the audio information stored in the generation information database 34. The acquired audio information is applied to the visual effect information for each participant terminal except for. That is, the visual effect generation unit 18 adds a visual effect representing the position of the acquired audio information to the visual effect information. Then, the visual effect generation unit 18 transmits the generated visual effect information to be provided to each participant to the visual effect expression unit 19.
- the visual effect expression unit 19 communicates with the participant terminals PT1 to PTn via the network NW by the communication interface unit 4, and the visual effect expression unit 19 communicates with the participant terminals of each participant participating in online communication. Send generated visual effect information to each participant.
- FIG. 4 is a flowchart showing an example of the procedure and contents of the preparation process executed by the control unit 1 of the communication server CS.
- the control unit 1 starts this preparation process when the communication interface unit 4 receives a preparation request sent from an organizer terminal OT used by an organizer who intends to hold online communication via the network NW. do.
- the preparation process is basically a process with the organizer terminal OT, and nothing is performed with the participant terminals PT1 to PTn.
- the control unit 1 When the preparation process is started, the control unit 1 operates as the conversation type setting unit 11 and performs a process of setting a conversation type such as conference or exhibition from the organizer terminal OT (step S101). Specifically, the control unit 1 communicates with the organizer terminal OT via the network NW using the communication interface unit 4, and transmits the list of conversation types stored as a preset data set in the conversation type database 31 to the organizer terminal. The message is presented to the OT, and the selection of the conversation type is accepted from the organizer terminal OT.
- a conversation type such as conference or exhibition from the organizer terminal OT
- the preset data set is a list of the main roles in the conversation type, and the sound source position and direction for each role are set in advance.
- FIG. 5 is a diagram showing an example of a preset data set 311 stored in the conversation type database 31.
- the example shown in FIG. 5 is a preset data set 311 whose conversation type is "exhibition.” That is, in the preset data set for this "exhibition", there are participant roles such as "exhibitor EH”, “attendant AT”, “expert EP”, and "visitor VI”, and the sound source for each role is The position and orientation are set.
- FIG. 6 is a schematic diagram showing the positional relationship of participants indicated by the example preset data set of FIG. Regarding the origin (0, 0, 0) of the sound source position, the coordinates of the sound source position of a visitor in any role such as Visitor VI may be set, or it may be set near the center between the four roles. good.
- the conversation type setting unit 11 separately stores a preset data set of the conversation type selected by the organizer in the conversation type database 31 as a selected data set of the online communication to be held.
- a preset data set of the conversation type selected by the organizer in the conversation type database 31 as a selected data set of the online communication to be held.
- the control unit 1 operates as the conversation type editing unit 12 and executes processing to edit the role, sound source position, and direction from the organizer terminal OT (step S102).
- the selected data set selected in step S101 and stored in the conversation type database 31 is in a general format, and may not conform to the online communication intended by the organizer. Therefore, the control unit 1 communicates with the organizer terminal OT via the network NW by the communication interface unit 4, presents the selected data set stored in the conversation type database 31 to the organizer terminal OT, and displays the selected data set. Accepts customization by the organizer. Then, the control unit 1 reflects the customization results in the selected data set stored in the conversation type database 31.
- the process of step S102 can be skipped.
- the control unit 1 operates as the participant information registration unit 13 and performs a process of registering participant information and roles (step S103). Specifically, the control unit 1 communicates with the organizer terminal OT via the network NW using the communication interface unit 4 to select users from among the users stored in the participant information database 32 to participate in the online communication to be held. accept participants' selections. Then, the control unit 1 accepts assignment of each participant to one of the roles constituting the conversation according to the selected conversation type. In this case, multiple participants may be assigned to each role. Then, the control unit 1 causes the participant information database 32 to store information on the role assigned by the organizer. Note that users who are not stored in the participant information database 32 may be newly registered from the organizer terminal OT.
- FIG. 7 is a diagram showing an example of the contents of the participant assignment table 321 of the participant information database 32 that stores information on roles assigned in this way.
- two users are assigned the roles of exhibitor EH, one user is assigned the role of attendant AT, one user is assigned the role of expert EP, and three users are assigned the roles of visitor VI.
- the control unit 1 operates as the sound source position defining unit 14 and performs a process of defining the sound source position of each participant (step S104). Specifically, the control unit 1 assigns each participant according to the assigned role based on the preset data set 311 stored in the conversation type database 31 and the participant assignment table 321 stored in the participant information database 32. Determine the sound source position. At this time, the control unit 1 first determines the number of people in each role from the participant assignment table 321 stored in the participant information database 32, and assigns the number of participants to the preset data set 311 stored in the conversation type database 31. A participant regulation data set including information on the number of participants is created and stored in the regulation information database 33.
- FIG. 8 is a diagram showing an example of the participant regulation data set 331 stored in the regulation information database 33.
- the control unit 1 when multiple participants are assigned to one role, such as an exhibitor and a visitor, the control unit 1 simply places the sound source positions of those multiple participants on the same coordinates. can be placed in Alternatively, the control unit 1 may randomly arrange the sound source positions of the plurality of participants within a certain distance, centering on the sound source position stored in the participant specified data set 331, or distribute them uniformly. You may do so.
- FIG. 9 shows how to define the sound source positions of multiple participants when multiple participants are assigned to one role based on the participant regulation data set 331 in the example of FIG. 8.
- FIG. 3 is a schematic diagram showing the positional relationship.
- the control unit 1 causes the regulation information database 33 to store information on the sound source position and direction for each of the participants thus defined.
- control unit 1 creates a sound source position regulation data table based on the sound source position of each visitor based on the sound source position information for each participant, and stores it in the regulation information database 33.
- FIG. 10 is a schematic diagram showing the positional relationship with visitor A in the example of FIG. 9 as a reference.
- EH-A is exhibitor A
- EH-B is exhibitor B
- VI-A is visitor A
- VI-B is visitor B
- VI-C is visitor C.
- Visitor A Based on VI-A exhibitors EH-A and EH-B are at a far distance in front
- attendant AT is at a middle distance to the right
- other visitors VI-B and VI-C are at a close distance
- experts. EP is located near the front left.
- FIG. 11 shows, as an example of the contents of the sound source position regulation data table 332 stored in the regulation information database 33, the sound source position for visitor A VI-A who is in the positional relationship as shown in FIG. 10 with respect to other participants.
- 3 is a diagram showing a regulation data table 332.
- the control unit 1 calculates the distance D between the sound source position of visitor A and the sound source position of each participant based on the information on the sound source position and direction for each participant stored in the regulation information database 33. , this distance D is stored in the sound source position definition data table 332. Further, the control unit 1 calculates an angle using the sound source position and direction of the visitor A and the sound source position of each participant, and stores this angle in the sound source position regulation data table 332 as the direction ⁇ .
- the control unit 1 similarly creates a sound source position regulation data table for each of the other visitors, visitors VI-B and VI-C, exhibitors EH-A and EH-B, attendant AT, and expert EP. 332 is created and stored in the regulation information database 33.
- control unit 1 determines a visual expression for each participant based on the sound source positional relationship of each participant stored in the regulation information database 33 of the regulation information database 33, and stores it in the regulation information database 33 (step S105). Specifically, the control unit 1 determines how to express each participant according to the distance from the sound source position of other participants, and how to express the expression according to the direction of the sound source position of other participants. Decide where on the screen each will be output.
- FIG. 12 is a schematic diagram showing a virtual position SVP of a slide in a positional relationship with visitor A as a reference. The control unit 1 thus determines the reference position for depth representation for each participant.
- FIG. 13 is a schematic diagram showing an example of how the slide in FIG. 12 appears on the display screen SC on the participant terminal of visitor A VIC-A at the virtual position SVP.
- symbols SY such as ripples that represent the sound source position
- the content of the slide SL is image analyzed and areas with a high amount of information (specifically, areas with small letters, areas with rapid color changes, etc.) are displayed. ) may be intentionally displayed while avoiding them, or the transparency during display may be temporarily increased to alleviate the difficulty of viewing the slide SL.
- the control unit 1 determines a display form that expresses a sense of distance without interfering with the viewing of the slide SL, that is, a reference position and a distance from the reference position, using other design considerations.
- composition improvement includes, for example, drawing a perspective line PL centered on the slide SL, either inside or outside the slide SL.
- Contrived shading includes gradually darkening the shading SH from the edges and outside of the slide SL toward the center of the screen, or displaying light and dark alternately.
- Size, brightness, and layer improvements include displaying the symbol SY in a larger size, brighter, or closer to the slide SL as the distance from the slide SL approaches.
- “Focus modification” includes blurring the symbol SY as the distance from the slide SL increases.
- the ⁇ animation technique'' includes increasing the number of ripples as the symbol SY as the volume increases.
- the reference position BP for depth expression is made to be the brightest as a ⁇ shading technique'', and the symbol SY is displayed in a blurred manner as it moves away from the depth expression reference position BP as a ⁇ focus technique''. That's what I do. Note that in the case of pattern (2) shown in FIG. 15, it is desirable to determine the roles or priorities of participants so that they can be uniquely determined even when a plurality of participants speak at the same time.
- the control unit 1 determines whether the communication interface unit 4 has received an instruction to end the preparation process from the organizer terminal OT via the network NW, that is, whether or not it has received a request to end the preparations. A judgment is made (step S106). If it is determined that the instruction to end the preparation process has not yet been given, the control unit 1 moves to step S101 described above and prepares for another online communication. Further, if it is determined that an instruction to end the preparation process has been received, the control unit 1 ends the preparation process.
- FIG. 16 is a flowchart showing an example of the processing procedure and processing contents of conversation processing executed by the control unit 1.
- the control unit 1 executes the conversation process shown in this flowchart for each online communication set by the organizer.
- the control unit 1 can perform the processing shown in this flowchart in parallel for a plurality of online communications held at the same time.
- the control unit 1 When the communication interface unit 4 receives a command to start online communication via the network NW, the control unit 1 starts this conversation processing for the online communication. Then, the control unit 1 determines whether there is a new participant (step S111). For example, when the control unit 1 receives an online communication start command, it determines that the participant at the participant terminal that sent the start command is a new participant. In addition, the control unit 1 may still store sound field information, etc. in the generated information database 34 among the participant terminals PT1 to PTn of the participants stored in the participant information database 32 as participants of the online communication. If a participation command is received from a participant terminal that does not exist, the participant of the participant terminal that is the source of the participation command is determined to be a new participant.
- the control unit 1 operates as the sound field generation unit 16 and generates a sound field for the new participant, taking into consideration the positional relationship between the participants (step S112). Specifically, the control unit 1 controls the participation of other participants based on the positional relationship between the participants stored in the sound source position regulation data table 332 of the regulation information database 33 for the participant terminal of the participant. The system generates sound field information for localizing sound images from audio information transmitted from user terminals. The control unit 1 causes the generated information database 34 to store the generated sound field information for the new participant.
- control unit 1 operates with the visual effect generation unit 18 to generate a visual effect for the new participant (step S113). Specifically, the control unit 1 controls the position of the new participant based on the position of the new participant stored in the sound source position regulation data table 332 of the regulation information database 33 and the visual representation stored in the regulation information database 33. Generate visual effect information for each participant's terminal. This visual effect information is, for example, the information shown in FIGS. 13 to 15 except for the symbol SY. The control unit 1 causes the generated information database 34 to store the generated visual effect information for the new participant.
- control unit 1 operates as the visual effect expression unit 19 and provides the generated visual effect to the new participant via the network NW by the communication interface unit 4 (step S114). Specifically, the control unit 1 transmits the visual effect information generated in step S113 to the participant terminal of the new participant.
- the control unit 1 operates as the audio playback unit 17, takes into account the positional relationship between the participants, and reproduces the audio information.
- the audio is played back to other participants excluding the information transmission source (step S116).
- the control unit 1 adds the input audio to the sound field information stored in the generation information database 34 for each of the participant terminals PT1 to PTn excluding the participant terminal PTi that is the source of the audio information. Apply the information. That is, the audio playback unit 17 generates sound field information for each of the other participants by localizing the sound image of the input audio information.
- the control unit For the participant terminal PTa, sound field information is generated to localize the sound image of the sound based on the sound information at a position based on the distance D r4-a_r1-a and the direction ⁇ r4a_r1a .
- the control unit 1 controls the participant terminals of the participants to whom the roles of Exhibitor B EH-B, Attendant AT, Expert EP, Visitor B VI-B, and Visitor C VI-C are assigned. Sound field information can be generated.
- control unit 1 transmits the sound generated for each of the other participants via the communication interface unit 4 to the other participant terminals PT1 to PTn other than the participant terminal PTi that is the source of the audio information.
- Send venue information
- control unit 1 operates with the visual effect generation unit 18 to generate visual effects for other participants in consideration of the positional relationship between the participants (step S117). . Specifically, the control unit 1 inputs visual effect information stored in the generation information database 34 for each of the other participant terminals PT1 to PTn other than the participant terminal PTi that is the source of the audio information. Apply the audio information that was created, ie add the symbol SY as a visual effect.
- control unit 1 determines whether there is a participant who leaves the online communication (step S119). If there is a person leaving, the control unit 1 deletes the sound field information and visual effect information for that person stored in the generated information database 34 (step S120).
- a sound source position defining unit 14 that defines a sound source position that is the position of a first participant who is a participant of a first participant terminal with reference to the position of a second participant; For each participant terminal, based on the sound source position of the first participant, sound field information is generated by localizing the sound information from the first participant terminal, and is transmitted to each of the second participant terminals.
- the sound field generation unit 16 and the audio playback unit 17, which function as an audio presentation unit for the first participant, generate a visual effect based on the sound source position of the first participant for each of one or more second participant terminals, and
- a visual effect generation section 18 and a visual effect expression section 19 are provided, which function as a visual effect presentation section that transmits data to each of the participant terminals.
- the sound image of each speaker is localized individually based on the role, and each sound image position is effectively visualized using visual effects, thereby preventing visual information to be projected. It is possible to provide an information presentation technique that allows the user to perceive an appropriate sense of distance from the speaker.
- the sound source position defining unit visualizes a position corresponding to the sound image localization position on the display screen of each of the second participant terminals based on the positional relationship of the first and second participants.
- the visual expression is determined, and the visual effect presenting section generates the visual effect according to the visual expression determined by the sound source position defining section. Therefore, according to one embodiment, by determining a visual expression for each participant based on the positional relationship with the speaker, it is possible to quickly generate a visual effect even if the speaker changes one after another. Therefore, it is possible to present visual effects without time lag, and it is possible to provide visual effects that do not make participants feel uncomfortable.
- the visual representation includes a display form representing a reference position and a distance from the reference position. Therefore, according to one embodiment, by changing the visual expression used in the design, such as color shading or blurring, depending on the reference position or the distance from the reference position, the sense of distance in the depth direction can be achieved with a small amount of information. can be visualized. That is, the depth of the sound image can be expressed without interfering with existing visual information.
- the conversation type database 31 functions as a first storage unit that stores sound source positions for each role, and a second storage unit that stores roles assigned to each participant. further comprising a participant information database 32 for identifying the second participant based on the sound source position stored in the first storage and the role stored in the second storage. Define each sound source position. Therefore, according to one embodiment, by preparing information necessary for defining the sound source position in advance, it is possible to easily define the sound source positions of multiple participants for an arbitrary participant.
- the participant information registration unit 13 is further provided, which functions as a participant registration unit that assigns roles to each of the participants of the plurality of participant terminals and stores them in the second storage unit. Accordingly, according to one embodiment, roles can be arbitrarily assigned to participants. Note that the same role may be assigned to multiple participants.
- a common positional relationship is used for all roles and all participants. However, it is not necessary that they be common, and the optimal positional relationship may be individually constructed for each. For example, in the case of ⁇ exhibition'' as a conversation type in online communication, participants who participate as ⁇ visitors'' want the ⁇ exhibitor'' to be far away in front of them. On the other hand, for participants who participate as "exhibitors,”"visitors" are expected to be close to the right. Furthermore, in discussions such as "meetings" as a type of conversation in online communication, participants have requests such as wanting people with similar ideas to be close to them.
- FIG. 17 is a diagram showing an example of the preset data set 311 stored in the conversation type database 31 in this first modification.
- the sound source position is represented by values on the X, Y, and Z axes. However, if the values of the X, Y, and Z axes of the sound source positions of all roles are the same, the setting of that axis may be omitted.
- FIG. 18 is a schematic diagram showing omission of the Y-axis setting in the second modification.
- the heights (Y-axis coordinates) of all the roles are the same, the two-dimensional coordinates of the XZ axes can be set.
- Various types of processing based on the sound source position can also be performed based only on the coordinates of the XZ axes.
- the control unit 1 may automatically assign roles to each participant without depending on settings from the organizer. For example, the control unit 1 can perform the assignment based on user affiliation information stored in the participant information database 32 in advance. Furthermore, by accumulating the past conversation content, conversation amount, and conversation timing of each user, the control unit 1 can infer the role based on the accumulated information. For example, in online communication with the conversation type "Exhibition," the control unit 1 determines the likelihood that a user will be assigned, such as a user who speaks a lot in the first half of the conversation is likely to be an "exhibitor.” A higher role can be assumed.
- control unit 1 when determining the visual expression based on the positional relationship between the sound source positions, the control unit 1 does not need to use a common visual expression for all roles and all participants, and may change it for each. For example, in response to feedback from each participant during or after the conversation, the control unit 1 can adjust the type and degree of emphasis of the visual expression.
- the control unit 1 may edit the visual representation determined by the control unit 1 from the host terminal OT, such as by intentionally moving the speaker's sound source position at any timing to change the sense of distance.
- the organizer may choose to use visual expressions such as shortening the distance during explanations to make the voices sound familiar, or changing the distance when actors play multiple roles during a play. It may be possible to change it.
- the present invention is applicable to all online communication involving audio, including conversation scenes as described in the embodiments, but it is also applicable not only to online communication but also to some real-world (offline) communication. It is also applicable to usage. For example, it can be applied to scenes such as audio guides at art museums, where the subject wears earphones or headphones and listens to the explanatory voice of an invisible speaker while viewing the content (stereoscopic sound only). In addition to this, by attaching AR (Augmented Reality) glasses, it can be applied to scenes that also express visual effects (stereophonic sound + visual effects).
- the embodiment shows a case where the information presentation device is composed of one communication server CS, it may be composed of a plurality of servers.
- a server that performs preparation processing and a server that performs conversation processing may be separated, or the servers may be separated according to the type of conversation.
- the present invention is not limited to the above-described embodiments as they are, but can be embodied by modifying the constituent elements at the implementation stage without departing from the spirit of the invention.
- various inventions can be formed by appropriately combining the plurality of components disclosed in the above embodiments. For example, some components may be deleted from all the components shown in the embodiments. Furthermore, components from different embodiments may be combined as appropriate.
- Control unit 2 ... Program storage unit 3... Data storage unit 4... Communication interface unit 5... Bus 11... Conversation type setting unit 12... Conversation type editing unit 13... Participant information registration unit 14... Sound source position specifying unit 15... Audio Acquisition unit 16...Sound field generation unit 17...Audio reproduction unit 18...Visual effect generation unit 19...Visual effect expression unit 31...Conversation type database 32...Participant information database 33...Regulation information database 34...Generation information database 311...Preset data Set 321...Participant assignment table 331...Participant regulation data set 332...Sound source position regulation data table AT...Attendant BP...Reference position CS for depth expression...Communication server EH, EH-A, EH-B...Exhibitor EP...Expert NW...Network OT...Organizer terminal PL...Perspective line PT1 to PTn...Participant terminal SC...Display screen SH...Shading SL...Slide SVP...Slide virtual position SY...Symbols VI, VI-
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Telephonic Communication Services (AREA)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2022/024206 WO2023243059A1 (ja) | 2022-06-16 | 2022-06-16 | 情報提示装置、情報提示方法及び情報提示プログラム |
| JP2024528044A JPWO2023243059A1 (https=) | 2022-06-16 | 2022-06-16 |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2022/024206 WO2023243059A1 (ja) | 2022-06-16 | 2022-06-16 | 情報提示装置、情報提示方法及び情報提示プログラム |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023243059A1 true WO2023243059A1 (ja) | 2023-12-21 |
Family
ID=89192595
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2022/024206 Ceased WO2023243059A1 (ja) | 2022-06-16 | 2022-06-16 | 情報提示装置、情報提示方法及び情報提示プログラム |
Country Status (2)
| Country | Link |
|---|---|
| JP (1) | JPWO2023243059A1 (https=) |
| WO (1) | WO2023243059A1 (https=) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2012147420A (ja) * | 2010-12-22 | 2012-08-02 | Ricoh Co Ltd | 画像処理装置、及び画像処理システム |
| CN111025233A (zh) * | 2019-11-13 | 2020-04-17 | 阿里巴巴集团控股有限公司 | 一种声源方向定位方法和装置、语音设备和系统 |
| WO2020240724A1 (ja) * | 2019-05-29 | 2020-12-03 | 日本電気株式会社 | 光ファイバセンシングシステム、光ファイバセンシング機器及び音出力方法 |
| JP2022054192A (ja) * | 2020-09-25 | 2022-04-06 | 大日本印刷株式会社 | リモート会議システム、サーバ、写真撮影装置、音声出力方法、及びプログラム |
-
2022
- 2022-06-16 WO PCT/JP2022/024206 patent/WO2023243059A1/ja not_active Ceased
- 2022-06-16 JP JP2024528044A patent/JPWO2023243059A1/ja active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2012147420A (ja) * | 2010-12-22 | 2012-08-02 | Ricoh Co Ltd | 画像処理装置、及び画像処理システム |
| WO2020240724A1 (ja) * | 2019-05-29 | 2020-12-03 | 日本電気株式会社 | 光ファイバセンシングシステム、光ファイバセンシング機器及び音出力方法 |
| CN111025233A (zh) * | 2019-11-13 | 2020-04-17 | 阿里巴巴集团控股有限公司 | 一种声源方向定位方法和装置、语音设备和系统 |
| JP2022054192A (ja) * | 2020-09-25 | 2022-04-06 | 大日本印刷株式会社 | リモート会議システム、サーバ、写真撮影装置、音声出力方法、及びプログラム |
Also Published As
| Publication number | Publication date |
|---|---|
| JPWO2023243059A1 (https=) | 2023-12-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7464989B2 (ja) | 仮想環境での相互作用を可能にするシステム及び方法 | |
| US11522925B2 (en) | Systems and methods for teleconferencing virtual environments | |
| KR102580110B1 (ko) | 내비게이션 가능한 아바타들이 있는 웹 기반 화상 회의 가상 환경 및 그 응용들 | |
| US11184362B1 (en) | Securing private audio in a virtual conference, and applications thereof | |
| JP7698300B2 (ja) | グラフィック表現ベースを用いたユーザ認証システムおよび方法 | |
| US20220070241A1 (en) | System and method enabling interactions in virtual environments with virtual presence | |
| US12603971B2 (en) | Providing awareness of who can hear audio in a virtual conference, and applications thereof | |
| CN114115519B (zh) | 用于在虚拟环境中递送应用程序的系统和方法 | |
| JP7683922B2 (ja) | 仮想環境内でクラウドコンピューティングベースの仮想コンピューティングリソースをプロビジョニングするためのシステム及び方法 | |
| EP3962076B1 (en) | System and method for virtually broadcasting from within a virtual environment | |
| JP7492746B2 (ja) | ユーザグラフィック表現間のアドホック仮想通信 | |
| JP2024022535A (ja) | 固有のセキュアなディープリンクを介したビデオ会議ミーティングスロット | |
| US20250054250A1 (en) | Avatar background alteration | |
| JP2024022537A (ja) | 固有のセキュアなディープリンクを介したビデオ会議ミーティングスロット | |
| WO2023243059A1 (ja) | 情報提示装置、情報提示方法及び情報提示プログラム | |
| Sermon | Reframing videotelephony through coexistence and empathy in the third space | |
| Wu et al. | User Interaction for WebGL-Based Desktop Metaverse | |
| WO2024089887A1 (ja) | 情報提示装置、情報提示方法及び情報提示プログラム | |
| JP2024022536A (ja) | 固有のセキュアなディープリンクを介したビデオ会議ミーティングスロット | |
| WO2022235916A1 (en) | Securing private audio in a virtual conference, and applications thereof | |
| CN119631394A (zh) | 用于改善与虚拟环境的目标实体相关联的用户的细节感知的透视视图的自适应调整 | |
| HK40089152A (zh) | 具有可导航虚拟形象的基於网页的视频会议虚拟环境和其应用 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22946875 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2024528044 Country of ref document: JP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 22946875 Country of ref document: EP Kind code of ref document: A1 |